Subliminal Learning: How AI Models Inherit Hidden
Dangers
Researchers have uncovered an unexpected flaw in one of the most common techniques used to build smaller, cheaper AI models: Distillation. When a “student” model is trained on filtered outputs from a larger “teacher,” it can still inherit the teacher’s quirks and unsafe behaviors, even when those traits never appear in the training data. They’re […]
The post Subliminal Learning: How AI Models Inherit Hidden Dangers appeared first on Analytics Vidhya.
23