Note

Knowledge Distillation Survey Note

July 28, 2025 · Sonu Jha

Notes from reading A Comprehensive Survey on Knowledge Distillation.

Introduction

Need for Knowledge Distillation (KD) in deep learning
KD vs other model compression techniques
What is Knowledge Distillation?
Key challenges in KD
Coverage of the survey

Sources

Logit-based Distillation

Loss functions
Variants of logit-based distillation
Disadvantages of logit-based distillation

Feature-based Distillation

Advantages of feature-based distillation
Loss functions
Variants of feature-based distillation
Challenges of feature-based distillation

Similarity-based Distillation

TODO

Schemes

Offline Distillation

Definition and process
Advantages and disadvantages

Online Distillation

Definition and process
Advantages and disadvantages

Self-Distillation

Definition and process
Advantages and disadvantages

Algorithms (TODO)

Attention-based Distillation
Adversarial Distillation
Multi-teacher Distillation
Cross-modal Distillation
Graph-based Distillation
Adaptive Distillation
Contrastive Distillation