深度学习
应用领域:
- 计算机视觉
- 自然语言处理
- 语音识别
相关方法:
- 增量学习
- 强化学习
- 迁移学习
应用领域:
相关方法:
Refer to: Awesome Incremental Learning Papers
All survey literatures are sorted by first submission date.
Paper: paper on arxiv
Code: code on GitHub
First Submission: 2021-01-25
Latest Submission: 2021-06-01
Dataset: CIFAR-100, MiniImageNet, CORe50-NC, NonStationary-MiniImageNet, CORe50-NI
Methods:
Related Survey:
Focus
Trends
Paper: paper on arxiv
Code: code on GitHub
First Submission: 2020-11-03
Latest Submission: 2020-12-15
Dataset: ILSVRC, VGGFACE2, Google Landmarks, CIFAR-100
Methods:
Related Survey
None
Focus
Trends
Paper: paper on arxiv
Code: code on GitHub
First Submission: 2020-10-28
Latest Submission: 2021-05-06
Dataset: CIFAR-100, Oxford Flowers, MIT Indoor Scenes, CUB-200-2011 Birds, Stanford Cars, FGVC Aircraft, Stanford Actions, VGGFace2, ImageNet
Methods:
Related Survey
Focus
Trends
Paper: paper on arxiv, paper on IEEE
Code: code on GitHub
First Submission: 2019-09-18
Latest Submission: 2021-04-16
Dataset: Tiny ImageNet, iNaturalist, RecogSeq(Oxford Flowers, MIT Scenes, Caltech-UCSD Birds, Stanford Cars, FGVC-Aircraft, VOC Actions, Letters, SVHN)
Methods:
Related Survey
None
Focus
Trends
Paper: paper on arxiv
Code: code on GitHub
First Submission: 2019-04-15
Latest Submission: 2019-04-15
Dataset: MNIST(split MNIST & premuted MNIST)
Methods:
Related Survey
None
Focus
Trends
None
Paper: paper on arxiv
Code: None
First Submission: 2018-02-21
Latest Submission: 2019-02-11
Dataset: None
Methods:
Related Survey
None
Focus
Trends
Paper: paper on arxiv
Code: code on github
First Submission: 2016-06-29
Latest Submission: 2017-02-14
Focus: image classification problems with Convolutional Neural Network classifiers.
Parameters
A CNN has a set of shared parameters $\theta_s$(e.g. 5 convolutional layers and 2 fully connected layers for AlexNet architecture).
Task specific parameters for previously learning tasks $\theta_o$(e.g. the output layer for ImageNet classification and corresponding weights).
Randomly initialized task specific parameters for new tasks $\theta_n$.
Related work
Feature extraction: $\theta_s$ and $\theta_o$ are unchanged, and the outputs of one or more layers are used as features for the new tasks in training $\theta_n$. Drawback: Feature extraction typically underperforms on the new task because the shared parameters fail to represent some information that is discriminative for the new task.
Fine-tuning: $\theta_s$ and $\theta_n$ are both optimized for the new tasks, while $\theta_o$ is fixed. Drawback: Fine-tuning degrades performance on previously learned tasks because the shared parameters change without new guidance for the original task-specifific prediction parameters
Joint Training: All parameters $\theta_s$, $\theta_o$, $\theta_n$ are jointly optimized. Drawback: Joint training becomes increasingly cumbersome in training as more tasks are learned and is not possible if the training data for previously learned tasks is unavailable.
Algorithm of LwF
Backbone
AlexNet, VGG
Dataset
ImageNet, Places2, VOC
Paper: paper on arxiv, paper on CVF
Code: code on GitHub
First Submission: 2016-11-23
Latest Submission: 2017-04-14
First definition of class-incremental learning:
The following three properties of an algorithm to qualify as class-incremental:
Components of iCaRL
Introduction
Classification
Training
Why nearest-mean-of-exemplars classification
NME overcomes 2 major problems of the IL settings:
Why representation learning
Exemplar management
Overall, iCaRL’s steps for exemplar selection and reduction fit exactly to the incremental learning setting: the selection step is required for each class only once, when it is first observed and its training data is available. At later times, only the reduction step is called, which does not need access to any earlier training data.
Related work
Dataset
CIFAR-100, ImageNet ILSVRC 2012
Future work