Train/dev/test/distribution

Choose a dev set and test set to reflect data you expect to get in fth future and cosindier important to do well on.

Size of the dev and test sets

Traditionally, train/test: 70%, 30%; or train/dev/test:60%, 20%, 20%

Currently, train/dev/test: 99%, 1%, 1% for deep learning algorithm.

Size of test set: set your test set to be going enough to give high confidence in the overall performance of your system.

If doing well on your metric + dev/test set don't correspond to doing well on your application, change your metric and/or dev/test set.

Huan-level error: as a proxy for Bayes error
Training error:
- training error - human -level error = "Avoidable bias"
- Training bigger modeel
- Train longer/better optimization algorithms
- NN architecture/hyperparameters search
Dev error :
- Dev error - Training error = "variance "
- More data
- Regularization: L2, dropout, data augmentation,
- NN architecture/hyperparameters search.

	General Speech Recognition		Rearer Mirror Speech Data
Human Level error	"human level" 4%			<<< Avoidable error
Error on the examples trained on	"training error" 7%			<<< Variance
Eroor on the examples not trained on	"training-dev error" 10%	<<<< Data Mismatch >>>>	"Dev/test error" 6%