On Wed, Apr 03, 2019 at 08:54:51AM -0400, Andreas Mueller wrote: > If the loss decomposes, the result might be different b/c of different test > set sizes, but I'm not sure if they are "worse" in some way?
Mathematically, a cross-validation estimates a double expectation: one expectation on the model (ie the train data), and another on the test data (see for instance eq 3 in https://europepmc.org/articles/pmc5441396, sorry for the self citation, this is seldom discussed in the literature). The correct way to compute this double expectation is by averaging first inside the fold and second across the folds. Other ways of computing errors estimate other quantities, that are harder to study mathematically and not comparable to objects studied in the literature. Another problem with cross_val_predict is that some people use metrics like correlation (which is a terrible metric and does not decompose across folds). It will then pick up things like correlations across folds. All these problems are made worse when data are not iid, and hence folds risk not being iid. G _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn