Hi list,
It's been a while since I spotted this issue with the cross_val_score
function when using a LeaveOneOut iterator over a regressor whose default
score function is r2_score. And I eventually take the time to raise it here
for sharing and discussion.
The issue comes from an incompatibility between these two points:
1. the cross_val_score_function is designed for returning a list of scores
for each couple of train/test folds ;
2. the r2_score returns 0. when the numerator (propto the variance of the
target) equals 0.
As a result, using a LeaveOneOut iterator over the r2_score function will
always return a list of zeros (because the testing set is reduced to a
single element, hence the estimated variance is 0)...
IMHO, the leave-one-out case is a very special case of cross-validation
iterators compared to KFold... I'd rather implement a cross_val_predictor
function that would only return a list of cross-validated predictions. Then
anyone could apply the appropriate score function on this set of
cross-validated predictions, yielding a more meaningful estimate of the
predictor's score.
A simple raw implementation of the cross_val_predictor function could be as
follows:
def dummy_score_function_that_returns_predicted_targets(y_true, y_pred):
"""The name sounds explicit, doesn't it?"""
return y_pred
def cross_val_predictor(predictor, X, y):
return cross_val_score(predictor, X, y,
score_func=dummy_score_function_that_returns_predicted_targets)
But I also discovered that passing a custom score function to
cross_val_score is deprecated as of 0.13 (al least), and you guys might
have better ideas!
I can open an issue on github, if this makes sense.
Cheers,
Vincent
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general