I use

sum((cross_val_predict(model, X, y) - y)**2) / len(y)        (*)

to evaluate the performance of a model. This conforms with Murphy: Machine Learning, section 6.5.3, and Hastie et al: The Elements of Statistical Learning,  eq. 7.48. However, according to the documentation of cross_val_predict, "it is not appropriate to pass these predictions into an evaluation metric". While it is obvious that cross_val_predict is different from cross_val_score, I don't see what should be wrong with (*).

Also, the explanation that "|cross_val_predict| <https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_predict.html#sklearn.model_selection.cross_val_predict>simply returns the labels (or probabilities)" is unclear, if not wrong. As I understand it, this function returns estimates and no labels or probabilities.

Regards, Boris

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to