Re: [scikit-learn] Why is cross_val_predict discouraged?

Boris Hollas Thu, 04 Apr 2019 00:42:10 -0700

Am 03.04.19 um 23:46 schrieb Joel Nothman:

Pull requests improving the documentation are always welcome. At aminimum, users need to know that these compute different things.
Accuracy is not precision. Precision is the number of true positivesdivided by the number of true positives plus false positives. Ittherefore cannot be decomposed as a sample-wise measure withoutknowing the rate of positive predictions. This rate is dependent onthe training data and algorithm.

In my last post, I referred to your remark that "for precision ... youcan't say the same". Since precision can't be computed with formula (*),even with a different loss function, I pointed out that (*) can be usedto compute the accuracy if the loss function is an indicator function.

It is still not clear to me what your point is with your remark that"for precision ... you can't say the same". I assume that you want totell that it is not wise to compute TP, FP, FN and then precision andrecall using cross_val_predict. If this is what you mean, I'd like youto explain why.

I'm not a statistician and cannot speak to issues of computing a meanof means, but if what we are trying to estimate is the performance ona sample of size approximately n_t of a model trained on a sample ofsize approximately N - n_t, then I wouldn't have thought taking a meanover such measures (with whatever score function) to be unreasonable.

In general, a mean of means is not the mean of the original data. Thepooled mean is the correct metric in this case. However, the pooled meanequals the mean of means if all folds are exactly the same size.

On Thu., 4 Apr. 2019, 3:51 am Boris Hollas,<hol...@informatik.htw-dresden.de<mailto:hol...@informatik.htw-dresden.de>> wrote:


    Am 03.04.19 um 13:59 schrieb Joel Nothman:

    The equations in Murphy and Hastie very clearly assume a metric
    decomposable over samples (a loss function). Several popular metrics
    are not.

    For a metric like MSE it will be almost identical assuming the test
    sets have almost the same size.

    What will be almost identical to what? I suppose you mean that (*)
    is consistent with the scores of the models in the fold (ie, the
    result of cross_val_score) if the loss function is (x-y)².

    For something like Recall
    (sensitivity) it will be almost identical assuming similar test set
    sizes**and**  stratification. For something like precision whose
    denominator is determined by the biases of the learnt classifier on
    the test dataset, you can't say the same.

    I can't follow here. If the loss function is L(x,y) = 1_{x = y},
    then (*) gives the accuracy.

      For something like ROC AUC
    score, relying on some decision function that may not be equivalently
    calibrated across splits, evaluating in this way is almost
    meaningless.


    In any case, I still don't see what may be wrong with (*).
    Otherwise, the warning in the documentation about the use of
    cross_val_predict should be removed or revised.

    On the other hand, an example in the documentation uses
    cross_val_scores.mean(). This is debatable since this computes a
    mean of means.

    On Wed, 3 Apr 2019 at 22:01, Boris Hollas
    <hol...@informatik.htw-dresden.de>  
<mailto:hol...@informatik.htw-dresden.de>  wrote:

    I use

    sum((cross_val_predict(model, X, y) - y)**2) / len(y)        (*)

    to evaluate the performance of a model. This conforms with Murphy: Machine Learning, 
section 6.5.3, and Hastie et al: The Elements of Statistical Learning,  eq. 7.48. 
However, according to the documentation of cross_val_predict, "it is not appropriate 
to pass these predictions into an evaluation metric". While it is obvious that 
cross_val_predict is different from cross_val_score, I don't see what should be wrong 
with (*).

    Also, the explanation that "cross_val_predict simply returns the labels (or 
probabilities)" is unclear, if not wrong. As I understand it, this function returns 
estimates and no labels or probabilities.

    Regards, Boris


    _______________________________________________
    scikit-learn mailing list
    scikit-learn@python.org <mailto:scikit-learn@python.org>
    https://mail.python.org/mailman/listinfo/scikit-learn


_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Why is cross_val_predict discouraged?

Reply via email to