Hi,
I have the same issue there with R2 score for regression :
http://sourceforge.net/mailarchive/message.php?msg_id=31136945
Most scores use averages over a test sample.
Hence, I think the choice between mean(scores) and score(concatenation)
depends on the CV iterator:
- For KFold it makes sense to use mean(scores) since, KFold implements test
folds with more than one element,
- For LeaveOneOut it doesn't since there is only one element per test fold
(averaging over a single element does not make sense).
Cheers,
Vincent
2013/7/8 Josh Wasserstein <[email protected]>
> The following call runs into an error
>
> clf = GridSearchCV(SVC(C=1), tuned_parameters,
> score_func=sklearn.metrics.auc_score,verbose=2, n_jobs=1, cv=loo)
> clf.fit(X, y)
>
> with:
> /opt/python/virtualenvs/work/lib/python2.7/site-packages/skle
> arn/metrics/metrics.pyc in auc(x, y, reorder)
> 64 # XXX: Consider using ``scipy.integrate`` instead, or moving t
> o
> 65 # ``utils.extmath``
> ---> 66 x, y = check_arrays(x, y)
> 67 if x.shape[0] < 2:
> 68 raise ValueError('At least 2 points are needed to compute'
>
> even though X and y hold more than 100 examples with 20+ positives.
>
> It looks sklearn cannot obtain AUC scores with LOO since this requires at
> least two points (and probably a mix of positives and negatives), and in
> LOO each fold only has one point.
>
> However, one way to circumvent this limitation could be to concatenate the
> prediction of each fold in LOO (concatenate all predictions), and only then
> measure AUC.
>
> In fact, this is a whole different way of evaluating the performance of a
> model with cross validation. Rather than averaging the scores across folds,
> one could always concatenate the prediction results and measure the
> performance. This way score functions can always be measured directly on
> the prediction of the full dataset.
>
> This also brings interesting an interesting ML question since
> mean(scores) != score(concatenation))
>
> Is there anything wrong with this approach?
>
> Thanks,
>
> Josh
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general