Re: [Scikit-learn-general] Grid search with validation set

Gael Varoquaux Fri, 28 Oct 2011 07:14:22 -0700

On Fri, Oct 28, 2011 at 04:06:35PM +0200, Olivier Grisel wrote:
> To address Andreas use case (which seems valid to me) I think we
> should have a new grid_search utility function that does not try to
> implement the `fit` API which is too restrictive for this use case.


I am not too excited about this. The reason is that this break any
advanced by legitimate usage, such as nested cross-validation. Find the
best parameter and measuring the prediction error on the same dataset is
overfit.

Thus it seems to me that the function that would really need to be
replaced whould be cross_val_score, but it is a bit trivial to replace:

    estimator.fit(X_train, y_train).score(X_test, y_test)

A ShuffleSplit can be used inside this in combination of a GridSearch to
do parameter selection with only one fold. Indeed, inside the train data,
there is seldom a predined test and train sub group.

I am actually not sure that I have understood the usecase that we are
discussing.

G

------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Grid search with validation set

Reply via email to