arch 2016 at 08:05, Philip Tully wrote:
>
>> Hi,
>>
>> I'm trying to optimize the time it takes to make a prediction with my
>> model(s). I realized that when I perform feature selection during the
>> model fit(), that these features are likely still comp
Hi,
I'm trying to optimize the time it takes to make a prediction with my
model(s). I realized that when I perform feature selection during the
model fit(), that these features are likely still computed when I go
to predict() or predict_proba(). An optimization would then involve
actually eliminat
Hi all,
In the documentation (
http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html#sklearn.feature_extraction.text.CountVectorizer)
it is written that when a callable tokenizer is passed
into (Count/TfIdf)Vectorizer, then this "Only applies if anal
different algorithms
> you'd like to compare and select the model & algorithm that gives you the
> "best" unbiased estimate (average of the outer loop validation scores).
> After that, you select this "best" learning algorithm and tune it again via
> "r
Hi all,
My question is mostly technical, but part ML best practice. I am
performing (Randomized/)GridSearchCV to 'optimize' the hyperparameters
of my estimator. However, if I want to do model selection after this,
it would be best to do nested cross-validation to get a more unbiased
estimate and a