Thanks Robert and everyone else. I am still having a strange problem with
GridSearchCV with 0.14-git. Even though the number returned by
===================================
len(list(ParameterGrid(tuned_parameters)))
===================================
is 191,880, my call to:
===================================
skf = StratifiedKFold(y,4)
clf = GridSearchCV(SVC(C=1, cache_size=5000), tuned_parameters,
scoring=score_name,verbose=1, n_jobs=1, cv=skf)
clf.fit(X,y)
===================================
reports 661,250 jobs after 14 minutes (and it keeps going). Below are the
parameters that I am using:
===================================
# Set the parameters by cross-validation
param_range = dict()
param_range['C'] = np.power(2,np.arange(-18,18, 0.5))
param_range['gamma'] = param_range['C']
param_range['coef0'] = np.power(2,np.arange(-18,18,1))
param_range['poly_coef0'] = np.power(2,np.arange(-10,10,2))
param_range['poly_degrees'] = np.arange(2,4)
tuned_parameters = [
{'kernel': ['linear'], 'C': param_range['C']},
{'kernel': ['rbf'], 'C': param_range['C'], 'gamma':
param_range['gamma']},
{'kernel': ['sigmoid'], 'C': param_range['C'], 'gamma':
param_range['gamma'], 'coef0': param_range['coef0']},
]
===================================
Why the discrepancy between the jobs reported by clf.fit and the jobs
returned from ParameterGrid? Is it because the number of folds in CV? How
can I get the actual number of jobs that GridSearchCV will run?
Josh
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general