This is odd. I can successfully run the example `grid_search_digits.py`.
However, I am unable to do a grid search on my own data.
I have the following setup:
===============
import sklearn
from sklearn.svm import SVC
from sklearn.grid_search import GridSearchCV
from sklearn.cross_validation import LeaveOneOut
from sklearn.metrics import auc_score
# ... Build X and y ....
tuned_parameters = [{'kernel': ['rbf'], 'gamma': [1e-3, 1e-4],
'C': [1, 10, 100, 1000]},
{'kernel': ['linear'], 'C': [1, 10, 100, 1000]}]
loo = LeaveOneOut(len(y))
clf = GridSearchCV(SVC(C=1), tuned_parameters, score_func=auc_score)
clf.fit(X, y, cv=loo)
....
print clf.best_estimator_
....
===============
But I never get passed `clf.fit` (I left it run for ~1hr).
I have tried also with
clf.fit(X, y, cv=10)
and with
skf = StratifiedKFold(y,2)
clf.fit(X, y, cv=skf)
and had the same problem (it never finishes the clf.fit statement). My data
is simple:
> X.shape
(27,26)
> y.shape
5
> y.dtype
dtype('int64')
>?y
Type: ndarray
String Form:[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1]
Length: 27
File:
/home/jacob04/opt/python/numpy/numpy-1.7.1/lib/python2.7/site-
packages/numpy/__init__.py
Docstring: <no docstring>
Class Docstring:
ndarray(shape, dtype=float, buffer=None, offset=0,
strides=None, order=None)
> ?X
Type: ndarray
String Form:
[[ -3.61238468e+03 -3.61253920e+03 -3.61290196e+03
-3.61326679e+03
7.84590361e+02 0.0000 <...> 0000e+00 2.22389150e+00
2.53252959e+00
2.11606216e+00 -1.99613432e+05 -1.99564828e+05]]
Length: 27
File:
/home/jacob04/opt/python/numpy/numpy-1.7.1/lib/python2.7/site-
packages/numpy/__init__.py
Docstring: <no docstring>
Class Docstring:
ndarray(shape, dtype=float, buffer=None, offset=0,
strides=None, order=None)
This is all with the latest version of scikit-learn (0.13.1) and:
$ pip freeze
Cython==0.19.1
PIL==1.1.7
PyXB==1.2.2
PyYAML==3.10
argparse==1.2.1
distribute==0.6.34
epc==0.0.5
ipython==0.13.2
jedi==0.6.0
matplotlib==1.3.x
nltk==2.0.4
nose==1.3.0
numexpr==2.1
numpy==1.7.1
pandas==0.11.0
pyparsing==1.5.7
python-dateutil==2.1
pytz==2013b
rpy2==2.3.1
scikit-learn==0.13.1
scipy==0.12.0
sexpdata==0.0.3
six==1.3.0
stemming==1.0.1
-e git+
https://github.com/PyTables/PyTables.git@df7b20444b0737cf34686b5d88b4e674ec85575b#egg=tables-dev
tornado==3.0.1
wsgiref==0.1.2
Thanks,
Jacob
PS: This thread is based on the following StackOverflow post:
http://stackoverflow.com/questions/17455302/clf-fit-freezes-on-small-dataset-in-scikit-learn
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general