Dear list,
I'm using GridSearchCV to do some simple model selection for a text
classification task. I've got it working (see below for caveat), but I'm
not convinced that I'm making the best use of this tool. If someone has the
time/inclination, I'd love a set of eyes to check the following gist to see
if I'm doing this correctly:
https://gist.github.com/e2ca1910450819a8a28
Also, for some reason this is throwing errors when I set n_jobs to anything
other than 1. I'm on OS X 10.7.4, using sklearn 0.13. The traceback looks
like:
Process PoolWorker-1:
Traceback (most recent call last):
File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py",
line 232, in _bootstrap
self.run()
File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py",
line 88, in run
self._target(*self._args, **self._kwargs)
File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py",
line 59, in worker
task = get()
File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/queues.py",
line 352, in get
return recv()
TypeError: ('data type not understood', <type 'numpy.dtype'>, ('S0', 0, 1))
Process PoolWorker-2:
[...etc etc ad infinitum]
Has anyone else come across this, or perhaps have any insight into what's
going on? Needless to say, this grid search is taking FOREVER (ca. 10hrs
thus far, and only about halfway through), and I'd love to be able to
parallelize it.
Many thanks,
Fred.
------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general