Re: [Scikit-learn-general] GridSearch example

2012-11-18 Thread Gael Varoquaux
On Fri, Nov 16, 2012 at 12:41:14PM -0800, Fred Mailhot wrote: > This doesn't appear to be document (at least not at > http://scikit-learn.org/dev > /modules/generated/sklearn.grid_search.GridSearchCV.html)... Thanks for reporting, I have addressed this: https://github.com/scikit-learn/scikit-lear

Re: [Scikit-learn-general] GridSearch example

2012-11-17 Thread Andreas Mueller
On 11/16/2012 08:41 PM, Fred Mailhot wrote: On 15 November 2012 23:20, Andreas Mueller > wrote: [...] You can give GridSearchCV not only a grid but also a list of grids. I would go with that. (is that sufficiently documented?) This doesn't app

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Fred Mailhot
On 15 November 2012 23:20, Andreas Mueller wrote: > [...] > You can give GridSearchCV not only a grid but also a list of grids. > I would go with that. > (is that sufficiently documented?) > This doesn't appear to be document (at least not at http://scikit-learn.org/dev/modules/generated/sklearn

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Ronnie Ghose
Ahh.. sorry >_<. I thought I made a new thread... sigh. On 16 November 2012 15:33, Fred Mailhot wrote: > Check out SGDClassifier and partial_fit()...I've used these to good effect. > > Also, PROTIP: if you want decent help, don't piggy-back on threads that > have nothing to do with your questio

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Fred Mailhot
Check out SGDClassifier and partial_fit()...I've used these to good effect. Also, PROTIP: if you want decent help, don't piggy-back on threads that have nothing to do with your question. Just sayin'. On 16 November 2012 12:23, Ronnie Ghose wrote: > Any ideas for online learning with Scikit? I

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Ronnie Ghose
Any ideas for online learning with Scikit? I have a data set that is > 20gb that I want to train on I don't think I can do that easily, so what should I do? Thanks, Shomiron Ghose On 15 November 2012 15:45, Fred Mailhot wrote: > Dear list, > > I'm using GridSearchCV to do some simple model

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Olivier Grisel
This is a really weird low level error. Maybe a python bug. I don't have time to investigate but I someone else can reproduce it would be interesting to try and make a minimalistic reproduction script that just uses the python multiprocessing API. --

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Fred Mailhot
Thanks to all for the tips on GridSearch with FeatureUnion, I'll be trying those out today. And @amueller I've been following the development of your PR for the random sampling of param space with great interest. But back to the initial problem...it seems that an empty input is the cause. My raw d

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Andreas Mueller
Sorry for not being able to help you with the actual problem, but another hint: I have a pull request for randomly sampling the parameter space, which should be much more efficient in a model with so many parameters. https://github.com/scikit-learn/scikit-learn/pull/1194

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Andreas Mueller
>> 2) how would I go about grid search over different vectorizers (e.g. >> CountVectorizer(analyzer="word"), CountVectorizer(analyzer="char_wb"), and a >> FeatureUnion of the two)? > You could always use a FeatureUnion and give it different TransformerLists via the GridSearchCV (at least I think t

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Mathieu Blondel
On Fri, Nov 16, 2012 at 3:28 PM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > On Thu, Nov 15, 2012 at 05:07:24PM -0800, Fred Mailhot wrote: > > 1) there are a few LinearSVC options (penalty/loss, penalty/dual) for > which > > certain values are incompatible, but which are not documente

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Gael Varoquaux
On Thu, Nov 15, 2012 at 05:07:24PM -0800, Fred Mailhot wrote: > 1) there are a few LinearSVC options (penalty/loss, penalty/dual) for which > certain values are incompatible, but which are not documented as such...this > makes grid search a bit of a pain. Indeed, they should be documented. Pull re

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Fred Mailhot
I already know that things work with n_jobs=1. I just tried n_jobs=-1 with a few smaller datasets (100 & 1000 items) and things seem to have worked fine (without LinearSVC, see below). Possibly there's something wrong with the larger dataset...investigating now. A couple of points related to grid

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Andreas Mueller
Are you sure the error is related to n_jobs, not a specific classifier? Could you run with n_jobs=1 and a very small training set (like 100 examples or something) and see if it runs through? (Actually I'm totally clueless but that doesn't look like a multiprocessing error to me) On 11/15/201

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Fred Mailhot
Argh, copy-paste error: https://gist.github.com/e2ca1910450819a8a287 As for Accelerate, I'm not 100% how to check that (I cloned & ran "setup.py build" and "setup.py install" without making any changes, if memory serves), but this leads me to think "yes": $ otool -L /Users/aboutuser/Development/

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Andreas Mueller
Hi Fred. The link is dead for me. Do you link against Accelerate (not sure if this is relevant)? Cheers, Andy On 11/15/2012 08:45 PM, Fred Mailhot wrote: Dear list, I'm using GridSearchCV to do some simple model selection for a text classification task. I've got it working (see below for cave

[Scikit-learn-general] GridSearch example

2012-11-15 Thread Fred Mailhot
Dear list, I'm using GridSearchCV to do some simple model selection for a text classification task. I've got it working (see below for caveat), but I'm not convinced that I'm making the best use of this tool. If someone has the time/inclination, I'd love a set of eyes to check the following gist t