Re: [Scikit-learn-general] Gridsearch pickle error with scipy distributions

2015-08-15 Thread Joel Nothman
This is a known scipy deficiency. See https://github.com/scipy/scipy/pull/4821 and related issues. On 15 August 2015 at 05:37, Jason Sanchez wrote: > This code raises a PicklingError: > > from sklearn.datasets import load_boston > from sklearn.pipeline import Pipeline > from sklearn.ensemble imp

[Scikit-learn-general] Gridsearch pickle error with scipy distributions

2015-08-14 Thread Jason Sanchez
This code raises a PicklingError: from sklearn.datasets import load_boston from sklearn.pipeline import Pipeline from sklearn.ensemble import RandomForestRegressor from sklearn.grid_search import RandomizedSearchCV from sklearn.externals import joblib from scipy.stats import randint X, y = load_b

Re: [Scikit-learn-general] gridsearch with sparsematrices

2014-09-11 Thread Manoj Kumar
This is simply due to the fact that centering a sparse matrix (that is subtracting from its mean), would make it dense, and there would be no point of it being sparse in the first place. On Thu, Sep 11, 2014 at 9:09 PM, Pagliari, Roberto wrote: > This is what I’m getting when using sparse matric

[Scikit-learn-general] gridsearch with sparsematrices

2014-09-11 Thread Pagliari, Roberto
This is what I'm getting when using sparse matrices with grid search: Cannot center sparse matrices: pass `with_mean=False` instead. See docstring for motivation and alternatives. If I set with_mean=False, does that mean the data will not be scaled with respect to the mean? If so, why would one

Re: [Scikit-learn-general] GridSearch comparing two preprocessors (or graph paths)

2014-08-07 Thread Joel Nothman
This is possible with https://github.com/scikit-learn/scikit-learn/pull/1769, which includes an example of something quite similar. Reviews would be greatly appreciated! On 8 August 2014 07:32, Ronnie Ghose wrote: > No afaik but it's easy enough to build in :) > On Aug 7, 2014 5:03 PM, "Satraji

Re: [Scikit-learn-general] GridSearch comparing two preprocessors (or graph paths)

2014-08-07 Thread Ronnie Ghose
No afaik but it's easy enough to build in :) On Aug 7, 2014 5:03 PM, "Satrajit Ghosh" wrote: > hi folks, > > is there a way for GridSearch in scikit learn to choose between two > preprocessors (e.g., PCA vs FeatureAgglomeration). more generally, whether > there is something to search through diff

[Scikit-learn-general] GridSearch comparing two preprocessors (or graph paths)

2014-08-07 Thread Satrajit Ghosh
hi folks, is there a way for GridSearch in scikit learn to choose between two preprocessors (e.g., PCA vs FeatureAgglomeration). more generally, whether there is something to search through different paths of a pipeline graph. i think i have seen this being discussed, but my keywords were not ret

Re: [Scikit-learn-general] GridSearch and Gaussian Bayes

2014-03-19 Thread Felipe Eltermann
To enhance evaluation metrics you still can: - try some feature selection methods - use more data (if possible) On Wed, Mar 19, 2014 at 1:09 PM, James Bergstra wrote: > If there are no hyper-parameters, then you don't need to do any such > optimization :) > > > On Wed, Mar 19, 2014 at 8:04 AM, A

Re: [Scikit-learn-general] GridSearch and Gaussian Bayes

2014-03-19 Thread James Bergstra
If there are no hyper-parameters, then you don't need to do any such optimization :) On Wed, Mar 19, 2014 at 8:04 AM, Ahmed Ibrahim wrote: > Hi, > > 1) Is it possible to optimise Gaussian Naive Bayes using GridSearch? > > 2) Since there are no parameters in GussianNB, do I have to use another >

[Scikit-learn-general] GridSearch and Gaussian Bayes

2014-03-19 Thread Ahmed Ibrahim
Hi, 1) Is it possible to optimise Gaussian Naive Bayes using GridSearch? 2) Since there are no parameters in GussianNB, do I have to use another approach for optimisation? Thanks Ahmed -- Learn Graph Databases - Downlo

Re: [Scikit-learn-general] GridSearch with sample_weights

2013-06-01 Thread Joel Nothman
Ahh and I'd forgotten that 1574 included support in grid search. I should perhaps take a look at that. On Sun, Jun 2, 2013 at 1:10 AM, Andreas Mueller wrote: > On 06/01/2013 01:03 PM, Joel Nothman wrote: > > I haven't seen any patch for this precisely, though it's a known issue > > (even if it d

Re: [Scikit-learn-general] GridSearch with sample_weights

2013-06-01 Thread Andreas Mueller
On 06/01/2013 01:03 PM, Joel Nothman wrote: > I haven't seen any patch for this precisely, though it's a known issue > (even if it doesn't seem to be explicitly ticketed; it's closest to > https://github.com/scikit-learn/scikit-learn/issues/1179). There are > various tricky cases not currently s

Re: [Scikit-learn-general] GridSearch with sample_weights

2013-06-01 Thread Joel Nothman
I haven't seen any patch for this precisely, though it's a known issue (even if it doesn't seem to be explicitly ticketed; it's closest to https://github.com/scikit-learn/scikit-learn/issues/1179). There are various tricky cases not currently supported for which it's easiest to roll your own search

Re: [Scikit-learn-general] GridSearch with sample_weights

2013-05-31 Thread Andreas Mueller
Hi Peter. What you try to achieve is currently not possible afaik There was a branch by Noel to implement this but I'm not sure about it's state. You could give it a shot. Alternatively you can use the IterGrid (or ParameterGrid in the dev version I think) from the grid_search module and write th

[Scikit-learn-general] GridSearch with sample_weights

2013-05-31 Thread Peter Retzlaff
Hello everyone, I'm trying to execute a grid search with the GridSearchCV class (for an AdaBoostClassifier) and want to use a custom sample_weight vector to begin with. However, I can't figure out, how to do this. Passing the parameter to GridSearchCV's fit()-method gives me the message, that t

Re: [Scikit-learn-general] GridSearch for Multilabel OneVsRestClassifier?

2013-01-09 Thread Andrew Winterman
I used: param_grid = dict(estimator__clf__gamma=10.0 ** np.arange(-5, 4), estimator__clf__c=10.0 ** np.arange(-2, 9), estimator__clf__degree=(1, 2, 3, 4), estimator__reduce_dim__n_components=(10, 100, 200)) I just up

Re: [Scikit-learn-general] GridSearch for Multilabel OneVsRestClassifier?

2013-01-09 Thread Mathieu Blondel
What did you use for param_grid? Your example doesn't contain it. Mathieu -- Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery and much more. Keep your Java skills current with LearnJavaNow - 200+ ho

Re: [Scikit-learn-general] GridSearch for Multilabel OneVsRestClassifier?

2013-01-09 Thread Andreas Mueller
Yes, you can open an issue. Which version of sklearn are you using? There was recently a fix for grid-search with lists. -- Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery and much more. Keep your J

Re: [Scikit-learn-general] GridSearch for Multilabel OneVsRestClassifier?

2013-01-08 Thread Andrew Winterman
Converting to a numpy array gave me a different and strange error message: /Users/andrewwinterman/Documents/sparks-honey/classifier/lib/python2.7/site-packages/sklearn/grid_search.pyc in fit_grid_point(X, y, base_clf, clf_params, train, test, loss_func, score_func, verbose, **fit_params) 109

Re: [Scikit-learn-general] GridSearch for Multilabel OneVsRestClassifier?

2013-01-08 Thread Andrew Winterman
X is a sparse matrix: X <926x1238 sparse matrix of type '' with 43973 stored elements in Compressed Sparse Row format> Y is a regular python list of 926 lists of strings: Y[0:10] [['29'], ['3', '24'], ['48'], ['29'], ['37'], ['3'], ['14'], ['21'], ['16', '48', '50'], ['48']]

Re: [Scikit-learn-general] GridSearch for Multilabel OneVsRestClassifier?

2013-01-08 Thread Andreas Mueller
On 01/09/2013 12:38 AM, Andrew Winterman wrote: I've also posted this question to Stack Overflow. I'm trying to use GridSearch for a multilabel problem with OneVsRestClassifier as follows. |#import

[Scikit-learn-general] GridSearch for Multilabel OneVsRestClassifier?

2013-01-08 Thread Andrew Winterman
I've also posted this question to Stack Overflow. I'm trying to use GridSearch for a multilabel problem with OneVsRestClassifier as follows. #importsfrom sklearn.svm import SVCfrom sklearn.pipeline import P

Re: [Scikit-learn-general] GridSearch example

2012-11-18 Thread Gael Varoquaux
On Fri, Nov 16, 2012 at 12:41:14PM -0800, Fred Mailhot wrote: > This doesn't appear to be document (at least not at > http://scikit-learn.org/dev > /modules/generated/sklearn.grid_search.GridSearchCV.html)... Thanks for reporting, I have addressed this: https://github.com/scikit-learn/scikit-lear

Re: [Scikit-learn-general] GridSearch example

2012-11-17 Thread Andreas Mueller
On 11/16/2012 08:41 PM, Fred Mailhot wrote: On 15 November 2012 23:20, Andreas Mueller > wrote: [...] You can give GridSearchCV not only a grid but also a list of grids. I would go with that. (is that sufficiently documented?) This doesn't app

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Fred Mailhot
On 15 November 2012 23:20, Andreas Mueller wrote: > [...] > You can give GridSearchCV not only a grid but also a list of grids. > I would go with that. > (is that sufficiently documented?) > This doesn't appear to be document (at least not at http://scikit-learn.org/dev/modules/generated/sklearn

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Ronnie Ghose
Ahh.. sorry >_<. I thought I made a new thread... sigh. On 16 November 2012 15:33, Fred Mailhot wrote: > Check out SGDClassifier and partial_fit()...I've used these to good effect. > > Also, PROTIP: if you want decent help, don't piggy-back on threads that > have nothing to do with your questio

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Fred Mailhot
Check out SGDClassifier and partial_fit()...I've used these to good effect. Also, PROTIP: if you want decent help, don't piggy-back on threads that have nothing to do with your question. Just sayin'. On 16 November 2012 12:23, Ronnie Ghose wrote: > Any ideas for online learning with Scikit? I

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Ronnie Ghose
Any ideas for online learning with Scikit? I have a data set that is > 20gb that I want to train on I don't think I can do that easily, so what should I do? Thanks, Shomiron Ghose On 15 November 2012 15:45, Fred Mailhot wrote: > Dear list, > > I'm using GridSearchCV to do some simple model

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Olivier Grisel
This is a really weird low level error. Maybe a python bug. I don't have time to investigate but I someone else can reproduce it would be interesting to try and make a minimalistic reproduction script that just uses the python multiprocessing API. --

Re: [Scikit-learn-general] GridSearch example

2012-11-16 Thread Fred Mailhot
Thanks to all for the tips on GridSearch with FeatureUnion, I'll be trying those out today. And @amueller I've been following the development of your PR for the random sampling of param space with great interest. But back to the initial problem...it seems that an empty input is the cause. My raw d

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Andreas Mueller
Sorry for not being able to help you with the actual problem, but another hint: I have a pull request for randomly sampling the parameter space, which should be much more efficient in a model with so many parameters. https://github.com/scikit-learn/scikit-learn/pull/1194

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Andreas Mueller
>> 2) how would I go about grid search over different vectorizers (e.g. >> CountVectorizer(analyzer="word"), CountVectorizer(analyzer="char_wb"), and a >> FeatureUnion of the two)? > You could always use a FeatureUnion and give it different TransformerLists via the GridSearchCV (at least I think t

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Mathieu Blondel
On Fri, Nov 16, 2012 at 3:28 PM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > On Thu, Nov 15, 2012 at 05:07:24PM -0800, Fred Mailhot wrote: > > 1) there are a few LinearSVC options (penalty/loss, penalty/dual) for > which > > certain values are incompatible, but which are not documente

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Gael Varoquaux
On Thu, Nov 15, 2012 at 05:07:24PM -0800, Fred Mailhot wrote: > 1) there are a few LinearSVC options (penalty/loss, penalty/dual) for which > certain values are incompatible, but which are not documented as such...this > makes grid search a bit of a pain. Indeed, they should be documented. Pull re

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Fred Mailhot
I already know that things work with n_jobs=1. I just tried n_jobs=-1 with a few smaller datasets (100 & 1000 items) and things seem to have worked fine (without LinearSVC, see below). Possibly there's something wrong with the larger dataset...investigating now. A couple of points related to grid

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Andreas Mueller
Are you sure the error is related to n_jobs, not a specific classifier? Could you run with n_jobs=1 and a very small training set (like 100 examples or something) and see if it runs through? (Actually I'm totally clueless but that doesn't look like a multiprocessing error to me) On 11/15/201

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Fred Mailhot
Argh, copy-paste error: https://gist.github.com/e2ca1910450819a8a287 As for Accelerate, I'm not 100% how to check that (I cloned & ran "setup.py build" and "setup.py install" without making any changes, if memory serves), but this leads me to think "yes": $ otool -L /Users/aboutuser/Development/

Re: [Scikit-learn-general] GridSearch example

2012-11-15 Thread Andreas Mueller
Hi Fred. The link is dead for me. Do you link against Accelerate (not sure if this is relevant)? Cheers, Andy On 11/15/2012 08:45 PM, Fred Mailhot wrote: Dear list, I'm using GridSearchCV to do some simple model selection for a text classification task. I've got it working (see below for cave

[Scikit-learn-general] GridSearch example

2012-11-15 Thread Fred Mailhot
Dear list, I'm using GridSearchCV to do some simple model selection for a text classification task. I've got it working (see below for caveat), but I'm not convinced that I'm making the best use of this tool. If someone has the time/inclination, I'd love a set of eyes to check the following gist t

Re: [Scikit-learn-general] GridSearch over min_n and max_n in CountVectorizer

2012-08-12 Thread Robert Layton
On 13 August 2012 01:56, Andreas Mueller wrote: > On 08/12/2012 01:56 PM, Alexandre Gramfort wrote: > >> Hey Everybody. > >> If was just trying to use CountVectorizer but I have trouble using > >> Gridsearch using both max_n and min_n. > >> I guess the problem is that the parameter are conditione

Re: [Scikit-learn-general] GridSearch over min_n and max_n in CountVectorizer

2012-08-12 Thread Andreas Mueller
On 08/12/2012 01:56 PM, Alexandre Gramfort wrote: >> Hey Everybody. >> If was just trying to use CountVectorizer but I have trouble using >> Gridsearch using both max_n and min_n. >> I guess the problem is that the parameter are conditioned on each other. >> Is there a nice way to do this? >> I gue

Re: [Scikit-learn-general] GridSearch over min_n and max_n in CountVectorizer

2012-08-12 Thread Alexandre Gramfort
> Hey Everybody. > If was just trying to use CountVectorizer but I have trouble using > Gridsearch using both max_n and min_n. > I guess the problem is that the parameter are conditioned on each other. > Is there a nice way to do this? > I guess I could generate lists of param_grids, i.e. one for e

[Scikit-learn-general] GridSearch over min_n and max_n in CountVectorizer

2012-08-11 Thread Andreas Mueller
Hey Everybody. If was just trying to use CountVectorizer but I have trouble using Gridsearch using both max_n and min_n. I guess the problem is that the parameter are conditioned on each other. Is there a nice way to do this? I guess I could generate lists of param_grids, i.e. one for each value

Re: [Scikit-learn-general] GridSearch

2012-02-06 Thread Andreas
On 02/03/2012 01:59 PM, Mathias Verbeke wrote: Hi Andreas, You would have to add it to the "fit" method of SVC, not GridSearchCV. How can this be done in the digits example, since there's only one fit there, namely the one of GridSearch? > Does this mean class weighting isn't possib

Re: [Scikit-learn-general] GridSearch

2012-02-03 Thread Gilles Louppe
Hi, You can inject your fit params using the `fit_params` parameter in GridSearchCV. Gilles On 3 February 2012 13:59, Mathias Verbeke wrote: > Hi Andreas, > >> You would have to add it to the "fit" method of SVC, not GridSearchCV. > > > How can this be done in the digits example, since there's

Re: [Scikit-learn-general] GridSearch

2012-02-03 Thread Mathias Verbeke
Hi Andreas, You would have to add it to the "fit" method of SVC, not GridSearchCV. > How can this be done in the digits example, since there's only one fit there, namely the one of GridSearch? > > Does this mean class weighting isn't possible at all with GridSearch? > At the moment, yes. > > If

Re: [Scikit-learn-general] GridSearch

2012-02-03 Thread Andreas
On 02/03/2012 01:47 PM, Mathias Verbeke wrote: > Hi Andreas, > > Thanks for the answer. Hm, that's a pity. When I add it as a parameter > to fit, I get > > AssertionError: Invalid parameter class_weight for estimator GridSearchCV > You would have to add it to the "fit" method of SVC, not GridSearc

Re: [Scikit-learn-general] GridSearch

2012-02-03 Thread Mathias Verbeke
Hi Andreas, Thanks for the answer. Hm, that's a pity. When I add it as a parameter to fit, I get AssertionError: Invalid parameter class_weight for estimator GridSearchCV Does this mean class weighting isn't possible at all with GridSearch? Thanks, Mathias On Fri, Feb 3, 2012 at 1:30 PM, And

Re: [Scikit-learn-general] GridSearch

2012-02-03 Thread Andreas
Hi Mathias. As far as I know the use of class weights in grid search is not possible in SVC at the moment. It can be used as a parameter to fit, but this prevents one from using it for grid searches. This is a known issue and the class_weight should be moved to the initialization of SVC. I am (som

Re: [Scikit-learn-general] GridSearch

2012-02-03 Thread Mathias Verbeke
Hi Olivier, That's something I tried already, but then I get: AssertionError: Invalid parameter class_weight for estimator SVC Any idea what can be wrong? Thanks, Mathias On Fri, Feb 3, 2012 at 12:19 PM, Olivier Grisel wrote: > 2012/2/3 Mathias Verbeke : > > Hi Adreas, > > > > Thanks a lot;

Re: [Scikit-learn-general] GridSearch

2012-02-03 Thread Olivier Grisel
2012/2/3 Mathias Verbeke : > Hi Adreas, > > Thanks a lot; that answers my questions. Just a quick check to be sure I > understand it correctly: the results in the classification report for the > best classifier are the ones on the test set, right? It print the performance measured on the test set

Re: [Scikit-learn-general] GridSearch

2012-02-03 Thread Mathias Verbeke
Hi Adreas, Thanks a lot; that answers my questions. Just a quick check to be sure I understand it correctly: the results in the classification report for the best classifier are the ones on the test set, right? And another small question: could you tell me how/where I need to set the class_weight

Re: [Scikit-learn-general] GridSearch

2012-02-03 Thread Andreas
Hi Mathias. First, please note that you are looking at an "old" version of the docs. We are in the process to include a warning. Please refer to http://scikit-learn.org/stable/auto_examples/grid_search_digits.html instead. For

[Scikit-learn-general] GridSearch

2012-02-03 Thread Mathias Verbeke
Hi all, I'm currently looking at the GridSearch example ( http://scikit-learn.org/0.9/auto_examples/grid_search_digits.html), and I don't completely get the point of using cross-validation twice. Why aren't the parameters and the classifier selected in on cross-validations step? Furthermore, I wa