Re: [Scikit-learn-general] Defining a Density Estimation Interface

2013-07-10 Thread Gael Varoquaux
On Mon, Jul 08, 2013 at 07:56:28PM +0200, Olivier Grisel wrote: > > Then you'd have pdf, logpdf, cdf, logcdf, sf, rvs (not wild about this > > one, and I think we use sample in places), etc. > I am not found of acronyms, especially when there not common at all > such as `rvs`. I think RVS stands f

Re: [Scikit-learn-general] Improving Text Classification

2013-07-10 Thread Olivier Grisel
2013/7/10 Mike Hansen : > I have been using Scikit's text classification for several weeks, and I > really like it. I use my own corpus (self-generated) and prepare each > document using the NLTK. Presently I am relying on this tutorial/code-base, > only making changes when absolutely necessary f

Re: [Scikit-learn-general] Improving Text Classification

2013-07-10 Thread Mike Hansen
I have been using Scikit's text classification for several weeks, and I really like it.  I use my own corpus (self-generated) and prepare each document using the NLTK.  Presently I am relying on this tutorial/code-base, only making changes when absolutely necessary for my documents to work. The

Re: [Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-10 Thread Olivier Grisel
2013/7/10 Josh Wasserstein : > Thanks Olivier. Would you mind elaborating on why having a grid that is this > fine-grained is a problem? Is it because the library runs into overflow > problems? Something else? No it's just that it will run forever and that you are probably just wasting CPU time.

Re: [Scikit-learn-general] Py3 port: FAILED (SKIP=2, errors=1, failures=5)

2013-07-10 Thread Justin Vincent
Ended up going with the approach of one of the SO folks. https://github.com/scikit-learn/scikit-learn/pull/2150 On Wed, Jul 10, 2013 at 1:56 PM, Olivier Grisel wrote: > Maybe you can try to put: > > from __future__ import unicode_literals > > in some appropriate place? > > Also: > http://stacko

Re: [Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-10 Thread Olivier Grisel
If you do 4 CV folds on each parameter combination of your fine grid, then: 191880 * len(skf) == 191880 * 4 == 767520 -- Olivier 2013/7/10 Olivier Grisel : > 2013/7/10 Josh Wasserstein : >> Thanks Olivier. Would you mind elaborating on why having a grid that is this >> fine-grained is a problem?

Re: [Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-10 Thread Gael Varoquaux
On Wed, Jul 10, 2013 at 03:15:46PM -0400, Josh Wasserstein wrote: > Thanks Olivier. Would you mind elaborating on why having a grid that is this > fine-grained is a problem? Is it because the library runs into overflow > problems? Something else? Because it's going to take ages. G --

Re: [Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-10 Thread Josh Wasserstein
Thanks Olivier. Would you mind elaborating on why having a grid that is this fine-grained is a problem? Is it because the library runs into overflow problems? Something else? Thanks, Josh On Wed, Jul 10, 2013 at 1:29 PM, Olivier Grisel wrote: > 2013/7/10 Josh Wasserstein : > > Thanks Robert a

Re: [Scikit-learn-general] Py3 port: FAILED (SKIP=2, errors=1, failures=5)

2013-07-10 Thread Olivier Grisel
Maybe you can try to put: from __future__ import unicode_literals in some appropriate place? Also: http://stackoverflow.com/questions/13473971/multi-version-support-for-python-doctests -- See everything from the browse

Re: [Scikit-learn-general] Py3 port: FAILED (SKIP=2, errors=1, failures=5)

2013-07-10 Thread Olivier Grisel
2013/7/10 Justin Vincent : > We are a little further than that unfortunately. Some failures in conf.py > were preventing tests other than doctests from running. With conf.py and the > doctests fixed, we are at 4 errors and 10 failures, which is still pretty > good. > > What do we think should be do

Re: [Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-10 Thread Olivier Grisel
2013/7/10 Josh Wasserstein : > Thanks Robert and everyone else. I am still having a strange problem with > GridSearchCV with 0.14-git. Even though the number returned by > > === > len(list(ParameterGrid(tuned_parameters))) > === > > is

Re: [Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-10 Thread Josh Wasserstein
Thanks Robert and everyone else. I am still having a strange problem with GridSearchCV with 0.14-git. Even though the number returned by === len(list(ParameterGrid(tuned_parameters))) === is 191,880, my call to: ===

[Scikit-learn-general] IPython Drinkup in Paris !

2013-07-10 Thread Nelle Varoquaux
Hi pythonistas, As there are many scientific developpers using IPython, I think some of you may be interested in this IPython Drinkup in Paris. Of course, we will have our own during the sprint, but if anyone wants to meet some of the IPython dev, this is the place to be! If you are interested in

[Scikit-learn-general] Help to fund the sprint

2013-07-10 Thread Jaques Grobler
Hi List, I've added a paypal button to the main page of the scikit-learn website, as well as on the 'About us' section. As you may know, our second International Scikit-learn Code-sprint is around the corner, and we would be very appreciative to any form of donation. Donations can be made through

Re: [Scikit-learn-general] Py3 port: FAILED (SKIP=2, errors=1, failures=5)

2013-07-10 Thread Justin Vincent
We are a little further than that unfortunately. Some failures in conf.py were preventing tests other than doctests from running. With conf.py and the doctests fixed, we are at 4 errors and 10 failures, which is still pretty good. What do we think should be done with u'blah' type strings in the do

Re: [Scikit-learn-general] Extremely poor SVM performance

2013-07-10 Thread Andreas Mueller
On 07/09/2013 12:44 AM, Josh Wasserstein wrote: > Peter - Yes. That also puzzles me. So odd. > > Thanks Olivier - I am using auc_score, not roc_curve. My scikit-learn > installation does not complain about it. I will try to get the master > git installed. > Well, it doesn't complain, but it doesn

[Scikit-learn-general] Paris Sprint location

2013-07-10 Thread Alexandre Gramfort
hi everyone, our next sprint will take place at Telecom ParisTech (http://www.telecom-paristech.fr) located at: 46 Rue Barrault, 75013 Paris, France Google Maps link : http://goo.gl/maps/k6QPL We'll be in room B312 everyday starting at 9am. See wiki page: https://github.com/scikit-learn/sciki