[Scikit-learn-general] adding contextual learning references to skl documentation

2013-11-08 Thread colorado reed
Hello, We're contributors to the open source project Metacademy ( http://www.metacademy.org http://github.com/metacademy), which is creating a curated web of machine learning concepts. The idea is that Metacademy can tell users how to efficiently learn a concept and all of its necessary prerequisi

Re: [Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Karol Pysniak
Yes, it would be perfect. Do you have any work planned or schedule on this problem? Let me know if there is something I could help with. Thanks, Karol 2013/11/8 Vlad Niculae > It should be written in such a way so that you can add more benchmarks > with a PR to that repo (the master branch) an

Re: [Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Vlad Niculae
It should be written in such a way so that you can add more benchmarks with a PR to that repo (the master branch) and it should "just work". Many parts of the framework are still hackish though. Yours, Vlad On Fri, Nov 8, 2013 at 7:53 PM, Karol Pysniak wrote: > Awesome, thanks Vlad, that's exact

Re: [Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Karol Pysniak
Awesome, thanks Vlad, that's exactly what I've been looking for! Thanks, Karol 2013/11/8 Vlad Niculae > We have an instance of vbench continuously running [1] that I did as a > GSoC project last year. > > For some reason it seems that the links don't generate properly now, > but it still works

Re: [Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Karol Pysniak
Looks good, but I was more interested if we want to have a single script or set of scripts that would produce a single number that could used to compare changes. What do you think? Thanks, Karol 2013/11/8 Skipper Seabold > On Fri, Nov 8, 2013 at 6:30 PM, Karol Pysniak wrote: > > Hi All, > > >

Re: [Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Vlad Niculae
We have an instance of vbench continuously running [1] that I did as a GSoC project last year. For some reason it seems that the links don't generate properly now, but it still works (though all data got lost in a jenkins setup incident this summer). Here are some linear model benchmarks for exam

Re: [Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Skipper Seabold
On Fri, Nov 8, 2013 at 6:30 PM, Karol Pysniak wrote: > Hi All, > > Has there any been discussion on adding some automated benchmarks for both > speed and accuracy of the algorithms we have? I think it would very > interesting if such a script could be automatically executed after every > commit so

[Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Karol Pysniak
Hi All, Has there any been discussion on adding some automated benchmarks for both speed and accuracy of the algorithms we have? I think it would very interesting if such a script could be automatically executed after every commit so that we could follow the performance of scikit-learn or, at leas

[Scikit-learn-general] PR #2391, Implemented Determinant ECOC

2013-11-08 Thread Karol Pysniak
Hi All, I've added some new ECOC some time ago. Would it be possible to have some review and feedback? Also, would you recommend any datasets that could be used for verification? I am especially concerned about what type and size of data sets I should use. I would appreciate any help and suggesti

Re: [Scikit-learn-general] Random forest with zero features

2013-11-08 Thread Michal Romaniuk
Did anyone work on this problem (exceptions raised by classifiers in grid search) since? I would be happy to do some work to fix this problem, but would need some advice. It seems to me like the easiest way around the issue is to wrap the call to clf.fit() in a try statement and catch the exceptio

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Gael Varoquaux
On Fri, Nov 08, 2013 at 11:56:24AM +0100, Olivier Grisel wrote: > In retrospect I would have prefered it named something explicit like > "regularization" or "l2_reg" instead of "alpha". Agreed. > Still I like the (alpha, l1_ratio) parameterization better over the > (l2_reg, l1_reg) parameter set

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Alexandre Gramfort
just a remark in LogisticRegression you can use L1 and L2 reg and there is a single param that is alpha. It's not trivial to have a consistent naming for regularization param. In SVC it is C as it's the common naming... but it corresponds to 1/l2_reg with what you suggest... Alex

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Olivier Grisel
We cannot use lambda as parameter name because it is a reserved keyword of the python language (for defining anonymous functions). This is why used "alpha" instead of "lambda" for the ElasticNet / Lasso model initially and then this notation was reused in more recently implemented estimators such a

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Peter Prettenhofer
SGDClassifier adopted the parameter names of ElasticNet (which has been around in sklearn for longer) for consistency reasons. I agree that we should strive for concise and intuitive parameter names such as ``l1_ratio``. Naming in sklearn is actually quite unfortunate since the popular R package

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Thomas Unterthiner
Just my 0.02$ as a user: I was also a confused/put-off by `alpha` and `l1_ratio` when I first explored SGDClassifier, I found those names to be pretty inconsistent --- plus I tend to call my regularization parameters `lambda` and use `alpha` for learning rates. I'm sure other people associate y

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Mathieu Blondel
And lambda is a reserved keyword in Python ;-) On Fri, Nov 8, 2013 at 4:59 PM, Olivier Grisel wrote: > 2013/11/7 Mathieu Blondel : > > > > On Fri, Nov 8, 2013 at 12:28 AM, Vlad Niculae > wrote: > >> > >> I feel like this would go against "explicit is better than implicit", > >> but without it g

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Vlad Niculae
Re: the discussion we had at PyCon.fr, I noticed that the internal elastic net coordinate descent functions are parametrized with `l1_reg` and `l2_reg`, but the exposed classes and functions have `alpha` and `l1_ratio`. Only yesterday there was somebody on IRC who couldn't match Ridge with Elastic

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Olivier Grisel
About the LBFGS-B residuals (non-)issue I was probably confused by the overlapping on the plot and mis-interpreted the location of the PG-l1 and PG-l2 curves. -- Olivier -- November Webinars for C, C++, Fortran Developer

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Olivier Grisel
2013/11/7 Mathieu Blondel : > > On Fri, Nov 8, 2013 at 12:28 AM, Vlad Niculae wrote: >> >> I feel like this would go against "explicit is better than implicit", >> but without it grid search would indeed be awkward. Maybe: >> >> if self.alpha_coef == 'same': >> alpha_coef = self.alpha_comp >>