Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Gael Varoquaux
On Wed, Mar 21, 2012 at 08:09:26PM -0400, David Warde-Farley wrote: > I think it's less about disagreeing with libsvm than disagreeing with the > notation of every textbook presentation I know of. I agree that libsvm is no > golden calf. But it is also the case for the lasso: the loss term is th

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Mathieu Blondel
On Thu, Mar 22, 2012 at 9:09 AM, David Warde-Farley wrote: > In particular, doing 1 vs rest for logistic regression seems like > an odd choice when there is a perfectly good multiclass generalization of > logistic regression. Mathieu clarified to me last night how liblinear is > calculating "

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Mathieu Blondel
On Thu, Mar 22, 2012 at 3:35 AM, James Bergstra wrote: > Also, isn't the feature normalization supposed to be done on a > fold-by-fold basis? If you're doing that, you have a different kernel > matrix in every fold anyway. Indeed, if you want really want to be clean, you would need to do that bu

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Olivier Grisel
Le 22 mars 2012 01:09, David Warde-Farley a écrit : > >> That said, I agree with James that the docs should be much more >> explicit about what is going on, and how what we have differs from >> libsvm. > > I think that renaming sklearn's scaled version of "C" is probably a start. > Using the name

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread David Warde-Farley
On 2012-03-21, at 7:25 PM, Gael Varoquaux wrote: > I'd like to stress that I don't think that following libsvm is much of a > goal per se. I understand that it make the life of someone like James > easier, because he knows libsvm well and can relate to it. I think it's less about disagreeing wi

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Gael Varoquaux
I've stayed quiet in this discussion because I was busy elsewhere. The good thing is that it has allowed me to hear to point of view of different people. Here is mine. First, the decision we took can be undone. It is not final, and the way that it should be taken is to make our user's life easiest

Re: [Scikit-learn-general] Online Non Negative Matrix Factorization GSoC

2012-03-21 Thread Gael Varoquaux
On Wed, Mar 21, 2012 at 06:42:36PM +0100, Alexandre Gramfort wrote: > > In short, I think it could be interesting to implement the scout method too: > > "We show that ridge regression, the lasso, and the elastic net are > > special cases of covariance-regularized regression" > > http://www-stat.sta

Re: [Scikit-learn-general] Grid search on regularization parameter with recursive feature elimination RFECV

2012-03-21 Thread Gael Varoquaux
On Wed, Mar 21, 2012 at 08:16:10PM +0100, Andreas Mueller wrote: >@devs: Is there a piece in the user guide that describes the pipeline? I >can't find it. I guess it should be in the model selection chapter, which never got much love. Gael

Re: [Scikit-learn-general] Grid search on regularization parameter with recursive feature elimination RFECV

2012-03-21 Thread Gael Varoquaux
On Wed, Mar 21, 2012 at 07:06:13PM +, Conrad Lee wrote: >Unsurprisingly, the above code doesn't work because it's not possible to >initialize a RFECV object without an estimator.  But I can't pass it an >estimator yet because I want to vary the value of C that is used in >initia

Re: [Scikit-learn-general] Buildbot failure - Sorry

2012-03-21 Thread Alexandre Gramfort
should be fixed Alex On Wed, Mar 21, 2012 at 9:42 PM, Andreas wrote: > Hey everybody. > It seems I broke the buildbot by clicking the green button to quickly. > I can not really reproduce the behavior, though. > Any help would be appreciated. > > Sorry, > Andy > > ---

[Scikit-learn-general] Buildbot failure - Sorry

2012-03-21 Thread Andreas
Hey everybody. It seems I broke the buildbot by clicking the green button to quickly. I can not really reproduce the behavior, though. Any help would be appreciated. Sorry, Andy -- This SF email is sponsosred by: Try Wind

Re: [Scikit-learn-general] Grid search on regularization parameter with recursive feature elimination RFECV

2012-03-21 Thread Andreas Mueller
Hi Conrad. The Pipeline is designed to do exactly this: http://scikit-learn.org/dev/modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline Example here: http://scikit-learn.org/dev/auto_examples/feature_selection_pipeline.html#example-feature-selection-pipeline-py You can use

[Scikit-learn-general] Grid search on regularization parameter with recursive feature elimination RFECV

2012-03-21 Thread Conrad Lee
I want to do a grid search that does two things at once: chooses the right value for C, the regularization parameter, and does feature selection with recursive feature elimination. As a reminder, here's how you usually use the Recursive Feature Elimination Cross Validation (RFECV) method: from sk

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread James Bergstra
On Wed, Mar 21, 2012 at 6:46 AM, Olivier Grisel wrote: > Le 21 mars 2012 11:14, Mathieu Blondel a écrit : >> On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote: >> >>> Are there any other options? >> >> Another solution is to perform cross-validation using non-scaled C >> values, select the best one

Re: [Scikit-learn-general] Online Non Negative Matrix Factorization GSoC

2012-03-21 Thread Alexandre Gramfort
> Okay, that sounds reasonable to me too. > It appears to me that it might be in everyone interest if I apply for > a different project. I'm considering "Coordinated descent in linear > models beyond squared loss (eg Logistic)" > I'm currently working on a p>>N problem using the R scout package, >

Re: [Scikit-learn-general] linear model, ridge, Bayesian priors

2012-03-21 Thread Mathieu Blondel
Did you have a look into Bayesian Ridge Regression? http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.BayesianRidge.html Mathieu -- This SF email is sponsosred by: Try Windows Azure free for 90 days C

Re: [Scikit-learn-general] Logistic Regression coefficients

2012-03-21 Thread David Warde-Farley
On 2012-03-21, at 4:57 AM, Olivier Grisel wrote: > I think the docstring is wrong. Anybody can confirm? Ran into this myself last night while answering the other thread. Yeah, it appears to be. David -- This SF email is

Re: [Scikit-learn-general] linear model, ridge, Bayesian priors

2012-03-21 Thread Andreas
Hi Jeremias. I haven't thought that trough but shouldn't it be possible to achieve the same effect by doing a linear transformation of your data an labels and then shrinking to zero? Cheers, Andy On 03/21/2012 03:12 PM, Jeremias Engelmann wrote: Hi I'm using scikit learn's linear model's ri

[Scikit-learn-general] linear model, ridge, Bayesian priors

2012-03-21 Thread Jeremias Engelmann
Hi I'm using scikit learn's linear model's ridge regression to do ridge regression with large sparse matrices. I know that, by design, ridge regression penalizes parameters for moving away from zero. What I actually want is to penalize parameters to move away from a certain prior (each parameter h

Re: [Scikit-learn-general] Online Non Negative Matrix Factorization GSoC

2012-03-21 Thread Immanuel B
2012/3/21 Gael Varoquaux : > On Wed, Mar 21, 2012 at 12:24:39PM +0900, Mathieu Blondel wrote: >> If the online NMF and SGD-based matrix factorization proposals are >> merged as I suggested before, I think it would make a decent GSOC >> project. Besides, if two different students were to work on the

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Olivier Grisel
Le 21 mars 2012 11:14, Mathieu Blondel a écrit : > On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote: > >> Are there any other options? > > Another solution is to perform cross-validation using non-scaled C > values, select the best one and scale it before refitting with the > entire dataset (to tak

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Mathieu Blondel
On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote: > Are there any other options? Another solution is to perform cross-validation using non-scaled C values, select the best one and scale it before refitting with the entire dataset (to take into account that the entire dataset is bigger than a train

Re: [Scikit-learn-general] Logistic Regression coefficients

2012-03-21 Thread Mathieu Blondel
On Wed, Mar 21, 2012 at 5:57 PM, Olivier Grisel wrote: > If there are only two classes, 0 or -1 is treated as negative and 1 is > treated as positive. To complement Olivier's answer, by convention in scikit-learn, the negative label is in self.classes_[0] and the positive one is in self.classes_

Re: [Scikit-learn-general] Logistic Regression coefficients

2012-03-21 Thread Kerui Min
Although I haven't check the code, I guess this is the usual way to store the coefficients. To calculate P(C=i|x), we can use the formula: exp(sum_j Coef([i,j])/Z, where Z=sum_i exp(\sum_j Coef[i,j]). Sincerely, Kerui Min On Wed, Mar 21, 2012 at 4:57 PM, Olivier Grisel wrote: > Le 21 mars 2012

Re: [Scikit-learn-general] Logistic Regression coefficients

2012-03-21 Thread Olivier Grisel
Le 21 mars 2012 07:49, Andrew Cepheus a écrit : > The LogisticRegression class holds a coef_ attribute which is said to hold > the coefficients in the decision function. > High (positive) coefficients mean more correlation with the class, while low > (negative) ones mean an opposite correlation wi

[Scikit-learn-general] Logistic Regression coefficients

2012-03-21 Thread Andrew Cepheus
The LogisticRegression class holds a coef_ attribute which is said to hold the coefficients in the decision function. High (positive) coefficients mean more correlation with the class, while low (negative) ones mean an opposite correlation with the class. - Assuming that I have two class in that ta

Re: [Scikit-learn-general] scikit-learn gsoc idea: Neural Networks

2012-03-21 Thread Gael Varoquaux
On Wed, Mar 21, 2012 at 05:19:44PM +0900, Mathieu Blondel wrote: > +1 for a pure cython implementation without dependency. Also, I agree > with what Andreas said in another thread: scikit-learn should include > every classical / textbook algorithm. So, MLP is more than welcome in > scikit-learn eve

Re: [Scikit-learn-general] scikit-learn gsoc idea: Neural Networks

2012-03-21 Thread Mathieu Blondel
On Wed, Mar 21, 2012 at 4:59 PM, Olivier Grisel wrote: > If we are to add implementation for some neural nets to the project I > would rather have it implemented in pure cython without any further > dependencies and providing less flexibility on the structure of the > networks and the list of hyp

Re: [Scikit-learn-general] scikit-learn gsoc idea: Neural Networks

2012-03-21 Thread Andreas
On 03/21/2012 01:39 AM, Olivier Grisel wrote: > Le 21 mars 2012 01:21, David Marek a écrit : > >> Hi >> >> I think I was a little confused, I'll try to summarize what I >> understand is needed: >> >> * the goal is to have multilayer perceptron with stochastic gradient >> descent and maybe othe

Re: [Scikit-learn-general] scikit-learn gsoc idea: Neural Networks

2012-03-21 Thread Olivier Grisel
Le 21 mars 2012 04:55, David Warde-Farley a écrit : > On 2012-03-20, at 9:16 PM, Rami Al-Rfou' wrote: > >> Hi All, >> >> I think Torch7 and Theano are fast and powerful libraries that would be nice >> to take advantage of them. > > They're also rather heavy dependencies. > > In this case, since I

Re: [Scikit-learn-general] Logistic Regression Predict function

2012-03-21 Thread Mathieu Blondel
> one-vs-rest with liblinear? yep ! Mathieu -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Scikit-learn

Re: [Scikit-learn-general] Logistic Regression Predict function

2012-03-21 Thread Alexandre Gramfort
> It's normalizing by the sum of the probabilities output by each > one-vs-rest classifier... one-vs-rest with liblinear? Alex -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf