On Wed, Mar 21, 2012 at 08:09:26PM -0400, David Warde-Farley wrote:
> I think it's less about disagreeing with libsvm than disagreeing with the
> notation of every textbook presentation I know of. I agree that libsvm is no
> golden calf.
But it is also the case for the lasso: the loss term is th
On Thu, Mar 22, 2012 at 9:09 AM, David Warde-Farley
wrote:
> In particular, doing 1 vs rest for logistic regression seems like
> an odd choice when there is a perfectly good multiclass generalization of
> logistic regression. Mathieu clarified to me last night how liblinear is
> calculating "
On Thu, Mar 22, 2012 at 3:35 AM, James Bergstra
wrote:
> Also, isn't the feature normalization supposed to be done on a
> fold-by-fold basis? If you're doing that, you have a different kernel
> matrix in every fold anyway.
Indeed, if you want really want to be clean, you would need to do that
bu
Le 22 mars 2012 01:09, David Warde-Farley a écrit :
>
>> That said, I agree with James that the docs should be much more
>> explicit about what is going on, and how what we have differs from
>> libsvm.
>
> I think that renaming sklearn's scaled version of "C" is probably a start.
> Using the name
On 2012-03-21, at 7:25 PM, Gael Varoquaux wrote:
> I'd like to stress that I don't think that following libsvm is much of a
> goal per se. I understand that it make the life of someone like James
> easier, because he knows libsvm well and can relate to it.
I think it's less about disagreeing wi
I've stayed quiet in this discussion because I was busy elsewhere. The
good thing is that it has allowed me to hear to point of view of
different people. Here is mine.
First, the decision we took can be undone. It is not final, and the way
that it should be taken is to make our user's life easiest
On Wed, Mar 21, 2012 at 06:42:36PM +0100, Alexandre Gramfort wrote:
> > In short, I think it could be interesting to implement the scout method too:
> > "We show that ridge regression, the lasso, and the elastic net are
> > special cases of covariance-regularized regression"
> > http://www-stat.sta
On Wed, Mar 21, 2012 at 08:16:10PM +0100, Andreas Mueller wrote:
>@devs: Is there a piece in the user guide that describes the pipeline? I
>can't find it.
I guess it should be in the model selection chapter, which never got much
love.
Gael
On Wed, Mar 21, 2012 at 07:06:13PM +, Conrad Lee wrote:
>Unsurprisingly, the above code doesn't work because it's not possible to
>initialize a RFECV object without an estimator. But I can't pass it an
>estimator yet because I want to vary the value of C that is used in
>initia
should be fixed
Alex
On Wed, Mar 21, 2012 at 9:42 PM, Andreas wrote:
> Hey everybody.
> It seems I broke the buildbot by clicking the green button to quickly.
> I can not really reproduce the behavior, though.
> Any help would be appreciated.
>
> Sorry,
> Andy
>
> ---
Hey everybody.
It seems I broke the buildbot by clicking the green button to quickly.
I can not really reproduce the behavior, though.
Any help would be appreciated.
Sorry,
Andy
--
This SF email is sponsosred by:
Try Wind
Hi Conrad.
The Pipeline is designed to do exactly this:
http://scikit-learn.org/dev/modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline
Example here:
http://scikit-learn.org/dev/auto_examples/feature_selection_pipeline.html#example-feature-selection-pipeline-py
You can use
I want to do a grid search that does two things at once: chooses the right
value for C, the regularization parameter, and does feature selection with
recursive feature elimination.
As a reminder, here's how you usually use the Recursive Feature Elimination
Cross Validation (RFECV) method:
from sk
On Wed, Mar 21, 2012 at 6:46 AM, Olivier Grisel
wrote:
> Le 21 mars 2012 11:14, Mathieu Blondel a écrit :
>> On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote:
>>
>>> Are there any other options?
>>
>> Another solution is to perform cross-validation using non-scaled C
>> values, select the best one
> Okay, that sounds reasonable to me too.
> It appears to me that it might be in everyone interest if I apply for
> a different project. I'm considering "Coordinated descent in linear
> models beyond squared loss (eg Logistic)"
> I'm currently working on a p>>N problem using the R scout package,
>
Did you have a look into Bayesian Ridge Regression?
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.BayesianRidge.html
Mathieu
--
This SF email is sponsosred by:
Try Windows Azure free for 90 days C
On 2012-03-21, at 4:57 AM, Olivier Grisel wrote:
> I think the docstring is wrong. Anybody can confirm?
Ran into this myself last night while answering the other thread. Yeah, it
appears to be.
David
--
This SF email is
Hi Jeremias.
I haven't thought that trough but shouldn't it be possible
to achieve the same effect by doing a linear transformation of your data
an labels
and then shrinking to zero?
Cheers,
Andy
On 03/21/2012 03:12 PM, Jeremias Engelmann wrote:
Hi
I'm using scikit learn's linear model's ri
Hi
I'm using scikit learn's linear model's ridge regression to do ridge
regression with large sparse matrices.
I know that, by design, ridge regression penalizes parameters for moving
away from zero. What I actually want is to penalize parameters to move away
from a certain prior (each parameter h
2012/3/21 Gael Varoquaux :
> On Wed, Mar 21, 2012 at 12:24:39PM +0900, Mathieu Blondel wrote:
>> If the online NMF and SGD-based matrix factorization proposals are
>> merged as I suggested before, I think it would make a decent GSOC
>> project. Besides, if two different students were to work on the
Le 21 mars 2012 11:14, Mathieu Blondel a écrit :
> On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote:
>
>> Are there any other options?
>
> Another solution is to perform cross-validation using non-scaled C
> values, select the best one and scale it before refitting with the
> entire dataset (to tak
On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote:
> Are there any other options?
Another solution is to perform cross-validation using non-scaled C
values, select the best one and scale it before refitting with the
entire dataset (to take into account that the entire dataset is bigger
than a train
On Wed, Mar 21, 2012 at 5:57 PM, Olivier Grisel
wrote:
> If there are only two classes, 0 or -1 is treated as negative and 1 is
> treated as positive.
To complement Olivier's answer, by convention in scikit-learn, the
negative label is in self.classes_[0]
and the positive one is in self.classes_
Although I haven't check the code, I guess this is the usual way to store
the coefficients. To calculate P(C=i|x), we can use the formula: exp(sum_j
Coef([i,j])/Z, where Z=sum_i exp(\sum_j Coef[i,j]).
Sincerely,
Kerui Min
On Wed, Mar 21, 2012 at 4:57 PM, Olivier Grisel wrote:
> Le 21 mars 2012
Le 21 mars 2012 07:49, Andrew Cepheus a écrit :
> The LogisticRegression class holds a coef_ attribute which is said to hold
> the coefficients in the decision function.
> High (positive) coefficients mean more correlation with the class, while low
> (negative) ones mean an opposite correlation wi
The LogisticRegression class holds a coef_ attribute which is said to hold
the coefficients in the decision function.
High (positive) coefficients mean more correlation with the class, while
low (negative) ones mean an opposite correlation with the class.
- Assuming that I have two class in that ta
On Wed, Mar 21, 2012 at 05:19:44PM +0900, Mathieu Blondel wrote:
> +1 for a pure cython implementation without dependency. Also, I agree
> with what Andreas said in another thread: scikit-learn should include
> every classical / textbook algorithm. So, MLP is more than welcome in
> scikit-learn eve
On Wed, Mar 21, 2012 at 4:59 PM, Olivier Grisel
wrote:
> If we are to add implementation for some neural nets to the project I
> would rather have it implemented in pure cython without any further
> dependencies and providing less flexibility on the structure of the
> networks and the list of hyp
On 03/21/2012 01:39 AM, Olivier Grisel wrote:
> Le 21 mars 2012 01:21, David Marek a écrit :
>
>> Hi
>>
>> I think I was a little confused, I'll try to summarize what I
>> understand is needed:
>>
>> * the goal is to have multilayer perceptron with stochastic gradient
>> descent and maybe othe
Le 21 mars 2012 04:55, David Warde-Farley a écrit :
> On 2012-03-20, at 9:16 PM, Rami Al-Rfou' wrote:
>
>> Hi All,
>>
>> I think Torch7 and Theano are fast and powerful libraries that would be nice
>> to take advantage of them.
>
> They're also rather heavy dependencies.
>
> In this case, since I
> one-vs-rest with liblinear?
yep !
Mathieu
--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
___
Scikit-learn
> It's normalizing by the sum of the probabilities output by each
> one-vs-rest classifier...
one-vs-rest with liblinear?
Alex
--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf
32 matches
Mail list logo