Re: [Scikit-learn-general] poor svm performance

2012-02-17 Thread Mathieu Blondel
On Fri, Feb 17, 2012 at 3:53 PM, Andreas wrote: > ASSET: Approximate Stochastic Subgradient > Estimation Training for Support Vector Machines Sangkyun Lee and Stephen > J. Wright > Code available online but didn't try yet. > I think I gave this reference before when we were talking about GSoC. S

Re: [Scikit-learn-general] Why does polynomial SVC perform so poorly on document classification?

2012-02-17 Thread Olivier Grisel
2012/2/17 Lars Buitinck : > 2012/2/13 Olivier Grisel : >> I don't know for the polynomial kernel part but since C is scale >> according to the number of sample, C=1e4 or more is required for text >> classification. > > I finally had time to try this out. You're absolutely right; C should > be aroun

Re: [Scikit-learn-general] Why does polynomial SVC perform so poorly on document classification?

2012-02-17 Thread Lars Buitinck
2012/2/13 Andreas : > On 02/13/2012 09:49 PM, Lars Buitinck wrote: >> I verified that the features coming from text.Vectorizer are >> normalized; they're all in the range [-1, 1]. >> > I guess that is not the problem here but chi2 is only defined for > positive input, right? Strictly, yes, so I sh

Re: [Scikit-learn-general] Why does polynomial SVC perform so poorly on document classification?

2012-02-17 Thread Lars Buitinck
2012/2/13 Olivier Grisel : > I don't know for the polynomial kernel part but since C is scale > according to the number of sample, C=1e4 or more is required for text > classification. I finally had time to try this out. You're absolutely right; C should be around 1e11 for a quadratic kernel SVM to

Re: [Scikit-learn-general] poor svm performance

2012-02-17 Thread Andreas
On 02/17/2012 01:35 PM, Olivier Grisel wrote: > 2012/2/17 Andreas: > >> >>> With regards to LaSVM, I would rather pitch a summer of code project as >>> having a good on-line SVM solver, that would incorporate the core ideas >>> from LaSVM, but that would also be useable in a real out-of-c

Re: [Scikit-learn-general] poor svm performance

2012-02-17 Thread Andreas
On 02/15/2012 02:04 AM, Ian Goodfellow wrote: > Further update: I talked to Adam Coates and his code doesn't implement > a standard SVM. Instead it's an "L2 SVM" which squares all the slack > variables. So this probably explains the difference in performance I > observed prior to building this test

Re: [Scikit-learn-general] poor svm performance

2012-02-17 Thread Olivier Grisel
2012/2/17 Andreas : > >> With regards to LaSVM, I would rather pitch a summer of code project as >> having a good on-line SVM solver, that would incorporate the core ideas >> from LaSVM, but that would also be useable in a real out-of-core setting. >> I believe that doing this right, including worr

Re: [Scikit-learn-general] poor svm performance

2012-02-17 Thread Andreas
> With regards to LaSVM, I would rather pitch a summer of code project as > having a good on-line SVM solver, that would incorporate the core ideas > from LaSVM, but that would also be useable in a real out-of-core setting. > I believe that doing this right, including worrying about the parameter