[Scikit-learn-general] Scipy2012 tutorial & video

2012-09-14 Thread Jake Vanderplas
Hello, The video of the scikit-learn tutorial from Scipy2012 has (finally) been posted: http://www.youtube.com/watch?v=33L_EXLtJPE&feature=plcp The tutorial material can be found at this site: http://astroml.github.com/sklearn_tutorial/ Enjoy! Jake

[Scikit-learn-general] [ANN] John Hunter has been awarded the first Distinguished Service Award by the PSF

2012-09-14 Thread Fernando Perez
Hi folks, you may have already seen this, but in case you haven't, I'm thrilled to share that the Python Software Foundation has just created its newest and highest distinction, the Distinguished Service Award, and has chosen John as its first recipient: http://pyfound.blogspot.com/2012/09/announ

Re: [Scikit-learn-general] Combining TFIDF and LDA features

2012-09-14 Thread Peter Prettenhofer
2012/9/14 Philipp Singer : > Hey! > > Am 14.09.2012 15:10, schrieb Peter Prettenhofer: >> >> I totally agree - I had such an issue in my research as well >> (combining word presence features with SVD embeddings). >> I followed Blitzer et. al 2006 and normalized** both feature groups >> separately -

Re: [Scikit-learn-general] Combining TFIDF and LDA features

2012-09-14 Thread Philipp Singer
Am 14.09.2012 15:28, schrieb Philipp Singer: > Okay, so I did a fast chi2 check and it seems like some LDA features > have high p-values, so they should be helpful at least. Oh, sorry. We want the lowest p-values, right? But that's the same case. There are many with low p-values. > > Am 14.09.201

Re: [Scikit-learn-general] Combining TFIDF and LDA features

2012-09-14 Thread Philipp Singer
Hey! Am 14.09.2012 15:10, schrieb Peter Prettenhofer: > > I totally agree - I had such an issue in my research as well > (combining word presence features with SVD embeddings). > I followed Blitzer et. al 2006 and normalized** both feature groups > separately - e.g. you could normalize word presen

Re: [Scikit-learn-general] Combining TFIDF and LDA features

2012-09-14 Thread Philipp Singer
Okay, so I did a fast chi2 check and it seems like some LDA features have high p-values, so they should be helpful at least. Am 14.09.2012 15:06, schrieb Andreas Müller: > I'd be interested in the outcome. > Let us know when you get it to work :) > > > - Ursprüngliche Mail - > Von: "Phili

Re: [Scikit-learn-general] Combining TFIDF and LDA features

2012-09-14 Thread Philipp Singer
Am 14.09.2012 15:10, schrieb amir rahimi: > Have you done tests using some other classifiers such as gradient > boosting which has a kind of internal feature selection? Actually not, but I wanted to try that out, if the runtime allows it. > > On Fri, Sep 14, 2012 at 5:36 PM, Andreas Müller > mailt

Re: [Scikit-learn-general] Combining TFIDF and LDA features

2012-09-14 Thread amir rahimi
Have you done tests using some other classifiers such as gradient boosting which has a kind of internal feature selection? On Fri, Sep 14, 2012 at 5:36 PM, Andreas Müller wrote: > I'd be interested in the outcome. > Let us know when you get it to work :) > > > - Ursprüngliche Mail - > Von

Re: [Scikit-learn-general] Combining TFIDF and LDA features

2012-09-14 Thread Peter Prettenhofer
2012/9/14 Andreas Müller : > Hi Philipp. > First, you should ensure that the features all have approximately the same > scale. > For example they should all be between zero and one - if the LDA features > are much smaller than the other ones, then they will probably not be weighted > much. I tot

Re: [Scikit-learn-general] Combining TFIDF and LDA features

2012-09-14 Thread Andreas Müller
I'd be interested in the outcome. Let us know when you get it to work :) - Ursprüngliche Mail - Von: "Philipp Singer" An: scikit-learn-general@lists.sourceforge.net Gesendet: Freitag, 14. September 2012 14:00:48 Betreff: Re: [Scikit-learn-general] Combining TFIDF and LDA features Am 14.

Re: [Scikit-learn-general] Combining TFIDF and LDA features

2012-09-14 Thread Philipp Singer
Am 14.09.2012 14:53, schrieb Andreas Müller: > Hi Philipp. Hey Andreas! > First, you should ensure that the features all have approximately the same > scale. > For example they should all be between zero and one - if the LDA features > are much smaller than the other ones, then they will probably

Re: [Scikit-learn-general] Combining TFIDF and LDA features

2012-09-14 Thread Andreas Müller
Hi Philipp. First, you should ensure that the features all have approximately the same scale. For example they should all be between zero and one - if the LDA features are much smaller than the other ones, then they will probably not be weighted much. Which LDA package did you use? I am not ver

[Scikit-learn-general] Combining TFIDF and LDA features

2012-09-14 Thread Philipp Singer
Hey there! I have seen in the past some few research papers that combined tfidf based features with LDA topic model features and they could increase their accuracy by some useful extent. I now wanted to do the same. As a simple step I just attended the topic features to each train and test sam

Re: [Scikit-learn-general] : SVM and Sparse Data in (latest) version of sklearn

2012-09-14 Thread Dimitrios Pritsos
On 09/13/2012 09:27 PM, Lars Buitinck wrote: > 2012/9/13 Dimitrios Pritsos: >> There is a Great difference in the performance of SVM.fit() method >> (OneClassSVM in particular) depending on the input. When the input is a >> Sparse Matrix the Training is Extremely slow for a very small amount of >>