Re: [Scikit-learn-general] threading error when training a RFC on a big dataset

2012-09-22 Thread Olivier Grisel
2012/9/22 Christian Jauvin : > Hi, > > I have been doing multiple experiments using a RandomForestClassifier > (trained with the parallel code option) recently, without encountering > any particular problem. However as soon as I began using a much bigger > dataset (with the exact same code), I got

[Scikit-learn-general] threading error when training a RFC on a big dataset

2012-09-22 Thread Christian Jauvin
Hi, I have been doing multiple experiments using a RandomForestClassifier (trained with the parallel code option) recently, without encountering any particular problem. However as soon as I began using a much bigger dataset (with the exact same code), I got this threading error: Exception in thre

Re: [Scikit-learn-general] Congrats to Vivek for winning yet another kaggle contest!

2012-09-22 Thread Vivek Sharma
Thanks Olivier, Andreas. And, again to the text classification module authors. sklearn rocks! I think I was quite lucky, but I'm not complaining! :) My feature set was almost the same as the char and word features that Andreas used. I found that SVC gave me better performance than LR. And, some n

Re: [Scikit-learn-general] Congrats to Vivek for winning yet another kaggle contest!

2012-09-22 Thread Olivier Grisel
2012/9/22 Andreas Mueller : > On 09/22/2012 12:17 PM, Olivier Grisel wrote: >> and to Andreas who finished in the 6th position out of 50 final submitters. >> >> This contest was about text classification: >> >>http://www.kaggle.com/c/detecting-insults-in-social-commentary >> >> Any feedback on

Re: [Scikit-learn-general] Congrats to Vivek for winning yet another kaggle contest!

2012-09-22 Thread Andreas Mueller
On 09/22/2012 12:17 PM, Olivier Grisel wrote: > and to Andreas who finished in the 6th position out of 50 final submitters. > > This contest was about text classification: > >http://www.kaggle.com/c/detecting-insults-in-social-commentary > > Any feedback on what scikit-learn models where used,

Re: [Scikit-learn-general] Congrats to Vivek for winning yet another kaggle contest!

2012-09-22 Thread Andreas Mueller
Congratulations to Vivek also from me :) On 09/22/2012 12:17 PM, Olivier Grisel wrote: > and to Andreas who finished in the 6th position out of 50 final submitters. > Thanks Olivier. I'll write a short blog post, but my best model is pretty boring :-/ There was a pretty big gap between the first

[Scikit-learn-general] Congrats to Vivek for winning yet another kaggle contest!

2012-09-22 Thread Olivier Grisel
and to Andreas who finished in the 6th position out of 50 final submitters. This contest was about text classification: http://www.kaggle.com/c/detecting-insults-in-social-commentary Any feedback on what scikit-learn models where used, which feature extraction / blending techniques were useful

Re: [Scikit-learn-general] TF-Idf

2012-09-22 Thread Olivier Grisel
2012/9/22 Ark : > Hello, > I am trying to classify a large document set with LinearSVC. I get good > accuracy. However I was wondering how to optimize the interface to this > classifier. For e.g.If I have an predict interface that accepts the raw > document, You can use the Pipeline class to