Re: [Scikit-learn-general] South Tahoe NIPS Conference Dec. 3-8, 2012

2012-11-07 Thread Gael Varoquaux
On Wed, Nov 07, 2012 at 05:23:50PM -0800, Doug Coleman wrote: > Is anyone planning on attending this conference? I am not going this time. Sorry. > Any of the tutorials/talks/ workshops look especially interesting? NIPS is usually very interesting and of high quality. Gaƫl

Re: [Scikit-learn-general] South Tahoe NIPS Conference Dec. 3-8, 2012

2012-11-07 Thread Joseph Turian
The biglearn workshop looks killer. On Wed, Nov 7, 2012 at 5:23 PM, Doug Coleman wrote: > Hello, > > Is anyone planning on attending this conference? Any of the > tutorials/talks/workshops look especially interesting? > > http://nips.cc/Conferences/2012/Program/ > > I'm debating whether to go or

[Scikit-learn-general] South Tahoe NIPS Conference Dec. 3-8, 2012

2012-11-07 Thread Doug Coleman
Hello, Is anyone planning on attending this conference? Any of the tutorials/talks/workshops look especially interesting? http://nips.cc/Conferences/2012/Program/ I'm debating whether to go or not and which sessions to attend because it's a 40 minute drive from where I live. If anyone is going,

Re: [Scikit-learn-general] OvR, Logistic Regression and SGD

2012-11-07 Thread Abhi
> > Indeed Abhi which section specific section of the documentation (or > docstring) led you to ask this question? > > The note on this page is pretty explicit: > > http://scikit-learn.org/dev/modules/multiclass.html > > Along with the docstring: > > http://scikit- learn.org/dev/modules/gener

Re: [Scikit-learn-general] RandomForest - optimisation of min_samples_split

2012-11-07 Thread Andreas Mueller
Am 07.11.2012 15:48, schrieb paul.czodrow...@merckgroup.com: > However, f1_score is not found. I would have suspected that this works in > analogy to the recall_score. How do you mean it is not found? It is in sklearn.metrics. >> If you want your confusion matrix to be more balanced, you can try tw

Re: [Scikit-learn-general] RandomForest - optimisation of min_samples_split

2012-11-07 Thread Paul . Czodrowski
Dear Andreas, Dear Gilles, Dear SciKitters, > Hi Paul > Tuning min_samples_split is a good idea but not related to imbalanced > classes. > First, you should specify what you want to optimize. Accuracy is usually > not a good measure for imbalanced classes. Maybe F-score? How would one do that? I j

Re: [Scikit-learn-general] RandomForest - optimisation of min_samples_split

2012-11-07 Thread Andreas Mueller
Hi Paul Tuning min_samples_split is a good idea but not related to imbalanced classes. First, you should specify what you want to optimize. Accuracy is usually not a good measure for imbalanced classes. Maybe F-score? If you want your confusion matrix to be more balanced, you can try two things (

Re: [Scikit-learn-general] RF optimisation - class weights etc.

2012-11-07 Thread Gilles Louppe
Hello Paul, > Do fully developed trees make sense for rather small datasets? Overall, I > have 622 samples with 177 features each. Isn't there the risk of > overfitting? Yes, overfitting might happen, but it should be limited since you are building randomized trees and average them together. > >

[Scikit-learn-general] RandomForest - optimisation of min_samples_split

2012-11-07 Thread Paul . Czodrowski
Dear SciKitters, given a dataset of 622 samples and 177 features each, I want to classify those given an experimental classification stating "0" or "1". After splitting up into training and test set, I trained a RandomForest the following way: " from sklearn.ensemble import RandomForestClassifie