Re: [Scikit-learn-general] SO question for the tree growers

2013-04-17 Thread Paul . Czodrowski
Dear Gilles, sorry to jump into that discussion, but it raised my interest.. In the R RandomForest package, MeanDecreaseGini can be calculated. Does scikit-learn somehow scale MeanDecreaseGini to the percentage scale. Please find attached the variable importance as compute by scikit-learn's RF

Re: [Scikit-learn-general] SO question for the tree growers

2013-04-05 Thread Gilles Louppe
Hi Paul, sorry to jump into that discussion, but it raised my interest.. > In the R RandomForest package, MeanDecreaseGini can be calculated. > > > Does scikit-learn somehow scale MeanDecreaseGini to the percentage scale. > Yes, in randomForest R package there is basically no scaling or normaliz

Re: [Scikit-learn-general] SO question for the tree growers

2013-04-05 Thread Paul . Czodrowski
Dear Gilles, sorry to jump into that discussion, but it raised my interest.. In the R RandomForest package, MeanDecreaseGini can be calculated. Does scikit-learn somehow scale MeanDecreaseGini to the percentage scale. Please find attached the variable importance as compute by scikit-learn's RF

Re: [Scikit-learn-general] SO question for the tree growers

2013-04-04 Thread Olivier Grisel
Thank you to both of you! I learned something new today :) -- Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified pro

Re: [Scikit-learn-general] SO question for the tree growers

2013-04-04 Thread Gilles Louppe
Hi Olivier, There are indeed several ways to get feature "importances". As often, there is no strict consensus about what this word means. In our case, we implement the importance as described in [1] (often cited, but unfortunately rarely read...). It is sometimes called "gini importance" or "mea

Re: [Scikit-learn-general] SO question for the tree growers

2013-04-04 Thread Peter Prettenhofer
I posted a brief description of the algorithm. The method that we implement is briefly described in ESLII. Gilles is the expert here, he can give more details on the issue. 2013/4/4 Olivier Grisel > The variable importance in scikit-learn's implementation of random > forest is based on the prop

[Scikit-learn-general] SO question for the tree growers

2013-04-04 Thread Olivier Grisel
The variable importance in scikit-learn's implementation of random forest is based on the proportion of samples that were classified by the feature at some point in one of the decision trees evaluation. http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation This method