On Mon, Sep 08, 2014 at 10:05:58AM +0100, Luca Puggini wrote: > for personal reason I am writing a function to compute the outlier > measure from random forest > http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm# > outliers
> with a little more work I can include the function in the sklearn > random forest class. Do you have a guessstimate on the amount of code it would add to the codebase. Also, is there a canonical paper on this approach that we could read. > Is the community interested? Should I do it? As always, it's very hard to judge whether a method should be included. I personnally think that outlier detection is something very important, and I'd like to see more in scikit-learn. However, we need to choose the methods that bring the most benefit to users to solve that problem. Thus we need to be convinved that the situations in which the method works well are reasonnably common. This requires understanding these situations, and that's usually a bit hard. Thanks a lot for that proposal! Gaƫl ------------------------------------------------------------------------------ Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
