[Scikit-learn-general] Random Forest Feature Importances Citation

2016-05-16 Thread Gavin Gray
In the Scikit-Learn documentation the feature importances are described as coming from the relative depths features are used as decision nodes, averaged across trees in the forest. Does anyone know which paper discusses this method? Breiman's original paper seems to just talk about randomly permuti

Re: [Scikit-learn-general] Shared scikit/ipython server

2014-09-01 Thread Gavin Gray
I've used git-annex recently. It works basically like git, with a few caveats. I don't know if sparkleshare deals with large files in the same way but git-annex has no problems with very large data files. -Gavin On Mon, Sep 1, 2014 at 9:36 AM, Gael Varoquaux <

[Scikit-learn-general] Beta regression

2014-07-21 Thread Gavin Gray
Checking the documentation it looks like Scikit-learn does not have an implementation of a generalized linear model where the target variable is within the unit interval. In R they call it beta regression . Are there models like

Re: [Scikit-learn-general] Flexible Naive Bayes

2014-06-11 Thread Gavin Gray
KDE: http://www.cs.iastate.edu/~jtian/cs573/Papers/John-UAI-95.pdf I guess then you wouldn't have to specify but it seems strange to try to estimate the distribution of a features you know is Bernoulli, for instance On Wed, Jun 11, 2014 at 3:52 PM, Lars Buitinck wrote: > 2014-06-11 15:

[Scikit-learn-general] Flexible Naive Bayes

2014-06-11 Thread Gavin Gray
Hi, I need to use Naive Bayes for mixed categorial and numerical data and was thinking of implementing a flexible Naive Bayes algorithm similar to Weka's instead of hacking my way around by converting the numerical to categorical or similar. Is there a good reason I shouldn't do this? Is anyone el