Re: Feature importance for RandomForestRegressor in Spark 1.5

Yanbo Liang Sun, 17 Jan 2016 00:48:07 -0800

Hi Robin,

#1 This feature is available from Spark 1.5.0.
#2 You should use the new ML rather than the old MLlib package to train the
Random Forest model and get featureImportances, because it was only exposed
at ML package. You can refer the documents:
https://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest-classifier
.


Thanks
Yanbo

2016-01-16 0:16 GMT+08:00 Robin East <robin.e...@xense.co.uk>:

> re 1.
> The pull requests reference the JIRA ticket in this case
> https://issues.apache.org/jira/browse/SPARK-5133. The JIRA says it was
> released in 1.5.
>
>
>
> -------------------------------------------------------------------------------
> Robin East
> *Spark GraphX in Action* Michael Malak and Robin East
> Manning Publications Co.
> http://www.manning.com/books/spark-graphx-in-action
>
>
>
>
>
> On 15 Jan 2016, at 16:06, Scott Imig <si...@richrelevance.com> wrote:
>
> Hello,
>
> I have a couple of quick questions about this pull request, which adds
> feature importance calculations to the random forests in MLLib.
>
> https://github.com/apache/spark/pull/7838
>
> 1. Can someone help me determine the Spark version where this is first
> available?  (1.5.0?  1.5.1?)
>
> 2. Following the templates in this  documentation to construct a
> RandomForestModel, should I be able to retrieve model.featureImportances?
> Or is there a different pattern for random forests in more recent spark
> versions?
>
> https://spark.apache.org/docs/1.2.0/mllib-ensembles.html
>
> Thanks for the help!
> Imig
> --
> S. Imig | Senior Data Scientist Engineer | *rich**relevance *|m:
> 425.999.5725
>
> I support Bip 101 and BitcoinXT <https://bitcoinxt.software/>.
>
>
>

Re: Feature importance for RandomForestRegressor in Spark 1.5

Reply via email to