Tried using 1.6 version of Spark that takes numberOfFeatures fifth argument in
the API but still getting featureImportance as null.
RandomForestClassifier rfc = getRandomForestClassifier( numTrees, maxBinSize,
maxTreeDepth, seed, impurity);
RandomForestClassificationModel rfm =
RandomForestClassificationModel.fromOld(model, rfc, categoricalFeatures,
numberOfClasses,numberOfFeatures);
System.out.println(rfm.featureImportances());
Stack Trace:
Exception in thread "main" java.lang.NullPointerException
at
org.apache.spark.ml.tree.impl.RandomForest$.computeFeatureImportance(RandomForest.scala:1152)
at
org.apache.spark.ml.tree.impl.RandomForest$$anonfun$featureImportances$1.apply(RandomForest.scala:1111)
at
org.apache.spark.ml.tree.impl.RandomForest$$anonfun$featureImportances$1.apply(RandomForest.scala:1108)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at
scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at
org.apache.spark.ml.tree.impl.RandomForest$.featureImportances(RandomForest.scala:1108)
at
org.apache.spark.ml.classification.RandomForestClassificationModel.featureImportances$lzycompute(RandomForestClassifier.scala:237)
at
org.apache.spark.ml.classification.RandomForestClassificationModel.featureImportances(RandomForestClassifier.scala:237)
at
com.markmonitor.antifraud.ce.ml.CheckFeatureImportance.main(CheckFeatureImportance.java:49)
From: Rachana Srivastava
Sent: Wednesday, January 13, 2016 3:30 PM
To: '[email protected]'; '[email protected]'
Subject: Random Forest FeatureImportance throwing NullPointerException
I have a Random forest model for which I am trying to get the featureImportance
vector.
Map<Object,Object> categoricalFeaturesParam = new HashMap<>();
scala.collection.immutable.Map<Object,Object> categoricalFeatures =
(scala.collection.immutable.Map<Object,Object>)
scala.collection.immutable.Map$.MODULE$.apply(JavaConversions.mapAsScalaMap(categoricalFeaturesParam).toSeq());
int numberOfClasses =2;
RandomForestClassifier rfc = new RandomForestClassifier();
RandomForestClassificationModel rfm =
RandomForestClassificationModel.fromOld(model, rfc, categoricalFeatures,
numberOfClasses);
System.out.println(rfm.featureImportances());
When I run above code I found featureImportance as null. Do I need to set
anything in specific to get the feature importance for the random forest model.
Thanks,
Rachana