well, sounds trivial now ... ! thanks ;-) 2016-07-02 10:04 GMT+02:00 Yanbo Liang <yblia...@gmail.com>:
> Hi Mathieu, > > Using the new ml package to train a RandomForestClassificationModel, you > can get feature importance. Then you can convert the prediction result to > RDD and feed it into BinaryClassificationEvaluator for ROC curve. You can > refer the following code snippet: > > val rf = new RandomForestClassifier() > val model = rf.fit(trainingData) > > val predictions = model.transform(testData) > > val scoreAndLabels = > predictions.select(model.getRawPredictionCol, model.getLabelCol).rdd.map > { > case Row(rawPrediction: Vector, label: Double) => (rawPrediction(1), > label) > case Row(rawPrediction: Double, label: Double) => (rawPrediction, > label) > } > val metrics = new BinaryClassificationMetrics(scoreAndLabels) > metrics.roc() > > > Thanks > Yanbo > > 2016-06-15 7:13 GMT-07:00 matd <matd...@gmail.com>: > >> Hi ml folks ! >> >> I'm using a Random Forest for a binary classification. >> I'm interested in getting both the ROC *curve* and the feature importance >> from the trained model. >> >> If I'm not missing something obvious, the ROC curve is only available in >> the >> old mllib world, via BinaryClassificationMetrics. In the new ml package, >> only the areaUnderROC and areaUnderPR are available through >> BinaryClassificationEvaluator. >> >> The feature importance is only available in ml package, through >> RandomForestClassificationModel. >> >> Any idea to get both ? >> >> Mathieu >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Get-both-feature-importance-and-ROC-curve-from-a-random-forest-classifier-tp27175.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >