Re: Is there a way to get the AUC metric for CrossValidator?

2016-09-29 Thread Rich Tarro
According to the documentation, cvModel.avgMetrics gives average cross-validation metrics for each paramMap in CrossValidator.estimatorParamMaps, in the corresponding order. So when using areaUnderROC as the evaluator, cvModel.avgMetrics gives (in this example using scala, but API appears to work

How to extract bestModel parameters from a CrossValidatorModel

2016-09-29 Thread Rich Tarro
I'm able to successfully extract parameters from a PipelineModel using model.stages. However, when I try to extract parameters from the bestModel of a CrossValidatorModel using cvModel.bestModel.stages, I get this error. error: value stages is not a member of org.apache.spark.ml.Model[_$4]

Model Persistence

2016-08-18 Thread Rich Tarro
The following Databricks blog on Model Persistence states "Internally, we save the model metadata and parameters as JSON and the data as Parquet." https://databricks.com/blog/2016/05/31/apache-spark-2-0-preview-machine-learning-model-persistence.html What data associated with a model or

Re: Random Forest Classification

2016-07-01 Thread Rich Tarro
; > On Tue, Jun 28, 2016 at 4:40 PM, Rich Tarro <richta...@gmail.com> wrote: > >> Thanks for the response, but in my case I reversed the meaning of >> "prediction" and "predictedLabel". It seemed to make more sense to me that >> way, but in ret

Re: Random Forest Classification

2016-06-28 Thread Rich Tarro
ork on the DataFrame of predictions. Any other suggestions? Thanks. On Tue, Jun 28, 2016 at 4:21 PM, Rich Tarro <richta...@gmail.com> wrote: > I created a ML pipeline using the Random Forest Classifier - similar to > what is described here except in my case the source data is in csv format &

Random Forest Classification

2016-06-28 Thread Rich Tarro
I created a ML pipeline using the Random Forest Classifier - similar to what is described here except in my case the source data is in csv format rather than libsvm. https://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest-classifier I am able to successfully train