According to the documentation, cvModel.avgMetrics gives average
cross-validation metrics for each paramMap in
CrossValidator.estimatorParamMaps,
in the corresponding order.
So when using areaUnderROC as the evaluator, cvModel.avgMetrics gives (in
this example using scala, but API appears to work
I'm able to successfully extract parameters from a PipelineModel using
model.stages. However, when I try to extract parameters from the bestModel
of a CrossValidatorModel using cvModel.bestModel.stages, I get this error.
error: value stages is not a member of org.apache.spark.ml.Model[_$4]
The following Databricks blog on Model Persistence states "Internally, we
save the model metadata and parameters as JSON and the data as Parquet."
https://databricks.com/blog/2016/05/31/apache-spark-2-0-preview-machine-learning-model-persistence.html
What data associated with a model or
;
> On Tue, Jun 28, 2016 at 4:40 PM, Rich Tarro <richta...@gmail.com> wrote:
>
>> Thanks for the response, but in my case I reversed the meaning of
>> "prediction" and "predictedLabel". It seemed to make more sense to me that
>> way, but in ret
ork on the
DataFrame of predictions.
Any other suggestions? Thanks.
On Tue, Jun 28, 2016 at 4:21 PM, Rich Tarro <richta...@gmail.com> wrote:
> I created a ML pipeline using the Random Forest Classifier - similar to
> what is described here except in my case the source data is in csv format
&
I created a ML pipeline using the Random Forest Classifier - similar to
what is described here except in my case the source data is in csv format
rather than libsvm.
https://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest-classifier
I am able to successfully train