According to the documentation, cvModel.avgMetrics gives average cross-validation metrics for each paramMap in CrossValidator.estimatorParamMaps, in the corresponding order.
So when using areaUnderROC as the evaluator, cvModel.avgMetrics gives (in this example using scala, but API appears to work the same in pyspark) Array(0.8706074097889074, 0.9409529716549123, 0.9618787730606256, 0.8838019837612303, 0.9397610587835981, 0.9591275359721634, 0.8829088978012987, 0.9394137261180164, 0.9584085992609841, 0.8706074097889079, 0.9628051240960216, 0.9827490959747656, 0.8838019837612294, 0.9636100965080932, 0.9826906885021736, 0.8829088978013016, 0.9627072956991051, 0.9809166441709806, 0.8508340706851226, 0.7325352788119097, 0.7208072472539231, 0.8553496724213554, 0.7354481892254211, 0.7251511314439787, 0.8546551939595262, 0.7358349987841173, 0.7251408416244391) My understanding is that each value is the average areaUnderROC across the folds for each Parameter Grid combination used during Cross Validation. I have yet to figure out what "in the corresponding order" specifically means, so don't know which areaUnderROC value corresponds to which set of hyperparameters. Hopefully this is what you are looking for. On Thu, Sep 29, 2016 at 3:18 PM, evanzamir <zamir.e...@gmail.com> wrote: > I'm using CrossValidator (in PySpark) to create a logistic regression > model. > There is "areaUnderROC", which I assume gives the AUC for the bestModel > chosen by CV. But how to get the areaUnderROC for the test data during the > cross-validation? > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/Is-there-a-way-to-get-the-AUC- > metric-for-CrossValidator-tp27816.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >