Paul Shearer created SPARK-14740: ------------------------------------ Summary: CrossValidatorModel.bestModel does not include hyper-parameters Key: SPARK-14740 URL: https://issues.apache.org/jira/browse/SPARK-14740 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.6.1 Reporter: Paul Shearer
If you tune hyperparameters using a CrossValidator object in PySpark, you may not be able to extract the parameter values of the best model. ` from pyspark.ml.classification import LogisticRegression from pyspark.ml.evaluation import BinaryClassificationEvaluator from pyspark.mllib.linalg import Vectors from pyspark.ml.tuning import ParamGridBuilder, CrossValidator dataset = sqlContext.createDataFrame( [(Vectors.dense([0.0]), 0.0), (Vectors.dense([0.4]), 1.0), (Vectors.dense([0.5]), 0.0), (Vectors.dense([0.6]), 1.0), (Vectors.dense([1.0]), 1.0)] * 10, ["features", "label"]) lr = LogisticRegression() grid = ParamGridBuilder().addGrid(lr.regParam, [0.1, 0.01, 0.001, 0.0001]).build() evaluator = BinaryClassificationEvaluator() cv = CrossValidator(estimator=lr, estimatorParamMaps=grid, evaluator=evaluator) cvModel = cv.fit(dataset) ` I can get the regression coefficient out, but I can't get the regularization parameter ` In [3]: cvModel.bestModel.coefficients Out[3]: DenseVector([3.1573]) In [4]: cvModel.bestModel.explainParams() Out[4]: '' In [5]: cvModel.bestModel.extractParamMap() Out[5]: {} In [15]: cvModel.params Out[15]: [] In [36]: cvModel.bestModel.params Out[36]: [] ` For a simple example please see http://stackoverflow.com/questions/36697304/how-to-extract-model-hyper-parameters-from-spark-ml-in-pyspark -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org