I used the cross validator tool for tuning the parameter. My code is here: from pyspark.ml.classification import LogisticRegression from pyspark.ml.tuning import ParamGridBuilder, CrossValidator from pyspark.ml.evaluation import BinaryClassificationEvaluator reg=100.0 lr=LogisticRegression(maxIter=500)
paramGrid = ParamGridBuilder().addGrid(lr.regParam, [0.02,0.01,0.2,0.1,1.0,2.0,10.0,15.0,20.0,100.0]).addGrid(lr.elasticNetParam, [0.0, 0.5, 1.0]).build() crossval = CrossValidator(estimator=lr, estimatorParamMaps=paramGrid, evaluator=BinaryClassificationEvaluator(), numFolds=3) model=crossval.fit(data_train_df) And finally predicted the values: prediction = model.transform(data_test_df) prediction.show() +-----+----------+ |label|prediction| +-----+----------+ | 1.0| 1.0| | 1.0| 1.0| | 1.0| 1.0| | 1.0| 1.0| | 1.0| 1.0| | 0.0| 1.0| | 0.0| 1.0| | 0.0| 1.0| | 0.0| 1.0| | 0.0| 1.0| | 0.0| 1.0| | 0.0| 1.0| | 0.0| 1.0| | 0.0| 1.0| | 0.0| 1.0| | 0.0| 1.0| | 0.0| 1.0| +-----+----------+ Why am I getting the wrong predictions? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Regularized-Logistic-regression-tp19432p19448.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org