zhengruifeng created SPARK-18739: ------------------------------------ Summary: Models in pyspark.classification support setXXXCol methods Key: SPARK-18739 URL: https://issues.apache.org/jira/browse/SPARK-18739 Project: Spark Issue Type: Improvement Components: ML, PySpark Reporter: zhengruifeng
Now, models in pyspark don't suport {{setXXCol}} methods at all. I update models in {{classification.py}} according the hierarchy in the scala side: 1, add {{setFeaturesCol}} and {{setPredictionCol}} in class {{JavaPredictionModel}} 2, add {{setRawPredictionCol}} in class {{JavaClassificationModel}} 3, create class {{JavaProbabilisticClassificationModel}} inherit {{JavaClassificationModel}}, and add {{setProbabilityCol}} in it 4, {{LogisticRegressionModel}}, {{DecisionTreeClassificationModel}}, {{RandomForestClassificationModel}} and {{NaiveBayesModel}} inherit {{JavaProbabilisticClassificationModel}} 5, {{GBTClassificationModel}} and {{MultilayerPerceptronClassificationModel}} inherit {{JavaClassificationModel}} 6, {{OneVsRestModel}} inherit {{JavaModel}}, and add {{setFeaturesCol}} and {{setPredictionCol}} method. With regard to model clustering and features, I suggest that we first add some abstract classes like {{ClusteringModel}}, {{ProbabilisticClusteringModel}}, {{FeatureModel}} in the scala side, otherwise we need to manually add setXXXCol methods one by one. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org