zhengruifeng created SPARK-18757: ------------------------------------ Summary: Models in Pyspark support column setters Key: SPARK-18757 URL: https://issues.apache.org/jira/browse/SPARK-18757 Project: Spark Issue Type: Brainstorming Components: ML, PySpark Reporter: zhengruifeng
Recently, I found three places in which column setters are missing: KMeansModel, BisectingKMeansModel and BisectingKMeansModel. These three models directly inherit `Model` which dont have columns setters, so I had to add the missing setters manually in [SPARK-18625] and [SPARK-18520]. Fow now, models in pyspark still don't support column setters at all. I suggest that we keep the hierarchy of pyspark models in line with that in the scala side: For classifiation and regression algs, I‘m making a trial in [SPARK-18379] For clustering algs, I think we may first create abstract classes {{ClusteringModel}} and {{ProbabilisticClusteringModel}}, and make clustering algs inherit it. Then, in the python side, we copy the hierarchy so that we dont need to add setters for each alg. For features algs, we can also use a abstract class {{FeatureModel}} in scala side, and do the same thing. What's your opinions? [~yanboliang][~josephkb][~sethah][~srowen] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org