zhengruifeng created SPARK-18757:
------------------------------------

             Summary: Models in Pyspark support column setters
                 Key: SPARK-18757
                 URL: https://issues.apache.org/jira/browse/SPARK-18757
             Project: Spark
          Issue Type: Brainstorming
          Components: ML, PySpark
            Reporter: zhengruifeng


Recently, I found three places in which column setters are missing: 
KMeansModel, BisectingKMeansModel and BisectingKMeansModel.
These three models directly inherit `Model` which dont have columns setters, so 
I had to add the missing setters manually in [SPARK-18625] and [SPARK-18520].
Fow now, models in pyspark still don't support column setters at all.
I suggest that we keep the hierarchy of pyspark models in line with that in the 
scala side:
For classifiation and regression algs, I‘m making a trial in [SPARK-18379]
For clustering algs, I think we may first create abstract classes 
{{ClusteringModel}} and {{ProbabilisticClusteringModel}}, and make clustering 
algs inherit it. Then, in the python side, we copy the hierarchy so that we 
dont need to add setters for each alg.
For features algs, we can also use a abstract class {{FeatureModel}} in scala 
side, and do the same thing.

What's your opinions? [~yanboliang][~josephkb][~sethah][~srowen]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to