Hello all, I have been looking at some of the missing items for complete feature parity between spark.ml and spark.mllib. Here is a proposal for porting mllib.stats, the descriptive statistics package:
https://docs.google.com/document/d/1ELVpGV3EBjc2KQPLN9_9_Ge9gWchPZ6SGtDW5tTm_50/edit?usp=sharing The umbrella ticket for this task is: https://issues.apache.org/jira/browse/SPARK-4591 Please comment on the document. Also, if you want to work on one of the algorithms, the design doc and the umbrella ticket have subtasks that you can assign yourself to. The cutoff deadline for Spark 2.2 is rapidly approaching, and it would be great if we could claim parity for this release! Cheers Tim --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org