[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib
[ https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088254#comment-16088254 ] Apache Spark commented on SPARK-14523: -- User 'WeichenXu123' has created a pull request for this issue: https://github.com/apache/spark/pull/14950 > Feature parity for Statistics ML with MLlib > --- > > Key: SPARK-14523 > URL: https://issues.apache.org/jira/browse/SPARK-14523 > Project: Spark > Issue Type: Sub-task > Components: ML >Reporter: yuhao yang > > Some statistics functions have been supported by DataFrame directly. Use this > jira to discuss/design the statistics package in Spark.ML and its function > scope. Hypothesis test and correlation computation may still need to expose > independent interfaces. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib
[ https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881567#comment-15881567 ] Joseph K. Bradley commented on SPARK-14523: --- Alright, given that there are now 3 more subtasks for stats, I'll close this one in favor of those other 3. > Feature parity for Statistics ML with MLlib > --- > > Key: SPARK-14523 > URL: https://issues.apache.org/jira/browse/SPARK-14523 > Project: Spark > Issue Type: Sub-task > Components: ML >Reporter: yuhao yang > > Some statistics functions have been supported by DataFrame directly. Use this > jira to discuss/design the statistics package in Spark.ML and its function > scope. Hypothesis test and correlation computation may still need to expose > independent interfaces. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib
[ https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866295#comment-15866295 ] Timothy Hunter commented on SPARK-14523: Also, the correlation is missing the multivariate case. I will take this task over unless one expresses some interest. > Feature parity for Statistics ML with MLlib > --- > > Key: SPARK-14523 > URL: https://issues.apache.org/jira/browse/SPARK-14523 > Project: Spark > Issue Type: Sub-task > Components: ML >Reporter: yuhao yang > > Some statistics functions have been supported by DataFrame directly. Use this > jira to discuss/design the statistics package in Spark.ML and its function > scope. Hypothesis test and correlation computation may still need to expose > independent interfaces. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib
[ https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861789#comment-15861789 ] Joseph K. Bradley commented on SPARK-14523: --- I'd like to keep this open until we have linked tasks for the missing functionality. [~hujiayin] This is for parity w.r.t. the RDD-based API, not for adding new functionality to MLlib. I think there's already a JIRA for ARIMA somewhere. > Feature parity for Statistics ML with MLlib > --- > > Key: SPARK-14523 > URL: https://issues.apache.org/jira/browse/SPARK-14523 > Project: Spark > Issue Type: Sub-task > Components: ML >Reporter: yuhao yang > > Some statistics functions have been supported by DataFrame directly. Use this > jira to discuss/design the statistics package in Spark.ML and its function > scope. Hypothesis test and correlation computation may still need to expose > independent interfaces. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib
[ https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252162#comment-15252162 ] yuhao yang commented on SPARK-14523: ||mllib.statistics || ml || |colStats |column | |corr |Pearson only| |chiSqTest| seems to see it somewhere in dataFrame| |kolmogorovSmirnovTest| missing| |StreamingTest |missing| [~josephkb] As you suggested, we can add the function to ml via DataFrame API. > Feature parity for Statistics ML with MLlib > --- > > Key: SPARK-14523 > URL: https://issues.apache.org/jira/browse/SPARK-14523 > Project: Spark > Issue Type: Sub-task > Components: ML >Reporter: yuhao yang > > Some statistics functions have been supported by DataFrame directly. Use this > jira to discuss/design the statistics package in Spark.ML and its function > scope. Hypothesis test and correlation computation may still need to expose > independent interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib
[ https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242661#comment-15242661 ] hujiayin commented on SPARK-14523: -- We can add ARIMA to Spark. > Feature parity for Statistics ML with MLlib > --- > > Key: SPARK-14523 > URL: https://issues.apache.org/jira/browse/SPARK-14523 > Project: Spark > Issue Type: Sub-task > Components: ML >Reporter: yuhao yang > > Some statistics functions have been supported by DataFrame directly. Use this > jira to discuss/design the statistics package in Spark.ML and its function > scope. Hypothesis test and correlation computation may still need to expose > independent interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org