[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib

2017-07-14 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088254#comment-16088254
 ] 

Apache Spark commented on SPARK-14523:
--

User 'WeichenXu123' has created a pull request for this issue:
https://github.com/apache/spark/pull/14950

> Feature parity for Statistics ML with MLlib
> ---
>
> Key: SPARK-14523
> URL: https://issues.apache.org/jira/browse/SPARK-14523
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: yuhao yang
>
> Some statistics functions have been supported by DataFrame directly. Use this 
> jira to discuss/design the statistics package in Spark.ML and its function 
> scope. Hypothesis test and correlation computation may still need to expose 
> independent interfaces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib

2017-02-23 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881567#comment-15881567
 ] 

Joseph K. Bradley commented on SPARK-14523:
---

Alright, given that there are now 3 more subtasks for stats, I'll close this 
one in favor of those other 3.

> Feature parity for Statistics ML with MLlib
> ---
>
> Key: SPARK-14523
> URL: https://issues.apache.org/jira/browse/SPARK-14523
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: yuhao yang
>
> Some statistics functions have been supported by DataFrame directly. Use this 
> jira to discuss/design the statistics package in Spark.ML and its function 
> scope. Hypothesis test and correlation computation may still need to expose 
> independent interfaces.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib

2017-02-14 Thread Timothy Hunter (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866295#comment-15866295
 ] 

Timothy Hunter commented on SPARK-14523:


Also, the correlation is missing the multivariate case.

I will take this task over unless one expresses some interest.

> Feature parity for Statistics ML with MLlib
> ---
>
> Key: SPARK-14523
> URL: https://issues.apache.org/jira/browse/SPARK-14523
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: yuhao yang
>
> Some statistics functions have been supported by DataFrame directly. Use this 
> jira to discuss/design the statistics package in Spark.ML and its function 
> scope. Hypothesis test and correlation computation may still need to expose 
> independent interfaces.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib

2017-02-10 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861789#comment-15861789
 ] 

Joseph K. Bradley commented on SPARK-14523:
---

I'd like to keep this open until we have linked tasks for the missing 
functionality.

[~hujiayin] This is for parity w.r.t. the RDD-based API, not for adding new 
functionality to MLlib.  I think there's already a JIRA for ARIMA somewhere.

> Feature parity for Statistics ML with MLlib
> ---
>
> Key: SPARK-14523
> URL: https://issues.apache.org/jira/browse/SPARK-14523
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: yuhao yang
>
> Some statistics functions have been supported by DataFrame directly. Use this 
> jira to discuss/design the statistics package in Spark.ML and its function 
> scope. Hypothesis test and correlation computation may still need to expose 
> independent interfaces.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib

2016-04-21 Thread yuhao yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252162#comment-15252162
 ] 

yuhao yang commented on SPARK-14523:


||mllib.statistics  || ml ||
|colStats   |column |
|corr   |Pearson only|
|chiSqTest| seems to see it somewhere in dataFrame|
|kolmogorovSmirnovTest| missing|
|StreamingTest |missing|

[~josephkb] As you suggested, we can add the function to ml via DataFrame API.


> Feature parity for Statistics ML with MLlib
> ---
>
> Key: SPARK-14523
> URL: https://issues.apache.org/jira/browse/SPARK-14523
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: yuhao yang
>
> Some statistics functions have been supported by DataFrame directly. Use this 
> jira to discuss/design the statistics package in Spark.ML and its function 
> scope. Hypothesis test and correlation computation may still need to expose 
> independent interfaces.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib

2016-04-15 Thread hujiayin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242661#comment-15242661
 ] 

hujiayin commented on SPARK-14523:
--

We can add ARIMA to Spark.

> Feature parity for Statistics ML with MLlib
> ---
>
> Key: SPARK-14523
> URL: https://issues.apache.org/jira/browse/SPARK-14523
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: yuhao yang
>
> Some statistics functions have been supported by DataFrame directly. Use this 
> jira to discuss/design the statistics package in Spark.ML and its function 
> scope. Hypothesis test and correlation computation may still need to expose 
> independent interfaces.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org