You can use Spark MLlib
http://spark.apache.org/docs/latest/ml-guide.html#announcement-dataframe-based-api-is-primary-api

On Tue, Aug 23, 2016 at 7:34 AM, Wen Pei Yu <yuw...@cn.ibm.com> wrote:

> Hi
>
> We have a dataframe, then want group it and apply a ML algorithm or
> statistics(say t test) to each one. Is there any efficient way for this
> situation?
>
> Currently, we transfer to pyspark, use groupbykey and apply numpy function
> to array. But this wasn't an efficient way, right?
>
> Regards.
> Wenpei.
>



-- 

Thanks & regards,
Nirmal

Team Lead - WSO2 Machine Learner
Associate Technical Lead - Data Technologies Team, WSO2 Inc.
Mobile: +94715779733
Blog: http://nirmalfdo.blogspot.com/

Reply via email to