On Tue, Aug 23, 2016 at 10:56 AM, Wen Pei Yu <yuw...@cn.ibm.com> wrote:

> We can group a dataframe by one column like
>
> df.groupBy(df.col("gender"))
>

On top of this DF, use a filter that would enable you to extract the
grouped DF as separated DFs. Then you can apply ML on top of each DF.

eg: xyzDF.filter(col("x").equalTo(x))

>
> It like split a dataframe to multiple dataframe. Currently, we can only
> apply simple sql function to this GroupedData like agg, max etc.
>
> What we want is apply one ML algorithm to each group.
>
> Regards.
>
> [image: Inactive hide details for Nirmal Fernando ---08/23/2016 01:14:48
> PM---Hi Wen, AFAIK Spark MLlib implements its machine learning]Nirmal
> Fernando ---08/23/2016 01:14:48 PM---Hi Wen, AFAIK Spark MLlib implements
> its machine learning algorithms on top of
>
> From: Nirmal Fernando <nir...@wso2.com>
> To: Wen Pei Yu/China/IBM@IBMCN
> Cc: User <user@spark.apache.org>
> Date: 08/23/2016 01:14 PM
>
> Subject: Re: Apply ML to grouped dataframe
> ------------------------------
>
>
>
> Hi Wen,
>
> AFAIK Spark MLlib implements its machine learning algorithms on top of
> Spark dataframe API. What did you mean by a grouped dataframe?
>
> On Tue, Aug 23, 2016 at 10:42 AM, Wen Pei Yu <*yuw...@cn.ibm.com*
> <yuw...@cn.ibm.com>> wrote:
>
>    Hi Nirmal
>
>    I didn't get your point.
>    Can you tell me more about how to use MLlib to grouped dataframe?
>
>    Regards.
>    Wenpei.
>
>    [image: Inactive hide details for Nirmal Fernando ---08/23/2016
>    10:26:36 AM---You can use Spark MLlib 
> http://spark.apache.org/docs/late]Nirmal
>    Fernando ---08/23/2016 10:26:36 AM---You can use Spark MLlib
>    
> *http://spark.apache.org/docs/latest/ml-guide.html#announcement-dataframe-bas*
>    
> <http://spark.apache.org/docs/latest/ml-guide.html#announcement-dataframe-bas>
>
>    From: Nirmal Fernando <*nir...@wso2.com* <nir...@wso2.com>>
>    To: Wen Pei Yu/China/IBM@IBMCN
>    Cc: User <*user@spark.apache.org* <user@spark.apache.org>>
>    Date: 08/23/2016 10:26 AM
>    Subject: Re: Apply ML to grouped dataframe
>    ------------------------------
>
>
>
>
>    You can use Spark MLlib
>    
> *http://spark.apache.org/docs/latest/ml-guide.html#announcement-dataframe-based-api-is-primary-api*
>    
> <http://spark.apache.org/docs/latest/ml-guide.html#announcement-dataframe-based-api-is-primary-api>
>
>    On Tue, Aug 23, 2016 at 7:34 AM, Wen Pei Yu <*yuw...@cn.ibm.com*
>    <yuw...@cn.ibm.com>> wrote:
>       Hi
>
>          We have a dataframe, then want group it and apply a ML algorithm
>          or statistics(say t test) to each one. Is there any efficient way 
> for this
>          situation?
>
>          Currently, we transfer to pyspark, use groupbykey and apply
>          numpy function to array. But this wasn't an efficient way, right?
>
>          Regards.
>          Wenpei.
>
>
>
>
>    --
>
>    Thanks & regards,
>    Nirmal
>
>    Team Lead - WSO2 Machine Learner
>    Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>    Mobile: *+94715779733* <%2B94715779733>
>    Blog: *http://nirmalfdo.blogspot.com/* <http://nirmalfdo.blogspot.com/>
>
>
>
>
>
>
>
> --
>
> Thanks & regards,
> Nirmal
>
> Team Lead - WSO2 Machine Learner
> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: *http://nirmalfdo.blogspot.com/* <http://nirmalfdo.blogspot.com/>
>
>
>
>


-- 

Thanks & regards,
Nirmal

Team Lead - WSO2 Machine Learner
Associate Technical Lead - Data Technologies Team, WSO2 Inc.
Mobile: +94715779733
Blog: http://nirmalfdo.blogspot.com/

Reply via email to