Hi Mirmal

Filter works fine if I want handle one of grouped dataframe. But I has
multiple grouped dataframe, I wish I can apply ML algorithm to all of them
in one job, but not in for loops.

Wenpei.



From:   Nirmal Fernando <nir...@wso2.com>
To:     Wen Pei Yu/China/IBM@IBMCN
Cc:     User <user@spark.apache.org>
Date:   08/23/2016 01:55 PM
Subject:        Re: Apply ML to grouped dataframe





On Tue, Aug 23, 2016 at 10:56 AM, Wen Pei Yu <yuw...@cn.ibm.com> wrote:
  We can group a dataframe by one column like

  df.groupBy(df.col("gender"))



On top of this DF, use a filter that would enable you to extract the
grouped DF as separated DFs. Then you can apply ML on top of each DF.

eg: xyzDF.filter(col("x").equalTo(x))

  It like split a dataframe to multiple dataframe. Currently, we can only
  apply simple sql function to this GroupedData like agg, max etc.

  What we want is apply one ML algorithm to each group.

  Regards.

  Inactive hide details for Nirmal Fernando ---08/23/2016 01:14:48 PM---Hi
  Wen, AFAIK Spark MLlib implements its machine learningNirmal Fernando
  ---08/23/2016 01:14:48 PM---Hi Wen, AFAIK Spark MLlib implements its
  machine learning algorithms on top of

  From: Nirmal Fernando <nir...@wso2.com>
  To: Wen Pei Yu/China/IBM@IBMCN
  Cc: User <user@spark.apache.org>
  Date: 08/23/2016 01:14 PM



  Subject: Re: Apply ML to grouped dataframe



  Hi Wen,

  AFAIK Spark MLlib implements its machine learning algorithms on top of
  Spark dataframe API. What did you mean by a grouped dataframe?

  On Tue, Aug 23, 2016 at 10:42 AM, Wen Pei Yu <yuw...@cn.ibm.com> wrote:
        Hi Nirmal

        I didn't get your point.
        Can you tell me more about how to use MLlib to grouped dataframe?

        Regards.
        Wenpei.

        Inactive hide details for Nirmal Fernando ---08/23/2016 10:26:36
        AM---You can use Spark MLlib http://spark.apache.org/docs/late
        Nirmal Fernando ---08/23/2016 10:26:36 AM---You can use Spark MLlib
        
http://spark.apache.org/docs/latest/ml-guide.html#announcement-dataframe-bas


        From: Nirmal Fernando <nir...@wso2.com>
        To: Wen Pei Yu/China/IBM@IBMCN
        Cc: User <user@spark.apache.org>
        Date: 08/23/2016 10:26 AM
        Subject: Re: Apply ML to grouped dataframe




        You can use Spark MLlib
        
http://spark.apache.org/docs/latest/ml-guide.html#announcement-dataframe-based-api-is-primary-api


        On Tue, Aug 23, 2016 at 7:34 AM, Wen Pei Yu <yuw...@cn.ibm.com>
        wrote:
                    Hi

                    We have a dataframe, then want group it and apply a ML
                    algorithm or statistics(say t test) to each one. Is
                    there any efficient way for this situation?

                    Currently, we transfer to pyspark, use groupbykey and
                    apply numpy function to array. But this wasn't an
                    efficient way, right?

                    Regards.
                    Wenpei.




        --

        Thanks & regards,
        Nirmal

        Team Lead - WSO2 Machine Learner
        Associate Technical Lead - Data Technologies Team, WSO2 Inc.
        Mobile: +94715779733
        Blog: http://nirmalfdo.blogspot.com/






  --

  Thanks & regards,
  Nirmal

  Team Lead - WSO2 Machine Learner
  Associate Technical Lead - Data Technologies Team, WSO2 Inc.
  Mobile: +94715779733
  Blog: http://nirmalfdo.blogspot.com/








--

Thanks & regards,
Nirmal

Team Lead - WSO2 Machine Learner
Associate Technical Lead - Data Technologies Team, WSO2 Inc.
Mobile: +94715779733
Blog: http://nirmalfdo.blogspot.com/



Reply via email to