Re: how to find the nearest holiday

2017-04-25 Thread Wen Pei Yu
TypeError: unorderable types: str() >= datetime.date()   Should transfer string to Date type when compare.   Yu Wenpei.   - Original message -From: Zeming Yu To: user Cc:Subject: how to find the nearest holidayDate: Tue, Apr 25, 2017 3:39 PM  I

Re: Aggregated column name

2017-03-23 Thread Wen Pei Yu
expr) Yu Wenpei. From: Kevin Mellott <kevin.r.mell...@gmail.com> To: Wen Pei Yu <yuw...@cn.ibm.com> Cc: user <user@spark.apache.org> Date: 03/24/2017 09:48 AM Subject:Re: Aggregated column name I'm not sure of the answer to your question; however, when

Aggregated column name

2017-03-23 Thread Wen Pei Yu
Hi All   I found some spark version(spark 1.4) return upper case aggregated column,  and some return low case. As below code, df.groupby(col("...")).agg(count("number"))  may return   COUNT(number)  -- spark 1,4 count(number) - spark 1.6   Anyone know if there is configure parameter for

Re: Apply ML to grouped dataframe

2016-08-23 Thread Wen Pei Yu
[2.0,16.0]| |12462589343|3| [1.0,1.0]| +---+-++ From: ayan guha <guha.a...@gmail.com> To: Wen Pei Yu/China/IBM@IBMCN Cc: user <user@spark.apache.org>, Nirmal Fernando <nir...@wso2.com> Date: 08/23/2016 05:13 PM Subject:Re: A

Re: Apply ML to grouped dataframe

2016-08-23 Thread Wen Pei Yu
Hi Mirmal Filter works fine if I want handle one of grouped dataframe. But I has multiple grouped dataframe, I wish I can apply ML algorithm to all of them in one job, but not in for loops. Wenpei. From: Nirmal Fernando <nir...@wso2.com> To: Wen Pei Yu/China/IBM@IBMCN Cc:

Re: Apply ML to grouped dataframe

2016-08-22 Thread Wen Pei Yu
: Nirmal Fernando <nir...@wso2.com> To: Wen Pei Yu/China/IBM@IBMCN Cc: User <user@spark.apache.org> Date: 08/23/2016 01:14 PM Subject:Re: Apply ML to grouped dataframe Hi Wen, AFAIK Spark MLlib implements its machine learning algorithms on top of Spark dataframe 

Re: Apply ML to grouped dataframe

2016-08-22 Thread Wen Pei Yu
Hi Nirmal I didn't get your point. Can you tell me more about how to use MLlib to grouped dataframe? Regards. Wenpei. From: Nirmal Fernando <nir...@wso2.com> To: Wen Pei Yu/China/IBM@IBMCN Cc: User <user@spark.apache.org> Date: 08/23/2016 10:26 AM Subject:

Apply ML to grouped dataframe

2016-08-22 Thread Wen Pei Yu
Hi We have a dataframe, then want group it and apply a ML algorithm or statistics(say t test) to each one. Is there any efficient way for this situation? Currently, we transfer to pyspark, use groupbykey and apply numpy function to array. But this wasn't an efficient way, right? Regards.

Re: LogisticsRegression in ML pipeline help page

2016-01-06 Thread Wen Pei Yu
You can get old resource under http://spark.apache.org/documentation.html And linear doc here for 1.5.2 http://spark.apache.org/docs/1.5.2/mllib-linear-methods.html#logistic-regression http://spark.apache.org/docs/1.5.2/ml-linear-methods.html Regards. Yu Wenpei. From: Arunkumar Pillai