Re: CATALYST rule join

2018-02-27 Thread Yong Zhang
: tan shai <tan.shai...@gmail.com> Sent: Tuesday, February 27, 2018 4:19 AM To: user@spark.apache.org Subject: Re: CATALYST rule join Hi, I need to write a rule to customize the join function using Spark Catalyst optimizer. The objective to duplicate the second dataset using this p

Re: CATALYST rule join

2018-02-27 Thread tan shai
Hi, I need to write a rule to customize the join function using Spark Catalyst optimizer. The objective to duplicate the second dataset using this process: - Execute a udf on the column called x, this udf returns an array - Execute an explode function on the new column Using SQL terms, my

CATALYST rule join

2018-02-25 Thread tan shai
Hi, I need to write a rule to customize the join function using Spark Catalyst optimizer. The objective to duplicate the second dataset using this process: - Execute a udf on the column called x, this udf returns an array - Execute an explode function on the new column Using SQL terms, my