[jira] [Commented] (PIG-5157) Upgrade to Spark 2.0

Nandor Kollar (JIRA) Wed, 24 May 2017 01:10:13 -0700

    [ 
https://issues.apache.org/jira/browse/PIG-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022491#comment-16022491
 ]


Nandor Kollar commented on PIG-5157:
------------------------------------

[~kellyzly], [~jeffzhang] I think (and correct me if I'm wrong) we don't have 
to change physical and logical plan, but we've to modify how the plan is mapped 
to Spark: modify the converters from RDD converter to DataSet converter.
I'd recommend to split this into two task. First is upgrading to Spark 2.1 
while still being able to compile with Spark 1.6. I'm close to finish this, 
there were few API changes, I'll attaching the patch soon for comments. Once 
this is done, we should try to migrate to DataSet API only for spark 2.1. As 
far as I know Spark 1.6 has DataFrames API, but since it was experimental that 
time, I think we shouldn't change that, RDDs are fine for Spark 1.6. Any 
thoughts?
[~pallavi.rao] I saw you investigated DataFrames API for PoS before, but didn't 
find it suitable. What was the issue with it?

> Upgrade to Spark 2.0
> --------------------
>
>                 Key: PIG-5157
>                 URL: https://issues.apache.org/jira/browse/PIG-5157
>             Project: Pig
>          Issue Type: Improvement
>          Components: spark
>            Reporter: Nandor Kollar
>            Assignee: Nandor Kollar
>             Fix For: 0.18.0
>
>
> Upgrade to Spark 2.0 (or latest)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5157) Upgrade to Spark 2.0

Reply via email to