[
https://issues.apache.org/jira/browse/PIG-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022491#comment-16022491
]
Nandor Kollar commented on PIG-5157:
------------------------------------
[~kellyzly], [~jeffzhang] I think (and correct me if I'm wrong) we don't have
to change physical and logical plan, but we've to modify how the plan is mapped
to Spark: modify the converters from RDD converter to DataSet converter.
I'd recommend to split this into two task. First is upgrading to Spark 2.1
while still being able to compile with Spark 1.6. I'm close to finish this,
there were few API changes, I'll attaching the patch soon for comments. Once
this is done, we should try to migrate to DataSet API only for spark 2.1. As
far as I know Spark 1.6 has DataFrames API, but since it was experimental that
time, I think we shouldn't change that, RDDs are fine for Spark 1.6. Any
thoughts?
[~pallavi.rao] I saw you investigated DataFrames API for PoS before, but didn't
find it suitable. What was the issue with it?
> Upgrade to Spark 2.0
> --------------------
>
> Key: PIG-5157
> URL: https://issues.apache.org/jira/browse/PIG-5157
> Project: Pig
> Issue Type: Improvement
> Components: spark
> Reporter: Nandor Kollar
> Assignee: Nandor Kollar
> Fix For: 0.18.0
>
>
> Upgrade to Spark 2.0 (or latest)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)