[ https://issues.apache.org/jira/browse/PIG-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang resolved PIG-4518. ------------------------------ Resolution: Fixed Committed to Spark branch. Thanks, Mohit. > SparkOperator should correspond to complete Spark job > ----------------------------------------------------- > > Key: PIG-4518 > URL: https://issues.apache.org/jira/browse/PIG-4518 > Project: Pig > Issue Type: Bug > Components: spark > Reporter: Mohit Sabharwal > Assignee: Mohit Sabharwal > Fix For: spark-branch > > Attachments: PIG-4518.1.patch, PIG-4518.patch > > > SparkPlan, which was added in PIG-4374, creates a new SparkOperator for every > shuffle boundary (denoted by presence of POGlobalRearrange in the > corresponding physical plan). This is unnecessary for Spark engine since it > relies on Spark to do the shuffle (using groupBy(), reduceByKey() and > CoGroupRDD) and does not need to explicitly identify "map" and "reduce" > operations. > It is also cleaner if a single SparkOperator represents a single complete > Spark job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)