-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32031/
-----------------------------------------------------------
(Updated March 13, 2015, 10:42 a.m.)
Review request for pig, liyun zhang and Mohit Sabharwal.
Changes
-------
Added license header
Bugs: PIG-4193
https://issues.apache.org/jira/browse/PIG-4193
Repository: pig-git
Description
-------
Moved getNextTuple(boolean proceed) method from POCollectedGroup to
POCollectedGroupSpark.
Collected group when used with mr performs group operation in the mapside after
making sure all data for same key exists on single map. This behaviour in spark
is achieved by a single map on function using POCollectedGroup operator.
TODO:
- Avoid using rdd.count() in CollectedGroupConverter.
Diffs (updated)
-----
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCollectedGroup.java
7f2f18e52e083b3e8e90ba02d07f12bcbc9be859
src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java
ca7a45f33320064e22628b40b34be7b9f7b07c36
src/org/apache/pig/backend/hadoop/executionengine/spark/converter/CollectedGroupConverter.java
3d04ba11855c39960e00d6f51b66654d1c70ebad
src/org/apache/pig/backend/hadoop/executionengine/spark/operator/POCollectedGroupSpark.java
PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkCompiler.java
PRE-CREATION
Diff: https://reviews.apache.org/r/32031/diff/
Testing
-------
Tested TestCollectedGroup and do not have any new successes or failures.
Thanks,
Praveen R