----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32031/ -----------------------------------------------------------
(Updated March 13, 2015, 10:42 a.m.) Review request for pig, liyun zhang and Mohit Sabharwal. Changes ------- Added license header Bugs: PIG-4193 https://issues.apache.org/jira/browse/PIG-4193 Repository: pig-git Description ------- Moved getNextTuple(boolean proceed) method from POCollectedGroup to POCollectedGroupSpark. Collected group when used with mr performs group operation in the mapside after making sure all data for same key exists on single map. This behaviour in spark is achieved by a single map on function using POCollectedGroup operator. TODO: - Avoid using rdd.count() in CollectedGroupConverter. Diffs (updated) ----- src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCollectedGroup.java 7f2f18e52e083b3e8e90ba02d07f12bcbc9be859 src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java ca7a45f33320064e22628b40b34be7b9f7b07c36 src/org/apache/pig/backend/hadoop/executionengine/spark/converter/CollectedGroupConverter.java 3d04ba11855c39960e00d6f51b66654d1c70ebad src/org/apache/pig/backend/hadoop/executionengine/spark/operator/POCollectedGroupSpark.java PRE-CREATION src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkCompiler.java PRE-CREATION Diff: https://reviews.apache.org/r/32031/diff/ Testing ------- Tested TestCollectedGroup and do not have any new successes or failures. Thanks, Praveen R