[ https://issues.apache.org/jira/browse/PIG-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075869#comment-14075869 ]
Rohini Palaniswamy commented on PIG-4057: ----------------------------------------- +1 > Group All followed by CROSS with default parallelism produces wrong results > --------------------------------------------------------------------------- > > Key: PIG-4057 > URL: https://issues.apache.org/jira/browse/PIG-4057 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy > Assignee: Daniel Dai > Fix For: 0.14.0 > > Attachments: PIG-4057-1.patch, PIG-4057-2.patch, PIG-4057-3.patch, > PIG-4057-4.patch, PIG-4057-5.patch, PIG-4057-6.patch > > > SET default_parallel 199; > ...... > by_size = ... > uniq_vals = ..... > grpd = group uniq_vals all; > all_vals = FOREACH grpd GENERATE uniq_vals; > cross_result = CROSS by_size, all_vals; > store cross_result into '/tmp/roh/cross/out/recipient_asns'; > Job1: grpd, all_vals, cross_result (The plan does GFCross function here for > all_vals assuming cross parallelism to be 1 taking that of the current job > even > though it should consider default parallelism 199 of Job 2. Parallelism of > Job1 > is 1 because of group all) > Job2: cross_result (Actual CROSS of by_size and all_vals) -- This message was sent by Atlassian JIRA (v6.2#6252)