[ https://issues.apache.org/jira/browse/PIG-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771390#action_12771390 ]
Ankur commented on PIG-1060: ---------------------------- Here's a sample script to illustrate the issue. Note that sample data isn't very important here since the optimization and execution fail. === test.pig ==== data = LOAD 'dummy' as (name:chararray, freq:int); filter1 = FILTER data BY freq < 5; group1 = GROUP filter1 BY name; proj1 = FOREACH group1 GENERATE FLATTEN(group), 'string1', SUM(filter1.freq); filter2 = FILTER data by freq > 5; group2 = GROUP filter2 BY name; proj2 = FOREACH group2 GENERATE FLATTEN(group), 'string2', SUM(filter2.freq); filter3 = FILTER filter2 by freq < 10; group3 = GROUP filter3 By name; proj3 = FOREACH group3 GENERATE FLATTEN(group), 'string3', SUM(filter3.freq); filter4 = FILTER filter3 by freq > 7; group4 = GROUP filter4 By name; proj4 = FOREACH group4 GENERATE FLATTEN(group), 'string4', SUM(filter4.freq); M1 = LIMIT proj1 10; M2 = LIMIT proj2 10; M3 = LIMIT proj3 10; M4 = LIMIT proj4 10; U = UNION M1, M2, M3, M4; STORE U INTO 'res' USING PigStorage(); The dot output can dumped via command - "explain -dot -script test.pig;" to visualize the scenario. A surprising observation is that despite turning MultiQuery off using -M, it seems that the MultiQuery optimizer is still runs and fails the script. > MultiQuery optimization throws error for multi-level splits > ----------------------------------------------------------- > > Key: PIG-1060 > URL: https://issues.apache.org/jira/browse/PIG-1060 > Project: Pig > Issue Type: Bug > Affects Versions: 0.5.0 > Reporter: Ankur > > Consider the following scenario :- > 1. Multi-level splits in the map plan. > 2. Each split branch further progressing across a local-global rearrange. > 3. Output of each of these finally merged via a UNION. > MultiQuery optimizer throws the following error in such a case: > "ERROR 2146: Internal Error. Inconsistency in key index found during > optimization." -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.