Tushar Pradhan created PIG-2752: ----------------------------------- Summary: Infinite (?) parser loop with complex FOREACH expression Key: PIG-2752 URL: https://issues.apache.org/jira/browse/PIG-2752 Project: Pig Issue Type: Bug Components: parser Affects Versions: 0.10.0 Environment: Linux Reporter: Tushar Pradhan
The following Pig script seems to hang in the parser for Pig 0.10.0. It works fine for Pig 0.8.1. ---- X = LOAD 'X' USING PigStorage(',') AS ( term: chararray,dcount: long,dcount_0: long,dcount_1: long,dcount_2: long,dcount_4: long,dcount_5: long,dcount_6: long,dcount_7: long,dcount_8: long,dcount_9: long,dcount_10: long,dcount_11: long,dcount_12: long,dcount_13: long,dcount_U: long,dcount_L: long,dcount_C: long,dcount_M: long,dcount_P: long,dcount_T: long,dcount_S: long,dcount_R: long,dcount_Z: long,dcount_K: long); Y = FOREACH X GENERATE term, ( (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_1 > 1 OR dcount_1 == 1 AND dcount == 1) ? 1 : ( (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_2 > 1 OR dcount_2 == 1 AND dcount == 1) ? 2 : ( (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_7 > 1 OR dcount_7 == 1 AND dcount == 1) ? 7 : ( (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_9 > 1 OR dcount_9 == 1 AND dcount == 1) ? 9 : ( (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_11 > 1 OR dcount_11 == 1 AND dcount == 1) ? 11 : ( dcount_5 > 1 OR dcount_5 == 1 AND dcount == 1 ? 5 : ( dcount_6 > 1 OR dcount_6 == 1 AND dcount == 1 ? 6 : ( dcount_8 > 1 OR dcount_8 == 1 AND dcount == 1 ? 8 : ( dcount_10 > 1 OR dcount_10 == 1 AND dcount == 1 ? 10 : ( dcount_12 > 1 OR dcount_12 == 1 AND dcount == 1 ? 12 : ( (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_13 > 0 OR dcount_13 == 1 AND dcount == 1) ? 13 : ( dcount_4 > 0 ? 4 : 0))))))))))) ) AS besttype; STORE Y INTO 'Y'; ---- 2012-06-12 08:04:46,435 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0-SNAPSHOT (rexported) compiled May 08 2012, 08:26:29 2012-06-12 08:04:46,435 [main] INFO org.apache.pig.Main - Logging error messages to: /tmp/pig_1339513486431.log 2012-06-12 08:04:46,950 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:/// The hang occurs in both local and Hadoop modes If I simplify the 'besttype' expression in the FOREACH a bit, the script works fine. The input 'X' directory isn't necessary as the processing gets stuck in the parser, but if needed, can contain a sample 'part-r-00000' file with the line: #1,49,1,0,0,0,0,0,0,0,0,0,0,0,48,0,0,0,0,49,1,2,0,0,43 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira