[ https://issues.apache.org/jira/browse/PIG-921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-921: --------------------------- Fix Version/s: 0.6.0 Affects Version/s: (was: 0.3.0) 0.4.0 Status: Patch Available (was: Open) > Strange use case for Join which produces different results in local and map > reduce mode > --------------------------------------------------------------------------------------- > > Key: PIG-921 > URL: https://issues.apache.org/jira/browse/PIG-921 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.4.0 > Environment: Hadoop 18 and Hadoop 20 > Reporter: Viraj Bhat > Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: A.txt, B.txt, joinusecase.pig, PIG-921-1.patch > > > I have script in this manner, loads from 2 files A.txt and B.txt > {code} > A = LOAD 'A.txt' as (a:tuple(a1:int, a2:chararray)); > B = LOAD 'B.txt' as (b:tuple(b1:int, b2:chararray)); > C = JOIN A by a.a1, B by b.b1; > DESCRIBE C; > DUMP C; > {code} > A.txt contains the following lines: > {code} > (1,a) > (2,aa) > {code} > B.txt contains the following lines: > {code} > (1,b) > (2,bb) > {code} > Now running the above script in local and map reduce mode on Hadoop 18 & > Hadoop 20, produces the following: > Hadoop 18 > ===================================================================== > (1,1) > (2,2) > ===================================================================== > Hadoop 20 > ===================================================================== > (1,1) > (2,2) > ===================================================================== > Local Mode: Pig with Hadoop 18 jar release > ===================================================================== > 2009-08-13 17:15:13,473 [main] INFO org.apache.pig.Main - Logging error > messages to: /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log > 09/08/13 17:15:13 INFO pig.Main: Logging error messages to: > /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log > C: {a: (a1: int,a2: chararray),b: (b1: int,b2: chararray)} > 2009-08-13 17:15:13,932 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1002: Unable to store alias C > 09/08/13 17:15:13 ERROR grunt.Grunt: ERROR 1002: Unable to store alias C > Details at logfile: > /homes/viraj/pig-svn/trunk/pigscripts/pig_1250208913472.log > ===================================================================== > Caused by: java.lang.NullPointerException > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNext(POPackage.java:206) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:191) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) > at > org.apache.pig.backend.local.executionengine.physicalLayer.counters.POCounter.getNext(POCounter.java:71) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117) > at > org.apache.pig.backend.local.executionengine.LocalPigLauncher.runPipeline(LocalPigLauncher.java:146) > at > org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(LocalPigLauncher.java:109) > at > org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:165) > ... 9 more > ===================================================================== > Local Mode: Pig with Hadoop 20 jar release > ===================================================================== > ((1,a),(1,b)) > ((2,aa),(2,bb) > ===================================================================== -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.