[ https://issues.apache.org/jira/browse/PIG-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238023#comment-13238023 ]
Zach Garner commented on PIG-1767: ---------------------------------- I have the same problem. Order works using "-x local" but not via "-x mapreduce". I'm running on Ubuntu, using the system packages for hadoop & pig. zach@ubuntu:~$ uname -a Linux ubuntu 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:50 UTC 2011 i686 i686 i386 GNU/Linux zach@ubuntu:~$ dpkg -l|grep pig ii pig 0.9.2 zach@ubuntu:~$ dpkg -l|grep hadoop ii hadoop 1.0.1 > Intermediate data lost in local mode > ------------------------------------ > > Key: PIG-1767 > URL: https://issues.apache.org/jira/browse/PIG-1767 > Project: Pig > Issue Type: Bug > Affects Versions: 0.7.0, 0.8.0 > Environment: Windows 7/64bit running in Cygwin with Pig in MR mode > and Hadoop using the local file system > Reporter: John Meagher > > When running the script below it fails because one of the intermediate data > files is not available one of the follow-on steps. This fails under both > 0.7.0 and the 0.8.0-rc. This works fine on 0.7.0 when run against HDFS. > cat test.txt > a,1,2 > b,1,3 > c,1,4 > d,1,5 > e,1,6 > f,1,7 > g,1,8 > h,1,9 > i,1,10 > j,1,11 > k,1,12 > l,1,13 > a,1,100 > y,10,20 > profileTimeInfo = LOAD 'test.txt' USING PigStorage(',') AS (id:chararray, > created:long, timestamp:long); > timesById = GROUP profileTimeInfo BY id; > ageById = FOREACH timesById GENERATE group, (MAX(profileTimeInfo.timestamp) - > MIN(profileTimeInfo.created)) AS age; > sortedAges = ORDER ageById BY age DESC; > topAges = LIMIT sortedAges 10; > DUMP timesById; -- Succeeds > DUMP ageById; -- Succeeds > DUMP sortedAges; -- Fails, see exception below > DUMP topAges; -- Fails, see exception below > Exception dumped in grunt: > 2010-12-14 11:59:02,248 [Thread-72] INFO > org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with > processName=JobTracker, sessionId= - already initialized > 2010-12-14 11:59:02,251 [Thread-72] WARN > org.apache.hadoop.mapred.LocalJobRunner - job_local_0005 > java.lang.RuntimeException: > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does > not exist: > file:/C:/jmeagher/devel/sample_data/pigsample_21114123_1292345941552 > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:139) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:527) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) > Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input > path does not exist: > file:/C:/jmeagher/devel/sample_data/pigsample_21114123_1292345941552 > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:224) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241) > at > org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:153) > at > org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:115) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangeP > artitioner.java:112) > ... 6 more > Exception from the pig log file: > Pig Stack Trace > --------------- > ERROR 1066: Unable to open iterator for alias sortedAges > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias sortedAges > at org.apache.pig.PigServer.openIterator(PigServer.java:754) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76) > at org.apache.pig.Main.run(Main.java:465) > at org.apache.pig.Main.main(Main.java:107) > Caused by: java.io.IOException: Job terminated with anomalous status FAILED > at org.apache.pig.PigServer.openIterator(PigServer.java:744) > ... 7 more > ================================================================================ > Pig Stack Trace > --------------- > ERROR 1066: Unable to open iterator for alias topAges > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias topAges > at org.apache.pig.PigServer.openIterator(PigServer.java:754) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76) > at org.apache.pig.Main.run(Main.java:465) > at org.apache.pig.Main.main(Main.java:107) > Caused by: java.io.IOException: Job terminated with anomalous status FAILED > at org.apache.pig.PigServer.openIterator(PigServer.java:744) > ... 7 more > ================================================================================ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira