[jira] [Commented] (PIG-1767) Intermediate data lost in local mode

Zach Garner (Commented) (JIRA) Sun, 25 Mar 2012 18:01:01 -0700

    [ 
https://issues.apache.org/jira/browse/PIG-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238023#comment-13238023
 ]


Zach Garner commented on PIG-1767:
----------------------------------

I have the same problem. Order works using "-x local" but not via "-x 
mapreduce".

I'm running on Ubuntu, using the system packages for hadoop & pig.


zach@ubuntu:~$ uname -a
Linux ubuntu 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:50 UTC 2011 i686 
i686 i386 GNU/Linux
zach@ubuntu:~$ dpkg -l|grep pig
ii  pig                                   0.9.2                                 
     
zach@ubuntu:~$ dpkg -l|grep hadoop
ii  hadoop                                1.0.1                      


                
> Intermediate data lost in local mode
> ------------------------------------
>
>                 Key: PIG-1767
>                 URL: https://issues.apache.org/jira/browse/PIG-1767
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.7.0, 0.8.0
>         Environment: Windows 7/64bit running in Cygwin with Pig in MR mode 
> and Hadoop using the local file system
>            Reporter: John Meagher
>
> When running the script below it fails because one of the intermediate data 
> files is not available one of the follow-on steps.  This fails under both 
> 0.7.0 and the 0.8.0-rc.  This works fine on 0.7.0 when run against HDFS.  
> cat test.txt
> a,1,2
> b,1,3
> c,1,4
> d,1,5
> e,1,6
> f,1,7
> g,1,8
> h,1,9
> i,1,10
> j,1,11
> k,1,12
> l,1,13
> a,1,100
> y,10,20
> profileTimeInfo  = LOAD 'test.txt' USING PigStorage(',') AS (id:chararray, 
> created:long, timestamp:long);
> timesById = GROUP profileTimeInfo BY id;
> ageById = FOREACH timesById GENERATE group, (MAX(profileTimeInfo.timestamp) - 
> MIN(profileTimeInfo.created)) AS age;
> sortedAges = ORDER ageById BY age DESC;
> topAges = LIMIT sortedAges 10;
> DUMP timesById;  -- Succeeds
> DUMP ageById;  -- Succeeds
> DUMP sortedAges;  -- Fails, see exception below
> DUMP topAges;  -- Fails, see exception below
> Exception dumped in grunt:
> 2010-12-14 11:59:02,248 [Thread-72] INFO  
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with 
> processName=JobTracker, sessionId= - already initialized
> 2010-12-14 11:59:02,251 [Thread-72] WARN  
> org.apache.hadoop.mapred.LocalJobRunner - job_local_0005 
> java.lang.RuntimeException: 
> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does 
> not exist: 
> file:/C:/jmeagher/devel/sample_data/pigsample_21114123_1292345941552
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:139)
>         at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
>         at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>         at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:527)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input 
> path does not exist: 
> file:/C:/jmeagher/devel/sample_data/pigsample_21114123_1292345941552
>         at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:224)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37)
>         at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
>         at 
> org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:153)
>         at 
> org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:115)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangeP
> artitioner.java:112)
>         ... 6 more
> Exception from the pig log file:
> Pig Stack Trace
> ---------------
> ERROR 1066: Unable to open iterator for alias sortedAges
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias sortedAges
>         at org.apache.pig.PigServer.openIterator(PigServer.java:754)
>         at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
>         at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
>         at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
>         at org.apache.pig.Main.run(Main.java:465)
>         at org.apache.pig.Main.main(Main.java:107)
> Caused by: java.io.IOException: Job terminated with anomalous status FAILED
>         at org.apache.pig.PigServer.openIterator(PigServer.java:744)
>         ... 7 more
> ================================================================================
> Pig Stack Trace
> ---------------
> ERROR 1066: Unable to open iterator for alias topAges
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias topAges
>         at org.apache.pig.PigServer.openIterator(PigServer.java:754)
>         at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
>         at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
>         at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
>         at org.apache.pig.Main.run(Main.java:465)
>         at org.apache.pig.Main.main(Main.java:107)
> Caused by: java.io.IOException: Job terminated with anomalous status FAILED
>         at org.apache.pig.PigServer.openIterator(PigServer.java:744)
>         ... 7 more
> ================================================================================

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1767) Intermediate data lost in local mode

Reply via email to