Question, do normal map-reduce jobs run on this cluster? Like the example jar jobs? Guy
On Mar 9, 2011, at 2:29 PM, Kris Coward <k...@melon.org> wrote: > > Also, reading some uncompressed data off the same cluster using > PigStorage shows a failure to even read the data in the first place :| > > -K > > On Tue, Mar 08, 2011 at 09:24:18PM -0500, Kris Coward wrote: >> >> None of the nodes have more than 20% utilization on any of their disks; >> so it must be the cluster figuring that it can get away with this sort >> of thing when the sysadmin's not around to set it straight.. clearly a >> cluster of redundant/load-sharing sysadmins is also needed :) >> >> -K >> >> On Tue, Mar 08, 2011 at 03:24:50PM -0800, Dmitriy Ryaboy wrote: >>> Check task logs. I am guessing you ran out of either hdfs or local disk on >>> the nodes. >>> >>> Also, never let your sysadmin go on vacation, that's what makes things >>> break! :) >>> >>> D >>> >>> On Tue, Mar 8, 2011 at 2:53 PM, Kris Coward <k...@melon.org> wrote: >>> >>>> >>>> So I queued up a batch of jobs last night to run overnight (and into the >>>> day a bit, owing to to a bottleneck on the scheduler the way that things >>>> are currently implemented), made sure they were running correctly, went >>>> to sleep, and when I woke up in the morning, they were failing all over >>>> the place. >>>> >>>> Since each of these jobs was basicaly the same pig script being run with >>>> a different set of parameters, I tried re-reunning it with the >>>> parameters that it had run (successfully) with the night before, and it >>>> also failed. So I started whittling away at steps to try and find the >>>> origin of the failure, until I was even getting a failure loading the >>>> initial data, and dumping it out. Basically, I've reduced things to a >>>> matter of >>>> >>>> apa = LOAD >>>> '/rawfiles/08556ecf5c6841d59eb702e9762e649a/{1296432000,1296435600,1296439200,1296442800,1296446400,1296450000,1296453600,1296457200,1296460800,1296464400,1296468000,1296471600,1296475200,1296478800,1296482400,1296486000,1296489600,1296493200,1296496800,1296500400,1296504000,1296507600,1296511200,1296514800}/*/apa' >>>> USING com.twitter.elephantbird.pig.load.LzoTokenizedLoader(',') AS >>>> (timestamp:long, type:chararray, appkey:chararray, uid:chararray, >>>> uniq:chararray, shortUniq:chararray, profUid:chararray, addr:chararray, >>>> ref:chararray); >>>> dump apa; >>>> >>>> and after getting all the happy messages from the loader like: >>>> >>>> 2011-03-08 21:48:46,454 [Thread-12] INFO >>>> com.twitter.elephantbird.pig.load.LzoBaseLoadFunc - Got 117 LZO slices in >>>> total. >>>> 2011-03-08 21:48:48,044 [main] INFO >>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>>> - 0% complete >>>> 2011-03-08 21:50:17,612 [main] INFO >>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>>> - 100% complete >>>> >>>> It went straight to: >>>> >>>> 2011-03-08 21:50:17,612 [main] ERROR >>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>>> - 1 map reduce job(s) failed! >>>> 2011-03-08 21:50:17,662 [main] ERROR >>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>>> - Failed to produce result in: >>>> "hdfs://master.hadoop:9000/tmp/temp-2121884028/tmp-268519128" >>>> 2011-03-08 21:50:17,664 [main] INFO >>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>>> - Failed! >>>> 2011-03-08 21:50:17,668 [main] ERROR org.apache.pig.tools.grunt.Grunt - >>>> ERROR 1066: Unable to open iterator for alias apa >>>> Details at logfile: /home/kris/pig_1299620898192.log >>>> >>>> And looking at the stack trace in the logfile, I've got: >>>> >>>> Pig Stack Trace >>>> --------------- >>>> ERROR 1066: Unable to open iterator for alias apa >>>> >>>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to >>>> open iterator for alias apa >>>> at org.apache.pig.PigServer.openIterator(PigServer.java:482) >>>> at >>>> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:539) >>>> at >>>> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241) >>>> at >>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168) >>>> at >>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144) >>>> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) >>>> at org.apache.pig.Main.main(Main.java:352) >>>> Caused by: java.io.IOException: Job terminated with anomalous status FAILED >>>> at org.apache.pig.PigServer.openIterator(PigServer.java:476) >>>> ... 6 more >>>> >>>> ================================================================================ >>>> >>>> My sysadmin's off on vacation for the week, but left information on the >>>> scripts to restart the cluster, so I tried that, and the problem is >>>> still persisting, so I was hoping someone here might have an idea what's >>>> wrong (and how to fix it). >>>> >>>> Thanks, >>>> Kris >>>> >>>> -- >>>> Kris Coward http://unripe.melon.org/ >>>> GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3 >>>> > > -- > Kris Coward http://unripe.melon.org/ > GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3