Combiner timing out

2011-11-03 Thread Christopher Egner
Hi all, Let me preface this with my understanding of how tasks work. If a task takes a long time (default 10min) and demonstrates no progress, the task tracker will decide the process is hung, kill it, and start a new attempt. Normally, one uses a Reporter instance's progress method to provide

Problems with MR Job running really slowly

2011-11-03 Thread Steve Lewis
I have a job which takes an xml file - the splitter breaks the file into tags, the mapper parses each tag and sends the data to the reducer. I am using a custom splitter which reads the file looking for start and end tags. When I run the code in the splitter and the mapper - generating separate ta

Re: Is there any way for a single map job to show progress

2011-11-03 Thread Milind.Bhandarkar
By implementing getProgress(). The problem with LineRecordReader is this: 80if (codec != null) { 81 in = new LineReader(codec.createInputStream(fileIn), job); 82 end = Long.MAX_VALUE; 83 } And getProgress() is: Math.min(1.0f, (pos - start) / (float)(end - start)); Afte

Re: Is there any way for a single map job to show progress

2011-11-03 Thread Steve Lewis
So how does a custom reader to that??? On Thu, Nov 3, 2011 at 10:28 AM, wrote: > Individual map task progress is indicative of what percentage of input > chunk has been consumed so far by the map task. However, the responsibility > of feeding this info to the framework is the responsibility of t

Classpath issues

2011-11-03 Thread Russell Brown
Hi, I tried searching the mailing list archive, but couldn't find this issue. I have a map reduce job in a jar, and the jars the job depends on are in the lib/ directory of the jar. However, there is an older version of one the libraries (jackson-mapper-asl-1.0.1.jar in hadoop/lib) so I get clas

Re: Is there any way for a single map job to show progress

2011-11-03 Thread Milind.Bhandarkar
Individual map task progress is indicative of what percentage of input chunk has been consumed so far by the map task. However, the responsibility of feeding this info to the framework is the responsibility of the record reader. * Milind * From: Steve Lewis mailto:lordjoe2...@gmail.com>> R

The Elephant in the Room: You are invited

2011-11-03 Thread Milind.Bhandarkar
Hello Hadoopers ! Are you attending Hadoop World in New York on November 8? This is your invitation to join Greenplum and other industry innovators in celebrating the Hadoop Movement. The Elephant in the Room - A Hadoop World Party Extravaganza! Enjoy bowling, shooting pool, and a few surprises

Re: Streaming question.

2011-11-03 Thread Dan Young
Praveen, So is the KeyFieldBasedPartitioner broken in the current release (0.21 or 0.20.x)? The bug link you reference refers to fix in 0.22. Is there anywhere I could download 0.22 to try this out? What I really need to do, is to have all the keys for a given group, written out to separate par

Re: Streaming question.

2011-11-03 Thread Dan Young
Hello Praveen, I'm using 0.20.2. I can try it with 0.21 this morning when I get into the office Regards, Dan On Nov 2, 2011 11:47 PM, "Praveen Sripati" wrote: > Dan, > > It is a known bug (https://issues.apache.org/jira/browse/MAPREDUCE-1888) > which has been identified in 0.21.0 release. Whic