Re: Is there any way for a single map job to show progress

2011-11-03 Thread Milind.Bhandarkar
By implementing getProgress(). The problem with LineRecordReader is this: 80if (codec != null) { 81 in = new LineReader(codec.createInputStream(fileIn), job); 82 end = Long.MAX_VALUE; 83 } And getProgress() is: Math.min(1.0f, (pos - start) / (float)(end - start)); Afte

Re: Is there any way for a single map job to show progress

2011-11-03 Thread Milind.Bhandarkar
Individual map task progress is indicative of what percentage of input chunk has been consumed so far by the map task. However, the responsibility of feeding this info to the framework is the responsibility of the record reader. * Milind * From: Steve Lewis mailto:lordjoe2...@gmail.com>> R

The Elephant in the Room: You are invited

2011-11-03 Thread Milind.Bhandarkar
Hello Hadoopers ! Are you attending Hadoop World in New York on November 8? This is your invitation to join Greenplum and other industry innovators in celebrating the Hadoop Movement. The Elephant in the Room - A Hadoop World Party Extravaganza! Enjoy bowling, shooting pool, and a few surprises

Re: Performance of mappers

2011-08-05 Thread Milind.Bhandarkar
No, I was wondering if you are specifying –cacheArchive or –cacheFile. These are fetched by the tasktracker prior to task startup, and can delay task launch. * Milind * --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do no

Re: Performance of mappers

2011-08-05 Thread Milind.Bhandarkar
Iman, Are you using cache archives ? If yes, what's the size of the cache archive? - milind --- --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any organization, past or present, the au

Re: Reducer Run on Which Machine?

2011-08-05 Thread Milind.Bhandarkar
Arun, As we had discussed several years ago, there are many use cases where reducer scheduling control will be beneficial. Suhendry, Currently it is not possible to specify hints for reducer scheduling, so patches welcome. - milind --- Milind Bhandarkar Greenplum Labs, EMC ((Disclaimer: Opinio

Re: How does Hadoop reuse the objects?

2011-08-04 Thread Milind.Bhandarkar
HADOOP-2399 has caused a lot of problems for users so far, and the saga still continues :-( I remember spending 18 straight hours in 2008 with a user debugging this issue. - milind --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, an