By implementing getProgress().
The problem with LineRecordReader is this:
80if (codec != null) {
81 in = new LineReader(codec.createInputStream(fileIn), job);
82 end = Long.MAX_VALUE;
83 }
And getProgress() is:
Math.min(1.0f, (pos - start) / (float)(end - start));
Afte
Individual map task progress is indicative of what percentage of input chunk
has been consumed so far by the map task. However, the responsibility of
feeding this info to the framework is the responsibility of the record reader.
* Milind
*
From: Steve Lewis mailto:lordjoe2...@gmail.com>>
R
Hello Hadoopers !
Are you attending Hadoop World in New York on November 8?
This is your invitation to join Greenplum and other industry innovators in
celebrating the Hadoop Movement.
The Elephant in the Room - A Hadoop World Party Extravaganza!
Enjoy bowling, shooting pool, and a few surprises
No, I was wondering if you are specifying –cacheArchive or –cacheFile. These
are fetched by the tasktracker prior to task startup, and can delay task launch.
* Milind
*
---
Milind Bhandarkar
Greenplum Labs, EMC
(Disclaimer: Opinions expressed in this email are those of the author, and do
no
Iman,
Are you using cache archives ? If yes, what's the size of the cache archive?
- milind
---
---
Milind Bhandarkar
Greenplum Labs, EMC
(Disclaimer: Opinions expressed in this email are those of the author, and do
not necessarily represent the views of any organization, past or present, the
au
Arun,
As we had discussed several years ago, there are many use cases where
reducer scheduling control will be beneficial.
Suhendry,
Currently it is not possible to specify hints for reducer scheduling, so
patches welcome.
- milind
---
Milind Bhandarkar
Greenplum Labs, EMC
((Disclaimer: Opinio
HADOOP-2399 has caused a lot of problems for users so far, and the saga
still continues :-(
I remember spending 18 straight hours in 2008 with a user debugging this
issue.
- milind
---
Milind Bhandarkar
Greenplum Labs, EMC
(Disclaimer: Opinions expressed in this email are those of the author, an