I'm thinking the reason for hard-coding is to protect Hadoop cluster
from high network traffic. If the value is too small, there are
too many network traffic between Map/Reduce tasks and MRAppMaster.

Please see https://issues.apache.org/jira/browse/MAPREDUCE-4381 also.

That's why you need to be very careful if you really want to change
the value.

The source code is at
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java (line 532-533)

  /** The number of milliseconds between progress reports. */
  public static final int PROGRESS_INTERVAL = 3000;

Regards,
Akira

(2014/04/16 22:17), Dharmesh Kakadia wrote:
Hi Akira,

Thanks fir the quick reply.
Any particular reason for hard-coding it? Is there a workaround? I want to
be able to get the counters as fine as possible. Also can you point me to
the relevant source code. I am willing to take the issue and contribute if
required.

Thanks,
Dharmesh


On Wed, Apr 16, 2014 at 3:14 PM, Akira AJISAKA
<ajisa...@oss.nttdata.co.jp>wrote:

Moved mapreduce-dev@ to Bcc.

Hi Dharmesh,

The parameter is to set the interval of polling the progress
of the MRAppMaster, not the Map/Reduce tasks. The tasks send
the progress (includes the counter information) to MRAppMaster
every 3000 milliseconds, which is hard-coded.

That's why a sudden big change in counter values happens
even if the parameter is set to a small value.

Regards,
Akira


(2014/04/16 15:42), Dharmesh Kakadia wrote:

Hi Akira,

Thanks for the reply, but as I understand this is the interval of console
counter printing. What I am trying to get

while(!job.isComplete()){
   getcounters() and do some processing on that.
}

Now this is running fine, but the status I get the same counter values
repeatedly and then suddenly a big change in counter values.
For example, getcounters for REDUCE_INPUT_RECORDS returns values like

0
0
..
0
280
280
...
280
516
516
...
516

etc.

I want to get more finer values, instead of directly jumping from 280 to
516.
Did that make sense? mapreduce.client.progressmonitor.pollinterval does
not
seem to effect it. Any workaround ?

Thanks,
Dharmesh




On Tue, Apr 15, 2014 at 7:51 PM, Akira AJISAKA
<ajisa...@oss.nttdata.co.jp>wrote:

  Moved to u...@hadoop.apache.org.

You can configure the interval by setting
"mapreduce.client.progressmonitor.pollinterval" parameter.
The default value is 1000 ms.

For more details, please see http://hadoop.apache.org/docs/
stable/hadoop-mapreduce-client/hadoop-mapreduce-
client-core/mapred-default.xml.

Regards,
Akira


(2014/04/15 15:29), Dharmesh Kakadia wrote:

  Hi,

What is the update interval of inbuilt framework counters? Is that
configurable?
I am trying to collect very fine grained information about the job
execution and using counters for that. It would be great if someone can
point me to documentation/code for it. Thanks in advance.

Thanks,
Dharmesh








Reply via email to