On a small cluster, I'm of the opinion that a value less than 3 would actually be useful in reducing job startup time a little bit.
https://issues.apache.org/jira/browse/MAPREDUCE-1266 The issue got stalled a bit. If you want it, pipe up on the JIRA :) Especially if you have hard data indicating this is a good idea (I never had the time to really prove it) -Todd On Tue, Feb 9, 2010 at 10:21 AM, E. Sammer <e...@lifeless.net> wrote: > On 2/9/10 11:52 AM, ChingShen wrote: >> >> Hi, >> >> I have a question about HEARTBEAT_INTERVAL. >> Why does the default HEARTBEAT_INTERVAL value is 3 rather than 2 or 1? >> any >> resources? > > Shen: > > While I don't have a good answer for why the number 3 was chosen (actually, > I think it's 5 seconds on heartbeats and the 3 seconds is how often a task > tracker thread checks if progress is being made or something like that), I > can tell you that there's network chatter caused by the heartbeat. You > wouldn't want heartbeat to be any faster as you would unnecessarily cause > network congestion and force the job tracker to do additional (possibly > unnecessary) work. As the cluster grows, the heartbeat interval is increased > leading to even less frequent check-ins to attempt to mitigate the > congestion / high concurrency on the JT. > > One of the down sides to this is that tasks aren't given to task trackers as > quickly as they could be, but there are probably better ways of decreasing > the amount of time required to hand out work rather than simply increasing > the heartbeat rate. > > Keep in mind that most Hadoop jobs run for long periods of time, so the > slight delay in handing out tasks isn't a huge problem and 3 to 5 seconds is > more than sufficient to know that a task tracker is alive and healthy. > > Hope this helps. > -- > Eric Sammer > e...@lifeless.net > http://esammer.blogspot.com >