tasktracker heartbeat interval

Doug Cutting Thu, 13 Apr 2006 11:01:53 -0700

Currently, pseudo-distributed mode is *much* slower than "local" mode.It makes sense that running a trivial task on 100 nodes might takelonger than running it standalone, but running it on one node overlocalhost should not be that much slower. In part this is due to taskjvm startup time, but I think the larger part of the blame is heartbeatintervals.

The tasktracker polls for new tasks only every heartbeat interval. Whenrunning small jobs in small clusters, this interval dominatesperformance. But in larger clusters a short heartbeat interval wouldoverload the jobtracker. Perhaps the tasktracker should instead get itsheartbeat interval from the jobtracker. The jobtracker could return asmall interval when few tasktrackers are known, and a larger intervalwhen lots of tasktrackers are known. This would make small clustersmore responsive.


One could use a similar mechanism in dfs.

This is a very low priority issue that I just wanted to get out of my head.

Doug

tasktracker heartbeat interval

Reply via email to