Re: Nutch fetcher, all map tasks pending except one

Dennis Kubes Thu, 18 Jun 2009 07:32:25 -0700

There is a mapred.tasktracker.map.tasks.maximum configuration variable,which defaults to 2 for hadoop in distributed mode. It defines the maxnumber of map tasks concurrently per tasktracker. With that though youshould have 1 task per machine/tasktracker, seems like you only have onerunning total? Any more information about your setup or what you areseeing?

To answer your previous question fetcher.threads.fetch defines thenumber of fetchers inside of a map task. So say you have 2 map tasks oneach task tracker, 10 task trackers, and 20 threads = 2 * 10 * 20 = 400concurrent fetchers.


Dennis

caezar wrote:

From many domains. It generates 15 tasks, as configured. Each task is
executed. One by one, no tasks run simultaneosly.

Dennis Kubes-2 wrote:
Are you fetching urls from a random set or all from a single domain? Ifall from a single domain (including subdomains) then the partitionar forfetcher will put them all into a single map task.
Dennis

Re: Nutch fetcher, all map tasks pending except one

Reply via email to