There is a mapred.tasktracker.map.tasks.maximum configuration variable, which defaults to 2 for hadoop in distributed mode. It defines the max number of map tasks concurrently per tasktracker. With that though you should have 1 task per machine/tasktracker, seems like you only have one running total? Any more information about your setup or what you are seeing?

To answer your previous question fetcher.threads.fetch defines the number of fetchers inside of a map task. So say you have 2 map tasks on each task tracker, 10 task trackers, and 20 threads = 2 * 10 * 20 = 400 concurrent fetchers.

Dennis

caezar wrote:
From many domains. It generates 15 tasks, as configured. Each task is
executed. One by one, no tasks run simultaneosly.

Dennis Kubes-2 wrote:
Are you fetching urls from a random set or all from a single domain? If all from a single domain (including subdomains) then the partitionar for fetcher will put them all into a single map task.

Dennis


Reply via email to