Java Threads do take advantage of multiple cores. The fetcher does use
multiple threads. Also having multiple fetcher tasks on a single
machine will utilize more of the CPU. Even with 50 threads on a single
machine, depending on the websites being crawled the utilization might
not get that much higher. Much of the time spent in fetching is spent
waiting on DNS and the websites being fetched.
Dennis
Raymond Balmès wrote:
I use a dual core intel, I observed the crawls never gets above 50% mark CPU
load, despite the fact that used -threads 50... does nutch take advantage of
multi-cores ?
Do I miss a setting somewhere ?
-Ray-