Actually even if I put 100 threads it does no go faster, I have 30Mbit/s
fiber internet connection so that shouldn't be the problem.

I thought if I would put more threads I could fetch more sites in parrallel
and so use more of the bandwidth & the CPU... so waiting on DNS should be
seen.
Or is it that I need run muliple fetchers in parallel, but I'm not sure how
to do that and merge the results back at the end.

-Ray-

2009/4/28 Dennis Kubes <[email protected]>

> Java Threads do take advantage of multiple cores.  The fetcher does use
> multiple threads.  Also having multiple fetcher tasks on a single machine
> will utilize more of the CPU.  Even with 50 threads on a single machine,
> depending on the websites being crawled the utilization might not get that
> much higher.  Much of the time spent in fetching is spent waiting on DNS and
> the websites being fetched.
>
> Dennis
>
>
> Raymond Balmčs wrote:
>
>> I use a dual core intel, I observed the crawls never gets above 50% mark
>> CPU
>> load, despite the fact that used -threads 50... does nutch take advantage
>> of
>> multi-cores ?
>> Do I miss a setting somewhere ?
>>
>> -Ray-
>>
>>

Reply via email to