Currently  I use default setting for generate.max.per.host, no limit for the 
fetchlist size for hosts...This might be the point,I'll run a test .I have one 
question about Fetcher2:
I begined my fetch with 200 threads. After some time,199 active threads left 
with 198 threads spinWaiting. What's the difference between the 
died one thread and 198 spinWaiting thread ?

In my understanding, If a thread "A" start with fetching website "A", then 
threadA can only be used to fetch websiteA during the whole life cycle. If the 
thread A cost too much time(longer than 0.5s)  to finish  fetching and parsing 
a page, the thread A will be set spinWaiting. Thread A will died if no pages 
left for Website A.Pls. refer the code below :
if (feeder.isAlive() || fetchQueues.getTotalSize() > 0) {
       LOG.debug(getName() + " spin-waiting ...");
       // spin-wait.
       spinWaiting.incrementAndGet();
       try {
        Thread.sleep(500);
       } catch (Exception e) {
       }
       spinWaiting.decrementAndGet();
       continue;
      } else {
       // all done, finish this thread
       return;
      }

Is my understanding right ?


----- Original Message ----- 
From: "Sami Siren" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Tuesday, April 03, 2007 12:29 AM
Subject: Re: Fetcher2 too many spinWaiting, How to tune?


> hi,
> 
> 
> qi wu wrote:
>> Hi, I am using  Fetcher2 with 200 threads started. I get a satisify
>> speed(about 20pages/s)  at the beginning stage ,but after no more
>> than one hour,there are many spinWaiting threads. Where might be the
>> bottleneck? network, memory or anyplace else? Could you also give me
>> some hints on how to get more detailed debug info?
> 
> Not specific to fetcher2, but how are the pages distributed among
> different hosts in fetchlist? Have you configured reasonable setting for
> generate.max.per.host in nutch conf?
> 
> If you generate too many pages for too few hosts there's no way
> fetcher|fetcher2 can fetch them fast unless you make it non polite.
> 
> --
> Sami Siren
> 
>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to