Hi, Regarding politeness, 3 threads per queue is not really polite :)
Cheers -----Original message----- > From:jc <jvizu...@gmail.com> > Sent: Fri 01-Mar-2013 15:08 > To: user@nutch.apache.org > Subject: Re: a lot of threads spinwaiting > > Hi Roland and lufeng, > > Thank you very much for your replies, I already tested lufeng advice, with > results pretty much as expected. > > By the way, my nutch installation is based on 2.1 version with hbase as > crawldb storage > > Roland, maybe fetcher.server.delay param has something to do with that as > well, I set it to 3 secs, setting it to 0 would be unpolite? > > All info you provided has helped me a lot, only one issue remains unfixed > yet, there are more than 60 URLs from different hosts in my seed file, and > only 20 queues, things may seem that all other 40 hosts have no more URLs to > generate, but I really haven't seen any URL coming from those hosts since > the creation of the crawldb. > > Based on my poor experience following params would allow a number of 60 > queues for my vertical crawl, am I missing something? > > topN = 1 million > fetcher.threads.per.queue = 3 > fetcher.threads.per.host = 3 (just in case, I remember you told me to use > per.queue instead) > fetcher.threads.fetch = 200 > seed urls of different hosts = 60 or more (regex-urlfilter.txt allows only > urls from these hosts, they're all there, I checked) > crawldb record count > 1 million > > Thanks again for all your help > > Regards, > JC > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/a-lot-of-threads-spinwaiting-tp4043801p4043988.html > Sent from the Nutch - User mailing list archive at Nabble.com. >