Ken:
Thank you very much for the info, I applied it my testing enviornment
and I could see big changes in my bandwidth utilization. I have tried
it on a simple server and i could get a rather constant 25-29
pages/sec in a vertical crawl. Previously I was getting about 5-7
pages/sec.
Cheers
Zahee
On 6/28/06, Ken Krugler <[EMAIL PROTECTED]> wrote:
Hi Doug,
Did you ever resolve your 0.8 vs 0.7 crawling performance question? I'm
running into a similar problem.
We wound up dramatically increasing the number of threads, which
seemed to help solve the bandwidth utilization problem. With Nut
On 6/28/06, Ken Krugler <[EMAIL PROTECTED]> wrote:
Hi Doug,
>Did you ever resolve your 0.8 vs 0.7 crawling performance question? I'm
>running into a similar problem.
We wound up dramatically increasing the number of threads, which
seemed to help solve the bandwidth utilization problem. With Nut
Ok, this isn't true because it is sorted by using HashComparator. For
some reason
generated list contains some parts wich are more or less sorted by host
and some
parts looks more "random".
This is consistent with what I am seeing; the Fetcher slowing down for
a while, sometimes coming to a virt
Fetchlist seems to be sorted by url.This leads to many threads being
Ok, this isn't true because it is sorted by using HashComparator. For
some reason
generated list contains some parts wich are more or less sorted by host
and some
parts looks more "random".
--
Sami Siren
+1 for a solution to this pressing issue!
I am seeing the same problem, in my case two symptoms:
1) low fetch speeds
2) crawls end "before their time" with "aborting with xxx hung
threads" error message
I am doing a focussed crawl on about 70.000 domains.
crawl.ignore.external.links is set to t
Ken Krugler wrote:
Hi Doug,
Did you ever resolve your 0.8 vs 0.7 crawling performance question? I'm
running into a similar problem.
We wound up dramatically increasing the number of threads, which
seemed to help solve the bandwidth utilization problem. With Nutch 0.7
we were running about
Hi Doug,
Did you ever resolve your 0.8 vs 0.7 crawling performance question? I'm
running into a similar problem.
We wound up dramatically increasing the number of threads, which
seemed to help solve the bandwidth utilization problem. With Nutch
0.7 we were running about 200 threads per crawl
Byron,
Did you ever resolve your 0.8 vs 0.7 crawling performance question? I'm
running into a similar problem.
--
View this message in context:
http://www.nabble.com/.8-svnfetcher-performance..-tf1170232.html#a5076764
Sent from the Nutch - User forum at Nabble.com.
Byron Miller wrote:
Anything i should change/tweak on my fetcher config
for .8 release? i'm only getting 5 pages/sec and i was
getting nearly 50 on .7 with 125 threads going. Does
.8 not use threads like 7 did?
Byron,
Have you tried again more recently? A number of bugs have been fixed in
0
Anything i should change/tweak on my fetcher config
for .8 release? i'm only getting 5 pages/sec and i was
getting nearly 50 on .7 with 125 threads going. Does
.8 not use threads like 7 did?
I believe i'm just using the standard protocol-http
support, not http-client
11 matches
Mail list logo