Hi Vishal,

On 5/23/07, Vishal Shah <[EMAIL PROTECTED]> wrote:
Hi Dogacan,

   Fetcher2 gives a better performance when the number of hosts per task is
more than the number of threads that the task can use. In this case, fetcher
might block on some hosts, whereas fetcher2 will use that idle time in
crawling some other host.

   It could be that the number of hosts per task is not significantly higher
than the number of threads per task. In that case, ideally you should see a
similar performance from fetcher2 and fetcher (assuming same url list and
network bandwidth).

  Also, as Andrzej suggested - it would be good to have some more debugging
info.

Have you tested Fetcher2 after NUTCH-474? There were a couple of bugs
in Fetcher2 that made it work just like Fetcher (because lib-http
still blocked threads, making Fetcher2's queue logic useless).

Looking at the code, I can't see any other bugs, but I am still
testing, perhaps I will find a couple more(or perhaps, I will find out
that something in my conf is broken).


Regards,

-vishal.

-----Original Message-----
From: Dogacan Güney [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 23, 2007 8:21 PM
To: [email protected]
Subject: Re: [Nutch-general] Fetcher2 slowness?

On 5/23/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> So what was Fetcher2's performance like when its number of threads was the
same as that of Fetcher?

It is still slower. I tried giving Fetcher2 more threads,it is still
worse than Fetcher but a bit better than fewer-threaded
Fetcher2(Fetcher finished in 1 hour, Fetcher2 in about 2.5). Though I
have performed other tests where their performance is similar(and I
have no idea why). I am trying to find the cause of problem, but so
far, had no luck.

>
> Otis
>

[snip]

--
Dogacan Güney




--
Doğacan Güney

Reply via email to