RE: Re[2]: what contibute to fetch slowing down

2005-10-11 Thread ogjunk-nutch
-Original Message- From: Daniele Menozzi [mailto:[EMAIL PROTECTED] Sent: Monday, October 10, 2005 5:42 PM To: nutch-dev@lucene.apache.org Subject: Re: Re[2]: what contibute to fetch slowing down On 03:36:45 03/Oct , Michael wrote: 3mbit, 100 threads = 15 pages/sec cpu is low

Re: Re[2]: what contibute to fetch slowing down

2005-10-10 Thread Daniele Menozzi
On 03:36:45 03/Oct , Michael wrote: 3mbit, 100 threads = 15 pages/sec cpu is low during fetch, so its bandwidth limit. yes, cpu is low, and even memory is quite free. But, with a 10MB in/out I cannot obtain good results (and I do not parse results, simply fetch them). If I use 100 threads, I

Re: what contibute to fetch slowing down

2005-10-10 Thread Daniele Menozzi
On 09:59:45 03/Oct , Doug Cutting wrote: I suspect threads are hanging, probably in the parser, I tried to not parse, but without good results. If I use 100 threads, I can download pages at 500KB/s for about 5 seconds, but after that, the download rate falls to 0. If I set 20 threads, I can

Re: what contibute to fetch slowing down

2005-10-03 Thread Doug Cutting
Fuad Efendi wrote: I found this in J2SE API for setReuseAddress(default: false): = When a TCP connection is closed the connection may remain in a timeout state for a period of time after the connection is closed (typically known as the TIME_WAIT state or 2MSL wait state). For applications

RE: what contibute to fetch slowing down

2005-10-03 Thread Fuad Efendi
Socket.close()... But we need to perform real tests anyway. -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Monday, October 03, 2005 1:05 PM To: nutch-dev@lucene.apache.org Subject: Re: what contibute to fetch slowing down Fuad Efendi wrote: If I am right, we are simply

RE: what contibute to fetch slowing down

2005-10-02 Thread Fuad Efendi
not only for us but also for Production Web Sites. Thanks, Fuad -Original Message- From: Fuad Efendi [mailto:[EMAIL PROTECTED] Sent: Friday, September 30, 2005 10:58 PM To: nutch-dev@lucene.apache.org; [EMAIL PROTECTED] Subject: RE: what contibute to fetch slowing down Dear Nutchers, I

Re: what contibute to fetch slowing down

2005-10-02 Thread AJ Chen
Update on fetch performance of my current run: download speed has been stable at 3.8 pages/sec, about 640kbps. This is probably limited by my bandwidth - regular DSL service, promising up to 1.5 mbps inbound but realistically only 640 kbps. More than 1 million pages were fetched, but it took

RE: what contibute to fetch slowing down

2005-10-02 Thread Michael Ji
but also for Production Web Sites. Thanks, Fuad -Original Message- From: Fuad Efendi [mailto:[EMAIL PROTECTED] Sent: Friday, September 30, 2005 10:58 PM To: nutch-dev@lucene.apache.org; [EMAIL PROTECTED] Subject: RE: what contibute to fetch slowing down Dear Nutchers

Re: what contibute to fetch slowing down

2005-10-02 Thread Ken Krugler
Update on fetch performance of my current run: download speed has been stable at 3.8 pages/sec, about 640kbps. This is probably limited by my bandwidth - regular DSL service, promising up to 1.5 mbps inbound but realistically only 640 kbps. More than 1 million pages were fetched, but it took

Re: what contibute to fetch slowing down

2005-10-02 Thread Ken Krugler
Correction to my previous post. I'd said: When you use the FetchListTool to emit multiple lists, it intentionally divides up the list using the MD5 value for the link, so that you get hosts scattered between the lists. But for a single list, this doesn't happen, and thus the max threads/host

RE: what contibute to fetch slowing down

2005-10-02 Thread Fuad Efendi
, Nutch needs few days. 8mbps/800kbps, download/upload. -Original Message- From: Michael Ji [mailto:[EMAIL PROTECTED] Sent: Sunday, October 02, 2005 5:37 PM To: nutch-dev@lucene.apache.org Subject: RE: what contibute to fetch slowing down Kelvin's OC implementation is queuing fetching

RE: what contibute to fetch slowing down

2005-10-02 Thread Fuad Efendi
://grinder.sourceforge.net - very simple Java based proxy) -Original Message- From: Michael Ji [mailto:[EMAIL PROTECTED] Sent: Sunday, October 02, 2005 5:37 PM To: nutch-dev@lucene.apache.org Subject: RE: what contibute to fetch slowing down Kelvin's OC implementation is queuing fetching

RE: what contibute to fetch slowing down

2005-09-30 Thread Fuad Efendi
? Should I create a new bug report at JIRA? SUN's Socket, Apache's HttpClient, UNIX's networking... -Original Message- From: Daniele Menozzi [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 28, 2005 4:42 PM To: nutch-dev@lucene.apache.org Subject: Re: what contibute to fetch slowing down

what contibute to fetch slowing down

2005-09-28 Thread AJ Chen
I started the crawler with about 2000 sites. The fetcher could achieve 7 pages/sec initially, but the performance gradually dropped to about 2 pages/sec, sometimes even 0.5 pages/sec. The fetch list had 300k pages and I used 500 threads. What are the main causes of this slowing down? Below

Re: what contibute to fetch slowing down

2005-09-28 Thread Jack Tang
Hi AJ I guess the growing of thread. You can show the thread id in the log. I think it makes sence Regards /Jack On 9/29/05, AJ Chen [EMAIL PROTECTED] wrote: I started the crawler with about 2000 sites. The fetcher could achieve 7 pages/sec initially, but the performance gradually dropped to

Re: what contibute to fetch slowing down

2005-09-28 Thread Daniele Menozzi
On 10:27:55 28/Sep , AJ Chen wrote: I started the crawler with about 2000 sites. The fetcher could achieve 7 pages/sec initially, but the performance gradually dropped to about 2 pages/sec, sometimes even 0.5 pages/sec. The fetch list had 300k pages and I used 500 threads. What are the