You wont be able to use the data in its non-completed state, an unfortunate 
limitation of the Hadoop versions.
 
When you look at the logs, is it fetching only from a few hosts? Is it fetching 
pages at all anymore?

I'm thinking your fetch list is at a point where it might only have few hosts 
left in it, but enough pages from those hosts to stall everything up. Recently 
there was a patch applied to trunk to help solve that problem, the generator 
was actually not working to its fullest capacity for some time up until that 
point.
 
----- Original Message ----
From: Shailendra Mudgal <[EMAIL PROTECTED]>
To: [email protected]
Sent: Thursday, January 18, 2007 1:54:09 AM
Subject: Re: How to stop a slow fetch?


Hi,

Oh I m sorry . I forgot to mention the version.
I am using nutch0.9 with hadoop0.9.1.

Thanks

On 1/18/07, Sean Dean <[EMAIL PROTECTED]> wrote:
>
> Which version of Nutch are you using in this case?
>
> Depending on that, you will either have to wait it out or you will have
> the choice of stopping the process and using what has already been fetched
> for the remaining tasks.
>
>
> ----- Original Message ----
> From: Shailendra Mudgal <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Thursday, January 18, 2007 12:26:27 AM
> Subject: How to stop a slow fetch?
>
>
> Hi,
>
> I am crawling around 4M pages using nutch. With my configuration, for
> completing up to 95 % of the fetch job it took around 3 hours. But after
> that it has already taken more then 10 hours. when i looked at task list i
> found that there are 3 tasks are still running, annd they are very slow. I
> want to stop this job so that i can use this data for further steps.
>
> Anybody has some idea regarding this ???
>
> Your kind help will be appreciated.
>
> Thanks and Regards,
>
> Shailendra
>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to