Re: [Nutch-general] checksum error in segment merger

Andrzej Bialecki Tue, 16 Jan 2007 09:06:43 -0800

Brian Whitman wrote:
>
> On Jan 15, 2007, at 2:05 PM, Brian Whitman wrote:
>
>>
>> OK-- I will check the RAID overnight and run the crawl again on a 
>> different drive. I can't just re-run the segment merge because the 
>> re-crawl script deletes all the segment directories whether or not 
>> they were successfully merged.
>>
>
>
> Just an update, there were errors on the RAID which we repaired. The 
> crawl ran fine last night. Thanks for the info about the status codes 
> -- I updated my crawl script to check for them.


Great, nice to know that there was some real cause to this!

BTW. this is not the first time I see that Hadoop detects non-obvious 
errors in hardware or connectivity on a cluster - on one hand, it would 
be nice if it were less susceptible to this kind of errors, on the other 
hand - it makes for a good diagnostic tool ;)

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] checksum error in segment merger

Reply via email to