I guess you can run segmentMergeTool to merge new
segments with previous one ( document with duplicated
URL and content MD5 will be discarded) and then run
index on it,

not sure if it is the best scenario for daily
refetching---just my thought based on the code I dig
out,

Michael Ji,

--- Lokkju <[EMAIL PROTECTED]> wrote:

> I have searched through the mail archives, and seen
> this question
> asked alot, but no answer ever seems to come back. 
> I am going to be
> using nutch against 5 sites, and I want to update
> the index on a
> nightly basis.  Besides deleting the previous crawl,
> then running it
> again, what method of doing nightly updates is
> recommended?
> 
> Thanks,
> Nick
> 



        
                
__________________________________ 
Yahoo! Mail - PC Magazine Editors' Choice 2005 
http://mail.yahoo.com


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to