Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "bin/nutch mergedb" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/bin/nutch%20mergedb?action=diff&rev1=2&rev2=3

  Mergedb is an alias for org.apache.nutch.crawl.CrawlDbMerger
  
- This tool merges several crawldb's into one, optionally filtering URLs 
through the current URLFilters, to skip prohibited pages. It is possible to use 
this tool just for filtering - in that case only one crawldb should be 
specified in arguments. If more than one CrawlDb contains information about the 
same URL, only the most recent version is retained, as determined by the value 
of ''''' org.apache.nutch.crawl.CrawlDatum#getFetchTime()'''''. However, all 
metadata information from all versions is accumulated, with newer values taking 
precedence over older values.
+ This tool merges several crawldb's into one, optionally filtering URLs 
through the current URLFilters, to skip prohibited pages. It is possible to use 
this tool just for filtering - in that case only one crawldb should be 
specified in arguments. If more than one crawldb contains information about the 
same URL, only the most recent version is retained, as determined by the value 
of  org.apache.nutch.crawl.CrawlDatum#getFetchTime(). However, all metadata 
information from all versions is accumulated, with newer values taking 
precedence over older values.
   
  
  Usage: 

Reply via email to