I've been reviewing the four different merge commands (as of nutch v0.9):
$ nutch | grep merg
mergedb merge crawldb-s, with optional filtering
mergesegs merge several segments, with optional filtering and slicing
mergelinkdb merge linkdb-s, with optional filtering
merge merge several segment indexes
Here are the javadocs:
mergedb --
http://lucene.apache.org/nutch/apidocs/org/apache/nutch/crawl/CrawlDbMerger.html
mergesegs --
http://lucene.apache.org/nutch/apidocs/org/apache/nutch/segment/SegmentMerger.html
mergelinkdb --
http://lucene.apache.org/nutch/apidocs/org/apache/nutch/crawl/LinkDbMerger.html
merge --
http://lucene.apache.org/nutch/apidocs/org/apache/nutch/indexer/IndexMerger.html
Naively: why are there four merge commands? Are some subsets of the others?
Are they used in conjunction? What are the usage scenarios of each?
I notice that Andrzej wrote the first three, and they have wiki entries (pretty
much the same as the javadoc):
(I found these from http://www.mail-archive.com/[EMAIL PROTECTED]/msg03588.html)
http://wiki.apache.org/nutch/nutch-0.8-dev/bin/nutch_mergedb
http://wiki.apache.org/nutch/nutch-0.8-dev/bin/nutch_mergelinkdb
http://wiki.apache.org/nutch/nutch-0.8-dev/bin/nutch_mergesegs
It seems most of the nutch-user discussions I've seen so far relate to the
simple merge command. Are the first three "advanced commands"?
____________________________________________________________________________________
Yahoo! oneSearch: Finally, mobile search
that gives answers, not web links.
http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general