Is it possible for me to dedup a Lucene index on a Hadoop filsystem against a finished Lucene index?
I build up my index with Nutch as per normal, but I would like to inject single urls and merge the result into the final index without having to run a full crawl. Cheers Rob ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
