: what i need is ,to log the existing urlid and new urlid(of course both will
: not be same) ,when a .xml file of same id(unique field) is posted.
: 
: I want to make this by modifying the solr source.Which file do i need to
: modify so that i could get the above details in log ?
: 
: I tried with DirectUpdateHandler2.java(which removes the duplicate
: entries),but efforts in vein.

DirectUpdateHandler2.java (on the trunk) delegates to Lucene-Java's 
IndexWriter.updateDocument method when you have a uniqueKey and you aren't 
allowing duplicates -- this method doesn't give you any way to access the 
old document(s) that had that existing key.

The easiest way to make a change like what you are interested in might be 
an UpdateProcessor that does a lookup/search for the uniqueKey of each 
document about to be added to see if it already exists.  that's probably 
about as efficient as you can get, and would be nicely encapsulated.

You might also want to take a look at SOLR-799, where some work is being 
done to create UpdateProcessors that can do "near duplicate" detection...

http://wiki.apache.org/solr/Deduplication
https://issues.apache.org/jira/browse/SOLR-799






-Hoss

Reply via email to