What sorts of rules would govern which one should be kept? Say you were adding three indexes and there was a document in each that was identical. Which one should be kept?
I suspect any rule would be wrong at least part of the time.... FWIW Erick On Mon, Feb 22, 2010 at 5:02 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > addIndexes doesn't make this possible. > > Maybe add the indexes but then make a 2nd pass to dedup? > > Mike > > On Mon, Feb 22, 2010 at 4:26 PM, jchang <jchangkihat...@gmail.com> wrote: > > > > When I call IndexWriter.addIndexes, is there anything I can do to make it > > filter out duplicates based a certain field (or group of fields)? If I > > know that the id field of the document is unique, can I make addIndexes > know > > that if it finds a new document bat the same id, the new one is valid and > > the old one should be overwritten (or deleted and the new one added in > its > > place)? > > > > I don't see anything like unique constraint in the Field class; I know > > Lucene is not a SQL database, but i just wanted to check to make sure I'm > > not missing anything. > > > > > > -- > > View this message in context: > http://old.nabble.com/can-IndexWriter.addIndexes-de-dupe-documents--tp27694763p27694763.html > > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >