What sorts of rules would govern which one should be
kept? Say you were adding three indexes and there
was a document in each that was identical. Which one
should be kept?

I suspect any rule would be wrong at least part of
the time....

FWIW
Erick

On Mon, Feb 22, 2010 at 5:02 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> addIndexes doesn't make this possible.
>
> Maybe add the indexes but then make a 2nd pass to dedup?
>
> Mike
>
> On Mon, Feb 22, 2010 at 4:26 PM, jchang <jchangkihat...@gmail.com> wrote:
> >
> > When I call IndexWriter.addIndexes, is there anything I can do to make it
> > filter out duplicates based a certain field (or group of fields)?   If I
> > know that the id field of the document is unique, can I make addIndexes
> know
> > that if it finds a new document bat the same id, the new one is valid and
> > the old one should be overwritten (or deleted and the new one added in
> its
> > place)?
> >
> > I don't see anything like unique constraint in the Field class; I know
> > Lucene is not a SQL database, but i just wanted to check to make sure I'm
> > not missing anything.
> >
> >
> > --
> > View this message in context:
> http://old.nabble.com/can-IndexWriter.addIndexes-de-dupe-documents--tp27694763p27694763.html
> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to