could you define duplicate? As far as I know, you don't get the same (internal) doc id back more than once, so what is a duplicate?
Best Erick On Mon, Jul 21, 2008 at 9:40 AM, Sebastin <[EMAIL PROTECTED]> wrote: > > at the time search , while querying the data > markrmiller wrote: > > > > Sebastin wrote: > >> Hi All, > >> > >> Is there any possibility to avoid duplicate records in lucene 2.3.1? > >> > > I don't believe that there is a very high performance way to do this. > > You are basically going to have to query the index for an id before > > adding a new doc. The best way I can think of off the top of my head is > > to batch - first check that ids in the batch are unique, then check all > > ids in the batch against the IndexReader, then add the ones that are not > > dupes. Of course all of your docs would have to be added through this > > single choke point so that you knew other threads had not added that id > > after the first thread had looked but before it added the doc. > > > > I think Mark H has you covered if getting the dupes out after are okay. > > > > - Mark > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > -- > View this message in context: > http://www.nabble.com/How-to-avoid-duplicate-records-in-lucene-tp18543588p18568862.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >