On 3/13/06, emerson cargnin <[EMAIL PROTECTED]> wrote: > I notice some duplicated entries in my index, my just looking at it, > and I suspect there might be more than those I found out. Is there a > way to detect duplicate documents in an index? > > Emerson Cargnin
If there is a field with a unique value for every document, it's relatively easy. Use a TermEnum to iterate over all values for that field. For every term, delete all but the last doc (use IndexReader.termDocs() to enumerate documents matching a certain term). -Yonik http://incubator.apache.org/solr Solr, The Open Source Lucene Search Server --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]