
this code won't capture uncommitted duplicates.

On Wed, Jul 31, 2013 at 9:41 AM, Dotan Cohen <dotanco...@gmail.com> wrote:

> On Tue, Jul 30, 2013 at 11:14 PM, Jack Krupansky
> <j...@basetechnology.com> wrote:
> > The Solr SignatureUpdateProcessorFactory is designed to facilitate
> dedupe...
> > any particular reason you did not use it?
> >
> > See:
> > http://wiki.apache.org/solr/Deduplication
> >
> > and
> >
> > https://cwiki.apache.org/confluence/display/solr/De-Duplication
> >
> Actually, the guy who made the changes (a coworker) did in fact write
> an alternative UpdateHandler. I've just noticed that there are a bunch
> of dupes right now, though.
> public class DiscoAPIUpdateHandler extends DirectUpdateHandler2 {
>     public DiscoAPIUpdateHandler(SolrCore core) {
>         super(core);
>     }
>     @Override
>     public int  addDoc(AddUpdateCommand cmd) throws IOException{
>         // if overwrite is set to false we'll use the
> DefaultUpdateHandler2 , this is done for debugging to insert
> duplicates to solr
>         if (!cmd.overwrite) return super.addDoc(cmd);
>         // when using ref counted objects you have!! to decrement the
> ref count when your done
>         RefCounted<SolrIndexSearcher> indexSearcher =
> this.core.getNewestSearcher(false);
>         // the idea is like this we'll make an internal lucene query
> and check if that id already exists
>         Term updateTerm = null;
>         if (cmd.updateTerm != null){
>             updateTerm = cmd.updateTerm;
>         } else {
>             updateTerm = new Term("id",cmd.getIndexedId());
>         }
>         Query query = new TermQuery(updateTerm);
>         TopDocs docs = indexSearcher.get().search(query,2);
>         if (docs.totalHits>0){
>             // index searcher is no longer needed
>             indexSearcher.decref();
>             // don't add the new document
>             return 0;
>         }
>         // index searcher is no longer needed
>         indexSearcher.decref();
>         // if i'm here then it's a new document
>         return super.addDoc(cmd);
>     }
> }
> > And I give a bunch of examples in my book.
> >
> I anticipate the book with esteem!
> --
> Dotan Cohen
> http://gibberish.co.il
> http://what-is-what.com

Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics


Reply via email to