Jira ticket added https://issues.apache.org/jira/browse/SOLR-1908
Wiki entry updated http://wiki.apache.org/solr/Deduplication

On Tuesday 11 May 2010 17:24:10 Mark Miller wrote:
> 1. You need to set the sig field to indexed.
> 2. This should be added to the wiki
> 3. Want to make a JIRA issue? This is not very friendly behavior (when
> you have the sig field set to indexed=false and overwriteDupes=true it
> should likely complain)
> 
> > List,
> >
> >
> > I've stumbled upon an issue with the deduplication mechanism. It either
> > deletes all documents or does nothing at all and it depends on the
> > overwriteDupes setting, resp. true and false.
> >
> > I use a slightly modified configuration:
> >
> >    <updateRequestProcessorChain name="dedupe">
> >      <processor
> > class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory">
> >        <bool name="enabled">true</bool>
> >        <str name="signatureField">sig</str>
> >        <bool name="overwriteDupes">true</bool>
> >        <str name="fields">content</str>
> >        <str
> > name="signatureClass">org.apache.solr.update.processor.Lookup3Signature</
> >str> </processor>
> >      <processor class="solr.LogUpdateProcessorFactory" />
> >      <processor class="solr.RunUpdateProcessorFactory" />
> >    </updateRequestProcessorChain>
> >
> >
> >          <field name="sig" type="string" stored="true" indexed="false"
> > multiValued="true" />
> >
> > After importing new documents i (only with overwriteDupes=false) can
> > clearly see the correct signatures. Most documents have a distinct
> > signature and some share the same because the content field's value is
> > identical for those documents.
> >
> >
> > Anyway, why does it delete all my documents? Any clues? The wiki is not
> > very helpful on this subject.
> >
> >
> > Cheers.
> >
> >
> > Markus Jelsma - Technisch Architect - Buyways BV
> > http://www.linkedin.com/in/markus17
> > 050-8536620 / 06-50258350
> 

Markus Jelsma - Technisch Architect - Buyways BV
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Reply via email to