Hey Andrew,

Just wondering if you ever managed to run TextProfileSignature based
deduplication. I would appreciate it if you could send me the code fragment
for it from  solrconfig.

I have currently something like this, but not sure if I am doing it right:

 <updateRequestProcessorChain name="dedupe">
    <processor
class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory">
      <bool name="enabled">true</bool>
      <str name="signatureField">signature</str>
      <bool name="overwriteDupes">true</bool>
      <str name="fields">title,author,abstract</str>
      <str
name="signatureClass">org.apache.solr.update.processor.TextProfileSignature</str>
      <str name="minTokenLen">3</str>
    </processor>
    <processor class="solr.LogUpdateProcessorFactory" />
    <processor class="solr.RunUpdateProcessorFactory" />
  </updateRequestProcessorChain> 

--

Thanks in advance,
-Ali
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Filtering-near-duplicates-using-TextProfileSignature-tp479039p880044.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to