Q1. Is is possible to pass *analyzed* content to the

public abstract class Signature {
  public void init(SolrParams nl) {  }
  public abstract String calculate(String content);
}


Q2. Method calculate() is using concatenated fields from <str
name="fields">name,features,cat</str>
Is there any mechanism I could build  "field dependant signatures"?

Use case for this: I have two fields:
OWNER , TEXT
I need to disable *fuzzy* duplicates for one owner, one clean way
would be to make prefixed signature "OWNER/FUZZY_SIGNATURE"

Is  idea to make two UpdadeProcessors and chain them OK? (Is ugly, but
would work)

  <updateRequestProcessorChain name="signature_hard">
      <bool name="enabled">true</bool>
      <bool name="overwriteDupes">false</bool>
      <str name="signatureField">exact_signature</str>
      <str name="fields">OWNER</str>
      <str name="signatureClass">ExactSignature</str>
    </processor>
  </updateRequestProcessorChain>

hard_signature should not be  stored and not indexed field

  <updateRequestProcessorChain name="fuzzy_and_mix">
      <bool name="enabled">true</bool>
      <bool name="overwriteDupes">true</bool>
      <str name="signatureField">mixed_signature</str>
      <str name="fields">exact_signature, TEXT</str>
      <str name="signatureClass">MixedSignature</str>
    </processor>
  </updateRequestProcessorChain>

 <field name="hard_signature"   type="string" stored="false"
indexed="false" multiValued="false" />
 <field name="mixed_signature" type="string" stored="true"
indexed="true" multiValued="false" />

Assuming I know how long my exact_signature is, I could calculate
fuzzy part and mix it properly.

Possible, better ideas?

Thanks,
eks

Reply via email to