[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675770#action_12675770 ]
Shalin Shekhar Mangar commented on SOLR-799: -------------------------------------------- bq. <field name="signatureField" type="signatureField" indexed="true" stored="false" signature="solr.TextProfileSignature" fields="product_name, model_t, *_s" /> I don't think signatureField is a separate type. It is just a string, right? bq. The patch as committed moves the specification of one field out of schema.xml file to another file. bq. That is, the design of the signature field should go in schema.xml, and each updateRequest section should only describe how it is used with that section's declared name. Also, there should be no default field, since every field in the schema should be described in schema.xml. The design of the signature field goes into schema.xml right now too. The wiki clearly states the following about signatureField: {code} The name of the field used to hold the fingerprint/signature. Be sure the field is defined in schema.xml. {code} bq. <field name="signatureField" type="signatureField" indexed="true" stored="false" signature="solr.TextProfileSignature" fields="product_name, model_t, *_s" /> I don't agree with the above. The method of computing the contents of the field should not be part of schema.xml. I do not understand your concern, maybe because I'm not very familiar with this feature. > Add support for hash based exact/near duplicate document handling > ----------------------------------------------------------------- > > Key: SOLR-799 > URL: https://issues.apache.org/jira/browse/SOLR-799 > Project: Solr > Issue Type: New Feature > Components: update > Reporter: Mark Miller > Assignee: Yonik Seeley > Priority: Minor > Fix For: 1.4 > > Attachments: SOLR-799.patch, SOLR-799.patch, SOLR-799.patch, > SOLR-799.patch > > > Hash based duplicate document detection is efficient and allows for blocking > as well as field collapsing. Lets put it into solr. > http://wiki.apache.org/solr/Deduplication -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.