[
https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675770#action_12675770
]
Shalin Shekhar Mangar commented on SOLR-799:
--------------------------------------------
bq. <field name="signatureField" type="signatureField" indexed="true"
stored="false" signature="solr.TextProfileSignature" fields="product_name,
model_t, *_s" />
I don't think signatureField is a separate type. It is just a string, right?
bq. The patch as committed moves the specification of one field out of
schema.xml file to another file.
bq. That is, the design of the signature field should go in schema.xml, and
each updateRequest section should only describe how it is used with that
section's declared name. Also, there should be no default field, since every
field in the schema should be described in schema.xml.
The design of the signature field goes into schema.xml right now too. The wiki
clearly states the following about signatureField:
{code}
The name of the field used to hold the fingerprint/signature. Be sure the field
is defined in schema.xml.
{code}
bq. <field name="signatureField" type="signatureField" indexed="true"
stored="false" signature="solr.TextProfileSignature" fields="product_name,
model_t, *_s" />
I don't agree with the above. The method of computing the contents of the field
should not be part of schema.xml. I do not understand your concern, maybe
because I'm not very familiar with this feature.
> Add support for hash based exact/near duplicate document handling
> -----------------------------------------------------------------
>
> Key: SOLR-799
> URL: https://issues.apache.org/jira/browse/SOLR-799
> Project: Solr
> Issue Type: New Feature
> Components: update
> Reporter: Mark Miller
> Assignee: Yonik Seeley
> Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-799.patch, SOLR-799.patch, SOLR-799.patch,
> SOLR-799.patch
>
>
> Hash based duplicate document detection is efficient and allows for blocking
> as well as field collapsing. Lets put it into solr.
> http://wiki.apache.org/solr/Deduplication
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.