Look at Deduplication:

http://wiki.apache.org/solr/Deduplication

It implements a unique hashcode (Lookup3Signature <http://wiki.apache.org/solr/Lookup3Signature> ) as a tool that avoids rewriting the same document over and over. It declares this in solrconfig.xml instead of schema.xml.

Lance

Staffan wrote:
Hi,

I am looking for a way to store the checksum of a field's value, something like:

<field name="text"...>
<!-- the SHA1 checksum of text (before applying analyzer) -->
<field name="text_sha1" type="checksum" indexed="true" stored="true">
...
<copyField source="text" dest="text_sha1">

I haven't found anything like that in the docs or on google. Did I
miss something? If not, would a custom tokenizer be a good way to
implement it?

/Staffan

Reply via email to