Look at Deduplication: http://wiki.apache.org/solr/Deduplication
It implements a unique hashcode (Lookup3Signature <http://wiki.apache.org/solr/Lookup3Signature> ) as a tool that avoids rewriting the same document over and over. It declares this in solrconfig.xml instead of schema.xml.
Lance Staffan wrote:
Hi, I am looking for a way to store the checksum of a field's value, something like: <field name="text"...> <!-- the SHA1 checksum of text (before applying analyzer) --> <field name="text_sha1" type="checksum" indexed="true" stored="true"> ... <copyField source="text" dest="text_sha1"> I haven't found anything like that in the docs or on google. Did I miss something? If not, would a custom tokenizer be a good way to implement it? /Staffan