Finally, I was able to implement desirable behavior using your suggestions as follows:
- Added StatelessScriptUpdateProcessorFactory before SignatureUpdateProcessorFactory in order to analyze "field1" and set analyzed value to "field1_tmp_ss" - Passed "field1_tmp_ss" to SignatureUpdateProcessorFactory - Used IgnoreFieldUpdateProcessorFactory to ignore "field1_tmp_ss" from document stored Everything seems to work fine and as expected. Thank you very much, Have a nice day, Leonidas 2017-01-25 19:19 GMT+02:00 Alexandre Rafalovitch <arafa...@gmail.com>: > It might be possible by sticking additional update request processors > before the signature one. For example clone field, regex instead of > tokenizing on the clone, then signature. If a clone is too much of a > burden, it may even be possible to then add IgnoreField URP to remove > it or map it in the schema to index/store/docValues=false field. > > Regards, > Alex. > P.s. The full all-in-one list of URPs is available at: > http://www.solr-start.com/info/update-request-processors/ > > ---- > http://www.solr-start.com/ - Resources for Solr users, new and experienced > > > On 25 January 2017 at 12:00, Markus Jelsma <markus.jel...@openindex.io> > wrote: > > Hello, > > > > This is not possible out of the box, you would need to manually pass the > input through an analyzer with a tokenizer and your steming token filter, > and put the output together again. > > > > Markus > > > > > > > > -----Original message----- > >> From:Leonidas Zagkaretos <leonz...@gmail.com> > >> Sent: Wednesday 25th January 2017 17:51 > >> To: solr-user@lucene.apache.org > >> Subject: Pass Analyzed Field to SignatureUpdateProcessorFactory > >> > >> Hi all, > >> > >> We have successfully integrated Solr in our application, and now we are > >> facing a requirement where the application should be able to search for > >> duplicate records in Solr core based on equality in 3 distinct fields. > >> > >> Tried using SignatureUpdateProcessorFactory as described in > >> https://cwiki.apache.org/confluence/display/solr/De-Duplication and > >> Lookup3Signature and everything seems to work fine, signature field is > >> being filled with unique hash values. > >> > >> One issue we have, is that we need to pass to > >> SignatureUpdateProcessorFactory the stemmed value of 1 of 3 fields. > >> Currenty, the following documents produce different hash values, and we > >> need them to produce unique. > >> Analysis for field1 and values "value1_a" and "value1_b" produce stemmed > >> value "value1" > >> > >> documentA: { > >> field1: value1_a, > >> field2: value2, > >> field3: value3, > >> signature: hash_value1 > >> } > >> > >> documentB: { > >> field1: value1_b, > >> field2: value2, > >> field3: value3, > >> signature: hash_value2 > >> } > >> > >> I would like to ask whether it is possible to have required behavior, > and > >> some tips about how to accomplish this task. > >> > >> Thank you in advance, > >> > >> Leonidas > >> >