Comments inline: On Wed, Jun 17, 2015 at 3:18 PM, Markus.Mirsberger <markus.mirsber...@gmx.de> wrote: > Hi, > > I am trying to use the dedupe feature to detect and mark near duplicate > content in my collections. > I dont want to prevent duplicate content. I woud like to detect it and keep > it for further processing. Thats why Im using an extra field and not the > documents unique field. > > Here is how I added it to the solrConfig.xml : > > <requestHandler name="/update" class="solr.UpdateRequestHandler"> > <lst name="defaults"> > <str name="update.chain">fill_signature</str> > </lst> > </requestHandler> > > <updateRequestProcessorChain name="fill_signature" > processor="signature"> > <processor class="solr.RunUpdateProcessorFactory" /> > </updateRequestProcessorChain> > > <updateProcessor class="solr.processor.SignatureUpdateProcessorFactory" > name="signature"> > <bool name="enabled">true</bool> > <str name="signatureField">signature</str> > <bool name="overwriteDupes">false</bool> > <str name="fields">content</str> > <str > name="signatureClass">solr.processor.TextProfileSignature</str> > <str name="quantRate">.2</str> > <str name="minTokenLen">3</str> > </updateProcessor> > > When I initially add the documents to the cloud everything works as expected > ..... the documents are added and the signature will be created and > added.....perfect:) > The problem occours when I want to update an exisiting document. In that > case the update.chain=fill_signature parameter will of course be set too and > I get a bad request error. > > I found this solr issue: https://issues.apache.org/jira/browse/SOLR-3473 > > Is it that problem I am running into?
You haven't pasted the complete error response so I am guessing a bit here. It is possible that you are running into the same problem i.e. the "signature" is being calculated again and the signature field not multi-valued, causes an error. > Is it somehow possible to add parameters or set a specific update Handler > when Im adding documents to the cloud using solrJ? Yes, any custom parameter can be added to a SolrJ request. There is a setParam(String param, String value) method available in AbstractUpdateRequest which can be used to set a custom update.chain for each SolrJ request. > In that case I could ether set the update.chain manually and remove it from > the request handler or write a second request Handler which I only use if I > want set the signature field. > I know I can do that manually when Im using eg curl but is it also possible > with SolrJ? :) > > > Thanks, > Markus > > > > -- Regards, Shalin Shekhar Mangar.