Hello everybody, i have tried to make use of the UniqFieldsUpdateProcessorFactory in order to achieve distinct values in multivalued fields. Example below:
<updateRequestProcessorChain name="uniq_fields"> <processor class="org.apache.solr.update.processor.UniqFieldsUpdateProcessorFactory"> <lst name="fields"> <str>title</str> <str>tag_type</str> </lst> </processor> <processor class="solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> <requestHandler name="/update" class="solr.UpdateRequestHandler"> <lst name="defaults"> <str name="update.chain">uniq_fields</str> </lst> </requestHandler> However the data being is indexed one by one. This may happen, since a document may will get an additional tag in a future update. Unfortunately in order to ensure not having any duplicate tags, i was hoping, the UpdateProcessorFactory is doing what i want to achieve. In order to actually add a tag, i am sending an "tag_type" :{"add":"foo"}, which still adds the tag, without questioning if its already part of the field. How may i be able to achieve distinct values on solr side?! In order to achieve this behavior i suggest writing an own processor might be a solution. However i am uncertain how to do and if it's the proper way. Imagine an incoming update - e.g. an update of an existing document having several multivalued fields without specifying "add" or "set". This task would cause the corresponding document to get dropped and re-indexed without keeping any previously added values within the multivalued field. Therefore if a field is getting updated and not having the distinct value being part of the index yet, shall add the value, otherwise ignore it. The processor needs to define whether a field is getting added to the index or not in condition of the existing index. Is that achievable on Solr side?! Below my current pretty empty processor class: public class ConditionalSolrUniqFieldValuesProcessorFactory extends UpdateRequestProcessorFactory { @Override public UpdateRequestProcessor getInstance(SolrQueryRequest sqr, SolrQueryResponse sqr1, UpdateRequestProcessor urp) { return new ConditionalUniqFieldValuesProcessor(urp); } class ConditionalUniqFieldValuesProcessor extends UpdateRequestProcessor { public ConditionalUniqFieldValuesProcessor(UpdateRequestProcessor next) { super(next); } @Override public void processAdd(AddUpdateCommand cmd) throws IOException { SolrInputDocument doc = cmd.getSolrInputDocument(); Collection<String> incomingFieldNames = doc.getFieldNames(); for (String t : incomingFieldNames) { /* is multivalued if (doc.getField(t).) { If multivalued and already part of index, drop from index. Otherwise add to multivalued field. } */ } } } } -- View this message in context: http://lucene.472066.n3.nabble.com/Distinct-values-in-multivalued-fields-tp4074337.html Sent from the Solr - User mailing list archive at Nabble.com.