Nick Hadder created SOLR-16160: ---------------------------------- Summary: UpdateXmlMessages duplicate data when data is removed and then added in the same message Key: SOLR-16160 URL: https://issues.apache.org/jira/browse/SOLR-16160 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: search, update Affects Versions: 8.11.1 Reporter: Nick Hadder Attachments: image-2022-04-20-10-34-08-573.png, image-2022-04-20-10-35-05-247.png
*Replication Steps* 1. Have two multi-value fields with the following schema {code:java} <field name="docTags" type="plongs" multiValued="true" indexed="true" stored="true"/><field name="tg0001" type="ipro_strings" multiValued="true" indexed="true" stored="true"/> <fieldType name="plong" class="solr.LongPointField" docValues="true"/> <fieldType name="ipro_strings" class="solr.TextField" sortMissingLast="true" multiValued="true"> <analyzer> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> {code} 2. Execute the following UpdateXmlMessage {code:java} <add commitWithin="1000"> <doc> <field name="_id">1</field> <field name="docTags" update="remove"><![CDATA[1]]></field> <field name="tg0001" update="remove"><![CDATA[Convert to Image]]></field> <field name="docTags" update="remove"><![CDATA[4]]></field> <field name="tg0001" update="remove"><![CDATA[Large Files]]></field> <field name="docTags" update="remove"><![CDATA[6]]></field> <field name="tg0001" update="remove"><![CDATA[To Bulk-Print]]></field> </doc> </add> <add commitWithin="1000"> <doc> <field name="_id">1</field> <field name="docTags" update="remove"><![CDATA[6]]></field> <field name="tg0001" update="remove"><![CDATA[To Bulk-Print]]></field> <field name="docTags" update="add-distinct"><![CDATA[1]]></field> <field name="tg0001" update="add-distinct"><![CDATA[Convert to Image]]></field> <field name="docTags" update="add-distinct"><![CDATA[4]]></field> <field name="tg0001" update="add-distinct"><![CDATA[Large Files]]></field> </doc> </add> <add commitWithin="1000"> <doc> <field name="_id">1</field> <field name="docTags" update="remove"><![CDATA[1]]></field> <field name="tg0001" update="remove"><![CDATA[Convert to Image]]></field> <field name="docTags" update="remove"><![CDATA[4]]></field> <field name="tg0001" update="remove"><![CDATA[Large Files]]></field> <field name="docTags" update="add-distinct"><![CDATA[6]]></field> <field name="tg0001" update="add-distinct"><![CDATA[To Bulk-Print]]></field> </doc> </add> {code} 3. Observe the following defect of duplicate values in those fields for that document !image-2022-04-20-10-35-05-247.png! *Note:* If you add the data first in the Xml message and the update="remove" tags at the bottom, it works as expected and only adds once instance of the data from the above update="add-distinct" message. The issue only occurs if the remove tags come before the add-distinct tags. Is this because of some undocumented order the updates need to be in or is it a true defect that it is not working as expected? -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org