Nick Hadder created SOLR-16160:
----------------------------------

             Summary: UpdateXmlMessages duplicate data when data is removed and 
then added in the same message
                 Key: SOLR-16160
                 URL: https://issues.apache.org/jira/browse/SOLR-16160
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: search, update
    Affects Versions: 8.11.1
            Reporter: Nick Hadder
         Attachments: image-2022-04-20-10-34-08-573.png, 
image-2022-04-20-10-35-05-247.png

*Replication Steps*

1. Have two multi-value fields with the following schema 
{code:java}
<field name="docTags" type="plongs" multiValued="true" indexed="true" 
stored="true"/><field name="tg0001" type="ipro_strings" multiValued="true" 
indexed="true" stored="true"/>

<fieldType name="plong" class="solr.LongPointField" docValues="true"/>
<fieldType name="ipro_strings" class="solr.TextField" sortMissingLast="true" 
multiValued="true">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType> 

{code}
2. Execute the following UpdateXmlMessage
{code:java}
<add commitWithin="1000">
<doc>
<field name="_id">1</field>
<field name="docTags" update="remove"><![CDATA[1]]></field>
<field name="tg0001" update="remove"><![CDATA[Convert to Image]]></field>
<field name="docTags" update="remove"><![CDATA[4]]></field>
<field name="tg0001" update="remove"><![CDATA[Large Files]]></field>
<field name="docTags" update="remove"><![CDATA[6]]></field>
<field name="tg0001" update="remove"><![CDATA[To Bulk-Print]]></field>
</doc>
</add>
<add commitWithin="1000">
<doc>
<field name="_id">1</field>
<field name="docTags" update="remove"><![CDATA[6]]></field>
<field name="tg0001" update="remove"><![CDATA[To Bulk-Print]]></field>
<field name="docTags" update="add-distinct"><![CDATA[1]]></field>
<field name="tg0001" update="add-distinct"><![CDATA[Convert to Image]]></field>
<field name="docTags" update="add-distinct"><![CDATA[4]]></field>
<field name="tg0001" update="add-distinct"><![CDATA[Large Files]]></field>
</doc>
</add>
<add commitWithin="1000">
<doc>
<field name="_id">1</field>
<field name="docTags" update="remove"><![CDATA[1]]></field>
<field name="tg0001" update="remove"><![CDATA[Convert to Image]]></field>
<field name="docTags" update="remove"><![CDATA[4]]></field>
<field name="tg0001" update="remove"><![CDATA[Large Files]]></field>
<field name="docTags" update="add-distinct"><![CDATA[6]]></field>
<field name="tg0001" update="add-distinct"><![CDATA[To Bulk-Print]]></field>
</doc>
</add> {code}
3. Observe the following defect of duplicate values in those fields for that 
document

!image-2022-04-20-10-35-05-247.png!

*Note:* If you add the data first in the Xml message and the update="remove" 
tags at the bottom, it works as expected and only adds once instance of the 
data from the above update="add-distinct" message. The issue only occurs if the 
remove tags come before the add-distinct tags.

 

Is this because of some undocumented order the updates need to be in or is it a 
true defect that it is not working as expected?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to