[ 
https://issues.apache.org/jira/browse/SOLR-18194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18079228#comment-18079228
 ] 

ASF subversion and git services commented on SOLR-18194:
--------------------------------------------------------

Commit 4f130b8c2f48aba397ce4eae147fb75477f4201e in solr's branch 
refs/heads/branch_9x from Luke Kot-Zaniewski
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=4f130b8c2f4 ]

SOLR-18194: fix nested docs detection false positive (#4279)

Previously if a segment had updates to the same Solr document (delete + add 
within a commit interval) the UPDATECOREINDEX action would falsely identify it 
as having child documents. These are not supported by the action so it would 
unnecessarily fail. We improve the check to compare cardinality of id with 
_root_ to identify child documents.

(cherry picked from commit 744184ab0d73993625a5df54999182a3ecad90a5)


> Handle Updated Documents in UpgradeCoreIndex Child Doc Check
> ------------------------------------------------------------
>
>                 Key: SOLR-18194
>                 URL: https://issues.apache.org/jira/browse/SOLR-18194
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Lucas Kot-Zaniewski
>            Assignee: Lucas Kot-Zaniewski
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> The child doc check in UpgradeCoreIndex compares the docCount with the terms 
> size of the {{\_root\_}} field with the intention that if there are more 
> Lucene documents than unique values (equivalent to the size of terms) then 
> some Lucene documents share a {{\_root\_}}. Nominally this is true, however 
> in case of document updates a single _Solr_ document can be written to the 
> _same_ segment if they are in the _same_ commit window. In this case the 
> (Lucene) docCount will exceed the terms size of {{\_root\_}} even though none 
> of the documents have children. A better check would be to explicitly compare 
> this with the count of Solr documents. Owing to the cycle-free relationship 
> of {{\_root\_}} we know that if the count of distinct Solr documents equals 
> the distinct values of {{\_root\_}} then we have no child docs. We can 
> implement this by looking at the distinct values of Solr's unique key field.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to