Lucas Kot-Zaniewski created SOLR-18194:
------------------------------------------
Summary: Handle Updated Documents in UpgradeCoreIndex Child Doc
Check
Key: SOLR-18194
URL: https://issues.apache.org/jira/browse/SOLR-18194
Project: Solr
Issue Type: Improvement
Reporter: Lucas Kot-Zaniewski
Assignee: Lucas Kot-Zaniewski
The child doc check in UpgradeCoreIndex compares the docCount with the terms
size of the {{\_root\_}} field with the intention that if there are more Lucene
documents than unique values (equivalent to the size of terms) then some Lucene
documents share a {{\_root\_}}. Nominally this is true, however in case of
document updates a single _Solr_ document can be written to the _same_ segment
if they are in the _same_ commit window. In this case the (Lucene) docCount
will exceed the terms size of {{\_root\_}} even though none of the documents
have children. A better check would be to explicitly compare this with the
count of Solr documents. Owing to the cycle-free relationship of {{\_root\_}}
we know that if the count of distinct Solr documents equals the distinct values
of {{\_root\_}} then we have no child docs. We can implement this by looking at
the distinct values of Solr's unique key field.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]