David Smiley created SOLR-17102:
-----------------------------------
Summary: VersionBucket not needed
Key: SOLR-17102
URL: https://issues.apache.org/jira/browse/SOLR-17102
Project: Solr
Issue Type: Improvement
Security Level: Public (Default Security Level. Issues are Public)
Components: SolrCloud
Reporter: David Smiley
SolrCloud ensures that updates for the same document ID are done in the correct
order internally in the face of possible re-orders during replication / log
replay. In order to ensure the updates are applied consecutively, a lock is
held on a hash of the ID for the doc. A hash is used to limit the number of
total locks because the locks are pre-created in advance for the core
(numVersionBuckets == 65k by default). The memory is non-negligible with many
cores, and it introduces the possibility of collisions, especially at lower
bucket counts if you configure it much lower.
Here I propose doing away with a pre-created hashed bucket strategy. Instead,
I propose more simply creating and GC'ing a lock per update being processed,
and using a ConcurrentHashMap to hold those in-flight. This strategy is
already used in org.apache.solr.util.OrderedExecutor.SparseStripedLock, more or
less.
Doing this is more tractable now that VersionBucket only holds a lock, not a
version anymore – SOLR-17036
The biggest challenge is that the code calls for the ability to use a Condition
to away/notify, which means the solution can't just re-use SparseStripedLock
above nor be quite so simple.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]