[ https://issues.apache.org/jira/browse/SOLR-17102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated SOLR-17102: ---------------------------------- Labels: pull-request-available (was: ) > VersionBucket not needed > ------------------------ > > Key: SOLR-17102 > URL: https://issues.apache.org/jira/browse/SOLR-17102 > Project: Solr > Issue Type: Improvement > Components: SolrCloud > Reporter: David Smiley > Assignee: David Smiley > Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > SolrCloud ensures that updates for the same document ID are done in the > correct order internally in the face of possible re-orders during replication > / log replay. In order to ensure the updates are applied consecutively, a > lock is held on a hash of the ID for the doc. A hash is used to limit the > number of total locks because the locks are pre-created in advance for the > core (numVersionBuckets == 65k by default). The memory is non-negligible > with many cores, and it introduces the possibility of collisions, especially > at lower bucket counts if you configure it much lower. > Here I propose doing away with a pre-created hashed bucket strategy. > Instead, I propose more simply creating and GC'ing a lock per update being > processed, and using a ConcurrentHashMap to hold those in-flight. This > strategy is already used in > org.apache.solr.util.OrderedExecutor.SparseStripedLock, more or less. > Doing this is more tractable now that VersionBucket only holds a lock, not a > version anymore – SOLR-17036 > The biggest challenge is that the code calls for the ability to use a > Condition to away/notify, which means the solution can't just re-use > SparseStripedLock above nor be quite so simple. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org