Caleb Rackliffe created CASSANDRA-19107:
-------------------------------------------

             Summary: Revert unnecessary read lock acquisition when reading 
ring version in TokenMetadata introduced in CASSANDRA-16286
                 Key: CASSANDRA-19107
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19107
             Project: Cassandra
          Issue Type: Improvement
          Components: Legacy/Distributed Metadata
            Reporter: Caleb Rackliffe
            Assignee: Caleb Rackliffe


CASSANDRA-16286 achieved its goal of making sure that concurrent increments to 
the ring version would independently increment the version (i.e. not "merge" 
multiple invalidations into single versions), but it also unnecessarily 
replaced the volatile read on {{ringVersion}} w/ making {{readVersion}} 
non-volatile and acquiring the read lock on the fair {{ReadWriteLock}} in 
{{TokenMetadata}}. This can result in unnecessary queueing w/ high CPU 
usage/read volume. For example, you might see this on a 4.0 cluster...

{noformat}
"Native-Transport-Requests-99" #271 daemon prio=5 os_prio=0 cpu=5822566.56ms 
elapsed=19477779.40s tid=0x00007fcc96c31b00 nid=0xb7bd waiting on condition  
[0x00007fcb7f144000]
  java.lang.Thread.State: WAITING (parking)
       at jdk.internal.misc.Unsafe.park(java.base@11.0.16/Native Method)
       - parking to wait for  <0x00000005c0ab92a0> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
       at 
java.util.concurrent.locks.LockSupport.park(java.base@11.0.16/LockSupport.java:194)
       at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.16/AbstractQueuedSynchronizer.java:885)
       at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(java.base@11.0.16/AbstractQueuedSynchronizer.java:1009)
       at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(java.base@11.0.16/AbstractQueuedSynchronizer.java:1324)
       at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(java.base@11.0.16/ReentrantReadWriteLock.java:738)
       at 
org.apache.cassandra.locator.TokenMetadata.getRingVersion(TokenMetadata.java:1389)
       at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getCachedReplicas(AbstractReplicationStrategy.java:82)
       at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalReplicas(AbstractReplicationStrategy.java:116)
       at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalReplicasForToken(AbstractReplicationStrategy.java:109)
       at 
org.apache.cassandra.locator.ReplicaLayout.forTokenWriteLiveAndDown(ReplicaLayout.java:209)
       at 
org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:328)
       at 
org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1426)
       at 
org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:937)
{noformat}

Reverting to a volatile read makes this no longer possible, but keeps the fix 
from CASSANDRA-16286 intact.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to