dcapwell commented on code in PR #3664:
URL: https://github.com/apache/cassandra/pull/3664#discussion_r1835158806


##########
src/java/org/apache/cassandra/service/accord/AccordConfigurationService.java:
##########
@@ -381,7 +381,7 @@ public synchronized void onNodesRemoved(long epoch, 
Topology current, Set<Node.I
     private long[] nonCompletedEpochsBefore(long max)

Review Comment:
   there are 3 things that matter:
   
   1) `epochs.minEpoch()`
   2) `epochs.maxEpoch()`
   3) `snapshot.syncStatus`
   
   first 2 share the epoch lock, and `maxEpoch` could grow between calls to 
`minEpoch`, but for this context that seems fine.
   
   That leaves `snapshot.syncStatus`, which does need the 
`AccordConfigurationService.this` lock for visibility...
   
   To limit the locks as much as possible I could lock `getEpochSnapshot` or I 
can just make syncStatus volatile... given that this is the only code path that 
isn't accessed with a lock and its via `getEpochSnapshot` which is only called 
here, I went with sync on `getEpochSnapshot`.
   
   A 3rd option is just accept that its stale... by the time this method exits 
the status might flip to complete, so we double add the node to 
`remote_sync_complete` in the topologies table, since this is a set it will 
dedup for us.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to