aweisberg commented on code in PR #4569:
URL: https://github.com/apache/cassandra/pull/4569#discussion_r2743636168


##########
src/java/org/apache/cassandra/replication/CoordinatorLog.java:
##########
@@ -283,6 +283,24 @@ Offsets.Immutable collectReconciledOffsets()
         }
     }
 
+    /**
+     * Returns the UNION of all witnessed offsets from all participants.
+     * This represents all offsets that ANY replica has witnessed.
+     */
+    Offsets.Immutable collectUnionOfWitnessedOffsets()

Review Comment:
   Spotted another wrinkle. So network message can be delayed arbitrarily. So 
you can wait 300 milliseconds, but when you receive the broadcast you don't 
know what is included in that broadcast. Both the message and its contents can 
be arbitrarily delayed because what you don't know is when it was actually 
collected from the data structures at the node sending the message. 
   
   In a request response scenario you know when it was sent (some time after 
the request). The broadcast message could include the current time, but we 
don't have hard guarantees on clock sync and in simulator we actively bring 
clocks out of sync and this would probably cause visible failures.
   
   You can do the "wait on broadcast" thing, but it has to include something 
from sync coordinator node in the broadcast message that shows the sync 
coordinator that the information was collected after the sync coordinator 
started.



##########
src/java/org/apache/cassandra/replication/CoordinatorLog.java:
##########
@@ -283,6 +283,24 @@ Offsets.Immutable collectReconciledOffsets()
         }
     }
 
+    /**
+     * Returns the UNION of all witnessed offsets from all participants.
+     * This represents all offsets that ANY replica has witnessed.
+     */
+    Offsets.Immutable collectUnionOfWitnessedOffsets()

Review Comment:
   Spotted another wrinkle. Network message can be delayed arbitrarily. So you 
can wait 300 milliseconds, but when you receive the broadcast you don't know 
what is included in that broadcast. Both the message and its contents can be 
arbitrarily delayed because what you don't know is when it was actually 
collected from the data structures at the node sending the message. 
   
   In a request response scenario you know when it was sent (some time after 
the request). The broadcast message could include the current time, but we 
don't have hard guarantees on clock sync and in simulator we actively bring 
clocks out of sync and this would probably cause visible failures.
   
   You can do the "wait on broadcast" thing, but it has to include something 
from sync coordinator node in the broadcast message that shows the sync 
coordinator that the information was collected after the sync coordinator 
started.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to