aweisberg commented on code in PR #4569:
URL: https://github.com/apache/cassandra/pull/4569#discussion_r2743636168
##########
src/java/org/apache/cassandra/replication/CoordinatorLog.java:
##########
@@ -283,6 +283,24 @@ Offsets.Immutable collectReconciledOffsets()
}
}
+ /**
+ * Returns the UNION of all witnessed offsets from all participants.
+ * This represents all offsets that ANY replica has witnessed.
+ */
+ Offsets.Immutable collectUnionOfWitnessedOffsets()
Review Comment:
Spotted another wrinkle. So network message can be delayed arbitrarily. So
you can wait 300 milliseconds, but when you receive the broadcast you don't
know what is included in that broadcast. Both the message and its contents can
be arbitrarily delayed because what you don't know is when it was actually
collected from the data structures at the node sending the message.
In a request response scenario you know when it was sent (some time after
the request). The broadcast message could include the current time, but we
don't have hard guarantees on clock sync and in simulator we actively bring
clocks out of sync and this would probably cause visible failures.
You can do the "wait on broadcast" thing, but it has to include something
from sync coordinator node in the broadcast message that shows the sync
coordinator that the information was collected after the sync coordinator
started.
##########
src/java/org/apache/cassandra/replication/CoordinatorLog.java:
##########
@@ -283,6 +283,24 @@ Offsets.Immutable collectReconciledOffsets()
}
}
+ /**
+ * Returns the UNION of all witnessed offsets from all participants.
+ * This represents all offsets that ANY replica has witnessed.
+ */
+ Offsets.Immutable collectUnionOfWitnessedOffsets()
Review Comment:
Spotted another wrinkle. Network message can be delayed arbitrarily. So you
can wait 300 milliseconds, but when you receive the broadcast you don't know
what is included in that broadcast. Both the message and its contents can be
arbitrarily delayed because what you don't know is when it was actually
collected from the data structures at the node sending the message.
In a request response scenario you know when it was sent (some time after
the request). The broadcast message could include the current time, but we
don't have hard guarantees on clock sync and in simulator we actively bring
clocks out of sync and this would probably cause visible failures.
You can do the "wait on broadcast" thing, but it has to include something
from sync coordinator node in the broadcast message that shows the sync
coordinator that the information was collected after the sync coordinator
started.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]