iamaleksey commented on code in PR #4106:
URL: https://github.com/apache/cassandra/pull/4106#discussion_r2070147166
##########
src/java/org/apache/cassandra/replication/Shard.java:
##########
@@ -94,6 +109,42 @@ void addSummaryForRange(AbstractBounds<PartitionPosition>
range, boolean include
});
}
+ List<InetAddressAndPort> remoteReplicas()
+ {
+ List<InetAddressAndPort> replicas = new
ArrayList<>(participants.size() - 1);
+ for (int i = 0, size = participants.size(); i < size; ++i)
+ {
+ int hostId = participants.get(i);
+ if (hostId != localHostId)
+ replicas.add(ClusterMetadata.current().directory.endpoint(new
NodeId(hostId)));
+ }
+ return replicas;
+ }
+
+ /**
+ * Collects replicated offsets for the logs owned by this coordinator on
this shard.
+ */
+ ShardReplicatedOffsets collectReplicatedOffsets()
+ {
+ Long2ObjectHashMap<LogReplicatedOffsets> offsets = new
Long2ObjectHashMap<>();
+ for (CoordinatorLogPrimary log : primaryLogs())
Review Comment:
It's not about the broadcast payload size in isolation, which I agree is
ultimately not a serious issue. There is also work that you need to do with
that message when it arrives. Multiply that by frequency of broadcasts, and -
possibly - by RF, and you get the final cost. There is a maximum cost that we
are willing to pay here, and the main variable - client write frequency being
mainly outside of our control - is the frequency of broadcasts. If only the
coordinator does broadcasting of its logs' states, then you can have a higher
frequency of broadcasts. If every replica does, then you have to scale down the
maximum broadcast frequency by an order of RF. And we want the broadcasts to be
*frequent* to make reads as cheap as possible. Every avoidable delay in
propagation potentially costs us blocking on reconciles that don't really need
to be done, and/or triggering SRP that could be avoided by a broadcast arriving
earlier. Additionally, the broadcasts from non-coordinator nodes wi
ll be always almost entirely redundant subsets of coordinator's broadcasts -
who will always have the freshest and fullest picture, barring some in-flight
write responses.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]