beobal commented on code in PR #4282:
URL: https://github.com/apache/cassandra/pull/4282#discussion_r2262216604
##########
src/java/org/apache/cassandra/cql3/statements/schema/CreateKeyspaceStatement.java:
##########
@@ -82,6 +82,13 @@ public Keyspaces apply(ClusterMetadata metadata)
}
KeyspaceMetadata keyspaceMetadata =
KeyspaceMetadata.create(keyspaceName, attrs.asNewKeyspaceParams());
+ // we deduplicate ReplicationParams here to use the same objects in
KeyspaceMetadata
+ // as we have as keys in metadata.placements to have a fast map lookup
+ // ReplicationParams are immutable, so it is a safe optimization
+ KeyspaceParams keyspaceParams = keyspaceMetadata.params;
+ ReplicationParams replicationParams =
metadata.placements.deduplicateReplicationParams(keyspaceParams.replication);
Review Comment:
One thing is that this logic _is not_ applied when we deserialize a
`ClusterMetadata` snapshot, which can happen during replay at startup or when
catching up from a peer.
In that case no deduplication is done, so there will still be multiple
equivalent `ReplicationParams` instances in schema. Similarly, the map keys in
`DataPlacements` will be distinct from those instances.
I don't believe this affects the intent of this patch, as the deduplication
of read/write`ReplicaGroups` in the `DataPlacement` construction is still done
(i.e. if `reads.equals(writes)` so the new reference equality check in
`ReplicaLayout` is still valid.
Of course, the memoized hashcode in `ReplicationParams` still works as
expected so despite the `KeyspaceParams` and `DataPlacements` having pointers
to different instances, the map lookup is still fine.
So I think the issue is that this deduplication may lead to some confusion
if it isn't applied consistently. Deduping the replication params instances
during deserialization may be more trouble than it's worth, especially as it's
actually the hashcode memoization & this `ReplicaGroups` deduplication that
actually provide the benefit.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]