beobal commented on code in PR #4282:
URL: https://github.com/apache/cassandra/pull/4282#discussion_r2262216604


##########
src/java/org/apache/cassandra/cql3/statements/schema/CreateKeyspaceStatement.java:
##########
@@ -82,6 +82,13 @@ public Keyspaces apply(ClusterMetadata metadata)
         }
 
         KeyspaceMetadata keyspaceMetadata = 
KeyspaceMetadata.create(keyspaceName, attrs.asNewKeyspaceParams());
+        // we deduplicate ReplicationParams here to use the same objects in 
KeyspaceMetadata
+        // as we have as keys in metadata.placements to have a fast map lookup
+        // ReplicationParams are immutable, so it is a safe optimization
+        KeyspaceParams keyspaceParams = keyspaceMetadata.params;
+        ReplicationParams replicationParams = 
metadata.placements.deduplicateReplicationParams(keyspaceParams.replication);

Review Comment:
   One thing is that this logic _is not_ applied when we deserialize a 
`ClusterMetadata` snapshot, which can happen during replay at startup or when 
catching up from a peer. 
   In that case no deduplication is done, so there will still be multiple 
equivalent `ReplicationParams` instances in schema. Similarly, the map keys in 
`DataPlacements` will be distinct from those instances. 
   I don't believe this affects the intent of this patch, as the deduplication 
of read/write`ReplicaGroups` in the `DataPlacement` construction is still done 
(i.e. if `reads.equals(writes)` so the new reference equality check in 
`ReplicaLayout` is still valid.
   Of course, the memoized hashcode in `ReplicationParams` still works as 
expected so despite the `KeyspaceParams` and `DataPlacements` having pointers 
to different instances, the map lookup is still fine.
   
   So I think the issue is that this deduplication may lead to some confusion 
if it isn't applied consistently. Deduping the replication params instances 
during deserialization may be more trouble than it's worth,  especially as it's 
actually the hashcode memoization & this `ReplicaGroups` deduplication that 
actually provide the benefit.     



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to