[ https://issues.apache.org/jira/browse/CASSANDRA-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17205069#comment-17205069 ]
David Capwell commented on CASSANDRA-16148: ------------------------------------------- The change to Gossiper isn't thread safe. Previous we used memoize which guarded the state with a sync on the memoizer, now that we have removed this there are no locks/fences in place to make sure the field access is safe (tests would directly access, but read/write was the same thread so not a issue there); it should be enough to add volatile to org.apache.cassandra.gms.Gossiper#haveMajorVersion3Nodes. Also, I am concerned that there may be perf implications with larger clusters during mixed-mode case, specifically this call site org.apache.cassandra.db.filter.ColumnFilter.Builder#build; would need to look closer but this looks like the query hot path so would want to confirm. A simple way to avoid this would be to still memoize with the 1m expires, but to avoid caching when version isn't known (the assumption the race window is "small" so performance hit during this window). {code} Gossiper.instance.injectApplicationState(addressAndPort, ApplicationState.RELEASE_VERSION, new VersionedValue.VersionedValueFactory(partitioner).releaseVersion()); {code} This logic isn't correct as you need the release version of the node you are adding, this method takes the release version of the current node (example: node1 is 3.0, node2 is 4.0. node2 would update gossip to say node1 is 4.0). https://github.com/apache/cassandra/compare/trunk...jrwest:jwest/16148#diff-9d7e7c7ee33324ba61d6240be9e88f7eR51 Can you remove the comment which switches to Feature.GOSSIP? > Test failures caused by merging CASSANDRA-15833 > ----------------------------------------------- > > Key: CASSANDRA-16148 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16148 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip > Reporter: Jordan West > Assignee: Jordan West > Priority: Normal > > Three issues were caused by merging CASSANDRA-15833: > 1. `GossiperTest#testHaveAnyVersion3Nodes` was failing on trunk: > https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771 > 2. python dtest ReadRepairTest#test_atomic_writes[blocking] was failing > 3. In-jvm dtests being worked on as part of CASSANDRA-15977 uncovered an > issue with how CASSANDRA-15833 changes interacted with in-jvm dtests running > without {{Feature.GOSSIP}} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org