tolbertam commented on code in PR #4304:
URL: https://github.com/apache/cassandra/pull/4304#discussion_r2296375066
##########
src/java/org/apache/cassandra/config/DatabaseDescriptor.java:
##########
@@ -5940,6 +5940,16 @@ public static void
setIncrementalRepairDiskHeadroomRejectRatio(double value)
conf.incremental_repair_disk_headroom_reject_ratio = value;
}
+ public static boolean getMixedMajorVersionRepairEnabled()
+ {
+ return conf.auto_repair.mixed_major_version_repair_enabled;
+ }
+
+ public static void setMixedMajorVersionRepairEnabled(boolean
mixed_major_version_repair_enabled)
+ {
+ conf.auto_repair.mixed_major_version_repair_enabled =
mixed_major_version_repair_enabled;
+ }
+
Review Comment:
It looks like we already have a
`AutoRepairConfig.isMixedMajorVersionRepairEnabled()` which is currently
unused. Since this is an `AutoRepair`-specific thing can we remove these
methods and use methods on `AutoRepairConfig` instead?
Would also be nice to update `SetAutoRepairConfig` to allow configuring this
value.
##########
test/unit/org/apache/cassandra/repair/autorepair/AutoRepairUtilsTest.java:
##########
@@ -232,6 +236,52 @@ public void testGetHostIdsInCurrentRing_multiple_nodes()
assertTrue(hosts.contains(hostId));
}
+ @Test
+ public void testHasMultipleLiveMajorVersionsWithSingleNode()
+ {
+ boolean result = AutoRepairUtils.hasMultipleLiveMajorVersions();
+ assertFalse(result);
+ }
+
+ @Test
+ public void
testHasMultipleLiveMajorVersionsWithMultipleNodesOfSameVersion()
+ {
+ ClusterMetadataTestHelper.addEndpoint(2);
+ // Test the current behavior with the existing cluster setup
+ // In a single-node test environment, this should return false
+ boolean result = AutoRepairUtils.hasMultipleLiveMajorVersions();
+ assertFalse(result);
+ }
+
+ @Test
+ public void
testHasMultipleLiveMajorVersionsWithMultipleNodesOfSameMajorVersionDifferentMinorVersions()
+ {
+ // add two nodes with cassandra 5 major version, but different minor
version
+ CassandraVersion differentCassandraVersion = new CassandraVersion(
+ String.format("%d.%d",
+ NodeVersion.CURRENT.cassandraVersion.major,
+ NodeVersion.CURRENT.cassandraVersion.minor+1));
+ ClusterMetadataTestHelper.addEndpoint(2, new NodeVersion(
+ differentCassandraVersion,
+ NodeVersion.CURRENT_METADATA_VERSION));
+ // With the same major versions, but different minor versions, we
should still see this function return true
+ boolean result = AutoRepairUtils.hasMultipleLiveMajorVersions();
+ assertFalse(result);
+ }
+
+ @Test
+ public void
testHasMultipleLiveMajorVersionsWithMultipleNodesOfDifferentMajorVersions()
+ {
+ // add two nodes with cassandra 5 major version, but different minor
version
Review Comment:
A bit pedantic I know, but in future the current major version will change.
```suggestion
// add two nodes with current cassandra major version, but different
minor version
```
##########
conf/cassandra.yaml:
##########
@@ -2779,6 +2779,9 @@ storage_compatibility_mode: NONE
# # for a specified duration to ensure they are indeed removed before
adjustments are made to the schedule.
# history_clear_delete_hosts_buffer_interval: 2h
# # NOTE: Each of the below settings can be overridden per repair type under
repair_type_overrides
+# # by default repair is disabled if there are mixed major versions detected
- which would happen
+# # if a major version upgrade is being performed on the cluster, but a user
can enable it using this flag
+# # mixed_major_version_repair_enabled = false;
Review Comment:
Small suggestions:
* Move `#NOTE: ...` to the line before `global_settings` as it makes more
sense to be proximate to there.
* Uncomment `mixed_major_version_repair_enabled = false` and remove
semi-colon.
```suggestion
# # by default repair is disabled if there are mixed major versions
detected - which would happen
# # if a major version upgrade is being performed on the cluster, but a
user can enable it using this flag
# mixed_major_version_repair_enabled = false
# # NOTE: Each of the below settings can be overridden per repair type
under repair_type_overrides
```
A couple of other things that would to be good to do:
1. Can you also add this to `cassandra_latest.yaml` as well?
2. Add to 'Top level settings' in auto_repair.adoc? In that file it would
also be good to expand on why repairs should not be run during major version
upgrades (previous slack thread about it:
https://the-asf.slack.com/archives/C06MR8K2Q0Z/p1729716524563759)
##########
test/distributed/org/apache/cassandra/distributed/test/log/ClusterMetadataTestHelper.java:
##########
@@ -798,19 +798,27 @@ public MoveProcess finishMove()
public static void addEndpoint(int i)
{
+ addEndpoint(i, NodeVersion.CURRENT);
+ }
+
+ public static void addEndpoint(int i, NodeVersion nodeVersion) {
try
Review Comment:
`{` on next line
```suggestion
public static void addEndpoint(int i, NodeVersion nodeVersion)
{
try
```
##########
test/distributed/org/apache/cassandra/distributed/test/log/ClusterMetadataTestHelper.java:
##########
@@ -798,19 +798,27 @@ public MoveProcess finishMove()
public static void addEndpoint(int i)
{
+ addEndpoint(i, NodeVersion.CURRENT);
+ }
+
+ public static void addEndpoint(int i, NodeVersion nodeVersion) {
try
{
- addEndpoint(InetAddressAndPort.getByName("127.0.0." + i), new
Murmur3Partitioner.LongToken(i));
+ addEndpoint(InetAddressAndPort.getByName("127.0.0." + i), new
Murmur3Partitioner.LongToken(i), nodeVersion);
}
catch (UnknownHostException e)
{
throw new RuntimeException(e);
}
}
- public static void addEndpoint(InetAddressAndPort endpoint, Token t)
+ public static void addEndpoint(InetAddressAndPort endpoint, Token t) {
+ addEndpoint(endpoint, t, NodeVersion.CURRENT);
Review Comment:
{ on next line:
```suggestion
public static void addEndpoint(InetAddressAndPort endpoint, Token t)
{
addEndpoint(endpoint, t, NodeVersion.CURRENT);
```
##########
src/java/org/apache/cassandra/repair/autorepair/AutoRepairUtils.java:
##########
@@ -425,6 +425,21 @@ public static CurrentRepairStatus
getCurrentRepairStatus(RepairType repairType,
return null;
}
+ /**
+ * Checks whether the cluster has multiple major versions
+ * @return
+ * true if more than one major versions are detected
+ * false if only one major version is detected
+ *
+ */
+ public static boolean hasMultipleLiveMajorVersions()
+ {
+ ClusterMetadata metadata = ClusterMetadata.current();
+ int maxMajorVersion =
ClusterMetadata.current().directory.clusterMaxVersion.cassandraVersion.major;
+ int minMajorVersion =
ClusterMetadata.current().directory.clusterMinVersion.cassandraVersion.major;
+ return maxMajorVersion != minMajorVersion;
+ }
+
Review Comment:
👍 nice and simple.
I had some doubts on whether this is sufficient given Cassandra's version
history. e.g. between 2.0 and 2.1 maybe this wouldn't be safe? But I'd hope
from C* 6.0 on this is probably sufficient.
##########
src/java/org/apache/cassandra/repair/autorepair/AutoRepair.java:
##########
@@ -165,6 +166,11 @@ public void repair(AutoRepairConfig.RepairType repairType)
logger.debug("Auto-repair is disabled for repair type {}",
repairType);
return;
}
+ if (!DatabaseDescriptor.getMixedMajorVersionRepairEnabled() &&
+ Gossiper.instance.hasMultipleLiveMajorVersions()) {
+ logger.info("Auto-repair is disabled when nodes in the cluster
have different major versions");
Review Comment:
I think the metrics are calculated off of / engaged by `AutoRepairState`
(accessed 2 lines below). You could add something to that. Curious on the
metric we are interested in adding? I'm guess maybe a `Counter` that
increments whenever repair is skipped because mixed versions are detected?
##########
test/unit/org/apache/cassandra/repair/autorepair/AutoRepairUtilsTest.java:
##########
@@ -232,6 +236,52 @@ public void testGetHostIdsInCurrentRing_multiple_nodes()
assertTrue(hosts.contains(hostId));
}
+ @Test
+ public void testHasMultipleLiveMajorVersionsWithSingleNode()
+ {
+ boolean result = AutoRepairUtils.hasMultipleLiveMajorVersions();
+ assertFalse(result);
+ }
+
+ @Test
+ public void
testHasMultipleLiveMajorVersionsWithMultipleNodesOfSameVersion()
+ {
+ ClusterMetadataTestHelper.addEndpoint(2);
+ // Test the current behavior with the existing cluster setup
+ // In a single-node test environment, this should return false
+ boolean result = AutoRepairUtils.hasMultipleLiveMajorVersions();
+ assertFalse(result);
+ }
+
+ @Test
+ public void
testHasMultipleLiveMajorVersionsWithMultipleNodesOfSameMajorVersionDifferentMinorVersions()
+ {
+ // add two nodes with cassandra 5 major version, but different minor
version
Review Comment:
```suggestion
// add two nodes with current cassandra major version, but different
minor version
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]