beobal commented on code in PR #4614:
URL: https://github.com/apache/cassandra/pull/4614#discussion_r2832706280
##########
test/distributed/org/apache/cassandra/distributed/test/log/RegisterTest.java:
##########
@@ -108,23 +134,24 @@ public void serializationVersionCeilingTest() throws
Throwable
cluster.get(1).runOnInstance(() -> {
try
{
- // Register a ghost node with V0 to fake-force V0
serialization. In a real world cluster we will always be upgrading from a
smaller version.
+ // Unregister to make directory empty
Review Comment:
This test isn't actually passing after these updates, for a slightly
convoluted reason.
It is failing because it attempts to register a new node with an IP address
which was previously used by another node and this is currently incompatible
with some C* configuration (specifically Accord).
The first node in the cluster is added when the test starts and has the
`broadcast_address: 127.0.0.1`. We then unregister this to force the directory
into an empty state, which is new in this patch - previous versions of the test
did not do this, instead they resorted to some other hackery to achieve the
same aim of lowering the min common serialization version. At this point in the
test, we commit a `Register` which uses the same endpoint address (`127.0.0.1`
via `NodeAddresses.current()`) and run into the problem with Accord:
[CASSANDRA-21026 Reusing the address of a removed node is not possible with
Accord enabled](https://issues.apache.org/jira/browse/CASSANDRA-21026). This
leads to uncaught exceptions being thrown which automatically trigger the test
to fail.
Fortunately, there is a choice of simple solutions to this
* simply use a different address for the registration of the `upgraded` node
* disable Accord via node config when creating the cluster. See
`o.a.c.distributed.test.ring.DecommissionTest::testAddressReuseAfterDecommission`,
which faces the same issue.
Personally, I would lean toward the former approach as when CASSANDRA-21026
is resolved the reason for disabling in this test won't be obvious.
This is all due to pretty obscure implementation details, so I wouldn't
expect any of this to be at all obvious but the modified test is consistently
failing, which should prompt further investigation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]