This isnt an area I know about but what I vaguely recalled/can see is that there was coincidentally a wire version bump in 2.18.0 as part of other changes, see the ARTEMIS_2_18_0_VERSION constant in PacketImpl. >From that I would guess it should be possible for newer servers to specifically tell whether they are connected to <=2.17.0 or >= 2.18.0. Perhaps the new one could then handle the situation in some way if the issue can be fixed from the new side only, by changing what it sends and expects in the existing packet?
If it can be handled that way, I doubt there would be appetite for releasing fixes across all the superceded intermediate versions rather than just the latest. It doesnt appear to be widely hit so far in nearly a year, people using only any versions >=2.18.0 wont be affected, and anyone not yet affected could become so should use a more recent fixed release (or else can patch the old superceded intermediate release with the fix themselves). On Fri, 15 Jul 2022 at 09:57, Jan Šmucr <jan.sm...@aimtecglobal.com> wrote: > > Dear devs, > > I'd like to ask you for help with the communication incompatibility between > pre-2.18.0 servers and the newer ones. What I've learned so far is that in > 2.18.0 there's been a change in the REPLICATION_START_FINISH_SYNC packet, yet > no new version of that packet has been introduced. There have been some > additional data appended to that packet, so that newer servers expect older > servers to send more data than they actually do, and older servers can't cope > with the additional data they receive. The fact that until now nobody noticed > that replication between pre-2.18.0 and post-2.18.0 does not work confuses me > a little. > > Before learning the actual reason of the incompatibility, I have developed a > test which would eventually pass after the issue has been fixed. But now I > see that fixing it would mean releasing a set of at least five minor bugfix > releases. Shall I even attempt? If not, will you accept at least the test > suite so that nothing like that happens in the future? Also mentioning the > incompatibility somewhere might help others as unfortunate as me. > > The WIP PR is here: https://github.com/apache/activemq-artemis/pull/4144 > [https://opengraph.githubassets.com/1fef362275960b2364da60ecddb76ca361b56b67aca157a2a2d25e3145d32d99/apache/activemq-artemis/pull/4144]<https://github.com/apache/activemq-artemis/pull/4144> > ARTEMIS-3767 Fix replication incompatibility between pre 2.18.0 and SNAPSHOT > (WIP) by jsmucr · Pull Request #4144 · > apache/activemq-artemis<https://github.com/apache/activemq-artemis/pull/4144> > This PR attempts to solve the issue described in > https://issues.apache.org/jira/browse/ARTEMIS-3767. TL;DR replication between > =<2.17.0 and newer Artemis versions is broken since 2.18.0. > github.com > > Thanks for your suggestions. > > Jan