... and I always thought replication would always be used within the
same server.


Recently we added a test on replication versioning (compatibility test).


I will see what I can do with the versioning.

On Fri, Jul 15, 2022 at 11:43 AM Robbie Gemmell
<robbie.gemm...@gmail.com> wrote:
>
> Perhaps, I didnt go looking at the year old commits to see the
> relative sequence of when it changed. The problem being raised wasnt
> that the particular PR didnt change the version though (albeit the
> version either already had, or subsequently did change, which I was
> simply noting in case it wasnt already clear to Jan). Instead its that
> it changed that packet contents without adding a new packet version,
> and its being said that the old server cant handle the new data now
> being sent in the old packet, and also that the new server cant handle
> the absence of the new data that the old server obviously doesnt know
> about to send it.
>
> Which or both of those is true I dont know. I do recall other similar
> cases before of suggesting not sending new fields to old servers, and
> being told it shouldnt matter as theyd simply not use it, though
> personally I argued it still should never be sent to them as then it
> definitely cant cause any change in behaviour.
>
> On Fri, 15 Jul 2022 at 16:00, Clebert Suconic <clebert.suco...@gmail.com> 
> wrote:
> >
> > as far as I know that PR did not make a switch in the protocol version
> > because there was already another change in there for the same
> > version... right?
> >
> > On Fri, Jul 15, 2022 at 6:07 AM Robbie Gemmell <robbie.gemm...@gmail.com> 
> > wrote:
> > >
> > > This isnt an area I know about but what I vaguely recalled/can see is
> > > that there was coincidentally a wire version bump in 2.18.0 as part of
> > > other changes, see the ARTEMIS_2_18_0_VERSION constant in PacketImpl.
> > > From that I would guess it should be possible for newer servers to
> > > specifically tell whether they are connected to <=2.17.0  or >=
> > > 2.18.0. Perhaps the new one could then handle the situation in some
> > > way if the issue can be fixed from the new side only, by changing what
> > > it sends and expects in the existing packet?
> > >
> > > If it can be handled that way, I doubt there would be appetite for
> > > releasing fixes across all the superceded intermediate versions rather
> > > than just the latest. It doesnt appear to be widely hit so far in
> > > nearly a year, people using only any versions >=2.18.0 wont be
> > > affected, and anyone not yet affected could become so should use a
> > > more recent fixed release (or else can patch the old superceded
> > > intermediate release with the fix themselves).
> > >
> > > On Fri, 15 Jul 2022 at 09:57, Jan Šmucr <jan.sm...@aimtecglobal.com> 
> > > wrote:
> > > >
> > > > Dear devs,
> > > >
> > > > I'd like to ask you for help with the communication incompatibility 
> > > > between pre-2.18.0 servers and the newer ones. What I've learned so far 
> > > > is that in 2.18.0 there's been a change in the 
> > > > REPLICATION_START_FINISH_SYNC packet, yet no new version of that packet 
> > > > has been introduced. There have been some additional data appended to 
> > > > that packet, so that newer servers expect older servers to send more 
> > > > data than they actually do, and older servers can't cope with the 
> > > > additional data they receive. The fact that until now nobody noticed 
> > > > that replication between pre-2.18.0 and post-2.18.0 does not work 
> > > > confuses me a little.
> > > >
> > > > Before learning the actual reason of the incompatibility, I have 
> > > > developed a test which would eventually pass after the issue has been 
> > > > fixed. But now I see that fixing it would mean releasing a set of at 
> > > > least five minor bugfix releases. Shall I even attempt? If not, will 
> > > > you accept at least the test suite so that nothing like that happens in 
> > > > the future? Also mentioning the incompatibility somewhere might help 
> > > > others as unfortunate as me.
> > > >
> > > > The WIP PR is here: https://github.com/apache/activemq-artemis/pull/4144
> > > > [https://opengraph.githubassets.com/1fef362275960b2364da60ecddb76ca361b56b67aca157a2a2d25e3145d32d99/apache/activemq-artemis/pull/4144]<https://github.com/apache/activemq-artemis/pull/4144>
> > > > ARTEMIS-3767 Fix replication incompatibility between pre 2.18.0 and 
> > > > SNAPSHOT (WIP) by jsmucr · Pull Request #4144 · 
> > > > apache/activemq-artemis<https://github.com/apache/activemq-artemis/pull/4144>
> > > > This PR attempts to solve the issue described in 
> > > > https://issues.apache.org/jira/browse/ARTEMIS-3767. TL;DR replication 
> > > > between =<2.17.0 and newer Artemis versions is broken since 2.18.0.
> > > > github.com
> > > >
> > > > Thanks for your suggestions.
> > > >
> > > > Jan
> >
> >
> >
> > --
> > Clebert Suconic



-- 
Clebert Suconic

Reply via email to