FYI: I created these scripts for my local tests:
https://github.com/symat/zk-rolling-upgrade-test

For the long term I would also add some script that actually monitors the
state of the quorum and also runs continuous traffic, not just 1-2
smoketests after each restart. But I don't know how important this would be.

On Tue, Feb 11, 2020 at 5:25 PM Enrico Olivelli <eolive...@gmail.com> wrote:

> Il giorno mar 11 feb 2020 alle ore 17:17 Andor Molnar
> <an...@apache.org> ha scritto:
> >
> > The most obvious one which crosses my mind is that I previously worked
> on:
> >
> > 1) run old version cluster,
> > 2) connect to each node and run smoke tests,
> > 3) restart one node with new code,
> > 4) goto 2) until all nodes are upgraded
> >
> > I think this wouldn’t work in a “unit test”, we probably need a separate
> Jenkins job and a nice python script to do this.
> >
> > Andor
> >
> >
> >
> >
> > > On 2020. Feb 11., at 16:38, Patrick Hunt <ph...@apache.org> wrote:
> > >
> > > Anyone have ideas how we could add testing for upgrade? Obviously
> something
> > > we're missing, esp given it's import.
>
> I will send an email next days with a proposal.
> btw my idea is very like Andor's one
>
> Once we have an automatic environment we can launch from Jenkins
>
> Enrico
>
>
> > >
> > > Patrick
> > >
> > > On Tue, Feb 11, 2020 at 12:40 AM Enrico Olivelli <eolive...@gmail.com>
> > > wrote:
> > >
> > >> Il giorno mar 11 feb 2020 alle ore 09:12 Szalay-Bekő Máté
> > >> <szalay.beko.m...@gmail.com> ha scritto:
> > >>>
> > >>> Hi All,
> > >>>
> > >>> about the question from Michael:
> > >>>> Regarding the fix, can we just make 3.6.0 aware of the old protocol
> and
> > >>>> speak old message format when it's talking to old server?
> > >>>
> > >>> In this particular case, it might be enough. The protocol change
> happened
> > >>> now in the 'initial message' sent by the QuorumCnxManager. Maybe it
> is
> > >> not
> > >>> a problem if the new servers can not initiate channels to the old
> > >> servers,
> > >>> maybe it is enough if these channel gets initiated by the old servers
> > >> only.
> > >>> I will test it quickly.
> > >>>
> > >>> Although I have no idea if any other thing changed in the quorum
> protocol
> > >>> between 3.5 and 3.6. In other cases it might not be enough if the new
> > >>> servers can understand the old messages, as the old servers can
> break by
> > >>> not understanding the messages from the new servers. Also, in the
> code
> > >>> currently (AFAIK) there is no generic knowledge of protocol
> versions, the
> > >>> servers are not storing that which protocol versions they can/should
> use
> > >> to
> > >>> communicate to which particular other servers. Maybe we don't even
> need
> > >>> this, but I would feel better if we would have more tests around
> these
> > >>> things.
> > >>>
> > >>> My suggestion for the long term:
> > >>> - let's fix this particular issue now with 3.6.0 quickly (I start
> doing
> > >>> this today)
> > >>> - let's do some automation (backed up with jenkins) that will test a
> > >> whole
> > >>> combinations of different ZooKeeper upgrade paths by making rolling
> > >>> upgrades during some light traffic. Let's have a bit better
> definition
> > >>> about what we expect (e.g. the quorum is up, but some clients can get
> > >>> disconnected? What will happen to the ephemeral nodes? Do we want to
> > >>> gracefully close or transfer the user sessions before stopping the
> old
> > >>> server?) and let's see where this broke. Just by checking the code, I
> > >> don't
> > >>> think the quorum will always be up (e.g. between older 3.4 versions
> and
> > >>> 3.5).
> > >>
> > >>
> > >> I am happy to work on this topic
> > >>
> > >>> - we need to update the Wiki about the working rolling upgrade paths
> and
> > >>> maybe about workarounds if needed
> > >>> - we might need to do some fixes (adding backward compatible versions
> > >>> and/or specific parameters that enforce old protocol temporary
> during the
> > >>> rolling upgrade that can be changed later to the new protocol by
> either
> > >>> dynamic reconfig or by rolling restart)
> > >>
> > >> it would be much better on 3.6 code to have some support for
> > >> compatibility with 3.5 servers
> > >> we can't require old code to be forward compatible but we can make new
> > >> code be compatible to a certain extend with old code.
> > >> If we can achieve this compatibility goal without a flag is better,
> > >> users won't have to care about this part and they simply "trust" on us
> > >>
> > >> The rollback story is also important, but maybe we are still not ready
> > >> for it, in case of local changes to store,
> > >> it is better to have a clear design and plan and work for a new
> release
> > >> (3.7?)
> > >>
> > >> Enrico
> > >>
> > >>>
> > >>> Depending on your comments, I am happy to create a few Jira tickets
> > >> around
> > >>> these topics.
> > >>>
> > >>> Kind regards,
> > >>> Mate
> > >>>
> > >>> ps. Enrico, sorry about your RC... I owe you a beer, let me know if
> you
> > >> are
> > >>> near to Budapest ;)
> > >>>
> > >>> On Tue, Feb 11, 2020 at 8:43 AM Enrico Olivelli <eolive...@gmail.com
> >
> > >> wrote:
> > >>>
> > >>>> Good.
> > >>>>
> > >>>> I will cancel the vote for 3.6.0rc2.
> > >>>>
> > >>>> I appreciate very much If Mate and his colleagues have time to work
> on
> > >> a
> > >>>> fix.
> > >>>> Otherwise I will have cycles next week
> > >>>>
> > >>>> I would also like to spend my time in setting up a few minimal
> > >> integration
> > >>>> tests about the upgrade story
> > >>>>
> > >>>> Enrico
> > >>>>
> > >>>> Il Mar 11 Feb 2020, 07:30 Michael Han <h...@apache.org> ha scritto:
> > >>>>
> > >>>>> Kudos Enrico, very thorough work as the final gate keeper of the
> > >> release!
> > >>>>>
> > >>>>> Now with this, I'd like to *vote a -1* on the 3.6.0 RC2.
> > >>>>>
> > >>>>> I'd recommend we fix this issue for 3.6.0. ZooKeeper is one of the
> > >> rare
> > >>>>> piece of software that put so much emphasis on compatibilities thus
> > >> it
> > >>>> just
> > >>>>> works when upgrade / downgrade, which is amazing. One guarantee we
> > >> always
> > >>>>> had is during rolling upgrade, the quorum will always be available,
> > >>>> leading
> > >>>>> to no service interruption. It would be sad we lose such capability
> > >> given
> > >>>>> this is still a tractable problem.
> > >>>>>
> > >>>>> Regarding the fix, can we just make 3.6.0 aware of the old protocol
> > >> and
> > >>>>> speak old message format when it's talking to old server?
> Basically,
> > >> an
> > >>>>> ugly if else check against the protocol version should work and
> > >> there is
> > >>>> no
> > >>>>> need to have multiple pass on rolling upgrade process.
> > >>>>>
> > >>>>>
> > >>>>> On Mon, Feb 10, 2020 at 10:23 PM Enrico Olivelli <
> > >> eolive...@gmail.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> I suggest this plan:
> > >>>>>> - release 3.6.0 now
> > >>>>>> - improve the migration story, the flow outlined by Mate is
> > >>>>>> interesting, but it will take time
> > >>>>>>
> > >>>>>> 3.6.0rc2 got enough binding votes so I am going to finalize the
> > >>>>>> release this evening (within 8-10 hours) if no one comes out in
> the
> > >>>>>> VOTE thread with a -1
> > >>>>>>
> > >>>>>> Enrico
> > >>>>>>
> > >>>>>> Enrico
> > >>>>>>
> > >>>>>> Il giorno lun 10 feb 2020 alle ore 19:33 Patrick Hunt
> > >>>>>> <ph...@apache.org> ha scritto:
> > >>>>>>>
> > >>>>>>> On Mon, Feb 10, 2020 at 3:38 AM Andor Molnar <an...@apache.org>
> > >>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi,
> > >>>>>>>>
> > >>>>>>>> Answers inline.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>> In my experience when you are close to a release it is
> > >> better to
> > >>>> to
> > >>>>>>>>> make big changes. (I am among the approvers of that patch,
> > >> so I
> > >>>> am
> > >>>>>>>>> responsible for this change)
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Although this statement is acceptable for me, I don’t feel this
> > >>>> patch
> > >>>>>>>> should not have been merged into 3.6.0. Submission has been
> > >>>> preceded
> > >>>>>> by a
> > >>>>>>>> long argument with MAPR folks who originally wanted to be
> > >> merged
> > >>>> into
> > >>>>>> 3.4
> > >>>>>>>> branch (considering the pace how ZooKeeper community is moving
> > >>>>>> forward) and
> > >>>>>>>> we reached an agreement that release it with 3.6.0.
> > >>>>>>>>
> > >>>>>>>> Make a long story short, this patch has been outstanding for
> > >> ages
> > >>>>>> without
> > >>>>>>>> much attention from the community and contributors made a lot
> > >> of
> > >>>>>> effort to
> > >>>>>>>> get it done before the release.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>> I would like to ear from people that have been in the
> > >> community
> > >>>> for
> > >>>>>>>>> long time, then I am ready to complete the release process
> > >> for
> > >>>>>>>>> 3.6.0rc2.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Me too.
> > >>>>>>>>
> > >>>>>>>> I tend to accept the way rolling restart works now - as you
> > >>>> described
> > >>>>>>>> Enrico - and given that situation was pretty much the same
> > >> between
> > >>>>> 3.4
> > >>>>>> and
> > >>>>>>>> 3.5, I don’t feel we have to make additional changes.
> > >>>>>>>>
> > >>>>>>>> On the other hand, the fix that Mate suggested sounds quite
> > >> cool,
> > >>>> I’m
> > >>>>>> also
> > >>>>>>>> happy to work on getting it in.
> > >>>>>>>>
> > >>>>>>>> Fyi, Release Management page says the following:
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/ReleaseManagement
> > >>>>>>>>
> > >>>>>>>> "major.minor release of ZooKeeper must be backwards compatible
> > >> with
> > >>>>> the
> > >>>>>>>> previous minor release, major.(minor-1)"
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>> Our users, direct and indirect, value the ability to migrate to
> > >> newer
> > >>>>>>> versions - esp as we drop support for older. Frictions such as
> > >> this
> > >>>> can
> > >>>>>> be
> > >>>>>>> a reason to go elsewhere. I'm "pro" b/w compact - esp given our
> > >>>>> published
> > >>>>>>> guidelines.
> > >>>>>>>
> > >>>>>>> Patrick
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> Andor
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>> On 2020. Feb 10., at 11:32, Enrico Olivelli <
> > >> eolive...@gmail.com
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>> Thank you Mate for checking and explaining this story.
> > >>>>>>>>>
> > >>>>>>>>> I find it very interesting that the cause is ZOOKEEPER-3188
> > >> as:
> > >>>>>>>>> - it is the last "big patch" committed to 3.6 before
> > >> starting the
> > >>>>>>>>> release process
> > >>>>>>>>> - it is the cause of the failure of the first RC
> > >>>>>>>>>
> > >>>>>>>>> In my experience when you are close to a release it is
> > >> better to
> > >>>> to
> > >>>>>>>>> make big changes. (I am among the approvers of that patch,
> > >> so I
> > >>>> am
> > >>>>>>>>> responsible for this change)
> > >>>>>>>>>
> > >>>>>>>>> This is a pointer to the change to whom who wants to
> > >> understand
> > >>>>>> better
> > >>>>>>>>> the context
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> https://github.com/apache/zookeeper/pull/1048/files#diff-7a209d890686bcba351d758b64b22a7dR11
> > >>>>>>>>>
> > >>>>>>>>> IIUC even for the upgrade from 3.4 to 3.5 the story was the
> > >> same
> > >>>>> and
> > >>>>>>>>> if this statement holds then I feel we can continue
> > >>>>>>>>> with this release.
> > >>>>>>>>>
> > >>>>>>>>> - Reverting ZOOKEEPER-3188 is not an option for me, it is too
> > >>>>>> complex.
> > >>>>>>>>> - Making 3.5 and 3.6 "compatible" can be very tricky and we
> > >> do
> > >>>> not
> > >>>>>>>>> have tools to certify this compatibility (at least not in the
> > >>>> short
> > >>>>>>>>> term)
> > >>>>>>>>>
> > >>>>>>>>> I would like to ear from people that have been in the
> > >> community
> > >>>> for
> > >>>>>>>>> long time, then I am ready to complete the release process
> > >> for
> > >>>>>>>>> 3.6.0rc2.
> > >>>>>>>>>
> > >>>>>>>>> I will update the website and the release notes with a
> > >> specific
> > >>>>>>>>> warning about the upgrade, we should also update the Wiki
> > >>>>>>>>>
> > >>>>>>>>> Enrico
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Il giorno lun 10 feb 2020 alle ore 11:17 Szalay-Bekő Máté
> > >>>>>>>>> <szalay.beko.m...@gmail.com> ha scritto:
> > >>>>>>>>>>
> > >>>>>>>>>> Hi Enrico!
> > >>>>>>>>>>
> > >>>>>>>>>> This is caused by the different PROTOCOL_VERSION in the
> > >>>>>>>> QuorumCnxManager.
> > >>>>>>>>>> The Protocol version  was changed last time in
> > >> ZOOKEEPER-2186
> > >>>>>> released
> > >>>>>>>>>> first in 3.4.7 and 3.5.1 to avoid some crashing / fix some
> > >> bugs.
> > >>>>>> Later I
> > >>>>>>>>>> also changed the protocol version when the format of the
> > >> initial
> > >>>>>> message
> > >>>>>>>>>> changed in ZOOKEEPER-3188. So actually the quorum protocol
> > >> is
> > >>>> not
> > >>>>>>>>>> compatible in this case and is the 'expected' behavior if
> > >> you
> > >>>>>> upgrade
> > >>>>>>>> e.g
> > >>>>>>>>>> from 3.4.6 to 3.4.7, or 3.4.6 to 3.5.5 or e.g from 3.5.6 to
> > >>>> 3.6.0.
> > >>>>>>>>>>
> > >>>>>>>>>> We had some discussion in the PR of ZOOKEEPER-3188 back
> > >> then and
> > >>>>>> got to
> > >>>>>>>> the
> > >>>>>>>>>> conclusion that it is not that bad, as there will be no data
> > >>>> loss
> > >>>>>> as you
> > >>>>>>>>>> wrote. The tricky thing is that during rolling upgrade we
> > >> should
> > >>>>>> ensure
> > >>>>>>>>>> both backward and forward compatibility to make sure that
> > >> the
> > >>>> old
> > >>>>>> and
> > >>>>>>>> the
> > >>>>>>>>>> new part of the quorum can still speak to each other. The
> > >>>> current
> > >>>>>>>> solution
> > >>>>>>>>>> (simply failing if the protocol versions mismatch) is more
> > >>>> simple
> > >>>>>> and
> > >>>>>>>> still
> > >>>>>>>>>> working just fine: as the servers are restarted one-by-one,
> > >> the
> > >>>>>> nodes
> > >>>>>>>> with
> > >>>>>>>>>> the old protocol version and the nodes with the new protocol
> > >>>>> version
> > >>>>>>>> will
> > >>>>>>>>>> form two partitions, but any given time only one partition
> > >> will
> > >>>>>> have the
> > >>>>>>>>>> quorum.
> > >>>>>>>>>>
> > >>>>>>>>>> Still, thinking it trough, as a side effect in these cases
> > >> there
> > >>>>>> will
> > >>>>>>>> be a
> > >>>>>>>>>> short time when none of the partitions will have quorums
> > >> (when
> > >>>> we
> > >>>>>> have N
> > >>>>>>>>>> servers with the old protocol version, N servers with the
> > >> new
> > >>>>>> protocol
> > >>>>>>>>>> version, and there is one server just being restarted). I
> > >> am not
> > >>>>>> sure
> > >>>>>>>> if we
> > >>>>>>>>>> can accept this.
> > >>>>>>>>>>
> > >>>>>>>>>> For ZOOKEEPER-3188 we can add a small patch to make it
> > >> possible
> > >>>> to
> > >>>>>> parse
> > >>>>>>>>>> the initial message of the old protocol version with the new
> > >>>> code.
> > >>>>>> But
> > >>>>>>>> I am
> > >>>>>>>>>> not sure if it would be enough (as the old code will not be
> > >> able
> > >>>>> to
> > >>>>>>>> parse
> > >>>>>>>>>> the new initial message).
> > >>>>>>>>>>
> > >>>>>>>>>> One option can be to make a patch also for 3.5 to have a
> > >> version
> > >>>>>> which
> > >>>>>>>>>> supports both protocol versions. (let's say in 3.5.8) Then
> > >> we
> > >>>> can
> > >>>>>> write
> > >>>>>>>> to
> > >>>>>>>>>> the release note, that if you need rolling upgrade from any
> > >>>>> versions
> > >>>>>>>> since
> > >>>>>>>>>> 3.4.7, then you have to first upgrade from 3.5.8 before
> > >>>> upgrading
> > >>>>> to
> > >>>>>>>> 3.6.0.
> > >>>>>>>>>> We can even make the same thing on the 3.4 branch.
> > >>>>>>>>>>
> > >>>>>>>>>> But I am also new to the community... It would be great to
> > >> hear
> > >>>>> the
> > >>>>>>>> opinion
> > >>>>>>>>>> of more experienced people.
> > >>>>>>>>>> Whatever the decision will be, I am happy to make the
> > >> changes.
> > >>>>>>>>>>
> > >>>>>>>>>> And sorry for breaking the RC (if we decide that this needs
> > >> to
> > >>>> be
> > >>>>>>>>>> changed...).  ZOOKEEPER-3188 was a complex patch.
> > >>>>>>>>>>
> > >>>>>>>>>> Kind regards,
> > >>>>>>>>>> Mate
> > >>>>>>>>>>
> > >>>>>>>>>> On Mon, Feb 10, 2020 at 9:47 AM Enrico Olivelli <
> > >>>>>> eolive...@gmail.com>
> > >>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi,
> > >>>>>>>>>>> even if we had enough binding +1 on 3.6.0rc2 before
> > >> closing the
> > >>>>>> VOTE
> > >>>>>>>>>>> of 3.6.0 I wanted to finish my tests and I am coming to an
> > >>>>> apparent
> > >>>>>>>>>>> blocker.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I am trying to upgrade a 3.5.6 cluster to 3.6.0, but it
> > >> looks
> > >>>>> like
> > >>>>>>>>>>> peers are not able to talk to each other.
> > >>>>>>>>>>> I have a cluster of 3, server1, server2 and server3.
> > >>>>>>>>>>> When I upgrade server1 to 3.6.0rc2 I see this kind of
> > >> errors on
> > >>>>> 3.5
> > >>>>>>>> nodes:
> > >>>>>>>>>>>
> > >>>>>>>>>>> 2020-02-10 09:35:07,745 [myid:3] - INFO
> > >>>>>>>>>>> [localhost/127.0.0.1:3334:QuorumCnxManager$Listener@918] -
> > >>>>>> Received
> > >>>>>>>>>>> connection request 127.0.0.1:62591
> > >>>>>>>>>>> 2020-02-10 09:35:07,746 [myid:3] - ERROR
> > >>>>>>>>>>> [localhost/127.0.0.1:3334:QuorumCnxManager@527] -
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> org.apache.zookeeper.server.quorum.QuorumCnxManager$InitialMessage$InitialMessageException:
> > >>>>>>>>>>> Got unrecognized protocol version -65535
> > >>>>>>>>>>>
> > >>>>>>>>>>> Once I upgrade all of the peers the system is up and
> > >> running,
> > >>>>>> without
> > >>>>>>>>>>> apparently no data loss.
> > >>>>>>>>>>>
> > >>>>>>>>>>> During the upgrade as soon as I upgrade the first node,
> > >> say,
> > >>>>>> server1,
> > >>>>>>>>>>> server1 is not able to accept connections (error "Close of
> > >>>>> session
> > >>>>>> 0x0
> > >>>>>>>>>>> java.io.IOException: ZooKeeperServer not running")  from
> > >>>> clients,
> > >>>>>> this
> > >>>>>>>>>>> is expected, because as far as it cannot talk with the
> > >> other
> > >>>>> peers
> > >>>>>> it
> > >>>>>>>>>>> is practically partitioned away from the cluster.
> > >>>>>>>>>>>
> > >>>>>>>>>>> My questions are:
> > >>>>>>>>>>> 1) is this expected ? I can't remember protocol changes
> > >> from
> > >>>> 3.5
> > >>>>> to
> > >>>>>>>>>>> 3.6, but actually 3.6 diverged from 3.5 branch so long ago,
> > >>>> and I
> > >>>>>> was
> > >>>>>>>>>>> not in the community as dev so I cannot tell
> > >>>>>>>>>>> 2) is this a viable option for users ? to have some
> > >> temporary
> > >>>>>> glitch
> > >>>>>>>>>>> during the upgrade and hope that the upgrade completes
> > >> without
> > >>>>>>>>>>> troubles ?
> > >>>>>>>>>>>
> > >>>>>>>>>>> In theory as long as two servers are running the same major
> > >>>>> version
> > >>>>>>>>>>> (3.5 or 3.6) we have a quorum and the system is able to
> > >> make
> > >>>>>> progress
> > >>>>>>>>>>> and to server clients.
> > >>>>>>>>>>> I feel that this is quite dangerous, but I don't have
> > >> enough
> > >>>>>> context
> > >>>>>>>>>>> to understand how this problem is possible and when we
> > >> decided
> > >>>> to
> > >>>>>>>>>>> break compatibility.
> > >>>>>>>>>>>
> > >>>>>>>>>>> The other option is that I am wrong in my test and I am
> > >> messing
> > >>>>> up
> > >>>>>> :-)
> > >>>>>>>>>>>
> > >>>>>>>>>>> The other upgrade path I would like to see working like a
> > >> charm
> > >>>>> is
> > >>>>>> the
> > >>>>>>>>>>> upgrade from 3.4 to 3.6, as I see that as soon as we
> > >> release
> > >>>> 3.6
> > >>>>> we
> > >>>>>>>>>>> should encourage users to move to 3.6 and not to 3.5.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regards
> > >>>>>>>>>>> Enrico
> > >>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
>

Reply via email to