Hi everyone,

I want to second Stephan's points. I think that supporting LTS versions
rather fights symptoms than addressing the underlying problem that are
incompatible/behaviour changing changes between versions. I believe if we
can address this problem, then our Flink users will probably upgrade more
frequently.

Making upgrades as smooth as possible won't probably help all our users but
we as a community also have to think about the costs of maintaining more
Flink versions/supporting certain versions for a longer period of time. One
thing is that we would need more complex processes that ensure that older
Flink versions/LTS receives all the required bug fixes. I think this is
already hard enough when only supporting the last two releases. Moreover,
according to my experience, backporting fixes is usually not a problem if
there are no merge conflicts. However, the older the Flink code is, the
more likely it is that there are merge conflicts. In the worst case, issues
need to be fixed in a completely different way than they are solved in
master.

Additionally, we have to consider that longer support periods will mean
that we need to maintain our CI infrastructure and tooling the same period
as well. To give you an example, we probably would still have to operate
Travis if we had a LTS version. Given our existing problems with CI this
would add a lot more problems to this heap. Moreover, supporting more
versions would add more load to our CI infrastructure.

All in all, I would expect that longer support periods would slow us down
with other things. One of these things could be to make upgrades as smooth
as possible so that more Flink users upgrade more frequently.

@Tison: Concerning the problem that an older version contains a bug fix
that is not contained yet in the latest release of a newer version, I think
this is a valid problem. Right now, our users will have to check via JIRA
whether their problems are solved in the latest version of the new release
(e.g. when upgrading from 1.13.y to 1.14.x). Faster and more frequent
releases could mitigate the problem. Faster and lower overhead releases
could also allow us to release all versions in sync.

Cheers,
Till

On Sat, Nov 13, 2021 at 1:40 AM tison <wander4...@gmail.com> wrote:

> Hi Stephan,
>
> Thanks for your email! I agree your points about compatibility and the
> upgrade experience should be smooth.
>
> The problem raised by Pitor is that, even if we officially support two
> latest versions, many users stay in an early version end-of-support.
> So the downside "no one ends up using the other versions" doesn't
> stand. I see that a number of companies like to test the latest version
> but not apply on prod if it's not robust - 1.9 & 1.10 is less used.
>
> If a non-LTS version resolves users' (urgent) issues and the release itself
> is robust, they will upgrade to that version. We have tried to support
> Java 9 and so do other projects.
>
> I'm a fan to keep two latest supported versions and am willing to work on
> improving compatibility to help users migrate. But if I make a choice
> between
> 4 supported versions and the LTS option, I like the LTS option.
>
> Here are several issues I met when trying to support a series of versions:
>
> 1. cheery-pick overhead grows, obviously.
> 2. cheery-pick owner is unclear. I think committers play the role of
> "reveal
> all the details", and it makes (1) worse.
> 3. version policy is harder. Generally a user wants to upgrade from
> 1.14.3 (latest 1.14.x) to 1.15.2 (latest 1.15.x) without regression.
> Given 1 & 2 the cherry-pick won't always get merged in time and
> it's possible that when 1.14.3 released, 1.15.2 is preparing. We have
> to do some simultaneous releases among 4 versions. And because there
> are 4 versions, such simultaneous releases will be more required.
>
> If a company wants to maintain an early version, it's welcome and can
> release
> all in its own - just like those who maintains Java 1.6. It's fine but not
> under
> committers/PMC maintenance.
>
> Additional input, 1.5, 1.7. 1.8, 1.11, 1.13 are loved.
>
> Best,
> tison.
>
>
> Stephan Ewen <ewenstep...@gmail.com> 于2021年11月13日周六 上午2:35写道:
>
> > I am of a bit different opinion here, I don't think LTS is the best way
> to
> > go.
> >
> > To my understanding, the big issue faced by users is that an upgrade is
> > simply too hard. Too many things change in subtle ways, you cannot just
> > take the previous application and configuration and expect it to run the
> > same after the upgrade. If that was much easier, users would be able to
> > upgrade more easily and more frequently (maybe not every month, but every
> > six months or so).
> >
> > In contrast, LTS is more about how long one provides patches and
> releases.
> > The upgrade problem is the same, between LTS versions, upgrades should
> > still be smooth. To make LTS to LTS smooth, we need to solve the same
> issue
> > as making it smooth from individual version to individual version. Unless
> > we expect non-linear upgrade paths with migration tools, which I am not
> > convinced we should do. It seems to be the opposite of where the industry
> > is moving (upgrade fast and frequently by updating images).
> >
> > The big downside of LTS versions is that almost no one ends up using the
> > other versions, so we get little feedback on features. We will end up
> > having a feature in Flink for three releases and still barely anyone will
> > have used it, so we will lack the confidence to turn it on by default.
> > I also see the risk that the community ends up taking compatibility with
> > non-LTS releases not as serious, because "it is anyways not an LTS
> > version".
> >
> >
> > We could look at making the  upgrades smoother, by starting to observe
> the
> > issues listed here.
> > I think we need to do that anyways, because that is what I hear users
> > bringing up more and more.  If after that we still feel like there is a
> > problem, then let's revisit this issue.
> >   - API compatibility (signatures and behavior)
> >   - Make it possible to pin SQL semantics of a query across releases
> > (FLIP-190 works on this)
> >   - Must be possible to use same configs as before in a new Flink version
> > and keep the same behavior that way
> >   - REST API must be stable (signature and semantics)
> >   - Make it possible to mix different client/cluster versions (stable
> > serializations for JobGraph, etc.)
> >
> > The issue that we define officially two supported versions, but many
> > committers like to backport things for one more version is something we
> can
> > certainly look at, to bring some consistency in there.
> >
> > Best,
> > Stephan
> >
> >
> > On Fri, Nov 12, 2021 at 9:17 AM Martijn Visser <mart...@ververica.com>
> > wrote:
> >
> > > Thanks for bringing this up for discussion Piotr. One the one hand I
> > think
> > > it's a good idea, because of the reasons you've mentioned. On the other
> > > hand, having an LTS version will remove an incentive for some users to
> > > upgrade, which will result in fewer Flink users who will test new
> > features
> > > because they wait for the next LTS version to upgrade. I can see that
> > > particularly happening for large enterprises. Another downside I can
> > > imagine is that upgrading from one LTS version to another LTS version
> > will
> > > become more complicated because more changes have been made between
> those
> > > versions.
> > >
> > > Related to my second remark, would/could introducing an LTS version
> would
> > > also trigger a follow-up discussion that we could potentially introduce
> > > breaking changes in a next LTS version, like a Flink 2.0 [1] ?
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-3957
> > >
> > > On Fri, 12 Nov 2021 at 08:59, Fabian Paul <fabianp...@ververica.com>
> > > wrote:
> > >
> > > > Thanks for bringing up this topic Piotr.
> > > > I also think we should try to decouple our release cycles from our
> > > support
> > > > plans. Currently we are very limited by the approach because faster
> > > release
> > > > cycles result in also faster deprecation of versions.
> > > >
> > > >
> > > > Therefore I am also favoring version 2 where we can align the next
> LTS
> > > > version
> > > > with our development speed. Option 1 I think can easily lead to
> > confusion
> > > > when
> > > > the number of supported releases constantly changes.
> > > >
> > > > Best,
> > > > Fabian
> > > >
> > > >
> > >
> >
>

Reply via email to