I am of a bit different opinion here, I don't think LTS is the best way to go.
To my understanding, the big issue faced by users is that an upgrade is simply too hard. Too many things change in subtle ways, you cannot just take the previous application and configuration and expect it to run the same after the upgrade. If that was much easier, users would be able to upgrade more easily and more frequently (maybe not every month, but every six months or so). In contrast, LTS is more about how long one provides patches and releases. The upgrade problem is the same, between LTS versions, upgrades should still be smooth. To make LTS to LTS smooth, we need to solve the same issue as making it smooth from individual version to individual version. Unless we expect non-linear upgrade paths with migration tools, which I am not convinced we should do. It seems to be the opposite of where the industry is moving (upgrade fast and frequently by updating images). The big downside of LTS versions is that almost no one ends up using the other versions, so we get little feedback on features. We will end up having a feature in Flink for three releases and still barely anyone will have used it, so we will lack the confidence to turn it on by default. I also see the risk that the community ends up taking compatibility with non-LTS releases not as serious, because "it is anyways not an LTS version". We could look at making the upgrades smoother, by starting to observe the issues listed here. I think we need to do that anyways, because that is what I hear users bringing up more and more. If after that we still feel like there is a problem, then let's revisit this issue. - API compatibility (signatures and behavior) - Make it possible to pin SQL semantics of a query across releases (FLIP-190 works on this) - Must be possible to use same configs as before in a new Flink version and keep the same behavior that way - REST API must be stable (signature and semantics) - Make it possible to mix different client/cluster versions (stable serializations for JobGraph, etc.) The issue that we define officially two supported versions, but many committers like to backport things for one more version is something we can certainly look at, to bring some consistency in there. Best, Stephan On Fri, Nov 12, 2021 at 9:17 AM Martijn Visser <mart...@ververica.com> wrote: > Thanks for bringing this up for discussion Piotr. One the one hand I think > it's a good idea, because of the reasons you've mentioned. On the other > hand, having an LTS version will remove an incentive for some users to > upgrade, which will result in fewer Flink users who will test new features > because they wait for the next LTS version to upgrade. I can see that > particularly happening for large enterprises. Another downside I can > imagine is that upgrading from one LTS version to another LTS version will > become more complicated because more changes have been made between those > versions. > > Related to my second remark, would/could introducing an LTS version would > also trigger a follow-up discussion that we could potentially introduce > breaking changes in a next LTS version, like a Flink 2.0 [1] ? > > Best regards, > > Martijn > > [1] https://issues.apache.org/jira/browse/FLINK-3957 > > On Fri, 12 Nov 2021 at 08:59, Fabian Paul <fabianp...@ververica.com> > wrote: > > > Thanks for bringing up this topic Piotr. > > I also think we should try to decouple our release cycles from our > support > > plans. Currently we are very limited by the approach because faster > release > > cycles result in also faster deprecation of versions. > > > > > > Therefore I am also favoring version 2 where we can align the next LTS > > version > > with our development speed. Option 1 I think can easily lead to confusion > > when > > the number of supported releases constantly changes. > > > > Best, > > Fabian > > > > >