Since I was called :). Yes. I would be very, very careful here. You might think that we use "SemVer" as a "cult". Finally it's just a versioning scheme we adopted, right? But for me - this is way more. It's all about communication with our users, making promises to our users and design decisions that impact our security policies.
I think Semver has this nice property that we can promise our users "if you are using the public interface of Airflow, you can upgrade without too much of a fear that things will break - if they will be broken, this will be accidental and will get fixed". BTW we already have, very nicely defined https://airflow.apache.org/docs/apache-airflow/stable/public-airflow-interface.html so it's pretty clear what we promise to our users. And it also has certain "security" properties - but I will get to that. I would love to hear what other think, but I have 3 important aspects that should be considered in this discussion 1. Promises we make to our users and what it means for us unswering to their issues. Surely we could make other promises. CalVer promises (We release regularly) - but it does not give the user any indication that whatever worked before will work in the foreseeable future and will get maintained. It makes maintainer life easier, yes. It however makes the user's life harder, because they cannot rely on something being available and their upgrades might and will be more difficult. And yes - for Snowflake it matters a lot, because they actually get paid for supporting old versions and they have no choice but to respond to users who claim the "old supported version does not work". They cannot (as we can, and often do currently) tell the users "upgrade to the latest version - it should be easy because of SemVer promise - if you follow our "use public interface only of course". We (community/maintainer) can very easily say that and since we give no support, no guarantees, no-one pays for solving problems, this "upgrade to latest version" is actually a very good answer - many, many times. For maintainers that rarely respond to user questions, yes Semver is harder to add new things. But for maintainers who actually respond a lot to users' questions, life is suddenly way harder - because they cannot answer "upgrade to latest version" - because immediately the user will answer "but I cannot - because I am using this and that feature. tell me how to solve the problem without upgrading". And they will be in full right to say that. I recommend any maintainers to spend sometimes few week looking and actually responding to user's questions. That is quite a good lesson and this aspect should also be considered in this discussion. 2. Why do we want to introduce breaking changes? I believe the only reason for removing features (i don't really like softening "breaking changes" with "behaviour changes' BTW. This attempts to hide the fact that those changes are - in fact - breaking changes - is that when they are slowing us down (us - people who develop airflow). So I propose to keep the name in further discussion as it tells much more about the nature of "behaviour changes". I see no particular reason for removing features if they do not slow us down. Let me ask this way - would Semver disturb you if we had a way of removing features from core airflow (i.e. making them not slowing down development) if we have a way of doing it without breaking Semver? Seems contradictory - but it is not. We've already done that and continue doing it. Move out Kubernetes and Celery out of core -> we did it. It's not any more part of Semver. They never were, actually (but a number of people could assume they were). Now, they are completely out of the way. I remember how much time Daniel, you spent on back-compatibility of K8S code - but... Not any more. People will be able to keep current K8S and Celery provider implementation basically forever now. No matter how many Airflow versions we have. By introducing a very clear Executor API and making sure we decouple it from the Core - we actually made the impossible happen: * Airflow core still keeps the SemVer Promises * Users can stick to the "old" behaviour of K8S and Celery executors as long as they want * We can introduce breaking changes in K8S and Celery providers - without absolutely looking back Seems like magic - but just by clear specification of what is/is not a public API, decoupling things and having mechanisms of providers where you downgrade and upgrade them independently and where the old versions can be used for as long as you want - even after you upgrade Airflow, we seemingly made impossible - possible. And .. my assumption is - we can do the same with other features. Surely some are more difficult than others (SubDAG - yes I am talking about you). But maybe - instead of breaking SemVer we can figure out how to remove subdag to s separately installable provider? Why not introduce some stable API and decoupling SubDAG functionality similar to Celery/K8S Executors? It will be a lot harder, and likely performance will suffer - but hey, performance is not something promised by SemVer. We already - apparently - in 2.5 increased Airflow's resource requirements by ~ 30% (looks like from an anecdotal user's report). And while users complain, performance / resource usage is not promised by SemVer (and by us). And while users might complain, increasing resources nowadays is just a matter of cost, it's generally easy to increase memory for your Airflow installation. Yes you will pay more, but usually Airflow's cost is rather low and there are other factors that might decrease the cost (such as deferrables) so this is not a big problem (and it does not matter in this discussion). So my question is - do we have a really good reason to break up with SemVer and remove some features ? Or maybe there are ways we can separate them out of the way of core maintainers without breaking SemVer? I believe more and more decoupling is way better approach to achieve faster development than breaking SemVer. 3. Security patches This is, I think, one of the things that will only get more important over the next few years. And we need to be ready for what's coming. I am not sure about others but I am not only following, but also I actively participate in discussion of the Apache Software Foundation. For those who don't - I recommend reading this blog post at the ASF Blog https://news.apache.org/foundation/entry/save-open-source-the-impending-tragedy-of-the-cyber-resilience-act . We are facing - in the next few years increased pressure from governments (EU and US mainly in our case) to force everyone to follow security practices they deem essential. We are at a pivotal moment where the Software Development industry is starting to be regulated. It happened in the past for many industries. And it is happening now - we are facing regulations that we've never experienced in software development history. Those laws are about to be finalized (some of them in a few months). The actual scope of it is yet to be finalized but among them there is a STRICT requirement of releasing security patches for all major versions of the software for 5 years (!) after it's been released. This will be a strict requirement in the EU and companies and organisations not following it will be forbidden to do business in the EU (similar in the US). How it will impact ASF - we do not know yet, our processes are sound. But there is a path that both - our users and stakeholders will expect that there are security patches that are released for all the software that is out there and used for years. If we use SemVer - this is the very nice side of it - by definition we only need to release patches for all the MAJOR versions we have. This is what we do effectively today. We only release security patches for the latest MINOR release of the ONLY major release (Airflow 2). If we start deliberately releasing breaking changes - then such a breaking release becomes automatically equivalent to a MAJOR release - because our users will not be able to upgrade and apply security fixes without - sometimes - majorly breaking their installation. This is 100% against the spirit and idea of the regulations. The regulations aim to force those who produce software to make it easy and possible for the users to upgrade immediately after security fixes are released. In a way - using SemVer and being able to tell the users "We only release security patches in the latest release because you can always easily upgrade to this version due to SemVer". If we are looking to speed up our development and not get into the way of maintainers - CalVer in this respect is way worse IMHO. The regulations might make us actually slower if we follow it. J. On Sun, Aug 27, 2023 at 8:46 AM Daniel Standish <daniel.stand...@astronomer.io.invalid> wrote: > And to clarify, when I talk about putting pressure on major releases, what > I mean is that there's this notion that a major release has to have some > very special or big feature change. One reason offered is marketing. > Major release is an opportunity to market airflow, so take advantage of > that. Another offered is, "well folks won't upgrade if there's not some > special carrots in it", especially given that major releases are where we > introduce a bunch of breaking changes all at once. > > Well, if we had a different policy that allowed for introducing behavior > changes on a regular basis, then we would not have to save them all up for > the major release, and unleash them on the users all at once. So then you > would not have that big painful major release upgrade to deal with -- you'd > have done it a little bit at a time. So the "carrots" become less > important perhaps. Perhaps the fact that behavior changes would come out > in dribs and drabs over time would make it more likely for users to upgrade > sooner, because staying current would be less painful than getting too far > behind -- though that's just a thought. > > But anyway, the way it is now, the major release seems to be too many > things: big marketing push, tons of new features, *and* the only > opportunity to make breaking changes. A policy like snowflake's seems so > much healthier, methodical, and relaxed, allowing us to be selective about > when and how to release behavior changes, without it having to be anything > more than that. > > CalVer <https://calver.org/> may be a good option. > > > On Sat, Aug 26, 2023 at 11:22 PM Daniel Standish < > daniel.stand...@astronomer.io> wrote: > > > For whatever reason, I was reminded recently of snowflake's "behavior > > change" policy > > > > See here: > > https://docs.snowflake.com/en/release-notes/behavior-change-policy > > > > I think semver is problematic for core because basically you cannot > > implement behavior changes until the "mythical" major release. The next > > major always seems very far off. Indeed some have suggested that 3.0 > might > > never happen (looking at you @potiuk :) ). > > > > Given the fact that it is so incredibly uncertain when exactly 3.0 will > > happen (and after that, subsequent major releases), it really makes the > > developer's job a lot harder. Because you feel like you may never (or > > practically never) be able to make a change, or fix something, etc. > > > > What snowflake does is they release "behavior changes" (a.k.a. breaks > with > > backward compatibility) in phases. First is testing phase (users can > > enable optionally). Next is opt-out period (enabled by default but you > can > > disable). Final phase is general availability, when it's enabled and you > > can't disable it. > > > > Moving to something like this would give us a little more flexibility to > > evolve airflow incrementally over time, without putting so much pressure > on > > those mythical major releases. And it would not put so much pressure on > > individual feature development, because you would know that there's a > > reasonable path to changing things down the road. > > > > Because of the infrequency of our major releases, I really think for > core, > > semver is just not working for us. That's perhaps a bold claim, but > that's > > how I feel. > > > > Discuss! > > > > > > > > > > >