Same, I was very tempted by this at first but Jarek and Niko changed my mind. I think sticking to semver will be more beneficial in the long run.
On Wed 30 Aug 2023 at 04:09, Mehta, Shubham <shu...@amazon.com.invalid> wrote: > I couldn’t agree more with Jarek and Niko's perspective on the importance > of maintaining SemVer for Apache Airflow. > > I've had conversations with dozens of customers, and it was a lot easier > to convince them to upgrade for a more stable and secure Airflow > environment. The key selling point was that Airflow strictly follows > SemVer, so users don't have to worry about upgrades breaking their > environment. Security is the most important aspect of this. And with the > recent inflow of CVEs being addressed with every version release, I can't > imagine how difficult it would have been for customers without SemVer > promise to ensure that their Airflow deployments are secure. > > > Well, if we had a different policy that allowed for introducing behavior > changes on a regular basis, then we would not have to save them all up for > the major release, and unleash them on the users all at once. So then you > would not have that big painful major release upgrade to deal with -- you'd > have done it a little bit at a time. So the "carrots" become less important > perhaps. Perhaps the fact that behavior changes would come out in dribs and > drabs over time would make it more likely for users to upgrade sooner, > because staying current would be less painful than getting too far > > Regarding introducing behavior changes on a regular basis, I recently > analyzed improvements and new features in Airflow. I noticed that Airflow > did not strictly follow SemVer for the 1.10.x releases. As a result, there > were many users stuck on versions like "1.10.12," and these users are > hesitant even to upgrade to later 1.10 versions. Now, I see users happily > migrating to newer versions of Airflow and trying out new features. > Granted, it's not perfect due to potential breaking changes in the provider > packages, but it's far better than what Airflow experienced with the 1.10.x > series. > > To be honest, Airflow already faces challenges in improving the adoption > of new features, in my personal opinion. For example, it took about a year > for deferrable operators to gain awareness and interest. I also haven't > seen much excitement around data-driven scheduling among Airflow users > (which I was secretly hoping), similar to Airflow contributors. Moving away > from SemVer would likely make this situation worse. > > > I'm sure many good ideas have emerged, but been ruled out solely based > on backcompat. > > Until we have a list of data points / ideas that were discarded, it is > hard to justify a major release for this reason. Maybe we should maintain > an active list in GitHub discussions? > > In conclusion, SemVer is easy to understand for regular Airflow users who > might not read every line in the release notes or follow every mailing list > discussion. Personally, I don't think it has made adding new features > difficult as I see a lot of new features coming in lately. In fact, I > strongly believe that SemVer helps keep Airflow contributors focused on > customer needs and encourages them to think creatively about ensuring > backward compatibility. > > Shubham > > > On 2023-08-27, 9:05 AM, "Daniel Standish" > <daniel.stand...@astronomer.io.inva <mailto: > daniel.stand...@astronomer.io.inva>LID> wrote: > > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > > > > > > > Since I was called :) > > > > > As though you needed to be called to chime in ;) > > > Yeah and the other thing that your comments made me think of was... how it > could make provider management more challenging. Because though currently > we have min_airflow_version set in providers and we can use that to control > behavior (and assumptions about what's in core), presently it's just about > future compat and just addition of new features. But with a change like > this, it would expand that burden to some extent, by > requiring consideration of what's changed and what's removed, in a way that > is not a practical issue presently. > > > I see no particular reason for removing features if they do not slow us > down > > > > > Yeah so wholesale removal of features is one thing, like with the subdags > you mentioned. But the prospect of the infinitely distant 3.0 also has a > more diffuse impact on development. I'm sure many good ideas have emerged > but been ruled out solely based on backcompat. Sometimes probably on a > narrow backcompat concern where it's maybe like... is anybody really > relying on this aspect of behavior? > > > Maybe that's simply what we must deal with. But the thought occurred to > me, maybe there's some other way. > > > And yeah i shouldn't say it's "not working for us"... that's just me > writing an email 2 minutes before bedtime when an idea popped in my > head.... obviously it's working ok for us, and doing a lot of work *for* > us. > > > > > > > > > On Sun, Aug 27, 2023 at 1:33 AM Jarek Potiuk <ja...@potiuk.com <mailto: > ja...@potiuk.com>> wrote: > > > > Since I was called :). > > > > Yes. I would be very, very careful here. You might think that we use > > "SemVer" as a "cult". Finally it's just a versioning scheme we adopted, > > right? But for me - this is way more. It's all about communication with > > our users, making promises to our users and design decisions that impact > > our security policies. > > > > I think Semver has this nice property that we can promise our users "if > you > > are using the public interface of Airflow, you can upgrade without too > much > > of a fear that things will break - if they will be broken, this will be > > accidental and will get fixed". BTW we already have, very nicely defined > > > > > https://airflow.apache.org/docs/apache-airflow/stable/public-airflow-interface.html > < > https://airflow.apache.org/docs/apache-airflow/stable/public-airflow-interface.html > > > > so it's pretty clear what we promise to our users. And it also has > certain > > "security" properties - but I will get to that. > > > > I would love to hear what other think, but I have 3 important aspects > that > > should be considered in this discussion > > > > 1. Promises we make to our users and what it means for us unswering to > > their issues. > > > > Surely we could make other promises. CalVer promises (We release > regularly) > > - but it does not give the user any indication that whatever worked > before > > will work in the foreseeable future and will get maintained. It makes > > maintainer life easier, yes. It however makes the user's life harder, > > because they cannot rely on something being available and their > > upgrades might and will be more difficult. And yes - for Snowflake it > > matters a lot, because they actually get paid for supporting old versions > > and they have no choice but to respond to users who claim the "old > > supported version does not work". They cannot (as we can, and often do > > currently) tell the users "upgrade to the latest version - it should be > > easy because of SemVer promise - if you follow our "use public interface > > only of course". We (community/maintainer) can very easily say that and > > since we give no support, no guarantees, no-one pays for solving > problems, > > this "upgrade to latest version" is actually a very good answer - many, > > many times. > > > > For maintainers that rarely respond to user questions, yes Semver is > harder > > to add new things. But for maintainers who actually respond a lot to > users' > > questions, life is suddenly way harder - because they cannot answer > > "upgrade to latest version" - because immediately the user will answer > "but > > I cannot - because I am using this and that feature. tell me how to solve > > the problem without upgrading". And they will be in full right to say > that. > > I recommend any maintainers to spend sometimes few week looking and > > actually responding to user's questions. That is quite a good lesson and > > this aspect should also be considered in this discussion. > > > > 2. Why do we want to introduce breaking changes? > > > > I believe the only reason for removing features (i don't really like > > softening "breaking changes" with "behaviour changes' BTW. > > This attempts to hide the fact that those changes are - in fact - > breaking > > changes - is that when they are slowing us down (us - people who > > develop airflow). So I propose to keep the name in further discussion as > it > > tells much more about the nature of "behaviour changes". I see no > > particular reason for removing features if they do not slow us down. > > > > Let me ask this way - would Semver disturb you if we had a way of > removing > > features from core airflow (i.e. making them not slowing down > development) > > if we have a way of doing it without breaking Semver? Seems > contradictory - > > but it is not. We've already done that and continue doing it. Move out > > Kubernetes and Celery out of core -> we did it. It's not any more part of > > Semver. They never were, actually (but a number of people could assume > they > > were). Now, they are completely out of the way. I remember how much time > > Daniel, you spent on back-compatibility of K8S code - but... Not any > more. > > People will be able to keep current K8S and Celery provider > implementation > > basically forever now. No matter how many Airflow versions we have. By > > introducing a very clear Executor API and making sure we decouple it from > > the Core - we actually made the impossible happen: > > > > * Airflow core still keeps the SemVer Promises > > * Users can stick to the "old" behaviour of K8S and Celery executors as > > long as they want > > * We can introduce breaking changes in K8S and Celery providers - without > > absolutely looking back > > > > Seems like magic - but just by clear specification of what is/is not a > > public API, decoupling things and having mechanisms of providers where > you > > downgrade and upgrade them independently and where the old versions can > be > > used for as long as you want - even after you upgrade Airflow, we > > seemingly made impossible - possible. And .. my assumption is - we can do > > the same with other features. Surely some are more difficult than others > > (SubDAG - yes I am talking about you). But maybe - instead of breaking > > SemVer we can figure out how to remove subdag to s separately installable > > provider? Why not introduce some stable API and decoupling SubDAG > > functionality similar to Celery/K8S Executors? It will be a lot harder, > and > > likely performance will suffer - but hey, performance is not something > > promised by SemVer. We already - apparently - in 2.5 increased > > Airflow's resource requirements by ~ 30% (looks like from an anecdotal > > user's report). And while users complain, performance / resource usage is > > not promised by SemVer (and by us). And while users might complain, > > increasing resources nowadays is just a matter of cost, it's generally > easy > > to increase memory for your Airflow installation. Yes you will pay more, > > but usually Airflow's cost is rather low and there are other factors that > > might decrease the cost (such as deferrables) so this is not a big > problem > > (and it does not matter in this discussion). > > > > So my question is - do we have a really good reason to break up with > SemVer > > and remove some features ? Or maybe there are ways we can separate them > out > > of the way of core maintainers without breaking SemVer? I believe more > and > > more decoupling is way better approach to achieve faster development than > > breaking SemVer. > > > > 3. Security patches > > > > This is, I think, one of the things that will only get more important > over > > the next few years. And we need to be ready for what's coming. I am not > > sure about others but I am not only following, but also I actively > > participate in discussion of the Apache Software Foundation. For those > who > > don't - I recommend reading this blog post at the ASF Blog > > > > > https://news.apache.org/foundation/entry/save-open-source-the-impending-tragedy-of-the-cyber-resilience-act > < > https://news.apache.org/foundation/entry/save-open-source-the-impending-tragedy-of-the-cyber-resilience-act > > > > . We are facing - in the next few years increased pressure from > governments > > (EU and US mainly in our case) to force everyone to follow security > > practices they deem essential. We are at a pivotal moment where the > > Software Development industry is starting to be regulated. It happened in > > the past for many industries. And it is happening now - we are facing > > regulations that we've never experienced in software development history. > > Those laws are about to be finalized (some of them in a few months). The > > actual scope of it is yet to be finalized but among them there is a > STRICT > > requirement of releasing security patches for all major versions of the > > software for 5 years (!) after it's been released. This will be a strict > > requirement in the EU and companies and organisations not following it > will > > be forbidden to do business in the EU (similar in the US). How it will > > impact ASF - we do not know yet, our processes are sound. But there is a > > path that both - our users and stakeholders will expect that there are > > security patches that are released for all the software that is out there > > and used for years. > > > > If we use SemVer - this is the very nice side of it - by definition we > only > > need to release patches for all the MAJOR versions we have. This is what > we > > do effectively today. We only release security patches for the latest > MINOR > > release of the ONLY major release (Airflow 2). If we start deliberately > > releasing breaking changes - then such a breaking release becomes > > automatically equivalent to a MAJOR release - because our users will not > be > > able to upgrade and apply security fixes without - sometimes - majorly > > breaking their installation. This is 100% against the spirit and idea of > > the regulations. The regulations aim to force those who produce software > to > > make it easy and possible for the users to upgrade immediately after > > security fixes are released. > > > > In a way - using SemVer and being able to tell the users "We only release > > security patches in the latest release because you can always easily > > upgrade to this version due to SemVer". > > > > If we are looking to speed up our development and not get into the way of > > maintainers - CalVer in this respect is way worse IMHO. The regulations > > might make us actually slower if we follow it. > > > > J. > > > > > > > > > > > > On Sun, Aug 27, 2023 at 8:46 AM Daniel Standish > > <daniel.stand...@astronomer.io.inva <mailto: > daniel.stand...@astronomer.io.inva>lid> wrote: > > > > > And to clarify, when I talk about putting pressure on major releases, > > what > > > I mean is that there's this notion that a major release has to have > some > > > very special or big feature change. One reason offered is marketing. > > > Major release is an opportunity to market airflow, so take advantage of > > > that. Another offered is, "well folks won't upgrade if there's not some > > > special carrots in it", especially given that major releases are where > we > > > introduce a bunch of breaking changes all at once. > > > > > > Well, if we had a different policy that allowed for introducing > behavior > > > changes on a regular basis, then we would not have to save them all up > > for > > > the major release, and unleash them on the users all at once. So then > > you > > > would not have that big painful major release upgrade to deal with -- > > you'd > > > have done it a little bit at a time. So the "carrots" become less > > > important perhaps. Perhaps the fact that behavior changes would come > out > > > in dribs and drabs over time would make it more likely for users to > > upgrade > > > sooner, because staying current would be less painful than getting too > > far > > > behind -- though that's just a thought. > > > > > > But anyway, the way it is now, the major release seems to be too many > > > things: big marketing push, tons of new features, *and* the only > > > opportunity to make breaking changes. A policy like snowflake's seems > so > > > much healthier, methodical, and relaxed, allowing us to be selective > > about > > > when and how to release behavior changes, without it having to be > > anything > > > more than that. > > > > > > CalVer <https://calver.org/> <https://calver.org/>> may be a good > option. > > > > > > > > > On Sat, Aug 26, 2023 at 11:22 PM Daniel Standish < > > > daniel.stand...@astronomer.io <mailto:daniel.stand...@astronomer.io>> > wrote: > > > > > > > For whatever reason, I was reminded recently of snowflake's "behavior > > > > change" policy > > > > > > > > See here: > > > > https://docs.snowflake.com/en/release-notes/behavior-change-policy < > https://docs.snowflake.com/en/release-notes/behavior-change-policy> > > > > > > > > I think semver is problematic for core because basically you cannot > > > > implement behavior changes until the "mythical" major release. The > > next > > > > major always seems very far off. Indeed some have suggested that 3.0 > > > might > > > > never happen (looking at you @potiuk :) ). > > > > > > > > Given the fact that it is so incredibly uncertain when exactly 3.0 > will > > > > happen (and after that, subsequent major releases), it really makes > the > > > > developer's job a lot harder. Because you feel like you may never (or > > > > practically never) be able to make a change, or fix something, etc. > > > > > > > > What snowflake does is they release "behavior changes" (a.k.a. breaks > > > with > > > > backward compatibility) in phases. First is testing phase (users can > > > > enable optionally). Next is opt-out period (enabled by default but > you > > > can > > > > disable). Final phase is general availability, when it's enabled and > > you > > > > can't disable it. > > > > > > > > Moving to something like this would give us a little more flexibility > > to > > > > evolve airflow incrementally over time, without putting so much > > pressure > > > on > > > > those mythical major releases. And it would not put so much pressure > > on > > > > individual feature development, because you would know that there's a > > > > reasonable path to changing things down the road. > > > > > > > > Because of the infrequency of our major releases, I really think for > > > core, > > > > semver is just not working for us. That's perhaps a bold claim, but > > > that's > > > > how I feel. > > > > > > > > Discuss! > > > > > > > > > > > > > > > > > > > > > > > > > > > > >