Re: [DISCUSSION] Assessing what is a breaking change for Airflow (SemVer context)

Bolke de Bruin Mon, 05 Dec 2022 07:37:37 -0800

on 1)

I've just skimmed the PR a bit and honestly it feels a bit clunky. It
relies yet again on some after-coding-enforcement and some kind of IDL to
generate a stub. Isn't there a kind of pragma that we could use? Or a
decorator?


like:
(see also: https://gist.github.com/latsa/1f3ed52784a6fb423a937aa030679117)

@public
def my_func()

or
def my_func()  # pragma: public

I like the immediate feedback instead of MyPy checks which also fail only
during tests. I think the decorator style is used in other projects too,
but I don't have references.

Again I just skimmed it so I am shooting from the hip as they say ;-)



Op za 3 dec. 2022 om 10:25 schreef Jarek Potiuk <ja...@potiuk.com>:

> Sorry for not following up on this for a bit - it's been hectic these
> days for me. I think valid points were said, and from the tone of
> those I feel that we all who participated have the same sense of what
> is important:
>
> 1) users "peace of mind" as top priority: clarity of what they can
> expect from Airflow, and avoiding surprises when upgrading
> 2) targeting minimal disruption to user's workflows (though we might
> never reach absolute 100%)
> 3) making it easy for contributors and maintainers to decide on
> breaking/non-breaking behaviours
>
> I think there is a main blocker to all of those (also mentioned in the
> discussion above):
>
> We are extremely cautious about any change because there is a lack of
> agreement/expectations with our users on what is supposed to be the
> "public API" .
>
> # Proposal
>
> My proposal  to work on documenting our approach for our users (and
> for maintainers) in a single page: "What is Airflow Public API?" and
> what users can expect.
>
> There are certain areas where we can define rules and either automate
> or document (or both) our statement about what is the "public" API and
> (more importantly) what is clearly NOT on a single page document.
> Also it should also be accompanied (where possible) with some
> automation and tooling that would help us to express it in detail (and
> help our users to validate if they are conforming to the "public
> API").
>
> We won't solve it very quickly, but once we start doing it, it might
> turn out that it's not that long of a process in fact. And if we start
> it now - in a few months we might be in a different place.
>
> # Some concrete actions we might take
>
> 1) On the 'Code" level - we can start to define the API that is
> considered as "public" and add verification of those for our users. We
> could implement a similar solution to what I proposed to common.sql
> https://github.com/apache/airflow/pull/27962 (where I followed Ash's
> idea to use MyPy stubgen and pre-commits to flag changes to it, and
> where we harness MyPy capabilities to control how the API is used). I
> believe that we could apply a similar solution to all providers and
> eventually even all parts of core, to make it very clear which part of
> the Airflow API is public and which is not. I think MyPy and
> strong-ish typing is taking the Python world by a storm, and we could
> use it as a standard way of communicating to those who use Airflow as
> a library, which parts are "public".
>
> Having .pyi files as part of our packages with "hidden" parts that are
> not supported to be exposed, seems to be not only a nice communication
> tool but also has support for all the kind of tooling from day 0 for
> our users (IDE integrations, automations to check if the right API is
> used etc.). We could even easily provide guidelines for the users
> "Here is how you can check if you are using Airflow code properly".
> Not 100% foolproof but much better than anything else I can imagine.
>
> Also having it in place will allow the providers to be finally
> separated to separate repositories - and we could use MyPy checks
> rather than running the full test suite with the Providers to verify
> if changes in Airflow do not break Providers. That would finally make
> it possible to loosen the coupling we have between Providers and
> Airflow (currently we basically run whole suite of tests to be certain
> things are working - but we could simply run providers with MyPy
> checks if we have proper .pyi files (not the same confidence but very,
> very close).
>
> 2) On the DB level - we already have "AIP-44" as the foundation of
> telling the users - those are the "Airflow" you can do "this" when you
> write your DAGs. Direct DB access will be forbidden and we can
> specifically communicate to the users "do not use DB any more" and we
> can even work out warnings when our users do. We could even make it a
> default behaviour later to block direct access by default (but that is
> likely only in Airflow 3).
>
> 3) On the UI level - we could simply explain that UI changes are
> exempt from the "no removal" policy. We might simply treat all the UI
> changes as non-breaking by default and loosen our strictness there.
> This would be very close to the Chrome/Firefox example by Bolke - I
> think UI changes are not breaking in the sense that you have to fix
> your code that uses it, it requires simply changing user's habits.
> We've already done this, That would be simply acknowledging the
> approach we already used when TreeView was replaced by GridView.
>
> 4) Airflow also has also a few non-code interfaces that are considered
> as part of the platform: statsd metrics is one of them. I can't think
> of any more but maybe there are more. We could simply make an
> inventory and discuss our approach on those ONCE and document it. This
> will avoid discussions, discussions, discussions, and let our users
> have some clear expectations and maintainers making quick decisions
> when approving (or not) PRs.
>
> # Question
>
> Does it sound like a good plan? Is it worth making such an effort ? Or
> maybe what we have as status-quo is "good enough" and that would be a
> waste of effort? WDYT?
>
> J.
>
>
>
>
> J
>


-- 

--
Bolke de Bruin
bdbr...@gmail.com

Re: [DISCUSSION] Assessing what is a breaking change for Airflow (SemVer context)

Reply via email to