Not sure if we can do much more than just periodically reminding and
educating the users  on it. This is really a "one-time" problem for most
people. Once you get burned, you will switch to the constraint method for
the future (at least most people will). I guess we just need to patiently
explain and point people to the documentation when it happens.

Also there is a reason I am repeating my explanations every time someone
reports it and creates an issue - because the more of those explanations
with links to the documentation will be there, in public issues, the more
likely someone will find it by google search or github issue search and
stumble into "this is the solution" when searching for similar problem -
before opening an issue. And I always do it "fresh" rather than
copy&pasting answer because variations in the answer might help with
indexing it. I think the number of issues we see could be much bigger - if
not all those repeated explanations. And the more we have, the more likely
they will be found, read, and maybe even followed. Also with AskAstro and
similar bots, having those answers in public issues in unstructured form is
even better than having a single "pinned" issue.

Remember that people who raise those issues are those who NEVER even
attempted to search for a similar issue. They have not even tried. If they
just pasted the error message in the search box of google or github issues,
they would immediately have found a duplicate issue with clear
explanation what to do, and they would not have opened the issue. Try
yourself:
https://github.com/apache/airflow/issues?q=is%3Aissue+No+module+named+%27connexion.decorators.validation%27+is%3Aclosed
or https://www.google.com/search?q=connexion.decorators.validation+airflow
. The closed issues with clear explanations what to do are right there at
the top. I believe we have quite a bit of a "survivor bias" - we only see a
few of those users who fail to think that they should google it or simply
search for a similar problem and solution. For those users, unfortunately -
no pinned issue or best practices will help. They will keep opening "I have
my issue I need to solve" without even checking if there is a similar one -
pretty much regardless of what we do.

Also I am not sure we need to do more from our side - those breakages do
not happen "too often" though - we are "quite" careful to foresee which
"major" dependencies might break our builds with breaking changes and
upper-bind them (especially when they declare SemVer compatibility) - but
we try to do it for very few of them that we "know" will be problematic
well in advance. We had maybe 5 or 6 similar cases in the last 4 years
(starting from the infamous Werkzeug breaking 1.10.8 installation in
February 2020 - literally hours after it was released (this is why we
introduced constraints as a mechanism to prevent it). We can only
upper-bind a handful of those, however - because of all the reasons that
airflow is not only an application but also a library and we do not want to
limit our users to also use future versions of libraries that may or may
not be compatible with current versions. We have no crystal ball to foresee
it, unfortunately. We do it currently for Python 3.12 now (because of
distutils removal), flask, flask-appbuilder, pendulum, sqlalchemy (and
Connexion now) - when it comes to "core" dependencies. That's about it. Few
and far between.

BTW. I think Connexion only happened because we never experienced breaking
release with it - and we have not expected it to be that breaking (also it
turns out - see comment
https://github.com/apache/airflow/issues/35234#issuecomment-1793404192 that
we would not have this problem at all if we used Connexion "as intended" -
in April we implemented some optimizations and customized the use of
Connexion that make our changes incompatible with Connexion 3 (but before
that change I believe, such change would not be breaking). Also since we
pinned Connexion in time, those issues will have relatively limited
time-span - people will hopefully move to installing 2.7.3 soon and there
the problem for Connexion is mitigated.

So.. Summarizing. I am not sure it's worth spending more time/effort than
what we already do :).


On Sat, Nov 4, 2023 at 1:18 PM Andrey Anshin <[email protected]>
wrote:

> Every single time when some upstream dependency breaks functionality of
> Airflow we have a couple of open issues and discussions.
> Previous one was WTForms, I think I've personally closed couple of issues
> and answer of couple discussions
>
> I'm not sure what we could do better here rather than accept the fact that
> some one of the users could be described by a meme "If those kids could
> read they'd be very upset", do not want to blame/assault someone but it is
> quite popular nowadays to ignore official documentations for different
> software and I don't know the nature of this behavior
>
> Maybe we could improve situations it by pin third meta issue into the
> Github Issues which are described best practices include:
> - Reproducible install
> - What we expect of good bug/feature request
> - Information about third-party Managed Airflow
>
> ----
> Best Wishes
> *Andrey Anshin*
>
>
>
> On Sat, 4 Nov 2023 at 15:58, Jarek Potiuk <[email protected]> wrote:
>
> > Dear Airflow community,
> >
> > Since there were a number (more than 4 already) of issues opened recently
> > when Connexion 3 broke installation of the released Airflow version a few
> > days ago - I have a short reminder on how to install Airflow in a
> > reproducible way.
> >
> > If you want to make sure released Airflow installs in a reproducible way
> > from the scratch - now and in the future, the only way to achieve that is
> > described here:
> >
> >
> >
> https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html
> >
> > It involves using constraints. It only works with `pip`. There are no
> other
> > ways and other tools that can be achieved easily, so we strongly
> recommend
> > you use `pip` when installing Airflow.
> >
> > Example installation command for airflow with celery extra and Python
> > version 3.8 is here:
> >
> > pip install "apache-airflow[celery]==2.7.2" --constraint "
> >
> >
> https://raw.githubusercontent.com/apache/airflow/constraints-2.7.2/constraints-3.8.txt
> > "
> >
> > If you **just** install airflow without constraints, situations like this
> > with Connexion breaking our installation might, and WILL happen and we
> have
> > no way to prevent this from happening if you are not using constraints.
> >
> > If you want to understand why and get some more insights on how to add
> your
> > custom dependencies as the next step - you can watch my talk from
> > September's Airflow Summit in Toronto where I explain how to do it (and
> why
> > we need to follow this approach for Airflow):
> >
> > https://www.youtube.com/watch?v=zPjIQjjjyHI
> >
> > J.
> >
>

Reply via email to