> Generally, I like the idea of having such a tool, but we need to make it
pretty general so that when such a tight dependency issue comes up,
the tool should just be able to pull it off by building the rest of the
ecosystem while keeping "constraints".

Absolutely. I checked that PyMSSQL has no dependencies on it's own so
bumping it does not affect the other part of the "environment". it
would not be possible in more general cases, it will always be a very,
very rare event and we only will do it when there is a problem with we
have reproducible build problem. We NEVER update constraints when
there is just a new dependency version with a fix. This is made very
clear here: 
https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html#fixing-constraints-at-release-time:

> The released “versioned” constraints are mostly fixed when we release Airflow 
> version and we only update them in exceptional circumstances. For example 
> when we find out that the released constraints might prevent Airflow from 
> being installed consistently from the scratch.
> In normal circumstances, the constraint files are not going to change if new 
> version of Airflow dependencies are released - not even when those versions 
> contain critical security fixes. The process of Airflow releases is designed 
> around upgrading dependencies automatically where applicable but only when we 
> release a new version of Airflow, not for already released versions.

This is the "exceptional circumstance" mentioned above.

> Btw, I think this issue is also for intel machines and not just ARM
installations. Ref:
> https://github.com/cython/cython/issues/5541#issuecomment-1641654183

Sort of. To be precise - it was triggered when the binary wheel
released by PyMSSQL maintainers for the given architecture/os/cpp
library version etc. is not used. When you see
https://pypi.org/project/pymssql/2.2.7/#files  you will see they
released MANY of those for x86 (but none for ARM). So generally linux
+ x86 were covered. UNLESS someone had an architecture that did not
fall into one of the 50 or so binary variants (so yeah it was a bit of
simplification to say "only ARM", but generally speaking it would be a
very rare case that one of those binaries did not match the actual x86
architecture). Not sure why this one did not match the binaries - they
seem to be there - maybe that someone posting it had a setting that
forces compilation (via --no-binary :all: for example) when PIP
installs packages. BTW. This is also one of the reasons I want to wait
until the maintainers will build and publish all the binary packages
that are missing for 2.2.8 now
https://pypi.org/project/pymssql/2.2.8/#files. Currently there are
just a few of those (only mac binaries are present there - see the
screenshots and compare to 2.2.7 ).

To say ARM users are impacted is largely because they are 100%
impacted compared to likely less than few % for Intel. The pymssql
project does not publish ARM binaries at all, so pymssql is always
compiled locally when installed on ARM.

> Also, not sure if this can be fixed by constraining Cython, see thread:
> https://discuss.python.org/t/no-way-to-pin-build-dependencies/29833/21

I am not going to constraint Cython - this is not going to work for
the reasons described there (and in my explanation - build isolation
does not take constraints into account, they only take them into
account when  PIP_CONSTRAINT is used but not when --pip-constraint is
used - this is why I explain that the workaround was mostly
"accidental" - PIP_CONSTRAINT working when build isolation is on,
could be considered as bug actually because it behaves differently
when the command line flag is used). As described above - what I am
going to do (see my description) I am going to bump pymssql to 2.2.8
instead. I had to wait for 2.2.8 release done by pymssql maintainers
in order to do that. The 2.2.8 release is compatible with Cython 3, so
when someone will install airflow 2.5.1 with 2.5.1 constraints for
example and this will trigger pymssql compilation, it will pull 2.2.8
version

J.



On Mon, Jul 31, 2023 at 12:36 PM Amogh Desai <[email protected]> wrote:
>
> Generally, I like the idea of having such a tool, but we need to make it
> pretty general so that when such a tight dependency issue comes up,
> the tool should just be able to pull it off by building the rest of the
> ecosystem while keeping "constraints".
>
> Btw, I think this issue is also for intel machines and not just ARM
> installations. Ref:
> https://github.com/cython/cython/issues/5541#issuecomment-1641654183
>
> Also, not sure if this can be fixed by constraining Cython, see thread:
> https://discuss.python.org/t/no-way-to-pin-build-dependencies/29833/21
>
> Thanks & Regards,
> Amogh Desai
>
> On Mon, Jul 31, 2023 at 3:57 PM Jarek Potiuk <[email protected]> wrote:
>
> > BTW. I will run the consensus question until Thursday 3rd of August
> > 2023 1pm noon.
> >
> > On Mon, Jul 31, 2023 at 12:20 PM Jarek Potiuk <[email protected]> wrote:
> > >
> > > Hello everyone,
> > >
> > > TL;DR; I have a small proposal - I would like to fix constraints for
> > > all Airflow 2.5.*  and 2.6*  by updating pymssql to (released today)
> > > 2.2.8 version from 2.2.7.
> > >
> > > I still have to wait for the "complete" release. They have not yet
> > > released linux binary variants of the packages for 2.2.8 and people
> > > watching it flagged it to the maintainers. but I wanted to get
> > > consensus on it before I start doing it.
> > >
> > > Currently users installing MSSQL provider for their ARM-based airflow
> > > are experiencing "build failed" when pymssql is installed. They have
> > > to use a workaround described here -
> > > https://github.com/apache/airflow/issues/32672#issuecomment-1647007726
> > > and the proposal aims to fix it so that the workaround will not be
> > > needed when using constraints. There are already few issues about it
> > > in our repo:
> > >
> > > This is one of the extremely rare cases (happened already 2 times over
> > > last 2 years) where our "reproducible installation" stopped working
> > > for Python versions - because of the `pip` tooling update that we have
> > > no control over, but thanks to ability of updating constraints, we can
> > > fix it by updating constraints.
> > >
> > > If we get consensus I will use that opportunity to add some tooling to
> > > make it easier to do such updates in the future - it requires creating
> > > new branch for every versio and moving constraint tags - but this is
> > > easy to automate. And I will have an excuse to develop a small tool do
> > > help with that that - which we will be able to use in the future in
> > > simillar cases (I've done it manually before).
> > >
> > > Some more context:
> > >
> > > Two weeks ago, on 17th of July, Cython released a long-in-the-making
> > > 3.0.0 version with some backwards-incompatible changes, and while a
> > > lot of the packages have been made compatible, pymssql was one of the
> > > packages that was not.  The issue did not affect x86 users, because
> > > pymssql binaries were pre-compiled in PyPI
> > > https://pypi.org/project/pymssql/2.2.7 but ARM users have problems
> > > installing it, because it needs to be compiled on-the-flight for them.
> > >
> > > It caused quite a bit of mayhem in Python ecosystem - especially for
> > > projects that are not as up-to-date as Airflow is with all our
> > > dependencies - most of our dependecies are automatically updated in
> > > the constraints as soon as new versions are released, and many of them
> > > have binary packages already. So given how big of the problem it was
> > > for some other projects, having just pymssql being problematic is
> > > quite cool and shows that our approach works :).
> > >
> > > Unfortunately we have no control over which version of Cython is used
> > > when compiling PyMSSQL (this is something described by pymssql package
> > > - and new versions of pip uses "build-isolation" enabled by default,
> > > so it's only up to the package itself to decide on the version of
> > > build tools that are used. There is a "mostly accidental" - I think -
> > > workaround with PIP_CONSTRAINTS environment variable but it is rather
> > > complexi-sh to pull, especialy in custom docker images based on the
> > > slim images.
> > >
> > > I've implemented the workaround for our ARM images last week to make
> > > them work - so you can see it's quite a bit complex-ish:
> > > https://github.com/apache/airflow/pull/32748
> > >
> > > The 2.2.8 version of pymssql has only one change:
> > >
> > > > Version 2.2.8 - 2023-07-30 - Mikhail Terekhov
> > > > Compatibility with Cython. Thanks to matusvalo (Matus Valo) (fix #826).
> > >
> > > Why 2.5+ ?
> > >
> > > a) because  ARM suppport for MsSQL has been introduced in 2.5.1
> > > b) because 2.4 used 2.2.5 version of PyMSSQL and there were few more
> > > changes in 2.2.6 so there is a (low) risk it will break something
> > > else.
> > >
> > > Note, that we do NOT have to rebuild our images, when the pymssql
> > > 2.2.7 has been build before Cython 3.0.0, it is good to go. The fact
> > > that 2.2.8 only change is to make it works with CPython to build - we
> > > do not need to rebuild and re-release our images.
> > >
> > > Can we get consensus on it? Anyone has anything against it ?
> > >
> > > J.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to