Agree Kaxil that requirements.txt is crucial. The requirements.txt is
already part of this change
https://github.com/apache/airflow/pull/4938 (and it is essential to
the whole concept).
>From what I see no-one complains about requirements.txt (the nice
thing is that we will have it automatically updated based on our
pre-commit + Ci image) and we have a nice workflow how to make sure it
is automatically updated along the way of development (basically
automatically by whoever who changes setup.py). Also it will help us
with any automated vulnerability scanning (we will be able to deploy
github's dependabot more easily then).
Let me just summarize where we are (I think):
I think the only contention point/doubt is whether we need the
*-pinned package at all. I don't need that package for production
image (I just need the requirements.txt). But for me there are two
arguments pro/con:
Pro:
- Having the pinned package will make it easy for everyone to be able
to consistently install airflow to begin with and test it. Right now
it might put-off the users if they do 'pip install
apache-airflow==1.10.8' and it does not install and they have to
google how. Over time it happens with any released Airflow version
sooner or later (with 1.10.8 it happened in a matter of hours). Due to
that I think Airflow might be perceived as not-really-stable.
Con:
- We will have two versions of airflow released ("apache-airflow" and
"apache-airflow-pinned"). I don't think it makes a lot of overhead for
testing etc. (we can automatically compare and verify installation for
both and see that they are equivalent at the time of release). But
maybe I am wrong about that one :)
I am thinking about starting a vote on that. WDYT? Did we have enough
discussion?
J.
On Thu, Mar 19, 2020 at 1:03 PM Kaxil Naik <[email protected]> wrote:
>
> I think we can have a requirements.txt (freezed when the package is
> released, similar to yarn.lock) instead of releasing a separate
> apache-airflow-pinned package.
>
> Regards,
> Kaxil
>
> On Tue, Mar 17, 2020 at 7:38 PM Ash Berlin-Taylor <[email protected]> wrote:
>
> > I think irrespective of what we do about releasing a pinned version, using
> > this approach so our prod image is repeatable sounds good!
> >
> > On 17 March 2020 19:17:59 GMT, Jarek Potiuk <[email protected]>
> > wrote:
> > >Any other comments?
> > >
> > >I'd love to hear your thoughts. It's the one thing that maybe not keeps
> > >me
> > >from prod image, But I would love to know if I can rely on the
> > >requirements.txt being part of the source code so that I can use it
> > >when building the prod image..
> > >
> > >J.
> > >
> > >
> > >On Mon, Mar 16, 2020 at 12:16 PM Jarek Potiuk
> > ><[email protected]>
> > >wrote:
> > >
> > >>
> > >> On Mon, Mar 16, 2020 at 11:16 AM Driesprong, Fokko
> > ><[email protected]>
> > >> wrote:
> > >>
> > >>> Personally I don't like to have two versions in the PyPi repo. This
> > >also
> > >>> complicates the releases, since we need to test, release and verify
> > >two
> > >>> versions of Airflow
> > >>>
> > >>
> > >> Not necessarily. Release is automated (and I run both package builds
> > >in CI
> > >> now).
> > >> Regarding testing - at the moment of release 'apache-airflow` and
> > >> 'apache-airflow-pinned' are 1-1 equivalent. Their behaviour starts to
> > >be
> > >> different when something out of our control happens with dependencies
> > >after
> > >> the release. We can even have automated tests and install both
> > >> 'apache-airflow' and 'apache-airlfow-pinned' at release time and
> > >compare
> > >> the virtualenvs on binary level to verify that they are equivalent.
> > >This
> > >> way we can just test one of them.
> > >>
> > >>
> > >>> I'm afraid that this might confuse users.
> > >>>
> > >>
> > >> I think that the current situation is much more confusing for the
> > >users.
> > >> In order to install <1.10.8 for example you need to know that you
> > >should
> > >> run 'pip install apache-airflow==1.10.8 Werkzeug<1.0'. If you want to
> > >> install 1.10.2 you need to add other constraints. And tomorrow
> > >another
> > >> constraint might be needed to install 1.10.9. This is also a hard
> > >blocker
> > >> for the official, production image of Apache Airflow. One of the
> > >> constraints to make an official Docker image is repeatability
> > >> https://github.com/docker-library/official-images#repeatability . We
> > >> don't have repeatability now.
> > >>
> > >> We need to consider what our alternatives are. I think our problem is
> > >that:
> > >>
> > >> 1) we want to have it open for operator development
> > >> 2) we want the user (and official docker build) to have a reliable
> > >way to
> > >> install the released version. This already happened and will happen
> > >that
> > >> out of a sudden released version of airflow will stop installing
> > >without
> > >> magic 'Werkzeug <= 1.0' or similar.
> > >>
> > >> We already have good solution for 1) with open dependencies . I see
> > >two
> > >> solutions to that:
> > >>
> > >> *Solution 1) Ask the users to use requirements.txt as constraint file
> > >> manually:*
> > >>
> > >> a) manually download requirements.txt from the right release in
> > >github
> > >> b) pip install apache-airflow==1.10.9 --constraint requirements.txt
> > >>
> > >> Or
> > >>
> > >> *Solution 2) Use "-pinned" package (effectively does the same):*
> > >>
> > >> pip install apache-airflow-pinned==1.10.9
> > >>
> > >> I find the second one much less confusing and straightforward..
> > >>
> > >>
> > >>
> > >> I am all ears if we have any other solution? Can we discuss some
> > >potential
> > >> options here?
> > >>
> > >>
> > >> Besides that, it feels a bit like we're reinventing certain
> > >mechanisms that
> > >>> are already in tooling like Poetry.
> > >>>
> > >>
> > >> Unfortunately according to my checks - Poetry does not handle our
> > >case
> > >> where we are both library and application at the same time. Form what
> > >I
> > >> checked poetry uses .lock file and publishes the "pinned" version
> > >always in
> > >> such case. Hacking it to support both cases would be super-difficult
> > >and
> > >> defeat the purpose of using ready-to-use-tool, I looked at it and
> > >could not
> > >> find a way to achieve both 1) and 2). If you think we can do it, It
> > >would
> > >> be great to discuss the approach we can take. If we can agree that we
> > >only
> > >> publish pinned version, I would be super happy and switch to poetry
> > >> immediately - but we cannot do it I am afraid. There was a long
> > >discussion
> > >> about it and we cannot afford pinning by default, unfortunately. This
> > >has
> > >> already been discussed several times and unless this assumption has
> > >changed
> > >> - we cannot use poetry or similar.
> > >>
> > >> My 2ct.
> > >>>
> > >> --
> > >>
> > >> Jarek Potiuk
> > >> Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >>
> > >> M: +48 660 796 129 <+48660796129>
> > >> [image: Polidea] <https://www.polidea.com/>
> > >>
> > >>
> > >
> > >--
> > >
> > >Jarek Potiuk
> > >Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >
> > >M: +48 660 796 129 <+48660796129>
> > >[image: Polidea] <https://www.polidea.com/>
> >
--
Jarek Potiuk
Polidea | Principal Software Engineer
M: +48 660 796 129