It's better than nothing but I am concerned that this will make managing
optional dependencies overly complicated. I really think that the proper
fix is patch releases.

On Thu, Aug 1, 2019 at 1:05 PM Jarek Potiuk <jarek.pot...@polidea.com>
wrote:

> Hello Everyone,
>
> Just to revive the thread - we had a discussion with Ash today after
> today's small "spanner" drama, and we came with a possible solution.
>
> This is something we yet have to try but it seems that it should be
> possible to generate additional "pinned" extras (pinned, gcp_api-pinned
> etc.) - it could also be "frozen" instead of "pinned" if the name sounds
> better.
>
> This way you would be able to run:
>
>    - `pip install airflow==1.10.4[all-pinned]`
>    - `pip-install airflow==1.10.4[gcp_api-pinned]'
>    - ...
>
> This way -  it will always work no matter if new dependencies are released.
> It will install the "frozen" version of dependencies that we know work for
> sure. We could update the documentation to add this is as the recommended
> method of standalone installation. Then if you need some other set of
> dependencies (newer) you could have a custom pip install to fix certain
> dependencies.
>
> What do you think? Would that work for the users of airflow ?
>
> J.
>
> On Tue, Jul 9, 2019 at 9:06 PM Driesprong, Fokko <fo...@driesprong.frl>
> wrote:
>
> > Hi Jarek,
> >
> > Thanks for bringing this up. I certainly think this is a good idea.
> > Unfortunately I'm in a plane right now so I'm unable to read the Google
> doc
> > right now.
> >
> > GitHub recently acquired Dependabot which even supports automatic updates
> > of dependencies. The we at least know when something breaks. The only
> > problem right now is that this bot isn't allowed by the ASF policies
> since
> > it requires write access to the repository.
> >
> > Regarding the symver. I do often see packages changing the public API in
> a
> > minor update without any notice of deprecation. In this case it is
> > impossible to make this watertight, but at least a more structured
> process
> > using something like Dependabot would be a big plus!
> >
> > Cheers, Fokko
> >
> >
> >
> > Op zo 7 jul. 2019 om 11:34 schreef Jarek Potiuk <
> jarek.pot...@polidea.com>
> >
> > > All for deeper release-cycle discussion. I think after 1.10.4 is out we
> > > should discuss/agree and document the release scheme we are going to
> use.
> > > Semver and patching seems like a good idea.
> > >
> > > We have already quite an experience in backporting to 1.10.x branch and
> > it
> > > was surprisingly easy - small, focused commits help with that. And if
> we
> > > limit patches to dependency updates and security fixes only, I don't
> see
> > it
> > > will be a lot of effort.
> > >
> > > Bot and automation is definitely something we should do. The pyup bot
> is
> > > great - for one - to automate upgrades of pinned dependencies. We use
> it
> > in
> > > Oozie-to-airflow for quite some time and it takes almost no time to
> > upgrade
> > > deps regularly:
> > >
> > >
> >
> https://github.com/GoogleCloudPlatform/oozie-to-airflow/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aclosed+pyup
> > > - those are automated PRs we got from pyup and it was just enough to do
> > > "approve" + "merge" after we saw that all the tests passed with the new
> > > version.
> > >
> > > J.
> > >
> > >
> > >
> > > On Sat, Jul 6, 2019 at 9:24 PM Philippe Gagnon <philgagn...@gmail.com>
> > > wrote:
> > >
> > > > I am +1 on pinning core packages, even though this adds a bit of
> manual
> > > > labor for maintenance. This latest werkzeug issue highlights why this
> > is
> > > a
> > > > good idea.
> > > >
> > > > Also +1 on changing the versioning scheme to something more akin to
> > > semver.
> > > > The current scheme basically does not support patch-only releases
> and a
> > > > 4-part version notation seems a bit much. Overall, I think that
> > > patch-only
> > > > releases would make the project healthier.
> > > >
> > > > Two points though:
> > > >
> > > > 1. I think that there should be a more in-depth discussion about
> > > clarifying
> > > > the release lifecycle policy.
> > > > 2. This implies a lot more backport-related work, which is a bit of a
> > > > burden since it is both tedious and boring. Perhaps we could look
> into
> > > > having a bot help out with this (similar to
> > > > https://github.com/miss-islington)?
> > > >
> > > > On Sat, Jul 6, 2019 at 1:04 PM Jarek Potiuk <
> jarek.pot...@polidea.com>
> > > > wrote:
> > > >
> > > > > I think the recent case with werkzeug calls for action here (also
> see
> > > > > https://issues.apache.org/jira/browse/AIRFLOW-4903 ). We again
> ended
> > > up
> > > > > with released Airflow version that cannot be installed easily
> because
> > > of
> > > > > some transient dependencies upgrade.
> > > > >
> > > > > I think this is something we should at least consider for 2.*
> > >  version.
> > > > >
> > > > > The problem is that simply running 'pip install airflow==1.10.3' .
> > > Right
> > > > > now this will not work - you have to hack it and manually upgrade
> > deps
> > > > > (like https://github.com/godatadriven/whirl/issues/50).
> > > > >
> > > > > I really do not like that changes beyond our control impact the
> > release
> > > > we
> > > > > already made (and is out there in pip).
> > > > >
> > > > > I've read recently the nice writeup
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1x_VrNtXCup75qA3glDd2fQOB2TakldwjKZ6pXaAjAfg/edit
> > > > > about
> > > > > Python Dependency problems and I think it's the only solution to
> pin
> > > the
> > > > > "core" packages. This likely means that we have to be ready to
> > release
> > > > > sub-releases with security dependencies updated (like 1.10.4.1
> maybe
> > or
> > > > > change semantics a bit to more semver and start releasing 2.0.0-
> > 2.1.0
> > > > and
> > > > > then release security updates as 2.0.1 etc. If those 2.0.1 etc are
> > > > released
> > > > > only because of dependency updates/security bugfixes and some
> > critical
> > > > > problems, and if we automate it - I don't think this would be a
> great
> > > > > problem to release those security-patched versions. We can have
> > > services
> > > > > like pyup (https://pyup.io/) or even github itself monitor
> > > dependencies
> > > > > for
> > > > > us and create PRs automatically to update them.
> > > > >
> > > > > Would someone actually complain if any of the "core" packages
> > > > > (install_requires + devel) below got pinned ? I am not sure if that
> > > would
> > > > > be a big problem for anyone, and even if you need (in your
> operator)
> > > some
> > > > > newer version - you can always upgrade it afterwards and ignore the
> > > fact
> > > > > that airflow has it pinned.
> > > > >
> > > > > Here are the dependencies that are the "core" ones:
> > > > >
> > > > > install_requires:
> > > > >
> > > > >    -             'alembic',
> > > > >    -             'cached_property',
> > > > >    -             'configparser',
> > > > >    -             'croniter',
> > > > >    -             'dill',
> > > > >    -             'dumb-ini',
> > > > >    -             'flask',
> > > > >    -             'flask-appbuilder',
> > > > >    -             'flask-caching',
> > > > >    -             'flask-login',
> > > > >    -             'flask-swagger',
> > > > >    -             'flask-wtf',
> > > > >    -             'funcsigs',
> > > > >    -             'gitpython',
> > > > >    -             'gunicorn',
> > > > >    -             'iso8601',
> > > > >    -             'json-merge-patch',
> > > > >    -             'jinja2',
> > > > >    -             'lazy_object_proxy',
> > > > >    -             'markdown',
> > > > >    -             'pendulum',
> > > > >    -             'psutil',
> > > > >    -             'pygments',
> > > > >    -             'python-daemon',
> > > > >    -             'python-dateutil',
> > > > >    -             'requests',
> > > > >    -             'setproctitle',
> > > > >    -             'sqlalchemy',
> > > > >    -             'tabulate',
> > > > >    -             'tenacity',
> > > > >    -             'text-unidecode',
> > > > >    -             'thrift',
> > > > >    -             'tzlocal',
> > > > >    -             'unicodecsv',
> > > > >    -             'zope.deprecation',
> > > > >
> > > > > Devel:
> > > > >
> > > > >    -     'beautifulsoup4',
> > > > >    -     'click',
> > > > >    -     'codecov',
> > > > >    -     'flake8',
> > > > >    -     'freezegun',
> > > > >    -     'ipdb',
> > > > >    -     'jira',
> > > > >    -     'mongomock',
> > > > >    -     'moto',
> > > > >    -     'nose',
> > > > >    -     'nose-ignore-docstring',
> > > > >    -     'nose-timer',
> > > > >    -     'parameterized',
> > > > >    -     'paramiko',
> > > > >    -     'pylint',
> > > > >    -     'pysftp',
> > > > >    -     'pywinrm',
> > > > >    -     'qds-sdk', -> should be moved to separate qubole
> > > > >    -     'rednose',
> > > > >    -     'requests_mock',
> > > > >
> > > > > J.
> > > > >
> > > > >
> > > > > On Mon, Jun 24, 2019 at 3:03 PM Ash Berlin-Taylor <a...@apache.org>
> > > > wrote:
> > > > >
> > > > > > Another suggestion someone (I forget who, sorry) had was that we
> > > could
> > > > > > maintain a full list of _fully tested and supported versions_
> (i.e.
> > > the
> > > > > > output of `pip freeze`) - that way people _can_ use other
> versions
> > if
> > > > > they
> > > > > > want, but we can at least say "use these versions".
> > > > > >
> > > > > > I'm not 100% sure how that would work in practice though, but
> > having
> > > it
> > > > > be
> > > > > > some list we can update without having to do a release is
> crucial.
> > > > > >
> > > > > > -ash
> > > > > >
> > > > > > > On 24 Jun 2019, at 10:00, Jarek Potiuk <
> jarek.pot...@polidea.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > With the recent Sphinx problem
> > > > > > > <https://issues.apache.org/jira/browse/AIRFLOW-4841>- we got
> > back
> > > > our
> > > > > > > old-time enemy. In this case sphinx autoapi has been released
> > > > yesterday
> > > > > > to
> > > > > > > 1.1.0 version and it started to caused our master to fail,
> > causing
> > > > kind
> > > > > > of
> > > > > > > emergency rush to fix as master (and all PRs based on it) would
> > be
> > > > > > broken.
> > > > > > >
> > > > > > > I think I have a proposal that can address similar problems
> > without
> > > > > > pushing
> > > > > > > us in emergency mode.
> > > > > > >
> > > > > > > *Context:*
> > > > > > >
> > > > > > > I wanted to return back to an old discussion - how we can avoid
> > > > > unrelated
> > > > > > > dependencies to cause emergencies on our side where we have to
> > > > quickly
> > > > > > > solve such dependency issues when they break our builds.
> > > > > > >
> > > > > > > *Change coming soon:*
> > > > > > >
> > > > > > > The problems will be partially addressed with last stage of
> > AIP-10
> > > (
> > > > > > > https://github.com/apache/airflow/pull/4938 - pending only
> > > > Kubernetes
> > > > > > test
> > > > > > > fix). It effectively freezes installed dependencies as cached
> > layer
> > > > of
> > > > > > > docker image for builds which do not touch setup.py - so in
> case
> > > > > setup.py
> > > > > > > does not change, the dependencies will not be updated to latest
> > > ones.
> > > > > > >
> > > > > > > *Possibly even better long-term solution:*
> > > > > > >
> > > > > > > I think we should address it a bit better. We had a number of
> > > > > discussions
> > > > > > > on pinning dependencies (for example here
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/9e775d11cce6a3473cbe31908a17d7840072125be2dff020ff59a441@%3Cdev.airflow.apache.org%3E
> > > > > > >).
> > > > > > > I think the conclusion there was that airflow is both "library"
> > > (for
> > > > > > DAGs)
> > > > > > > - where dependencies should not be pinned and end-product
> (where
> > > the
> > > > > > > dependencies should be pinned). So it's a bit catch-22
> situation.
> > > > > > >
> > > > > > > Looking at the problem with Sphinx however It came to me that
> > maybe
> > > > we
> > > > > > can
> > > > > > > use hybrid solution. We pin all the libraries (like Sphinx or
> > > Flask)
> > > > > that
> > > > > > > are used to merely build and test the end product but we do not
> > pin
> > > > the
> > > > > > > libraries (like google-api) which are used in the context of
> > > library
> > > > > > > (writing the operators and DAGs).
> > > > > > >
> > > > > > > What do you think? Maybe that will be the best of both worlds ?
> > > Then
> > > > we
> > > > > > > would have to classify the dependencies and maybe restructure
> > > > setup.py
> > > > > > > slightly to have an obvious distinction between those two types
> > of
> > > > > > > dependencies.
> > > > > > >
> > > > > > > J.
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Jarek Potiuk
> > > > > > > Polidea <https://www.polidea.com/> | Principal Software
> Engineer
> > > > > > >
> > > > > > > M: +48 660 796 129 <+48660796129>
> > > > > > > [image: Polidea] <https://www.polidea.com/>
> > > > > >
> > > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Jarek Potiuk
> > > > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > > >
> > > > > M: +48 660 796 129 <+48660796129>
> > > > > [image: Polidea] <https://www.polidea.com/>
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Jarek Potiuk
> > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > >
> > > M: +48 660 796 129 <+48660796129>
> > > [image: Polidea] <https://www.polidea.com/>
> > >
> >
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Reply via email to