Hey Jarek,

sounds good, but actually I would probably go with pinning everything by 
default and have a "Dependency Bot" testing new releases of packages.
But regarding of the big amount of computing (=costs) we already have by 
running our ci pipeline we cannot set up a Dependency Bot at the moment, 
right? :/

So for the beginning I think sth like `all-pinned` works :)

Kind regards,
Felix

Am 01/08/2019 um 19:05 schrieb Jarek Potiuk:
> Hello Everyone,
>
> Just to revive the thread - we had a discussion with Ash today after
> today's small "spanner" drama, and we came with a possible solution.
>
> This is something we yet have to try but it seems that it should be
> possible to generate additional "pinned" extras (pinned, gcp_api-pinned
> etc.) - it could also be "frozen" instead of "pinned" if the name sounds
> better.
>
> This way you would be able to run:
>
>     - `pip install airflow==1.10.4[all-pinned]`
>     - `pip-install airflow==1.10.4[gcp_api-pinned]'
>     - ...
>
> This way -  it will always work no matter if new dependencies are released.
> It will install the "frozen" version of dependencies that we know work for
> sure. We could update the documentation to add this is as the recommended
> method of standalone installation. Then if you need some other set of
> dependencies (newer) you could have a custom pip install to fix certain
> dependencies.
>
> What do you think? Would that work for the users of airflow ?
>
> J.
>
> On Tue, Jul 9, 2019 at 9:06 PM Driesprong, Fokko <fo...@driesprong.frl>
> wrote:
>
>> Hi Jarek,
>>
>> Thanks for bringing this up. I certainly think this is a good idea.
>> Unfortunately I'm in a plane right now so I'm unable to read the Google doc
>> right now.
>>
>> GitHub recently acquired Dependabot which even supports automatic updates
>> of dependencies. The we at least know when something breaks. The only
>> problem right now is that this bot isn't allowed by the ASF policies since
>> it requires write access to the repository.
>>
>> Regarding the symver. I do often see packages changing the public API in a
>> minor update without any notice of deprecation. In this case it is
>> impossible to make this watertight, but at least a more structured process
>> using something like Dependabot would be a big plus!
>>
>> Cheers, Fokko
>>
>>
>>
>> Op zo 7 jul. 2019 om 11:34 schreef Jarek Potiuk <jarek.pot...@polidea.com>
>>
>>> All for deeper release-cycle discussion. I think after 1.10.4 is out we
>>> should discuss/agree and document the release scheme we are going to use.
>>> Semver and patching seems like a good idea.
>>>
>>> We have already quite an experience in backporting to 1.10.x branch and
>> it
>>> was surprisingly easy - small, focused commits help with that. And if we
>>> limit patches to dependency updates and security fixes only, I don't see
>> it
>>> will be a lot of effort.
>>>
>>> Bot and automation is definitely something we should do. The pyup bot is
>>> great - for one - to automate upgrades of pinned dependencies. We use it
>> in
>>> Oozie-to-airflow for quite some time and it takes almost no time to
>> upgrade
>>> deps regularly:
>>>
>>>
>> https://github.com/GoogleCloudPlatform/oozie-to-airflow/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aclosed+pyup
>>> - those are automated PRs we got from pyup and it was just enough to do
>>> "approve" + "merge" after we saw that all the tests passed with the new
>>> version.
>>>
>>> J.
>>>
>>>
>>>
>>> On Sat, Jul 6, 2019 at 9:24 PM Philippe Gagnon <philgagn...@gmail.com>
>>> wrote:
>>>
>>>> I am +1 on pinning core packages, even though this adds a bit of manual
>>>> labor for maintenance. This latest werkzeug issue highlights why this
>> is
>>> a
>>>> good idea.
>>>>
>>>> Also +1 on changing the versioning scheme to something more akin to
>>> semver.
>>>> The current scheme basically does not support patch-only releases and a
>>>> 4-part version notation seems a bit much. Overall, I think that
>>> patch-only
>>>> releases would make the project healthier.
>>>>
>>>> Two points though:
>>>>
>>>> 1. I think that there should be a more in-depth discussion about
>>> clarifying
>>>> the release lifecycle policy.
>>>> 2. This implies a lot more backport-related work, which is a bit of a
>>>> burden since it is both tedious and boring. Perhaps we could look into
>>>> having a bot help out with this (similar to
>>>> https://github.com/miss-islington)?
>>>>
>>>> On Sat, Jul 6, 2019 at 1:04 PM Jarek Potiuk <jarek.pot...@polidea.com>
>>>> wrote:
>>>>
>>>>> I think the recent case with werkzeug calls for action here (also see
>>>>> https://issues.apache.org/jira/browse/AIRFLOW-4903 ). We again ended
>>> up
>>>>> with released Airflow version that cannot be installed easily because
>>> of
>>>>> some transient dependencies upgrade.
>>>>>
>>>>> I think this is something we should at least consider for 2.*
>>>   version.
>>>>> The problem is that simply running 'pip install airflow==1.10.3' .
>>> Right
>>>>> now this will not work - you have to hack it and manually upgrade
>> deps
>>>>> (like https://github.com/godatadriven/whirl/issues/50).
>>>>>
>>>>> I really do not like that changes beyond our control impact the
>> release
>>>> we
>>>>> already made (and is out there in pip).
>>>>>
>>>>> I've read recently the nice writeup
>>>>>
>>>>>
>> https://docs.google.com/document/d/1x_VrNtXCup75qA3glDd2fQOB2TakldwjKZ6pXaAjAfg/edit
>>>>> about
>>>>> Python Dependency problems and I think it's the only solution to pin
>>> the
>>>>> "core" packages. This likely means that we have to be ready to
>> release
>>>>> sub-releases with security dependencies updated (like 1.10.4.1 maybe
>> or
>>>>> change semantics a bit to more semver and start releasing 2.0.0-
>> 2.1.0
>>>> and
>>>>> then release security updates as 2.0.1 etc. If those 2.0.1 etc are
>>>> released
>>>>> only because of dependency updates/security bugfixes and some
>> critical
>>>>> problems, and if we automate it - I don't think this would be a great
>>>>> problem to release those security-patched versions. We can have
>>> services
>>>>> like pyup (https://pyup.io/) or even github itself monitor
>>> dependencies
>>>>> for
>>>>> us and create PRs automatically to update them.
>>>>>
>>>>> Would someone actually complain if any of the "core" packages
>>>>> (install_requires + devel) below got pinned ? I am not sure if that
>>> would
>>>>> be a big problem for anyone, and even if you need (in your operator)
>>> some
>>>>> newer version - you can always upgrade it afterwards and ignore the
>>> fact
>>>>> that airflow has it pinned.
>>>>>
>>>>> Here are the dependencies that are the "core" ones:
>>>>>
>>>>> install_requires:
>>>>>
>>>>>     -             'alembic',
>>>>>     -             'cached_property',
>>>>>     -             'configparser',
>>>>>     -             'croniter',
>>>>>     -             'dill',
>>>>>     -             'dumb-ini',
>>>>>     -             'flask',
>>>>>     -             'flask-appbuilder',
>>>>>     -             'flask-caching',
>>>>>     -             'flask-login',
>>>>>     -             'flask-swagger',
>>>>>     -             'flask-wtf',
>>>>>     -             'funcsigs',
>>>>>     -             'gitpython',
>>>>>     -             'gunicorn',
>>>>>     -             'iso8601',
>>>>>     -             'json-merge-patch',
>>>>>     -             'jinja2',
>>>>>     -             'lazy_object_proxy',
>>>>>     -             'markdown',
>>>>>     -             'pendulum',
>>>>>     -             'psutil',
>>>>>     -             'pygments',
>>>>>     -             'python-daemon',
>>>>>     -             'python-dateutil',
>>>>>     -             'requests',
>>>>>     -             'setproctitle',
>>>>>     -             'sqlalchemy',
>>>>>     -             'tabulate',
>>>>>     -             'tenacity',
>>>>>     -             'text-unidecode',
>>>>>     -             'thrift',
>>>>>     -             'tzlocal',
>>>>>     -             'unicodecsv',
>>>>>     -             'zope.deprecation',
>>>>>
>>>>> Devel:
>>>>>
>>>>>     -     'beautifulsoup4',
>>>>>     -     'click',
>>>>>     -     'codecov',
>>>>>     -     'flake8',
>>>>>     -     'freezegun',
>>>>>     -     'ipdb',
>>>>>     -     'jira',
>>>>>     -     'mongomock',
>>>>>     -     'moto',
>>>>>     -     'nose',
>>>>>     -     'nose-ignore-docstring',
>>>>>     -     'nose-timer',
>>>>>     -     'parameterized',
>>>>>     -     'paramiko',
>>>>>     -     'pylint',
>>>>>     -     'pysftp',
>>>>>     -     'pywinrm',
>>>>>     -     'qds-sdk', -> should be moved to separate qubole
>>>>>     -     'rednose',
>>>>>     -     'requests_mock',
>>>>>
>>>>> J.
>>>>>
>>>>>
>>>>> On Mon, Jun 24, 2019 at 3:03 PM Ash Berlin-Taylor <a...@apache.org>
>>>> wrote:
>>>>>> Another suggestion someone (I forget who, sorry) had was that we
>>> could
>>>>>> maintain a full list of _fully tested and supported versions_ (i.e.
>>> the
>>>>>> output of `pip freeze`) - that way people _can_ use other versions
>> if
>>>>> they
>>>>>> want, but we can at least say "use these versions".
>>>>>>
>>>>>> I'm not 100% sure how that would work in practice though, but
>> having
>>> it
>>>>> be
>>>>>> some list we can update without having to do a release is crucial.
>>>>>>
>>>>>> -ash
>>>>>>
>>>>>>> On 24 Jun 2019, at 10:00, Jarek Potiuk <jarek.pot...@polidea.com
>>>>> wrote:
>>>>>>> With the recent Sphinx problem
>>>>>>> <https://issues.apache.org/jira/browse/AIRFLOW-4841>- we got
>> back
>>>> our
>>>>>>> old-time enemy. In this case sphinx autoapi has been released
>>>> yesterday
>>>>>> to
>>>>>>> 1.1.0 version and it started to caused our master to fail,
>> causing
>>>> kind
>>>>>> of
>>>>>>> emergency rush to fix as master (and all PRs based on it) would
>> be
>>>>>> broken.
>>>>>>> I think I have a proposal that can address similar problems
>> without
>>>>>> pushing
>>>>>>> us in emergency mode.
>>>>>>>
>>>>>>> *Context:*
>>>>>>>
>>>>>>> I wanted to return back to an old discussion - how we can avoid
>>>>> unrelated
>>>>>>> dependencies to cause emergencies on our side where we have to
>>>> quickly
>>>>>>> solve such dependency issues when they break our builds.
>>>>>>>
>>>>>>> *Change coming soon:*
>>>>>>>
>>>>>>> The problems will be partially addressed with last stage of
>> AIP-10
>>> (
>>>>>>> https://github.com/apache/airflow/pull/4938 - pending only
>>>> Kubernetes
>>>>>> test
>>>>>>> fix). It effectively freezes installed dependencies as cached
>> layer
>>>> of
>>>>>>> docker image for builds which do not touch setup.py - so in case
>>>>> setup.py
>>>>>>> does not change, the dependencies will not be updated to latest
>>> ones.
>>>>>>> *Possibly even better long-term solution:*
>>>>>>>
>>>>>>> I think we should address it a bit better. We had a number of
>>>>> discussions
>>>>>>> on pinning dependencies (for example here
>>>>>>> <
>> https://lists.apache.org/thread.html/9e775d11cce6a3473cbe31908a17d7840072125be2dff020ff59a441@%3Cdev.airflow.apache.org%3E
>>>>>>> ).
>>>>>>> I think the conclusion there was that airflow is both "library"
>>> (for
>>>>>> DAGs)
>>>>>>> - where dependencies should not be pinned and end-product (where
>>> the
>>>>>>> dependencies should be pinned). So it's a bit catch-22 situation.
>>>>>>>
>>>>>>> Looking at the problem with Sphinx however It came to me that
>> maybe
>>>> we
>>>>>> can
>>>>>>> use hybrid solution. We pin all the libraries (like Sphinx or
>>> Flask)
>>>>> that
>>>>>>> are used to merely build and test the end product but we do not
>> pin
>>>> the
>>>>>>> libraries (like google-api) which are used in the context of
>>> library
>>>>>>> (writing the operators and DAGs).
>>>>>>>
>>>>>>> What do you think? Maybe that will be the best of both worlds ?
>>> Then
>>>> we
>>>>>>> would have to classify the dependencies and maybe restructure
>>>> setup.py
>>>>>>> slightly to have an obvious distinction between those two types
>> of
>>>>>>> dependencies.
>>>>>>>
>>>>>>> J.
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Jarek Potiuk
>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>
>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>
>>>>> --
>>>>>
>>>>> Jarek Potiuk
>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>
>>>>> M: +48 660 796 129 <+48660796129>
>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>
>>>
>>> --
>>>
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>

Reply via email to