Comment (from a bit outsider)

Fantastic document Valentyn.

Very, very insightful and interesting. We feel a lot of the same pain in
Apache Airflow (actually even more because we have not 20 but 620+
dependencies) but we are also a bit more advanced in the way how we are
managing the dependencies - some of the ideas you had there are already
tested and tried in Airflow, some of them are a bit different but we can
definitely share "principles" and we are a little higher in the "supply
chain" (i.e. Apache Beam Python SDK is our dependency).

I left some suggestions and some comments describing in detail how the same
problems look like in Airflow and how we addressed them (if we did) and I
am happy to participate in further discussions. I am "the dependency guy"
in Airflow and happy to share my experiences and help to work out some
problems - and especially help to solve problems coming from using multiple
google-client libraries and diamond dependencies (we are just now dealing
with similar issue - where likely we will have to do a massive update of
several of our clients - hopefully with the involvement of Composer team.
And I'd love to be involved in a joint discussion with the google client
team to work out some common and expectations that we can rely on when we
define our future upgrade strategy for google clients.

I will watch it here and be happy to spend quite some time on helping to
hash it out.

BTW. You can also watch my talk I gave last year at PyWaw about "Managing
Python dependencies at Scale"
https://www.youtube.com/watch?v=_SjMdQLP30s&t=2549s where I explain the
approach we took, reasoning behind it etc.

J.


On Wed, Aug 24, 2022 at 2:45 AM Valentyn Tymofieiev via dev <
dev@beam.apache.org> wrote:

> Hi everyone,
>
> Recently, several issues [1-3]  have highlighted outage risks and
> developer inconveniences due to  dependency management practices in Beam
> Python.
>
> With dependabot and other tooling  that we have integrated with Beam, one
> of the missing pieces seems to be having a clear guideline of how we should
> be specifying requirements for our dependencies and when and how we should
> be updating them to have a sustainable process.
>
> As a conversation starter, I put together a retrospective
> <https://docs.google.com/document/d/1gxQF8mciRYgACNpCy1wlR7TBa8zN-Tl6PebW-U8QvBk/edit?resourcekey=0-XcHRyFh4KRPkA0GsdUmU3g#>[4]
> covering a recent incident and would like to get community opinions on the
> open questions.
>
> In particular, if you have experience managing dependencies for other
> Python libraries with rich dependency chains, knowledge of available
> tooling or first hand experience dealing with other dependency issues in
> Beam, your input would be greatly appreciated.
>
> Thanks,
> Valentyn
>
> [1] https://github.com/apache/beam/issues/22218
> [2] https://github.com/apache/beam/pull/22550#issuecomment-1217348455
> [3] https://github.com/apache/beam/issues/22533
> [4]
> https://docs.google.com/document/d/1gxQF8mciRYgACNpCy1wlR7TBa8zN-Tl6PebW-U8QvBk/edit
>

Reply via email to