Hi David,

This requires some thought. Avoiding breaking changes is more than a
preference.

As I understand it, the problem is not dependency incompatibilities in the
Beam Java SDK, but self-incompatibility in Google's libraries across
releases. It makes sense - inside Google's "monorepo" one does not need to
be as careful, so these things are bound to happen.

Nonetheless, in practice, breaking changes == forks. The term "upgrade" and
"downgrade" are misleading in this case. The situation is more like a
milder version of Python 2 vs 3.

So taking "Pubsub 1.28" and "Pubsub 1.43" which are simply different*,
which one(s) should Beam support? Should we choose to support only the
versions of the libraries that Google suggests via its BOM? Can we manage
to cleverly support them with a single PubsubIO like we do w/ Flink and
ElasticSearch? Given Google's track record, there will surely be more
mutually incompatible versions to come. Any popular storage system that
wants to make breaking changes will need a connector at least for the prior
dominant version and the new ascending version, or else harm users.

If we can pin to a version of the BOM with no breaking changes in Beam or
its dependencies, then this is all a non-issue.

If adopting the bom is a breaking change, then it would be a radical new
policy. It makes some sense, since Beam users who care about GCP connectors
are probably willing and interested in adhering to Google's recommendations
even if they have to adjust their code one time. But notably, there are
plenty of Beam users who upgrade their Beam jars without recompilation, so
the breakage is more severe. And presumably any change to the BOM version
would include breaking changes - would we establish a policy of changing
it? Or basically never change it once we do the first breaking change to
pin?

I think the scope of this proposal must be severely limited. It should have
zero impact on users outside of the GCP connectors and Dataflow runner -
any other Beam use of Google's OSS utility libraries is out of scope. Even
so, I am not sure this is best for users overall.

There's a lot to consider and flesh out. I'm not even sure how / if we can
get the data needed to guide these decisions.

Kenn

*I just made up the version numbers

On Tue, Nov 19, 2019 at 5:23 PM David Cavazos <dcava...@google.com> wrote:

> Hi Beamers,
>
> I recently was a part of a discussion about some dependency
> incompatibilities in the Java SDK. Specifically on the GRPC versions when
> trying to use one of the Google Cloud client libraries as part of a Beam
> pipeline. Their workaround was downgrading to an older version of the
> client library to match Beam's version of the GRPC library. However, this
> could not have been possible if they *needed* the newer version for any
> reason.
>
> I'm aware that Java development environments usually prefer to hardcode
> versions to avoid breaking changes, but it would be great to have the
> latest versions of dependencies that could be *shared* with other
> libraries, like the GRPC libraries.
>
> It looks like the Google Cloud client library team has been aware of this
> problem, as well as the tricky interactions between the hundreds of
> libraries they offer. They mentioned that they are starting to roll out a GCP
> Libraries BOM
> <https://github.com/GoogleCloudPlatform/cloud-opensource-java/wiki/The-Google-Cloud-Platform-Libraries-BOM>
>  to
> help everyone have up-to-date versions of their libraries, including
> *guava*, *protobuf*, *grpc-java*, *google-http-java-client*, and
> *google-cloud-java*.
>
> Would everyone feel comfortable on using the BOM to manage the Google
> Cloud dependency versions? If so, is there anyone comfortable in Gradle
> willing to do these changes?
>
> Cheers!
> David
>

Reply via email to