I took a look at the linkage checker and have opened up this PR[1] to allow
contributors to aid in performing dependency analysis within Apache Beam
during upgrades.

The current PR works by compiling and publishing all the Java artifacts to
your local maven repo and then runs the linkage checker against it with a
specified list of artifacts. For example by running:
./gradlew -Ppublishing
-PjavaLinkageArtifactIds=beam-sdks-java-core,beam-sdks-java-io-jdbc
:checkJavaLinkage

Produces:
Class javax.annotation.Nullable is not found;
  referenced by 1 class file
    org.apache.beam.sdk.schemas.FieldValueTypeInformation
(beam-sdks-java-core-2.18.0-SNAPSHOT.jar)
Class org.brotli.dec.BrotliInputStream is not found;
  referenced by 1 class file

org.apache.beam.repackaged.core.org.apache.commons.compress.compressors.brotli.BrotliCompressorInputStream
(beam-sdks-java-core-2.18.0-SNAPSHOT.jar)
Class com.github.luben.zstd.ZstdInputStream is not found;
  referenced by 1 class file

org.apache.beam.repackaged.core.org.apache.commons.compress.compressors.zstandard.ZstdCompressorInputStream
(beam-sdks-java-core-2.18.0-SNAPSHOT.jar)
... (lots more output) ...

I haven't tried running the linker analysis for all Apache Beam artifacts
yet but for anyone who is interested in doing dependency clean-up or
upgrades should be able to use the PR as is.

1: https://github.com/apache/beam/pull/10184

On Wed, Nov 20, 2019 at 12:16 PM Kenneth Knowles <k...@apache.org> wrote:

> On Wed, Nov 20, 2019 at 4:05 AM Elliotte Rusty Harold <elh...@ibiblio.org>
> wrote:
>
>> BOM or no BOM is an implementation detail.
>
>
> Agreed for the most part.
>
>
>> Using com.google.cloud:libraries-bom would make dependency management
>> simpler for developers, but the real issue is whether Beam can continue to
>> work with very old versions of the many libraries it depends on. Even if
>> this is acceptable for Beam, it's unlikely to be feasible for anyone who
>> needs to mix Beam code with other code.
>
>
> I believe every version of Beam's dependencies has been, and should
> continue to be, driven by what is best for Beam's users. That does mean
> making it easy for them to use the latest compatible version of their
> favorite libraries.
>
> There should be no self-incompatibility between Google minor version
>> releases. All the Google libraries in question follow semantic versioning.
>> E.g. Pubsub 1.43 would be fully API compatible with Pubsub 1.28, though not
>> the reverse. However there are likely to be important bug fixes in 1.43 and
>> definitely new features that 1.28 would not have. If there are any edge
>> cases where this is not true, that's a bug and if you file it against the
>> repo we'll try to fix it. We're also installing tooling to make this less
>> likely to happen by accident. However, right now any such problem is rare.
>>
>
> I'm glad we share the same ideals. If things were as good as you
> described, then we would have two good properties:
>
> 1. Users would always be able to force a newer minor version to trivially
> work around Beam's deps
> 2. Beam could always upgrade minor versions with no code change in Beam
> and no code change by users
>
> My experience is that this rarely works so simply. Generally, a user
> forces a new version of a library and it turns out that library or its
> dependencies has broken compatibility.
>
> Just reiterating that if semver really holds in these cases, then this
> proposal is fine with me. And if semver doesn't hold, I still think we
> should try to support the latest, but may also need to maintain a connector
> to support older versions that are still in wide use.
>
>
>> Looking at Beam's dependencies, the only case where there are major
>> version changes to address is Guava.
>>
>
> Beam has vendored Guava so it is mostly beside the point. Upgrading the
> vendored Guava does not interact with any of Beam's dependencies. See
> https://lists.apache.org/thread.html/c477d120a4c4626cbe675f8b03d84c6fe7938e36c8e2b55c492224cf@%3Cdev.beam.apache.org%3E
>
> Only KinesisIO and the ZetaSQL-to-Calcite translator actually have
> essential dependencies on Guava. In these cases, the version of Guava must
> necessarily be compatible with the Kinesis client and ZetaSQL,
> respectively. They may or may not be able to interop, and that is mostly
> out of our hands.
>
> The remaining issues are pre-1.0 libraries. OpenCensus is a particular
>> thorn in my side. Ideally these should not be used, at all. However if we
>> must, we should not expose them on the Beam API surface and we need to move
>> them forward quickly as they change.
>>
>
> This might deserve its own thread. This sounds like it should be
> well-hidden, vendored, or well-marked as "experimental".
>
> Kenn
>

Reply via email to