I couldn't think of a good flow that didn't lead me to clearing org.apache.beam artifacts in .m2 before running the analysis.
There might be a way to override the maven local path in Gradle so that it publishes to a temporary directory but it wasn't obvious how to do this from Gradles maven publishing plugin docs[1]. 1: https://docs.gradle.org/current/userguide/publishing_maven.html On Thu, Nov 21, 2019 at 8:43 PM Kenneth Knowles <k...@apache.org> wrote: > If we have a bunch of leftover junk in .m2 will that pollute the analysis? > Should we rm -rf ~/.m2 first or does it work well anyhow? > > On Wed, Nov 20, 2019 at 4:52 PM Luke Cwik <lc...@google.com> wrote: > >> I took a look at the linkage checker and have opened up this PR[1] to >> allow contributors to aid in performing dependency analysis within Apache >> Beam during upgrades. >> >> The current PR works by compiling and publishing all the Java artifacts >> to your local maven repo and then runs the linkage checker against it with >> a specified list of artifacts. For example by running: >> ./gradlew -Ppublishing >> -PjavaLinkageArtifactIds=beam-sdks-java-core,beam-sdks-java-io-jdbc >> :checkJavaLinkage >> >> Produces: >> Class javax.annotation.Nullable is not found; >> referenced by 1 class file >> org.apache.beam.sdk.schemas.FieldValueTypeInformation >> (beam-sdks-java-core-2.18.0-SNAPSHOT.jar) >> Class org.brotli.dec.BrotliInputStream is not found; >> referenced by 1 class file >> >> org.apache.beam.repackaged.core.org.apache.commons.compress.compressors.brotli.BrotliCompressorInputStream >> (beam-sdks-java-core-2.18.0-SNAPSHOT.jar) >> Class com.github.luben.zstd.ZstdInputStream is not found; >> referenced by 1 class file >> >> org.apache.beam.repackaged.core.org.apache.commons.compress.compressors.zstandard.ZstdCompressorInputStream >> (beam-sdks-java-core-2.18.0-SNAPSHOT.jar) >> ... (lots more output) ... >> >> I haven't tried running the linker analysis for all Apache Beam artifacts >> yet but for anyone who is interested in doing dependency clean-up or >> upgrades should be able to use the PR as is. >> >> 1: https://github.com/apache/beam/pull/10184 >> >> On Wed, Nov 20, 2019 at 12:16 PM Kenneth Knowles <k...@apache.org> wrote: >> >>> On Wed, Nov 20, 2019 at 4:05 AM Elliotte Rusty Harold < >>> elh...@ibiblio.org> wrote: >>> >>>> BOM or no BOM is an implementation detail. >>> >>> >>> Agreed for the most part. >>> >>> >>>> Using com.google.cloud:libraries-bom would make dependency management >>>> simpler for developers, but the real issue is whether Beam can continue to >>>> work with very old versions of the many libraries it depends on. Even if >>>> this is acceptable for Beam, it's unlikely to be feasible for anyone who >>>> needs to mix Beam code with other code. >>> >>> >>> I believe every version of Beam's dependencies has been, and should >>> continue to be, driven by what is best for Beam's users. That does mean >>> making it easy for them to use the latest compatible version of their >>> favorite libraries. >>> >>> There should be no self-incompatibility between Google minor version >>>> releases. All the Google libraries in question follow semantic versioning. >>>> E.g. Pubsub 1.43 would be fully API compatible with Pubsub 1.28, though not >>>> the reverse. However there are likely to be important bug fixes in 1.43 and >>>> definitely new features that 1.28 would not have. If there are any edge >>>> cases where this is not true, that's a bug and if you file it against the >>>> repo we'll try to fix it. We're also installing tooling to make this less >>>> likely to happen by accident. However, right now any such problem is rare. >>>> >>> >>> I'm glad we share the same ideals. If things were as good as you >>> described, then we would have two good properties: >>> >>> 1. Users would always be able to force a newer minor version to >>> trivially work around Beam's deps >>> 2. Beam could always upgrade minor versions with no code change in Beam >>> and no code change by users >>> >>> My experience is that this rarely works so simply. Generally, a user >>> forces a new version of a library and it turns out that library or its >>> dependencies has broken compatibility. >>> >>> Just reiterating that if semver really holds in these cases, then this >>> proposal is fine with me. And if semver doesn't hold, I still think we >>> should try to support the latest, but may also need to maintain a connector >>> to support older versions that are still in wide use. >>> >>> >>>> Looking at Beam's dependencies, the only case where there are major >>>> version changes to address is Guava. >>>> >>> >>> Beam has vendored Guava so it is mostly beside the point. Upgrading the >>> vendored Guava does not interact with any of Beam's dependencies. See >>> https://lists.apache.org/thread.html/c477d120a4c4626cbe675f8b03d84c6fe7938e36c8e2b55c492224cf@%3Cdev.beam.apache.org%3E >>> >>> Only KinesisIO and the ZetaSQL-to-Calcite translator actually have >>> essential dependencies on Guava. In these cases, the version of Guava must >>> necessarily be compatible with the Kinesis client and ZetaSQL, >>> respectively. They may or may not be able to interop, and that is mostly >>> out of our hands. >>> >>> The remaining issues are pre-1.0 libraries. OpenCensus is a particular >>>> thorn in my side. Ideally these should not be used, at all. However if we >>>> must, we should not expose them on the Beam API surface and we need to move >>>> them forward quickly as they change. >>>> >>> >>> This might deserve its own thread. This sounds like it should be >>> well-hidden, vendored, or well-marked as "experimental". >>> >>> Kenn >>> >>