I couldn't think of a good flow that didn't lead me to clearing
org.apache.beam artifacts in .m2 before running the analysis.

There might be a way to override the maven local path in Gradle so that it
publishes to a temporary directory but it wasn't obvious how to do this
from Gradles maven publishing plugin docs[1].

1: https://docs.gradle.org/current/userguide/publishing_maven.html

On Thu, Nov 21, 2019 at 8:43 PM Kenneth Knowles <k...@apache.org> wrote:

> If we have a bunch of leftover junk in .m2 will that pollute the analysis?
> Should we rm -rf ~/.m2 first or does it work well anyhow?
>
> On Wed, Nov 20, 2019 at 4:52 PM Luke Cwik <lc...@google.com> wrote:
>
>> I took a look at the linkage checker and have opened up this PR[1] to
>> allow contributors to aid in performing dependency analysis within Apache
>> Beam during upgrades.
>>
>> The current PR works by compiling and publishing all the Java artifacts
>> to your local maven repo and then runs the linkage checker against it with
>> a specified list of artifacts. For example by running:
>> ./gradlew -Ppublishing
>> -PjavaLinkageArtifactIds=beam-sdks-java-core,beam-sdks-java-io-jdbc
>> :checkJavaLinkage
>>
>> Produces:
>> Class javax.annotation.Nullable is not found;
>>   referenced by 1 class file
>>     org.apache.beam.sdk.schemas.FieldValueTypeInformation
>> (beam-sdks-java-core-2.18.0-SNAPSHOT.jar)
>> Class org.brotli.dec.BrotliInputStream is not found;
>>   referenced by 1 class file
>>
>> org.apache.beam.repackaged.core.org.apache.commons.compress.compressors.brotli.BrotliCompressorInputStream
>> (beam-sdks-java-core-2.18.0-SNAPSHOT.jar)
>> Class com.github.luben.zstd.ZstdInputStream is not found;
>>   referenced by 1 class file
>>
>> org.apache.beam.repackaged.core.org.apache.commons.compress.compressors.zstandard.ZstdCompressorInputStream
>> (beam-sdks-java-core-2.18.0-SNAPSHOT.jar)
>> ... (lots more output) ...
>>
>> I haven't tried running the linker analysis for all Apache Beam artifacts
>> yet but for anyone who is interested in doing dependency clean-up or
>> upgrades should be able to use the PR as is.
>>
>> 1: https://github.com/apache/beam/pull/10184
>>
>> On Wed, Nov 20, 2019 at 12:16 PM Kenneth Knowles <k...@apache.org> wrote:
>>
>>> On Wed, Nov 20, 2019 at 4:05 AM Elliotte Rusty Harold <
>>> elh...@ibiblio.org> wrote:
>>>
>>>> BOM or no BOM is an implementation detail.
>>>
>>>
>>> Agreed for the most part.
>>>
>>>
>>>> Using com.google.cloud:libraries-bom would make dependency management
>>>> simpler for developers, but the real issue is whether Beam can continue to
>>>> work with very old versions of the many libraries it depends on. Even if
>>>> this is acceptable for Beam, it's unlikely to be feasible for anyone who
>>>> needs to mix Beam code with other code.
>>>
>>>
>>> I believe every version of Beam's dependencies has been, and should
>>> continue to be, driven by what is best for Beam's users. That does mean
>>> making it easy for them to use the latest compatible version of their
>>> favorite libraries.
>>>
>>> There should be no self-incompatibility between Google minor version
>>>> releases. All the Google libraries in question follow semantic versioning.
>>>> E.g. Pubsub 1.43 would be fully API compatible with Pubsub 1.28, though not
>>>> the reverse. However there are likely to be important bug fixes in 1.43 and
>>>> definitely new features that 1.28 would not have. If there are any edge
>>>> cases where this is not true, that's a bug and if you file it against the
>>>> repo we'll try to fix it. We're also installing tooling to make this less
>>>> likely to happen by accident. However, right now any such problem is rare.
>>>>
>>>
>>> I'm glad we share the same ideals. If things were as good as you
>>> described, then we would have two good properties:
>>>
>>> 1. Users would always be able to force a newer minor version to
>>> trivially work around Beam's deps
>>> 2. Beam could always upgrade minor versions with no code change in Beam
>>> and no code change by users
>>>
>>> My experience is that this rarely works so simply. Generally, a user
>>> forces a new version of a library and it turns out that library or its
>>> dependencies has broken compatibility.
>>>
>>> Just reiterating that if semver really holds in these cases, then this
>>> proposal is fine with me. And if semver doesn't hold, I still think we
>>> should try to support the latest, but may also need to maintain a connector
>>> to support older versions that are still in wide use.
>>>
>>>
>>>> Looking at Beam's dependencies, the only case where there are major
>>>> version changes to address is Guava.
>>>>
>>>
>>> Beam has vendored Guava so it is mostly beside the point. Upgrading the
>>> vendored Guava does not interact with any of Beam's dependencies. See
>>> https://lists.apache.org/thread.html/c477d120a4c4626cbe675f8b03d84c6fe7938e36c8e2b55c492224cf@%3Cdev.beam.apache.org%3E
>>>
>>> Only KinesisIO and the ZetaSQL-to-Calcite translator actually have
>>> essential dependencies on Guava. In these cases, the version of Guava must
>>> necessarily be compatible with the Kinesis client and ZetaSQL,
>>> respectively. They may or may not be able to interop, and that is mostly
>>> out of our hands.
>>>
>>> The remaining issues are pre-1.0 libraries. OpenCensus is a particular
>>>> thorn in my side. Ideally these should not be used, at all. However if we
>>>> must, we should not expose them on the Beam API surface and we need to move
>>>> them forward quickly as they change.
>>>>
>>>
>>> This might deserve its own thread. This sounds like it should be
>>> well-hidden, vendored, or well-marked as "experimental".
>>>
>>> Kenn
>>>
>>

Reply via email to