Hi,

Thanks Alexey for bringing this topic up.

I'd be in favor of 3

Best

Etienne

Le 12/05/2022 à 23:21, Brian Hulette a écrit :
Regarding Option (3) "but keep and shade Avro for “core” needs as v.1.8.2 (still have an issue with CVEs)"

Do we actually need to keep avro in core for any reason? I thought we only had it in core for AvroCoder, schema support, and IOs, which I think are all reasonable to separate out into an extension (this would be comparable to the protobuf extension). To confirm I just grepped for files in core that import avro:

❯ grep -liIrn 'import org\.apache\.avro' sdks/java/core/src/main
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroByteBuddyUtils.java
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/ConvertHelpers.java
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/AvroRecordSchema.java
sdks/java/core/src/main/java/org/apache/beam/sdk/io/DynamicAvroDestinations.java
sdks/java/core/src/main/java/org/apache/beam/sdk/io/SerializableAvroCodecFactory.java
sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSink.java
sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java
sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSchemaIOProvider.java
sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
sdks/java/core/src/main/java/org/apache/beam/sdk/io/ConstantAvroDestination.java
sdks/java/core/src/main/java/org/apache/beam/sdk/coders/AvroGenericCoder.java
sdks/java/core/src/main/java/org/apache/beam/sdk/coders/AvroCoder.java

Brian

On Thu, May 12, 2022 at 2:08 PM Robert Bradshaw <rober...@google.com> wrote:

    Keeping avro in our public (core) API and as an internal dependency
    seems to be a recurring pain point, I would be all for pulling it out
    (basically option 3) and subsequently updating our internal version
    (hopefully no backwards compatibility issues here) and letting the
    extension live with a variety of versions (insofar as this is
    feasible).

    On Thu, May 12, 2022 at 10:29 AM Alexey Romanenko
    <aromanenko....@gmail.com> wrote:
    >
    > Hi everyone,
    >
    > Sorry in advance for a long email.
    > TL;DR: Let’s discuss the next steps to update Avro dependency in
    Beam.
    >
    > I’d like to come back to this old and quite sensitive topic here
    which is Apache Avro version update in Beam. Along the time, we
    already had several discussions on this (for example [1]) but
    without any concrete resolutions in the end, iirc.
    >
    > As we all know, Beam still depends on quite old Avro version
    1.8.2 and there were some attempts to bump it to more recent ones.
    One of the main reasons to bump an Avro version, imho, is that
    Avro 1.8.2 dependency brings several CVEs [2], but the latest Avro
    1.11.0 brings only one [3]
    >
    > In the same time, this update with introduce some incompatible
    changes that Avro has between versions and this may affect Beam
    users and potentially it may affect transitive dependencies while
    using Beam with other project that use Avro as well:
    > - Avro completely moved to java.time.* instead of
    org.joda.time.*. So, we need to adjust date/time conversions
    from/to Beam schema accordingly since Beam schema still uses
    joda.time. It will require users to regenerate already generated
    Java code with avro-compiler (if any) otherwise it won’t compile;
    > - Some minor changes in Avro dependencies and user API;
    > - Something else?
    >
    > I know that here, on the list, we have people from Avro
    community that are much more experienced in this than me - so,
    please correct me if I say something wrong or not 100% correct.
    >
    >
    > In Beam, we also performed several attempts to update Avro - for
    example, [4], [5], [6] and others.
    >
    > To make such update easier in the future, we also discussed to
    move Avro dependency out of core Beam [7] and there were an
    attempt to do that [8] by finally this PR was closed with a
    resolution that it’s not actually needed and we may just want to
    test Beam with different Avro versions [9]
    >
    > The latest work on this was a PR to support several versions of
    Avro in Beam (1.8.x and 1.9.x) [10] which still introduces some
    breaking changes for users, iirc.
    >
    > So, seems that we are a bit stuck on this topic, though, imho,
    we need to decide how move forward mostly because of CVEs in old
    Avro versions and future Avro updates in Beam.
    >
    > The potential options (as I can see them):
    >
    > 1) Bump Avro dependency to the latest one (1.11.0) or the
    possible more recent one
    > - Pros:
    > - latest/recent Avro dependency;
    > - potentially easy to update in the future;
    > - Cons:
    > - breaking change for users;
    > - potentially issues with other projects that use Avro (like
    Apache Spark e.g.).
    >
    > 2) Support different Avro versions in Beam, make Avro dependency
    provided
    > - Pros:
    > - user decides which versions to use;
    > - easy to update in the future;
    > - Cons:
    > - breaking change for users;
    > - not fact that it’s possible to implement in reality;
    > - more tests to test Beam with different Avro versions
    >
    > 3) Extract Avro as an extension, like we do for other formats,
    and update to latest Avro version, but keep and shade Avro for
    “core” needs as v.1.8.2 (still have an issue with CVEs)
    >
    > 4) Anything else?
    >
    >
    > Please, share your thoughts on this and correct me if I stated
    something wrong. The goal of this discussion is finally to move
    forward with Avro update topic.
    >
    > —
    > Alexey
    >
    >
    > [1] https://lists.apache.org/thread/bkwrbqg2nwp1xq1j57xt3kvmy93vpj9r
    > [2] https://mvnrepository.com/artifact/org.apache.avro/avro/1.8.2
    > [3] https://mvnrepository.com/artifact/org.apache.avro/avro/1.11.0
    > [4] https://github.com/apache/beam/pull/9779
    > [5] https://github.com/apache/beam/pull/17372
    > [6] https://github.com/apache/beam/pull/17246
    > [7] https://lists.apache.org/thread/fw4w6xgm05nl5cg502co97pt6cygt4on
    > [8] https://github.com/apache/beam/pull/12748
    > [9] https://lists.apache.org/thread/y76wjqprm8dyfxxfwcqbzxtht2qkrgzg
    > [10] https://github.com/apache/beam/pull/16271
    >
    >
    >
    >
    >
    >

Reply via email to