On Thu, Sep 10, 2020 at 2:48 PM Brian Hulette <bhule...@google.com> wrote:
> > On Tue, Sep 8, 2020 at 9:18 AM Robert Bradshaw <rober...@google.com> > wrote: > >> IIRC Dataflow (and perhaps others) implicitly depend on Avro to write >> out intermediate files (e.g. for non-shuffle Fusion breaks). Would >> this break if we just removed it? >> > > I think Dataflow would just need to declare a dependency on the new > extension. > I'm not sure this would solve the underlying problem (it just pushes it onto users and makes it more obscure). Maybe my reasoning is incorrect, but from what I see * Many Beam modules (e.g. dataflow, spark, file-based-io, sql, kafka, parquet, ...) depend on Avro. * Using Avro 1.9 with the above modules doesn't work. Doesn't this mean that, even if we remove avro from Beam core, a user that uses Beam + Avro 1.9 will have issues with any of the above (fairly fundamental) modules? We could mitigate this by first adding the new extension module and > deprecating the core Beam counterpart for a release (or multiple releases). > +1 to Reuven's concerns here.