On Thu, Sep 10, 2020 at 2:48 PM Brian Hulette <bhule...@google.com> wrote:

>
> On Tue, Sep 8, 2020 at 9:18 AM Robert Bradshaw <rober...@google.com>
> wrote:
>
>> IIRC Dataflow (and perhaps others) implicitly depend on Avro to write
>> out intermediate files (e.g. for non-shuffle Fusion breaks). Would
>> this break if we just removed it?
>>
>
> I think Dataflow would just need to declare a dependency on the new
> extension.
>

I'm not sure this would solve the underlying problem (it just pushes it
onto users and makes it more obscure). Maybe my reasoning is incorrect, but
from what I see

* Many Beam modules (e.g. dataflow, spark, file-based-io, sql, kafka,
parquet, ...) depend on Avro.
* Using Avro 1.9 with the above modules doesn't work.

Doesn't this mean that, even if we remove avro from Beam core, a user that
uses Beam + Avro 1.9 will have issues with any of the above (fairly
fundamental) modules?

 We could mitigate this by first adding the new extension module and
> deprecating the core Beam counterpart for a release (or multiple releases).
>

+1 to Reuven's concerns here.

Reply via email to