Re: AvroIO.to(DynamicAvroDestinations) deprecated?

2022-09-13 Thread John Casey via user
That would be great, thanks!

On Tue, Sep 13, 2022 at 3:00 PM Steve Niemitz  wrote:

> Ah this is super useful context, thank you!  I can submit a couple PRs to
> get AvroIO.sink up to parity if that's the way forward.
>
> On Tue, Sep 13, 2022 at 2:53 PM John Casey via user 
> wrote:
>
>> Hi Steve,
>>
>> I've asked around, and it looks like this confusing state is due to a
>> migration that isn't complete (and likely won't be until Beam 3.0).
>>
>> Here is the doc that explains some of the history:
>> https://docs.google.com/document/d/1zcF4ZGtq8pxzLZxgD_JMWAouSszIf9LnFANWHKBsZlg/edit
>> And a PR that implements some of the changes:
>> https://github.com/apache/beam/pull/3817
>>
>> Based on this, AvroIO.sink is what we recommend. Please feel free to
>> raise issues on Github to account for features you're missing. In addition,
>> if you think they are straightforward changes, I'd be happy to discuss
>> designs, or look at proposed changes to make these features available.
>>
>> I hope this helps,
>> John
>>
>> On Mon, Sep 12, 2022 at 3:38 PM Steve Niemitz 
>> wrote:
>>
>>> We're trying to do some semi-advanced custom logic (custom writers and
>>> schemas per destination) with AvroIO, and want to use
>>> DynamicAvroDestinations to accomplish this.
>>>
>>> However, AvroIO.to(DynamicAvroDestinations) is deprecated, but there
>>> doesn't seem to be any other way to accomplish what we want here.
>>> AvroIO.sink is much less sophisticated than the non-sink options, missing
>>> much of the configurability that the non-sink version has.  For example,
>>> there's no way to project from the UserT -> OutputT with the sink version,
>>> only from UserT -> GenericRecord, which isn't what we want.
>>>
>>> It seems like most things would be trivial to fix or add on the
>>> AvroIO.sink implementation, is that the intended way that people would be
>>> consuming AvroIO?  I'm a little confused with FileIO.write/writeDynamic vs
>>> WriteFiles vs AvroIO.write, some seem deprecated, and some seem
>>> not-deprecated-but-not-recommended.  To add to the confusion AvroIO.write
>>> uses WriteFiles, but the documentation for the deprecated
>>> AvroIO.to(DynamicAvroDestinations) points to FileIO.write.  Which is the
>>> "right" one to use?
>>>
>>


Re: AvroIO.to(DynamicAvroDestinations) deprecated?

2022-09-13 Thread Steve Niemitz
Ah this is super useful context, thank you!  I can submit a couple PRs to
get AvroIO.sink up to parity if that's the way forward.

On Tue, Sep 13, 2022 at 2:53 PM John Casey via user 
wrote:

> Hi Steve,
>
> I've asked around, and it looks like this confusing state is due to a
> migration that isn't complete (and likely won't be until Beam 3.0).
>
> Here is the doc that explains some of the history:
> https://docs.google.com/document/d/1zcF4ZGtq8pxzLZxgD_JMWAouSszIf9LnFANWHKBsZlg/edit
> And a PR that implements some of the changes:
> https://github.com/apache/beam/pull/3817
>
> Based on this, AvroIO.sink is what we recommend. Please feel free to raise
> issues on Github to account for features you're missing. In addition, if
> you think they are straightforward changes, I'd be happy to discuss
> designs, or look at proposed changes to make these features available.
>
> I hope this helps,
> John
>
> On Mon, Sep 12, 2022 at 3:38 PM Steve Niemitz  wrote:
>
>> We're trying to do some semi-advanced custom logic (custom writers and
>> schemas per destination) with AvroIO, and want to use
>> DynamicAvroDestinations to accomplish this.
>>
>> However, AvroIO.to(DynamicAvroDestinations) is deprecated, but there
>> doesn't seem to be any other way to accomplish what we want here.
>> AvroIO.sink is much less sophisticated than the non-sink options, missing
>> much of the configurability that the non-sink version has.  For example,
>> there's no way to project from the UserT -> OutputT with the sink version,
>> only from UserT -> GenericRecord, which isn't what we want.
>>
>> It seems like most things would be trivial to fix or add on the
>> AvroIO.sink implementation, is that the intended way that people would be
>> consuming AvroIO?  I'm a little confused with FileIO.write/writeDynamic vs
>> WriteFiles vs AvroIO.write, some seem deprecated, and some seem
>> not-deprecated-but-not-recommended.  To add to the confusion AvroIO.write
>> uses WriteFiles, but the documentation for the deprecated
>> AvroIO.to(DynamicAvroDestinations) points to FileIO.write.  Which is the
>> "right" one to use?
>>
>


Re: AvroIO.to(DynamicAvroDestinations) deprecated?

2022-09-13 Thread John Casey via user
Hi Steve,

I've asked around, and it looks like this confusing state is due to a
migration that isn't complete (and likely won't be until Beam 3.0).

Here is the doc that explains some of the history:
https://docs.google.com/document/d/1zcF4ZGtq8pxzLZxgD_JMWAouSszIf9LnFANWHKBsZlg/edit
And a PR that implements some of the changes:
https://github.com/apache/beam/pull/3817

Based on this, AvroIO.sink is what we recommend. Please feel free to raise
issues on Github to account for features you're missing. In addition, if
you think they are straightforward changes, I'd be happy to discuss
designs, or look at proposed changes to make these features available.

I hope this helps,
John

On Mon, Sep 12, 2022 at 3:38 PM Steve Niemitz  wrote:

> We're trying to do some semi-advanced custom logic (custom writers and
> schemas per destination) with AvroIO, and want to use
> DynamicAvroDestinations to accomplish this.
>
> However, AvroIO.to(DynamicAvroDestinations) is deprecated, but there
> doesn't seem to be any other way to accomplish what we want here.
> AvroIO.sink is much less sophisticated than the non-sink options, missing
> much of the configurability that the non-sink version has.  For example,
> there's no way to project from the UserT -> OutputT with the sink version,
> only from UserT -> GenericRecord, which isn't what we want.
>
> It seems like most things would be trivial to fix or add on the
> AvroIO.sink implementation, is that the intended way that people would be
> consuming AvroIO?  I'm a little confused with FileIO.write/writeDynamic vs
> WriteFiles vs AvroIO.write, some seem deprecated, and some seem
> not-deprecated-but-not-recommended.  To add to the confusion AvroIO.write
> uses WriteFiles, but the documentation for the deprecated
> AvroIO.to(DynamicAvroDestinations) points to FileIO.write.  Which is the
> "right" one to use?
>