Great write up! Unfortunate situation :-( On Wed, Apr 5, 2017 at 3:20 PM, Stephen Sisk <[email protected]> wrote:
> Pablo - thanks for your investigation and taking the time to write this up! > > I filed https://issues.apache.org/jira/browse/BEAM-1891 for this. > > S > > On Wed, Apr 5, 2017 at 2:24 PM Ben Chambers <[email protected]> > wrote: > > Correction autovalue coder. > > On Wed, Apr 5, 2017, 2:24 PM Ben Chambers <[email protected]> wrote: > > > Serializable coder had a separate set of issues - often larger and less > > efficient. Ideally, we would have an avrocoder. > > > > On Wed, Apr 5, 2017, 2:15 PM Pablo Estrada <[email protected]> > > wrote: > > > > As a note, it seems that SerializableCoder does the trick in this case, > as > > it does not require a no-arg constructor for the class that is being > > deserialized - so perhaps we should encourage people to use that in the > > future. > > Best > > -P. > > > > On Wed, Apr 5, 2017 at 1:48 PM Pablo Estrada <[email protected]> wrote: > > > > > Hi all, > > > I was encouraged to write about my troubles to use PCollections of > > > AutoValue classes with AvroCoder; because it seems like currently, this > > is > > > not possible. > > > > > > As part of the changes to PAssert, I meant to create a SuccessOrFailure > > > class that could be passed in a PCollection to a `concludeTransform`, > > which > > > would be in charge of validating that all the assertions succeeded, and > > use > > > AvroCoder for serialization of that class. Consider this dummy example: > > > > > > @AutoValue > > > abstract class FizzBuzz { > > > ... > > > } > > > > > > class FizzBuzzDoFn extends DoFn<Integer, FizzBuzz> { > > > ... > > > } > > > > > > 1. The first problem was that the abstract class does not have any > > > attributes, so AvroCoder can not scrape them. For this, (with advice > from > > > Kenn Knowles), the Coder would need to take the AutoValue-generated > > class: > > > > > > .apply(ParDo.of(new FizzBuzzDoFn())) > > > .setCoder(AvroCoder.of((Class<FizzBuzz>) AutoValue_FizzBuzz.class)) > > > > > > 2. This errored out saying that FizzBuzz and AutoValue_FizzBuzz are > > > incompatible classes, so I just tried bypassing the type system like > so: > > > > > > .setCoder(AvroCoder.of((Class) AutoValue_FizzBuzz.class)) > > > > > > 3. This compiled properly, and encoding worked, but the problem came at > > > decoding, because Avro specifically requires the class to have a no-arg > > > constructor [1], and AutoValue-generated classes do not come with one. > > This > > > is a problem for several serialization frameworks, and we're not the > > first > > > ones to hit this [2], and the AutoValue people don't seem keen on > adding > > > this. > > > > > > Considering all that, it seems that the AutoValue-AvroCoder pair can > not > > > currently work. We'd need a serialization framework that does not > depend > > on > > > calling the no-arg constructor and then filling in the attributes with > > > reflection. I'm trying to check if SerializableCoder has different > > > deserialization techniques; but for PAssert, I just decided to use > > > POJO+AvroCoder. > > > > > > I hope my experience may be useful to others, and maybe start a > > discussion > > > on how to enable users to have AutoValue classes in their PCollections. > > > > > > Best > > > -P. > > > > > > [1] - > > > > > > http://avro.apache.org/docs/1.7.7/api/java/org/apache/avro/ > reflect/package-summary.html?is-external=true > > > [2] - https://github.com/google/auto/issues/122 > > > > > > > > > > >
