Eugene Kirpichov created BEAM-2536: -------------------------------------- Summary: Simplify specifying coders on PCollectionTuple Key: BEAM-2536 URL: https://issues.apache.org/jira/browse/BEAM-2536 Project: Beam Issue Type: Bug Components: sdk-java-core Reporter: Eugene Kirpichov
Currently when using a multi-output ParDo, the user usually has to do one of the following: 1) Use anonymous class: new TupleTag<Foo>() {} - in order to reify the Foo type and make coder inference work. In this case, a frequent problem is that the anonymous class captures a large enclosing class, and either doesn't serialize at all, or at least serializes to something bulky. 2) Explicitly do tuple.get(myTag).setCoder(...) Both of these are suboptimal. Could we have e.g. a constructor for TupleTag that explicitly takes a TypeDescriptor? Or even a Coder? Or a family of factory methods for TupleTagList that take these? E.g.: in.apply(ParDo.of(...).withOutputTags(mainTag, TupleTagList.of(side1, FooCoder.of()).and(side2, BarCoder.of())); I would suggest both: TupleTag constructor should optionally take a TypeDescriptor; and TupleTagList.of() and .and() should optionally take a Coder. -- This message was sent by Atlassian JIRA (v6.4.14#64029)