On Thu, Aug 10, 2017 at 5:06 PM, Reuven Lax <re...@google.com.invalid> wrote: > Interestingly I've seen examples of PTransforms where the transform itself > is unable to easily set its own coder. This happens when the transform is > parametrized in such a way that its ouput coder is not determinable except > by the caller of the PTransform. The caller can of course pass a coder into > the constructor of the PTransform, but that's not any cleaner than simply > calling setCoder on the output.
The argument is that having a setCoder on the output is infact more problematic than passing a coder to the PTransform (via the constructor, or a builder-style method). It does introduce the ugliness that each such PTransform must manually provide this capability though; it'd be nice to reduce this boilerplate. > On Thu, Aug 10, 2017 at 4:57 PM, Eugene Kirpichov < > kirpic...@google.com.invalid> wrote: > >> I've updated the guidance in PTransform Style Guide on setting coders >> https://beam.apache.org/contribute/ptransform-style-guide/#coders >> according to this discussion. >> https://github.com/apache/beam-site/pull/279 >> >> On Thu, Aug 3, 2017 at 6:27 PM Robert Bradshaw <rober...@google.com.invalid >> > >> wrote: >> >> > On Thu, Aug 3, 2017 at 6:08 PM, Eugene Kirpichov >> > <kirpic...@google.com.invalid> wrote: >> > > https://github.com/apache/beam/pull/3649 has landed. The main >> > contribution >> > > of this PR is deprecating PTransform.getDefaultOutputCoder(). >> > > >> > > Next steps are to get rid of all setCoder() calls in the SDK, and >> > deprecate >> > > setCoder(). >> > > Nearly all setCoder() calls (perhaps simply all?) I found are on the >> > output >> > > of mapping transforms, such as ParDo, Map/FlatMapElements, WithKeys. >> > > I think we should simply make these transforms optionally configurable >> > with >> > > an output coder: e.g. input.apply(ParDo.of(new >> > > SomeFn<>()).withOutputCoder(SomeCoder.of())) >> > > For multi-output ParDo this is a little more complex API-wise, but >> doable >> > > too. >> > > >> > > (another minor next step is to say in PTransform Style Guide that the >> > > transform must set a coder on all its outputs) >> > > >> > > Sounds reasonable? >> > >> > +1 >> > >> > I'd like to do this in a way that lowers the burden for all PTransform >> > authors. Can't think of a better way than a special subclass of >> > PTransform that has the setters that one would subclass... >> > >> > > On Thu, Aug 3, 2017 at 5:34 AM Lukasz Cwik <lc...@google.com.invalid> >> > wrote: >> > > >> > >> I'm for (1) and am not sure about the feasibility of (2) without >> having >> > an >> > >> escape hatch that allows a pipeline author to specify a coder to >> handle >> > >> their special case. >> > >> >> > >> On Tue, Aug 1, 2017 at 2:15 PM, Reuven Lax <re...@google.com.invalid> >> > >> wrote: >> > >> >> > >> > One interesting wrinkle: I'm about to propose a set of semantics for >> > >> > snapshotting/in-place updating pipelines. Part of this proposal is >> the >> > >> > ability to write pipelines to "upgrade" snapshots to make them >> > compatible >> > >> > with new graphs. This relies on the ability to have two separate >> > coders >> > >> for >> > >> > the same type - the old coder and the new coder - in order to handle >> > the >> > >> > case where the user has changed coders in the new pipeline. >> > >> > >> > >> > On Tue, Aug 1, 2017 at 2:12 PM, Robert Bradshaw >> > >> > <rober...@google.com.invalid >> > >> > > wrote: >> > >> > >> > >> > > There are two concerns in this thread: >> > >> > > >> > >> > > (1) Getting rid of PCollection.setCoder(). Everyone seems in favor >> > of >> > >> > this >> > >> > > (right?) >> > >> > > >> > >> > > (2) Deprecating specifying Coders in favor of specifying >> > >> TypeDescriptors. >> > >> > > I'm generally in favor, but it's unclear how far we can push this >> > >> > through. >> > >> > > >> > >> > > Let's at least do (1), and separately state a preference for (2), >> > >> seeing >> > >> > > how fare we can push it. >> > >> > > >> > >> > > On Thu, Jul 27, 2017 at 9:13 PM, Kenneth Knowles >> > >> <k...@google.com.invalid >> > >> > > >> > >> > > wrote: >> > >> > > >> > >> > > > Another thought on this: setting a custom coder to support a >> > special >> > >> > data >> > >> > > > distribution is likely often a property of the input to the >> > pipeline. >> > >> > So >> > >> > > > setting a coder during pipeline construction - more generally, >> > when >> > >> > > writing >> > >> > > > a composite transform for reuse - you might not actually have >> the >> > >> > needed >> > >> > > > information. But setting up a special indicator type descriptor >> > lets >> > >> > your >> > >> > > > users map that type descriptor to a coder that works well for >> > their >> > >> > data. >> > >> > > > >> > >> > > > But Robert's example of RawUnionValue seems like a deal breaker >> > for >> > >> all >> > >> > > > approaches. It really requires .getCoder() during expand() and >> > >> > explicitly >> > >> > > > building coders encoding information that is cumbersome to get >> > into a >> > >> > > > TypeDescriptor. While making up new type languages is a >> > comfortable >> > >> > > > activity for me :-) I don't think we should head down that path, >> > for >> > >> > our >> > >> > > > users' sake. So I'll stop hoping we can eliminate this pain >> point >> > for >> > >> > > now. >> > >> > > > >> > >> > > > Kenn >> > >> > > > >> > >> > > > On Thu, Jul 27, 2017 at 8:48 PM, Kenneth Knowles < >> k...@google.com> >> > >> > wrote: >> > >> > > > >> > >> > > > > On Thu, Jul 27, 2017 at 11:18 AM, Thomas Groh >> > >> > <tg...@google.com.invalid >> > >> > > > >> > >> > > > > wrote: >> > >> > > > > >> > >> > > > >> introduce a >> > >> > > > >> new, specialized type to represent the restricted >> > >> > > > >> (alternatively-distributed?) data. The TypeDescriptor for >> this >> > >> type >> > >> > > can >> > >> > > > >> map >> > >> > > > >> to the specialized coder, without having to perform a >> > significant >> > >> > > degree >> > >> > > > >> of >> > >> > > > >> potentially wasted encoding work, plus it includes the >> > assumptions >> > >> > > that >> > >> > > > >> are >> > >> > > > >> being made about the distribution of data. >> > >> > > > >> >> > >> > > > > >> > >> > > > > This is a very cool idea, in theory :-) >> > >> > > > > >> > >> > > > > For complex types with a few allocations involved and/or >> > nontrivial >> > >> > > > > deserialization, or when a pipeline does a lot of real work, I >> > >> think >> > >> > > the >> > >> > > > > wrapper cost won't be perceptible. >> > >> > > > > >> > >> > > > > But for more primitive types in pipelines that don't really >> do >> > >> much >> > >> > > > > computation but just move data around, I think it could >> matter. >> > >> > > Certainly >> > >> > > > > there are languages with constructs to allow type wrappers at >> > zero >> > >> > cost >> > >> > > > > (Haskell's `newtype`). >> > >> > > > > >> > >> > > > > This is all just speculation until we measure, like most of >> this >> > >> > > thread. >> > >> > > > > >> > >> > > > > Kenn >> > >> > > > > >> > >> > > > > >> > >> > > > >> > On Thu, Jul 27, 2017 at 11:00 AM, Thomas Groh >> > >> > > > <tg...@google.com.invalid >> > >> > > > >> > >> > >> > > > >> > wrote: >> > >> > > > >> > >> > >> > > > >> > > +1 on getting rid of setCoder; just from a Java SDK >> > >> perspective, >> > >> > > my >> > >> > > > >> ideal >> > >> > > > >> > > world contains PCollections which don't have a >> user-visible >> > >> way >> > >> > to >> > >> > > > >> mutate >> > >> > > > >> > > them. >> > >> > > > >> > > >> > >> > > > >> > > My preference would be to use TypeDescriptors everywhere >> > >> within >> > >> > > > >> Pipeline >> > >> > > > >> > > construction (where possible), and utilize the >> > CoderRegistry >> > >> > > > >> everywhere >> > >> > > > >> > to >> > >> > > > >> > > actually extract the appropriate type. The unfortunate >> > >> > difficulty >> > >> > > of >> > >> > > > >> > having >> > >> > > > >> > > to encode a union type and the lack of variable-length >> > >> generics >> > >> > > does >> > >> > > > >> > > complicate that. We could consider some way of >> constructing >> > >> > coders >> > >> > > > in >> > >> > > > >> the >> > >> > > > >> > > registry from a collection of type descriptors (which >> > should >> > >> be >> > >> > > > >> > accessible >> > >> > > > >> > > from the point the union-type is being constructed), e.g. >> > >> > > something >> > >> > > > >> like >> > >> > > > >> > > `getCoder(TypeDescriptor output, TypeDescriptor... >> > >> components)` >> > >> > - >> > >> > > > that >> > >> > > > >> > does >> > >> > > > >> > > only permit a single flat level (but since this is being >> > >> invoked >> > >> > > by >> > >> > > > >> the >> > >> > > > >> > SDK >> > >> > > > >> > > during construction it could also pass Coder...). >> > >> > > > >> > > >> > >> > > > >> > > >> > >> > > > >> > > >> > >> > > > >> > > On Thu, Jul 27, 2017 at 10:22 AM, Robert Bradshaw < >> > >> > > > >> > > rober...@google.com.invalid> wrote: >> > >> > > > >> > > >> > >> > > > >> > > > On Thu, Jul 27, 2017 at 10:04 AM, Kenneth Knowles >> > >> > > > >> > > > <k...@google.com.invalid> wrote: >> > >> > > > >> > > > > On Thu, Jul 27, 2017 at 2:22 AM, Lukasz Cwik >> > >> > > > >> > <lc...@google.com.invalid >> > >> > > > >> > > > >> > >> > > > >> > > > > wrote: >> > >> > > > >> > > > >> >> > >> > > > >> > > > >> Ken/Robert, I believe users will want the ability to >> > set >> > >> > the >> > >> > > > >> output >> > >> > > > >> > > > coder >> > >> > > > >> > > > >> because coders may have intrinsic properties where >> the >> > >> type >> > >> > > > isn't >> > >> > > > >> > > enough >> > >> > > > >> > > > >> information to fully specify what I want as a user. >> > Some >> > >> > > cases >> > >> > > > I >> > >> > > > >> can >> > >> > > > >> > > see >> > >> > > > >> > > > >> are: >> > >> > > > >> > > > >> 1) I have a cheap and fast non-deterministic coder >> > but a >> > >> > > > >> different >> > >> > > > >> > > > slower >> > >> > > > >> > > > >> coder when I want to use it as the key to a GBK, For >> > >> > example >> > >> > > > >> with a >> > >> > > > >> > > set >> > >> > > > >> > > > >> coder, it would need to consistently order the >> values >> > of >> > >> > the >> > >> > > > set >> > >> > > > >> > when >> > >> > > > >> > > > used >> > >> > > > >> > > > >> as the key. >> > >> > > > >> > > > >> 2) I know a property of the data which allows me to >> > have >> > >> a >> > >> > > > >> cheaper >> > >> > > > >> > > > >> encoding. Imagine I know that all the strings have a >> > >> common >> > >> > > > >> prefix >> > >> > > > >> > or >> > >> > > > >> > > > >> integers that are in a certain range, or that a >> > matrix is >> > >> > > > >> > > sparse/dense. >> > >> > > > >> > > > Not >> > >> > > > >> > > > >> all PCollections of strings / integers / matrices in >> > the >> > >> > > > pipeline >> > >> > > > >> > will >> > >> > > > >> > > > have >> > >> > > > >> > > > >> this property, just some. >> > >> > > > >> > > > >> 3) Sorting comes up occasionally, traditionally in >> > Google >> > >> > > this >> > >> > > > >> was >> > >> > > > >> > > done >> > >> > > > >> > > > by >> > >> > > > >> > > > >> sorting the encoded version of the object >> > >> lexicographically >> > >> > > > >> during a >> > >> > > > >> > > > GBK. >> > >> > > > >> > > > >> There are good lexicographical byte representations >> > for >> > >> > ASCII >> > >> > > > >> > strings, >> > >> > > > >> > > > >> integers, and for some IEEE number representations >> > which >> > >> > > could >> > >> > > > be >> > >> > > > >> > done >> > >> > > > >> > > > by >> > >> > > > >> > > > >> the use of a special coder. >> > >> > > > >> > > > >> >> > >> > > > >> > > > > >> > >> > > > >> > > > > Items (1) and (3) do not require special knowledge >> from >> > >> the >> > >> > > > user. >> > >> > > > >> > They >> > >> > > > >> > > > are >> > >> > > > >> > > > > easily observed properties of a pipeline. My proposal >> > >> > included >> > >> > > > >> full >> > >> > > > >> > > > > automation for both. The suggestion is new methods >> > >> > > > >> > > > > .getDeterministicCoder(TypeDescriptor) and >> > >> > > > >> > > > > .getLexicographicCoder(TypeDescriptor). >> > >> > > > >> > > > >> > >> > > > >> > > > Completely agree--usecases (1) and (3) are an indirect >> > use >> > >> of >> > >> > > > Coders >> > >> > > > >> > > > that are used to achieve an effect that would be better >> > >> > > expressed >> > >> > > > >> > > > directly. >> > >> > > > >> > > > >> > >> > > > >> > > > > (2) is an interesting hypothetical for massive scale >> > where >> > >> > > tiny >> > >> > > > >> > > > incremental >> > >> > > > >> > > > > optimization represents a lot of cost _and_ your data >> > has >> > >> > > > >> sufficient >> > >> > > > >> > > > > structure to realize a benefit _and_ it needs to be >> > >> > pinpointed >> > >> > > > to >> > >> > > > >> > just >> > >> > > > >> > > > some >> > >> > > > >> > > > > PCollections. I think our experience with coders so >> > far is >> > >> > > that >> > >> > > > >> their >> > >> > > > >> > > > > existence is almost entirely negative. It would be >> > nice to >> > >> > > > support >> > >> > > > >> > this >> > >> > > > >> > > > > vanishingly rare case without inflicting a terrible >> > pain >> > >> > point >> > >> > > > on >> > >> > > > >> the >> > >> > > > >> > > > model >> > >> > > > >> > > > > and all other users. >> > >> > > > >> > > > >> > >> > > > >> > > > (2) is not just about cheapness, sometimes there's >> other >> > >> > > structure >> > >> > > > >> in >> > >> > > > >> > > > the data we can leverage. Consider the UnionCoder used >> in >> > >> > > > >> > > > CoGBK--RawUnionValue has an integer value that >> specifies >> > >> > > indicates >> > >> > > > >> the >> > >> > > > >> > > > type of it's raw Object field. Unless we want to extend >> > the >> > >> > type >> > >> > > > >> > > > language, there's not a sufficient type descriptor that >> > can >> > >> be >> > >> > > > used >> > >> > > > >> to >> > >> > > > >> > > > infer the coder. I'm dubious going down the road of >> > adding >> > >> > > special >> > >> > > > >> > > > cases is the right thing here. >> > >> > > > >> > > > >> > >> > > > >> > > > > For example, in those cases you could encode in your >> > >> > > > >> > > > > DoFn so the type descriptor would just be byte[]. >> > >> > > > >> > > > >> > >> > > > >> > > > As well as being an extremely cumbersome API, this >> would >> > >> incur >> > >> > > the >> > >> > > > >> > > > cost of coding/decoding at that DoFn boundary even if >> it >> > is >> > >> > > fused >> > >> > > > >> > > > away. >> > >> > > > >> > > > >> > >> > > > >> > > > >> On Thu, Jul 27, 2017 at 1:34 AM, Jean-Baptiste >> Onofré >> > < >> > >> > > > >> > > j...@nanthrax.net> >> > >> > > > >> > > > >> wrote: >> > >> > > > >> > > > >> >> > >> > > > >> > > > >> > Hi, >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> > That's an interesting thread and I was wondering >> the >> > >> > > > >> relationship >> > >> > > > >> > > > between >> > >> > > > >> > > > >> > type descriptor and coder for a while ;) >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> > Today, in a PCollection, we can set the coder and >> we >> > >> also >> > >> > > > have >> > >> > > > >> a >> > >> > > > >> > > > >> > getTypeDescriptor(). It sounds weird to me: it >> > should >> > >> be >> > >> > > one >> > >> > > > or >> > >> > > > >> > the >> > >> > > > >> > > > >> other. >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> > Basically, if the Coder is not used to define the >> > type, >> > >> > > > than, I >> > >> > > > >> > > fully >> > >> > > > >> > > > >> > agree with Eugene. >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> > Basically, the PCollection should define only the >> > type >> > >> > > > >> descriptor, >> > >> > > > >> > > not >> > >> > > > >> > > > >> the >> > >> > > > >> > > > >> > coder by itself: the coder can be found using the >> > type >> > >> > > > >> descriptor. >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> > With both coder and type descriptor on the >> > PCollection, >> > >> > it >> > >> > > > >> sounds >> > >> > > > >> > a >> > >> > > > >> > > > big >> > >> > > > >> > > > >> > "decoupled" to me and it would be possible to >> have a >> > >> > coder >> > >> > > on >> > >> > > > >> the >> > >> > > > >> > > > >> > PCollection that doesn't match the type >> descriptor. >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> > I think PCollection type descriptor should be >> > defined, >> > >> > and >> > >> > > > the >> > >> > > > >> > coder >> > >> > > > >> > > > >> > should be implicit based on this type descriptor. >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> > Thoughts ? >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> > Regards >> > >> > > > >> > > > >> > JB >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> > On 07/26/2017 05:25 AM, Eugene Kirpichov wrote: >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> >> Hello, >> > >> > > > >> > > > >> >> >> > >> > > > >> > > > >> >> I've worked on a few different things recently >> and >> > ran >> > >> > > > >> repeatedly >> > >> > > > >> > > > into >> > >> > > > >> > > > >> the >> > >> > > > >> > > > >> >> same issue: that we do not have clear guidance on >> > who >> > >> > > should >> > >> > > > >> set >> > >> > > > >> > > the >> > >> > > > >> > > > >> Coder >> > >> > > > >> > > > >> >> on a PCollection: is it responsibility of the >> > >> PTransform >> > >> > > > that >> > >> > > > >> > > outputs >> > >> > > > >> > > > >> it, >> > >> > > > >> > > > >> >> or is it responsibility of the user, or is it >> > >> sometimes >> > >> > > one >> > >> > > > >> and >> > >> > > > >> > > > >> sometimes >> > >> > > > >> > > > >> >> the other? >> > >> > > > >> > > > >> >> >> > >> > > > >> > > > >> >> I believe that the answer is "it's responsibility >> > of >> > >> the >> > >> > > > >> > transform" >> > >> > > > >> > > > and >> > >> > > > >> > > > >> >> moreover that ideally PCollection.setCoder() >> > should >> > >> not >> > >> > > > >> exist. >> > >> > > > >> > > > Instead: >> > >> > > > >> > > > >> >> >> > >> > > > >> > > > >> >> - Require that all transforms set a Coder on the >> > >> > > > PCollection's >> > >> > > > >> > they >> > >> > > > >> > > > >> >> produce >> > >> > > > >> > > > >> >> - i.e. it should never be responsibility of the >> > user >> > >> to >> > >> > > "fix >> > >> > > > >> up" >> > >> > > > >> > a >> > >> > > > >> > > > coder >> > >> > > > >> > > > >> >> on >> > >> > > > >> > > > >> >> a PCollection produced by a transform. >> > >> > > > >> > > > >> >> >> > >> > > > >> > > > >> >> - Since all transforms are composed of primitive >> > >> > > transforms, >> > >> > > > >> > saying >> > >> > > > >> > > > >> >> "transforms must set a Coder" means simply that >> all >> > >> > > > >> *primitive* >> > >> > > > >> > > > >> transforms >> > >> > > > >> > > > >> >> must set a Coder on their output. >> > >> > > > >> > > > >> >> >> > >> > > > >> > > > >> >> - In some cases, a primitive PTransform currently >> > >> > doesn't >> > >> > > > have >> > >> > > > >> > > enough >> > >> > > > >> > > > >> >> information to infer a coder for its output >> > >> collection - >> > >> > > > e.g. >> > >> > > > >> > > > >> >> ParDo.of(DoFn<InputT, OutputT>) might be unable >> to >> > >> > infer a >> > >> > > > >> coder >> > >> > > > >> > > for >> > >> > > > >> > > > >> >> OutputT. In that case such transforms should >> allow >> > the >> > >> > > user >> > >> > > > to >> > >> > > > >> > > > provide a >> > >> > > > >> > > > >> >> coder: ParDo.of(DoFn).withOutputCoder(...) [note >> > that >> > >> > > this >> > >> > > > >> > differs >> > >> > > > >> > > > from >> > >> > > > >> > > > >> >> requiring the user to set a coder on the >> resulting >> > >> > > > collection] >> > >> > > > >> > > > >> >> >> > >> > > > >> > > > >> >> - Corollary: composite transforms need to only >> > >> configure >> > >> > > > their >> > >> > > > >> > > > primitive >> > >> > > > >> > > > >> >> transforms (and composite sub-transforms) >> properly, >> > >> and >> > >> > > give >> > >> > > > >> > them a >> > >> > > > >> > > > >> Coder >> > >> > > > >> > > > >> >> if needed. >> > >> > > > >> > > > >> >> >> > >> > > > >> > > > >> >> - Corollary: a PTransform with type parameters >> > <FooT, >> > >> > > BarT, >> > >> > > > >> ...> >> > >> > > > >> > > > needs >> > >> > > > >> > > > >> to >> > >> > > > >> > > > >> >> be configurable with coders for all of these, >> > because >> > >> > the >> > >> > > > >> > > > implementation >> > >> > > > >> > > > >> >> of >> > >> > > > >> > > > >> >> the transform may change and it may introduce >> > >> > intermediate >> > >> > > > >> > > > collections >> > >> > > > >> > > > >> >> involving these types. However, in many cases, >> > some of >> > >> > > these >> > >> > > > >> type >> > >> > > > >> > > > >> >> parameters appear in the type of the transform's >> > >> input, >> > >> > > > e.g. a >> > >> > > > >> > > > >> >> PTransform<PCollection<KV<FooT, BarT>>, >> > >> > > PCollection<MooT>> >> > >> > > > >> will >> > >> > > > >> > > > always >> > >> > > > >> > > > >> be >> > >> > > > >> > > > >> >> able to extract the coders for FooT and BarT from >> > the >> > >> > > input >> > >> > > > >> > > > PCollection, >> > >> > > > >> > > > >> >> so >> > >> > > > >> > > > >> >> the user does not need to provide them. However, >> a >> > >> coder >> > >> > > for >> > >> > > > >> BarT >> > >> > > > >> > > > must >> > >> > > > >> > > > >> be >> > >> > > > >> > > > >> >> provided. I think in most cases the transform >> > needs to >> > >> > be >> > >> > > > >> > > > configurable >> > >> > > > >> > > > >> >> only >> > >> > > > >> > > > >> >> with coders for its output. >> > >> > > > >> > > > >> >> >> > >> > > > >> > > > >> >> Here's a smooth migration path to accomplish the >> > >> above: >> > >> > > > >> > > > >> >> - Make PCollection. >> createPrimitiveOutputInternal() >> > >> > take a >> > >> > > > >> Coder. >> > >> > > > >> > > > >> >> - Make all primitive transforms optionally >> > >> configurable >> > >> > > > with a >> > >> > > > >> > > coder >> > >> > > > >> > > > for >> > >> > > > >> > > > >> >> their outputs, such as ParDo.of(DoFn). >> > >> > withOutputCoder(). >> > >> > > > >> > > > >> >> - By using the above, make all composite >> transforms >> > >> > > shipped >> > >> > > > >> with >> > >> > > > >> > > the >> > >> > > > >> > > > SDK >> > >> > > > >> > > > >> >> set a Coder on the collections they produce; in >> > some >> > >> > > cases, >> > >> > > > >> this >> > >> > > > >> > > will >> > >> > > > >> > > > >> >> require adding a withSomethingCoder() option to >> the >> > >> > > > transform >> > >> > > > >> and >> > >> > > > >> > > > >> >> propagating that coder to its sub-transforms. If >> > the >> > >> > > option >> > >> > > > is >> > >> > > > >> > > unset, >> > >> > > > >> > > > >> >> that's fine for now. >> > >> > > > >> > > > >> >> - As a result of the above, get rid of all >> > setCoder() >> > >> > > calls >> > >> > > > in >> > >> > > > >> > the >> > >> > > > >> > > > Beam >> > >> > > > >> > > > >> >> repo. The call will still be there, but it will >> > just >> > >> not >> > >> > > be >> > >> > > > >> used >> > >> > > > >> > > > >> anywhere >> > >> > > > >> > > > >> >> in the SDK or examples, and we can mark it >> > deprecated. >> > >> > > > >> > > > >> >> - Add guidance to PTransform Style Guide in line >> > with >> > >> > the >> > >> > > > >> above >> > >> > > > >> > > > >> >> >> > >> > > > >> > > > >> >> Does this sound like a good idea? I'm not sure >> how >> > >> > urgent >> > >> > > it >> > >> > > > >> > would >> > >> > > > >> > > > be to >> > >> > > > >> > > > >> >> actually do this, but I'd like to know whether >> > people >> > >> > > agree >> > >> > > > >> that >> > >> > > > >> > > this >> > >> > > > >> > > > >> is a >> > >> > > > >> > > > >> >> good goal in general. >> > >> > > > >> > > > >> >> >> > >> > > > >> > > > >> >> >> > >> > > > >> > > > >> > -- >> > >> > > > >> > > > >> > Jean-Baptiste Onofré >> > >> > > > >> > > > >> > jbono...@apache.org >> > >> > > > >> > > > >> > http://blog.nanthrax.net >> > >> > > > >> > > > >> > Talend - http://www.talend.com >> > >> > > > >> > > > >> > >> > >> > > > >> > > > >> >> > >> > > > >> > > > >> > >> > > > >> > > >> > >> > > > >> > >> > >> > > > >> >> > >> > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > >>