+1
Definitively a confusing API when you first jump on building a PTransform.
For such API change, the sooner the better, I think.

On Wed, Dec 7, 2016 at 10:37 PM, Kenneth Knowles <[email protected]>
wrote:

> Hi all,
>
> I want to bring up another major backwards-incompatible change before it is
> too late, to resolve [BEAM-438].
>
> Summary: Leave PInput.apply the same but rename PTransform.apply to
> PTransform.expand. I have opened [PR #1538] just for reference (it took 30
> seconds using IDE automated refactor)
>
> This change affects *PTransform authors* but does *not* affect pipeline
> authors.
>
> This issue was filed a long time ago. It has been a problem many times with
> actual users since before Beam started incubating. This is what goes wrong
> (often):
>
>    PCollection<Foo> input = ...
>    PTransform<PCollection<Foo>, ...> transform = ...
>
>    transform.apply(input)
>
> This type checks and even looks perfectly normal. Do you see the error?
>
> ... what we need the user to write is:
>
>     input.apply(transform)
>
> What a confusing difference! After all, the first one type-checks and the
> first one is how you apply a Function or Predicate or SerializableFunction,
> etc. But it is broken. With transform.apply(input) the transform is not
> registered with the pipeline at all.
>
> We obviously can't (and don't want to) change the most core way that
> pipeline authors use Beam, so PInput.apply (aka PCollection.apply) must
> remain the same. But we do need a way to make it impossible to mix these
> up.
>
> The simplest way I can think of is to choose a new name for the other
> method involved. Users probably won't write transform.expand(input) since
> they will never have seen it in any examples, etc. This will just make
> PTransform authors need to do a global rename, and the type system will
> direct them to all cases so there is no silent failure possible.
>
> What do you think?
>
> Kenn
>
> [BEAM-438] https://issues.apache.org/jira/browse/BEAM-438
> [PR #1538] https://github.com/apache/incubator-beam/pull/1538
>
> p.s. there is a really amusing and confusing call chain: PCollection.apply
> -> Pipeline.applyTransform -> Pipeline.applyInternal ->
> PipelineRunner.apply -> PTransform.apply
>
> After this change and work to get the runner out of the loop, it becomes
> PCollection.apply -> Pipeline.applyTransform -> PTransform.expand
>



-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: [email protected]
w: http://redlink.co

Reply via email to