A recent bug with SqlTransform on Dataflow Runner V2 [1] revealed an
interesting ambiguity in the Beam model: it's not clear if a composite
transform is allowed to have zero sub-transforms [2]. This may sound like
an academic concern, but it can happen if a PTransform returns its own
input, making it a no-op.

I tend to agree with Kenn's comment in the jira that we should allow it. If
we don't this puts a burden on SDKs, they would need to either
a) detect when a PTransform returns one of its inputs and raise an error, or
b) find and replace any such no-ops before generating a portable pipeline
graph

If there aren't any objections to allowing "empty" composites I'll send a
PR to clarify this in beam_runner_api.proto

Brian

[1] https://issues.apache.org/jira/browse/BEAM-11614
[2]
https://github.com/apache/beam/blob/05c8471b27e03e5611a2a13137c4a785f2d17fc9/model/pipeline/src/main/proto/beam_runner_api.proto#L152-L155

Reply via email to