I just want to highlight that "just returns its inputs" has a couple of non-useless instantiations:
- programmatically generating a transform (as in SqlTransform) - rearranging the Java or Python data structures (like PCollectionList and PCollectionTuple) that are noops in the protocol buffers Kenn On Wed, Jan 13, 2021 at 8:01 AM Brian Hulette <bhule...@google.com> wrote: > Thanks Robert. Docs PR is up: https://github.com/apache/beam/pull/13744 > > Brian > > On Tue, Jan 12, 2021 at 2:04 PM Robert Bradshaw <rober...@google.com> > wrote: > >> Yes, a PTansform can have no sub-transforms, as long as it only returns >> its inputs. Updating the docs would be a good idea. >> >> On Tue, Jan 12, 2021 at 1:04 PM Brian Hulette <bhule...@google.com> >> wrote: >> >>> A recent bug with SqlTransform on Dataflow Runner V2 [1] revealed an >>> interesting ambiguity in the Beam model: it's not clear if a composite >>> transform is allowed to have zero sub-transforms [2]. This may sound like >>> an academic concern, but it can happen if a PTransform returns its own >>> input, making it a no-op. >>> >>> I tend to agree with Kenn's comment in the jira that we should allow it. >>> If we don't this puts a burden on SDKs, they would need to either >>> a) detect when a PTransform returns one of its inputs and raise an >>> error, or >>> b) find and replace any such no-ops before generating a portable >>> pipeline graph >>> >>> If there aren't any objections to allowing "empty" composites I'll send >>> a PR to clarify this in beam_runner_api.proto >>> >>> Brian >>> >>> [1] https://issues.apache.org/jira/browse/BEAM-11614 >>> [2] >>> https://github.com/apache/beam/blob/05c8471b27e03e5611a2a13137c4a785f2d17fc9/model/pipeline/src/main/proto/beam_runner_api.proto#L152-L155 >>> >>