I agree there's value in this. The cause of BEAM-11614 is a good example of
your first instantiation. An optimization was added to the Convert
transform to make it recognize when it's unneeded and short-circuit itself.
This short-circuit was triggered (twice) within the SqlTransform used in
sql_taxi.py.

On Wed, Jan 13, 2021 at 11:57 AM Kenneth Knowles <k...@apache.org> wrote:

> I just want to highlight that "just returns its inputs" has a couple of
> non-useless instantiations:
>
>  - programmatically generating a transform (as in SqlTransform)
>
 - rearranging the Java or Python data structures (like PCollectionList and
> PCollectionTuple) that are noops in the protocol buffers
>
> Kenn
>
> On Wed, Jan 13, 2021 at 8:01 AM Brian Hulette <bhule...@google.com> wrote:
>
>> Thanks Robert. Docs PR is up: https://github.com/apache/beam/pull/13744
>>
>> Brian
>>
>> On Tue, Jan 12, 2021 at 2:04 PM Robert Bradshaw <rober...@google.com>
>> wrote:
>>
>>> Yes, a PTansform can have no sub-transforms, as long as it only returns
>>> its inputs. Updating the docs would be a good idea.
>>>
>>> On Tue, Jan 12, 2021 at 1:04 PM Brian Hulette <bhule...@google.com>
>>> wrote:
>>>
>>>> A recent bug with SqlTransform on Dataflow Runner V2 [1] revealed an
>>>> interesting ambiguity in the Beam model: it's not clear if a composite
>>>> transform is allowed to have zero sub-transforms [2]. This may sound like
>>>> an academic concern, but it can happen if a PTransform returns its own
>>>> input, making it a no-op.
>>>>
>>>> I tend to agree with Kenn's comment in the jira that we should allow
>>>> it. If we don't this puts a burden on SDKs, they would need to either
>>>> a) detect when a PTransform returns one of its inputs and raise an
>>>> error, or
>>>> b) find and replace any such no-ops before generating a portable
>>>> pipeline graph
>>>>
>>>> If there aren't any objections to allowing "empty" composites I'll send
>>>> a PR to clarify this in beam_runner_api.proto
>>>>
>>>> Brian
>>>>
>>>> [1] https://issues.apache.org/jira/browse/BEAM-11614
>>>> [2]
>>>> https://github.com/apache/beam/blob/05c8471b27e03e5611a2a13137c4a785f2d17fc9/model/pipeline/src/main/proto/beam_runner_api.proto#L152-L155
>>>>
>>>

Reply via email to