Re: [PR] Add documentation for Wait.On and WaitOn transforms [beam]

via GitHub Wed, 23 Jul 2025 06:04:10 -0700


dizzydroid commented on PR #35647:
URL: https://github.com/apache/beam/pull/35647#issuecomment-3108151910


   Thank you! In the basic example provided:
   ```python
   # Example 1: Basic usage
   with beam.Pipeline() as p:
       main = p | 'CreateMain' >> beam.Create([1, 2, 3])
       side = p | 'CreateSide' >> beam.Create(['a', 'b', 'c'])
   
       result = main | 'WaitOnSide' >> WaitOn(side)
       result | beam.Map(print)
    ```
   
   the line `result = main | 'WaitOnSide' >> WaitOn(side)` implements a 
coordination barrier where `main` elements `[1, 2, 3]` pass through unchanged, 
but only after `side` processing completes. 
   
   Internally, WaitOn converts `side` into a side input using `beam.Map(lambda 
x, *unused_sides: x, *sides)` - the main data (`x`) flows through unmodified 
while the side inputs (`*unused_sides`) act as completion signals. When `result 
| beam.Map(print)` executes, it prints `1, 2, 3` but with guaranteed ordering 
that `side` finished first.
   
   So it's meant to be illustrating a synchronization primitive for 
coordinating pipeline stages, as commonly used for database write ordering or 
ensuring setup tasks complete before main processing.
   
   I've added [explanatory 
comments](https://github.com/apache/beam/pull/35647/commits/3df0ac68188a72db5609f677a46f16ca67f34f90)
 to both the java and python examples, too. I hope this clears things up a 
little!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Add documentation for Wait.On and WaitOn transforms [beam]

Reply via email to