tvalentyn commented on code in PR #35647:
URL: https://github.com/apache/beam/pull/35647#discussion_r2225761190


##########
website/www/site/content/en/documentation/transforms/python/other/waiton.md:
##########
@@ -0,0 +1,65 @@
+---
+title: "WaitOn"
+---
+
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# WaitOn
+
+`WaitOn` delays the processing of a main `PCollection` until one or more other 
`PCollections` (signals) have finished processing. This is useful for enforcing 
ordering or dependencies between different parts of a pipeline, especially when 
some outputs interact with external systems (such as writing to a database).
+
+When you apply `WaitOn`, the elements of the main `PCollection` will not be 
processed until all the specified signal `PCollections` have completed. In 
streaming mode, this is enforced per window: the corresponding window of each 
waited-on `PCollection` must be complete before elements are passed through.
+
+## Examples
+
+```python
+import apache_beam as beam
+from apache_beam.transforms.util import WaitOn
+
+# Example 1: Basic usage
+with beam.Pipeline() as p:
+    main = p | 'CreateMain' >> beam.Create([1, 2, 3])
+    side = p | 'CreateSide' >> beam.Create(['a', 'b', 'c'])

Review Comment:
   How about we change 'side' to 'signal' and add some processsing step after 
create, so that there is a processing step involved? Even if something simple 
as to sleep for a few seconds.



##########
website/www/site/content/en/documentation/transforms/python/other/waiton.md:
##########
@@ -0,0 +1,65 @@
+---
+title: "WaitOn"
+---
+
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# WaitOn
+
+`WaitOn` delays the processing of a main `PCollection` until one or more other 
`PCollections` (signals) have finished processing. This is useful for enforcing 
ordering or dependencies between different parts of a pipeline, especially when 
some outputs interact with external systems (such as writing to a database).

Review Comment:
   How about the following wording in both Python and Java docs:
   
   `WaitOn` returns a  `PCollection` with the contents identical to the input 
`PCollection`, but  delays the downstream processing  until one or more other 
`PCollections` (signals) have finished processing. This is useful for enforcing 
ordering or dependencies between different parts of a pipeline, especially when 
some outputs interact with external systems (such as writing to a database).`
   
   When you apply `WaitOn`, the elements of the main `PCollection` will not be 
emitted for downstream processing  until the computations required to produce 
the specified signal `PCollections` have completed. In streaming mode, this is 
enforced per window: the corresponding window of each waited-on `PCollection` 
must close before elements are passed through.
   



##########
website/www/site/content/en/documentation/transforms/java/other/wait.md:
##########
@@ -0,0 +1,56 @@
+---
+title: "Wait.On"
+---
+
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Wait.On
+
+`Wait.On` is a transform that delays the processing of a main `PCollection` 
until one or more other `PCollections` (signals) have finished processing. This 
is useful for enforcing ordering or dependencies between different parts of a 
pipeline, especially when some outputs interact with external systems (such as 
writing to a database).
+
+When you apply `Wait.On`, the elements of the main `PCollection` will not be 
processed until all the specified signal `PCollections` have completed. In 
streaming mode, this is enforced per window: the corresponding window of each 
waited-on `PCollection` must be complete before elements are passed through.
+
+## Examples
+
+```java
+// Example 1: Basic usage
+PCollection<String> main = ...;
+PCollection<Void> signal = ...;
+
+// Wait for 'signal' to complete before processing 'main'
+// Elements pass through unchanged after 'signal' finishes
+PCollection<String> result = main.apply(Wait.on(signal));

Review Comment:
   1. Could you please add a link to the javadoc: 
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/Wait.html
   
   2. There is no downstream processing after  Wait.on - could we add 
something? You could also reuse the example from the Javadoc if you don't feel 
like making a full example.
   
   



##########
website/www/site/content/en/documentation/transforms/python/overview.md:
##########
@@ -82,4 +82,6 @@ limitations under the License.
   most useful for adjusting parallelism or preventing coupled 
failures.</td></tr>
   <tr><td><a 
href="/documentation/transforms/python/other/windowinto">WindowInto</a></td><td>Logically
 divides up or groups the elements of a collection into finite
   windows according to a function.</td></tr>
+  <tr><td><a 
href="/documentation/transforms/python/other/waiton">WaitOn</a></td><td>Delays 
processing of a PCollection until other PCollections have finished 
processing.</td></tr>

Review Comment:
   Could you please add Wait/WaitOn to the sidebard on the left hand side of 
the page? Thanks!
   See (e.g.) Transform Catalog > Java > Other. Currently rendered version 
(available at 
https://apache-beam-website-pull-requests.storage.googleapis.com/35647/documentation/transforms/java/other/wait/index.html
 ) doesn't have it.



##########
website/www/site/content/en/documentation/transforms/python/other/waiton.md:
##########
@@ -0,0 +1,65 @@
+---
+title: "WaitOn"
+---
+
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# WaitOn
+
+`WaitOn` delays the processing of a main `PCollection` until one or more other 
`PCollections` (signals) have finished processing. This is useful for enforcing 
ordering or dependencies between different parts of a pipeline, especially when 
some outputs interact with external systems (such as writing to a database).
+
+When you apply `WaitOn`, the elements of the main `PCollection` will not be 
processed until all the specified signal `PCollections` have completed. In 
streaming mode, this is enforced per window: the corresponding window of each 
waited-on `PCollection` must be complete before elements are passed through.
+
+## Examples
+
+```python
+import apache_beam as beam
+from apache_beam.transforms.util import WaitOn
+
+# Example 1: Basic usage
+with beam.Pipeline() as p:
+    main = p | 'CreateMain' >> beam.Create([1, 2, 3])
+    side = p | 'CreateSide' >> beam.Create(['a', 'b', 'c'])
+
+    # Wait for 'side' to complete before processing 'main'
+    # Elements [1, 2, 3] pass through unchanged after 'side' finishes
+    result = main | 'WaitOnSide' >> WaitOn(side)
+    result | beam.Map(print)
+
+# Example 2: Using multiple signals
+with beam.Pipeline() as p:
+    main = p | 'CreateMain' >> beam.Create([1, 2, 3])
+    side1 = p | 'CreateSide1' >> beam.Create(['a', 'b', 'c'])
+    side2 = p | 'CreateSide2' >> beam.Create(['x', 'y', 'z'])
+
+    # Wait for both side1 and side2 to complete before processing main
+    result = main | 'WaitOnSides' >> WaitOn(side1, side2)
+    result | beam.Map(print)
+
+# Example 3: Streaming mode with windowing
+with beam.Pipeline() as p:
+    main = (p | 'CreateMain' >> beam.Create([1, 2, 3])
+             | "ApplyWindow" >> beam.WindowInto(FixedWindows(5 * 60)))
+    side = (p | 'CreateSide' >> beam.Create(['a', 'b', 'c'])

Review Comment:
   There are not timestamps associated with elements after Create, and there 
will be an error if there is two transforms labelled 'ApplyWindow'. I would 
remove example 3 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to