rosetn commented on a change in pull request #13326:
URL: https://github.com/apache/beam/pull/13326#discussion_r524389224



##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming

Review comment:
       I'd remove "properly" and add a comma after "watermark"

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of

Review comment:
       Replace "which" with "that"

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of
+work for a given element. For example, we defined OffsetRange as a restriction 
to represent offset
+positions in 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/range/OffsetRange.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRange).
+
+The restriction provider lets SDF authors override default implementations
+for splitting, sizing, and so forth. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
 and 
[Go](https://github.com/apache/beam/blob/0f466e6bcd4ac8677c2bd9ecc8e6af3836b7f3b8/sdks/go/pkg/beam/pardo.go#L226),
-this is the `DoFn`. 
[Python](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/python/apache_beam/transforms/core.py#L213)
-has a dedicated RestrictionProvider type. The restriction tracker is 
responsible for tracking
-what subset of the restriction has been completed during processing.
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.RestrictionProvider)
+has a dedicated RestrictionProvider type.
+
+The restriction tracker is responsible for tracking what subset of the 
restriction has been
+completed during processing. For APIs details, please refer to 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/RestrictionTracker.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.iobase.html#apache_beam.io.iobase.RestrictionTracker)
+documentations.
+There are some built-in RestrictionTracker defined in Java:
+1. 
[OffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTracker.html)
+2. 
[GrowableOffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/GrowableOffsetRangeTracker.html)
+3. 
[ByteKeyRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTracker.html)
+
+We also have built-in RestrictionTracker in Python:

Review comment:
       Maybe "SDFs also have a built-in RestrictionTracker implementation in 
Python:"

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of
+work for a given element. For example, we defined OffsetRange as a restriction 
to represent offset
+positions in 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/range/OffsetRange.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRange).
+
+The restriction provider lets SDF authors override default implementations
+for splitting, sizing, and so forth. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
 and 
[Go](https://github.com/apache/beam/blob/0f466e6bcd4ac8677c2bd9ecc8e6af3836b7f3b8/sdks/go/pkg/beam/pardo.go#L226),
-this is the `DoFn`. 
[Python](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/python/apache_beam/transforms/core.py#L213)
-has a dedicated RestrictionProvider type. The restriction tracker is 
responsible for tracking
-what subset of the restriction has been completed during processing.
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.RestrictionProvider)
+has a dedicated RestrictionProvider type.
+
+The restriction tracker is responsible for tracking what subset of the 
restriction has been
+completed during processing. For APIs details, please refer to 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/RestrictionTracker.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.iobase.html#apache_beam.io.iobase.RestrictionTracker)
+documentations.
+There are some built-in RestrictionTracker defined in Java:
+1. 
[OffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTracker.html)
+2. 
[GrowableOffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/GrowableOffsetRangeTracker.html)
+3. 
[ByteKeyRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTracker.html)
+
+We also have built-in RestrictionTracker in Python:
+1. 
[OffsetRangeTracker](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRestrictionTracker)
+
+The watermark state is a user-defined object which is used to create a 
`WatermarkEstimator` from a
+`WatermarkEstimatorProvider`. The simplest watermark state could be a 
`timestamp`.
+
+The watermark estimator provider lets SDF authors to define the way of 
initializing the watermark
+state and creating a watermark estimator. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.WatermarkEstimatorProvider)
+has a dedicated WatermarkEstimatorProvider type.
+
+The watermark estimator is for tracking watermark when an element-restriction 
pair is in progress.

Review comment:
       The watermark estimator tracks the watermark when an element-restriction 
pair is in progress.

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of
+work for a given element. For example, we defined OffsetRange as a restriction 
to represent offset

Review comment:
       Make OffsetRange in code font by adding backticks. Can you change all of 
the instances of class names into code font? More information here: 
https://developers.google.com/style/code-in-text#some-specific-items-to-put-in-code-font
   
   `OffsetRange`

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of
+work for a given element. For example, we defined OffsetRange as a restriction 
to represent offset
+positions in 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/range/OffsetRange.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRange).
+
+The restriction provider lets SDF authors override default implementations
+for splitting, sizing, and so forth. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
 and 
[Go](https://github.com/apache/beam/blob/0f466e6bcd4ac8677c2bd9ecc8e6af3836b7f3b8/sdks/go/pkg/beam/pardo.go#L226),
-this is the `DoFn`. 
[Python](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/python/apache_beam/transforms/core.py#L213)
-has a dedicated RestrictionProvider type. The restriction tracker is 
responsible for tracking
-what subset of the restriction has been completed during processing.
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.RestrictionProvider)
+has a dedicated RestrictionProvider type.
+
+The restriction tracker is responsible for tracking what subset of the 
restriction has been

Review comment:
       Replace "what" with "which"

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of
+work for a given element. For example, we defined OffsetRange as a restriction 
to represent offset
+positions in 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/range/OffsetRange.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRange).
+
+The restriction provider lets SDF authors override default implementations
+for splitting, sizing, and so forth. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
 and 
[Go](https://github.com/apache/beam/blob/0f466e6bcd4ac8677c2bd9ecc8e6af3836b7f3b8/sdks/go/pkg/beam/pardo.go#L226),
-this is the `DoFn`. 
[Python](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/python/apache_beam/transforms/core.py#L213)
-has a dedicated RestrictionProvider type. The restriction tracker is 
responsible for tracking
-what subset of the restriction has been completed during processing.
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.RestrictionProvider)
+has a dedicated RestrictionProvider type.
+
+The restriction tracker is responsible for tracking what subset of the 
restriction has been
+completed during processing. For APIs details, please refer to 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/RestrictionTracker.html)
 

Review comment:
       I recommend rewording this to be more specific
   
   For APIs details, read the 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/RestrictionTracker.html)
 and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.iobase.html#apache_beam.io.iobase.RestrictionTracker)
 reference documentation.

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of
+work for a given element. For example, we defined OffsetRange as a restriction 
to represent offset
+positions in 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/range/OffsetRange.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRange).
+
+The restriction provider lets SDF authors override default implementations
+for splitting, sizing, and so forth. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
 and 
[Go](https://github.com/apache/beam/blob/0f466e6bcd4ac8677c2bd9ecc8e6af3836b7f3b8/sdks/go/pkg/beam/pardo.go#L226),
-this is the `DoFn`. 
[Python](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/python/apache_beam/transforms/core.py#L213)
-has a dedicated RestrictionProvider type. The restriction tracker is 
responsible for tracking
-what subset of the restriction has been completed during processing.
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.RestrictionProvider)
+has a dedicated RestrictionProvider type.
+
+The restriction tracker is responsible for tracking what subset of the 
restriction has been
+completed during processing. For APIs details, please refer to 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/RestrictionTracker.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.iobase.html#apache_beam.io.iobase.RestrictionTracker)
+documentations.
+There are some built-in RestrictionTracker defined in Java:

Review comment:
       I think this is missing a noun. WDYT about the following?:
   
   There are some built-in RestrictionTracker implementations defined in Java:

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of
+work for a given element. For example, we defined OffsetRange as a restriction 
to represent offset
+positions in 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/range/OffsetRange.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRange).
+
+The restriction provider lets SDF authors override default implementations
+for splitting, sizing, and so forth. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
 and 
[Go](https://github.com/apache/beam/blob/0f466e6bcd4ac8677c2bd9ecc8e6af3836b7f3b8/sdks/go/pkg/beam/pardo.go#L226),
-this is the `DoFn`. 
[Python](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/python/apache_beam/transforms/core.py#L213)
-has a dedicated RestrictionProvider type. The restriction tracker is 
responsible for tracking
-what subset of the restriction has been completed during processing.
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.RestrictionProvider)
+has a dedicated RestrictionProvider type.
+
+The restriction tracker is responsible for tracking what subset of the 
restriction has been
+completed during processing. For APIs details, please refer to 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/RestrictionTracker.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.iobase.html#apache_beam.io.iobase.RestrictionTracker)
+documentations.
+There are some built-in RestrictionTracker defined in Java:
+1. 
[OffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTracker.html)
+2. 
[GrowableOffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/GrowableOffsetRangeTracker.html)
+3. 
[ByteKeyRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTracker.html)
+
+We also have built-in RestrictionTracker in Python:
+1. 
[OffsetRangeTracker](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRestrictionTracker)
+
+The watermark state is a user-defined object which is used to create a 
`WatermarkEstimator` from a
+`WatermarkEstimatorProvider`. The simplest watermark state could be a 
`timestamp`.
+
+The watermark estimator provider lets SDF authors to define the way of 
initializing the watermark
+state and creating a watermark estimator. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.WatermarkEstimatorProvider)
+has a dedicated WatermarkEstimatorProvider type.
+
+The watermark estimator is for tracking watermark when an element-restriction 
pair is in progress.
+For APIs details, please refer to 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimator.html)

Review comment:
       For APIs details, read the Java and Python reference documentation.

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of
+work for a given element. For example, we defined OffsetRange as a restriction 
to represent offset
+positions in 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/range/OffsetRange.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRange).
+
+The restriction provider lets SDF authors override default implementations
+for splitting, sizing, and so forth. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)

Review comment:
       I'd remove "and so forth." 
https://developers.google.com/style/word-list#etc
   
   The restriction provider lets SDF authors override default implementations, 
including the ones for splitting and sizing.

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of
+work for a given element. For example, we defined OffsetRange as a restriction 
to represent offset
+positions in 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/range/OffsetRange.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRange).
+
+The restriction provider lets SDF authors override default implementations
+for splitting, sizing, and so forth. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
 and 
[Go](https://github.com/apache/beam/blob/0f466e6bcd4ac8677c2bd9ecc8e6af3836b7f3b8/sdks/go/pkg/beam/pardo.go#L226),
-this is the `DoFn`. 
[Python](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/python/apache_beam/transforms/core.py#L213)
-has a dedicated RestrictionProvider type. The restriction tracker is 
responsible for tracking
-what subset of the restriction has been completed during processing.
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.RestrictionProvider)
+has a dedicated RestrictionProvider type.
+
+The restriction tracker is responsible for tracking what subset of the 
restriction has been
+completed during processing. For APIs details, please refer to 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/RestrictionTracker.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.iobase.html#apache_beam.io.iobase.RestrictionTracker)
+documentations.
+There are some built-in RestrictionTracker defined in Java:
+1. 
[OffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTracker.html)
+2. 
[GrowableOffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/GrowableOffsetRangeTracker.html)
+3. 
[ByteKeyRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTracker.html)
+
+We also have built-in RestrictionTracker in Python:
+1. 
[OffsetRangeTracker](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRestrictionTracker)
+
+The watermark state is a user-defined object which is used to create a 
`WatermarkEstimator` from a
+`WatermarkEstimatorProvider`. The simplest watermark state could be a 
`timestamp`.
+
+The watermark estimator provider lets SDF authors to define the way of 
initializing the watermark
+state and creating a watermark estimator. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.WatermarkEstimatorProvider)
+has a dedicated WatermarkEstimatorProvider type.
+
+The watermark estimator is for tracking watermark when an element-restriction 
pair is in progress.
+For APIs details, please refer to 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimator.html)
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.iobase.html#apache_beam.io.iobase.WatermarkEstimator)
+documentations.
+There are some built-in `WatermarkEstimator` defined in Java:

Review comment:
       add  "implementations"

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of
+work for a given element. For example, we defined OffsetRange as a restriction 
to represent offset
+positions in 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/range/OffsetRange.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRange).
+
+The restriction provider lets SDF authors override default implementations
+for splitting, sizing, and so forth. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
 and 
[Go](https://github.com/apache/beam/blob/0f466e6bcd4ac8677c2bd9ecc8e6af3836b7f3b8/sdks/go/pkg/beam/pardo.go#L226),
-this is the `DoFn`. 
[Python](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/python/apache_beam/transforms/core.py#L213)
-has a dedicated RestrictionProvider type. The restriction tracker is 
responsible for tracking
-what subset of the restriction has been completed during processing.
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.RestrictionProvider)
+has a dedicated RestrictionProvider type.
+
+The restriction tracker is responsible for tracking what subset of the 
restriction has been
+completed during processing. For APIs details, please refer to 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/RestrictionTracker.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.iobase.html#apache_beam.io.iobase.RestrictionTracker)
+documentations.
+There are some built-in RestrictionTracker defined in Java:
+1. 
[OffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTracker.html)
+2. 
[GrowableOffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/GrowableOffsetRangeTracker.html)
+3. 
[ByteKeyRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTracker.html)
+
+We also have built-in RestrictionTracker in Python:
+1. 
[OffsetRangeTracker](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRestrictionTracker)
+
+The watermark state is a user-defined object which is used to create a 
`WatermarkEstimator` from a
+`WatermarkEstimatorProvider`. The simplest watermark state could be a 
`timestamp`.
+
+The watermark estimator provider lets SDF authors to define the way of 
initializing the watermark

Review comment:
       "The watermark estimator provider lets SDF authors define how to 
initialize the watermark
   state and create a watermark estimator."

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5324,10 +5368,15 @@ resource utilization.
 A runner at any time may attempt to split a restriction while it is being 
processed. This allows the
 runner to either pause processing of the restriction so that other work may be 
done (common for
 unbounded restrictions to limit the amount of output and/or improve latency) 
or split the restriction
-into two pieces, increasing the available parallelism within the system. It is 
important to author a
-SDF with this in mind since the end of the restriction may change. Thus when 
writing the
-processing loop, it is important to use the result from trying to claim a 
piece of the restriction
-instead of assuming one can process till the end.
+into two pieces, increasing the available parallelism within the system. 
Please note that different
+runners(e.g., Dataflow, Flink, Spark) have different strategies to issue 
splits under batch and
+streaming execution.
+
+It is important to author an SDF with this in mind since the end of the 
restriction may change. Thus

Review comment:
       Cleaning up a little. WDYT about this?:
   
   Author an SDF with this in mind since the end of the restriction may change. 
When writing the processing loop, use the result from trying to claim a piece 
of the restriction instead of assuming you can process until the end.

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5188,16 +5188,60 @@ restriction pairs.
 #### 12.1.1. A basic SDF {#a-basic-sdf}
 
 A basic SDF is composed of three parts: a restriction, a restriction provider, 
and a
-restriction tracker. The restriction is used to represent a subset of work for 
a given element.
-The restriction provider lets SDF authors override default implementations for 
splitting, sizing,
-watermark estimation, and so forth. In 
[Java](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L92)
+restriction tracker. If you want to control the watermark properly especially 
in a streaming
+pipeline, two more components are needed: a watermark estimator provider and a 
watermark estimator.
+
+The restriction is a user-defined object which is used to represent a subset of
+work for a given element. For example, we defined OffsetRange as a restriction 
to represent offset
+positions in 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/range/OffsetRange.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRange).
+
+The restriction provider lets SDF authors override default implementations
+for splitting, sizing, and so forth. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
 and 
[Go](https://github.com/apache/beam/blob/0f466e6bcd4ac8677c2bd9ecc8e6af3836b7f3b8/sdks/go/pkg/beam/pardo.go#L226),
-this is the `DoFn`. 
[Python](https://github.com/apache/beam/blob/f4c2734261396858e388ebef2eef50e7d48231a8/sdks/python/apache_beam/transforms/core.py#L213)
-has a dedicated RestrictionProvider type. The restriction tracker is 
responsible for tracking
-what subset of the restriction has been completed during processing.
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.RestrictionProvider)
+has a dedicated RestrictionProvider type.
+
+The restriction tracker is responsible for tracking what subset of the 
restriction has been
+completed during processing. For APIs details, please refer to 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/RestrictionTracker.html)
 
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.iobase.html#apache_beam.io.iobase.RestrictionTracker)
+documentations.
+There are some built-in RestrictionTracker defined in Java:
+1. 
[OffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTracker.html)
+2. 
[GrowableOffsetRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/GrowableOffsetRangeTracker.html)
+3. 
[ByteKeyRangeTracker](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTracker.html)
+
+We also have built-in RestrictionTracker in Python:
+1. 
[OffsetRangeTracker](https://beam.apache.org/releases/pydoc/current/apache_beam.io.restriction_trackers.html#apache_beam.io.restriction_trackers.OffsetRestrictionTracker)
+
+The watermark state is a user-defined object which is used to create a 
`WatermarkEstimator` from a
+`WatermarkEstimatorProvider`. The simplest watermark state could be a 
`timestamp`.
+
+The watermark estimator provider lets SDF authors to define the way of 
initializing the watermark
+state and creating a watermark estimator. In 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/DoFn.ProcessElement.html)
+this is the `DoFn`. 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.WatermarkEstimatorProvider)
+has a dedicated WatermarkEstimatorProvider type.
+
+The watermark estimator is for tracking watermark when an element-restriction 
pair is in progress.
+For APIs details, please refer to 
[Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimator.html)
+and 
[Python](https://beam.apache.org/releases/pydoc/current/apache_beam.io.iobase.html#apache_beam.io.iobase.WatermarkEstimator)
+documentations.
+There are some built-in `WatermarkEstimator` defined in Java:
+1. 
[Manual](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimators.Manual.html)
+2. 
[MonotonicallyIncreasing](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimators.MonotonicallyIncreasing.html)
+3. 
[WallTime](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimators.WallTime.html)
+
+There are the same set of built-in `WatermarkEstimator` in Python along with 
default `WatermarkEstimatorProvider` as well:

Review comment:
       Along with the default `WatermarkEstimatorProvider`, there are the same 
set of built-in `WatermarkEstimator` implementations in Python:

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5324,10 +5368,15 @@ resource utilization.
 A runner at any time may attempt to split a restriction while it is being 
processed. This allows the
 runner to either pause processing of the restriction so that other work may be 
done (common for
 unbounded restrictions to limit the amount of output and/or improve latency) 
or split the restriction
-into two pieces, increasing the available parallelism within the system. It is 
important to author a
-SDF with this in mind since the end of the restriction may change. Thus when 
writing the
-processing loop, it is important to use the result from trying to claim a 
piece of the restriction
-instead of assuming one can process till the end.
+into two pieces, increasing the available parallelism within the system. 
Please note that different

Review comment:
       Different
   runners (e.g., Dataflow, Flink, Spark) have different strategies to issue 
splits under batch and
   streaming execution.

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -5324,10 +5368,15 @@ resource utilization.
 A runner at any time may attempt to split a restriction while it is being 
processed. This allows the
 runner to either pause processing of the restriction so that other work may be 
done (common for
 unbounded restrictions to limit the amount of output and/or improve latency) 
or split the restriction
-into two pieces, increasing the available parallelism within the system. It is 
important to author a
-SDF with this in mind since the end of the restriction may change. Thus when 
writing the
-processing loop, it is important to use the result from trying to claim a 
piece of the restriction
-instead of assuming one can process till the end.
+into two pieces, increasing the available parallelism within the system. 
Please note that different
+runners(e.g., Dataflow, Flink, Spark) have different strategies to issue 
splits under batch and
+streaming execution.
+
+It is important to author an SDF with this in mind since the end of the 
restriction may change. Thus
+when writing the processing loop, it is important to use the result from 
trying to claim a piece of
+the restriction instead of assuming one can process till the end.
+
+One bad example could be:

Review comment:
       Replace "bad" with "incorrect." Does this still have the same meaning?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to