[ 
https://issues.apache.org/jira/browse/BEAM-10676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-10676:
--------------------------------------
    Description: 
By default, the Python SDK adds a timer output timestamp equal to the current 
timestamp of an element. This is problematic because

1. We hold back the output watermark on the current element's timestamp for 
every timer
2. It doesn't match the behavior in the Java SDK which defaults to using the 
fire timestamp as the timer output timestamp (and adds a hold on it)
3. There is no way for the user to influence this behavior because there is no 
user-facing API 

https://github.com/apache/beam/blob/dfadde2d3ee0a0487362dbcca80388fdc2ef2302/sdks/python/apache_beam/runners/worker/bundle_processor.py#L650

We should use the fire timestamp as the default output timestamp.

  was:
By default, the Python SDK adds a timer output timestamp equal to the current 
timestamp of an element. This is problematic because

1. We hold back the output watermark on the current element's timestamp for 
every timer
2. It doesn't match the behavior in the Java SDK which defaults to using the 
fire timestamp as the timer output timestamp (and adds a hold on it)
3. There is no way for the user to influence this behavior because there is no 
user-facing API 

https://github.com/apache/beam/blob/dfadde2d3ee0a0487362dbcca80388fdc2ef2302/sdks/python/apache_beam/runners/worker/bundle_processor.py#L650


> Timers by default add a hold on the input timestamp
> ---------------------------------------------------
>
>                 Key: BEAM-10676
>                 URL: https://issues.apache.org/jira/browse/BEAM-10676
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core, sdk-py-harness
>            Reporter: Maximilian Michels
>            Assignee: Maximilian Michels
>            Priority: P2
>
> By default, the Python SDK adds a timer output timestamp equal to the current 
> timestamp of an element. This is problematic because
> 1. We hold back the output watermark on the current element's timestamp for 
> every timer
> 2. It doesn't match the behavior in the Java SDK which defaults to using the 
> fire timestamp as the timer output timestamp (and adds a hold on it)
> 3. There is no way for the user to influence this behavior because there is 
> no user-facing API 
> https://github.com/apache/beam/blob/dfadde2d3ee0a0487362dbcca80388fdc2ef2302/sdks/python/apache_beam/runners/worker/bundle_processor.py#L650
> We should use the fire timestamp as the default output timestamp.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to