[ 
https://issues.apache.org/jira/browse/BEAM-6695?focusedWorklogId=223210&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223210
 ]

ASF GitHub Bot logged work on BEAM-6695:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Apr/19 19:29
            Start Date: 04/Apr/19 19:29
    Worklog Time Spent: 10m 
      Work Description: ttanay commented on issue #8206: [BEAM-6695] Latest 
PTransform for Python SDK
URL: https://github.com/apache/beam/pull/8206#issuecomment-480030557
 
 
   **Observation from working on this:**
   A PCollection of TimestampedValues can be created with a DoFn that adds the 
timestamp info along with the value in a TimestampedValue object, in a ParDo 
PTransform. This approach is used in the Java SDK.
   But, this is not possible in the Python SDK because when the runner 
evaluates the ParDo, it converts the element of TimestampedValue to a 
WindowedValue with the same value and timestamp as the TimestampedValue object.
   But, this value used for the WindowedValue object is not the 
TimestampedValue object, which should be the case.
   
   I have a PR on my fork with a failing test to illustrate this: 
https://github.com/ttanay/beam/pull/6
   
   I'm not sure whether a PCollection of TimestampedValues would be of much 
use. I found a tuple of (value, timestamp) to be a useful replacement for it. 
   But, I don't know whether there may be other cases where this may be needed. 
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 223210)
    Time Spent: 2h  (was: 1h 50m)

> Latest transform for Python SDK
> -------------------------------
>
>                 Key: BEAM-6695
>                 URL: https://issues.apache.org/jira/browse/BEAM-6695
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Ahmet Altay
>            Assignee: Tanay Tummalapalli
>            Priority: Minor
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Add a PTransform} and Combine.CombineFn for computing the latest element in a 
> PCollection.
> It should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Latest.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to