[ https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=291472&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-291472 ]
ASF GitHub Bot logged work on BEAM-7389: ---------------------------------------- Author: ASF GitHub Bot Created on: 08/Aug/19 17:56 Start Date: 08/Aug/19 17:56 Worklog Time Spent: 10m Work Description: rosetn commented on pull request #9267: [BEAM-7389] Add code examples for WithTimestamps page URL: https://github.com/apache/beam/pull/9267#discussion_r312165421 ########## File path: website/src/documentation/transforms/python/element-wise/withtimestamps.md ########## @@ -19,10 +19,116 @@ limitations under the License. --> # WithTimestamps + +<script type="text/javascript"> +localStorage.setItem('language', 'language-py') +</script> + Assigns timestamps to all the elements of a collection. ## Examples -See [BEAM-7389](https://issues.apache.org/jira/browse/BEAM-7389) for updates. -## Related transforms -* [Reify]({{ site.baseurl }}/documentation/transforms/python/elementwise/reify) converts between explicit and implicit forms of Beam values. \ No newline at end of file +In the following examples, we create a pipeline with a `PCollection` and attach a timestamp value to each of its elements. +Timestamps are especially useful on streaming pipelines where windowing and late data play a more important role. + +### Example 1: Timestamp by event time + +Often times, the elements themselves already contain a timestamp field that can be used. +`beam.window.TimestampedValue` will take a value and a timestamp in the form of seconds, where a unix timestamp can be used. + +```py +{% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/with_timestamps.py tag:event_time %}``` + +Output `PCollection` after getting the timestamps: + +``` +{% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/with_timestamps_test.py tag:plant_timestamps %}``` + +<table> + <td> + <a class="button" target="_blank" + href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/with_timestamps.py"> + <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" + width="20px" height="20px" alt="View on GitHub" /> + View on GitHub + </a> + </td> +</table> +<br> + +To convert from a +[`time.struct_time`](https://docs.python.org/3/library/time.html#time.struct_time) +to `unix_time` you can use +[`time.mktime`](https://docs.python.org/3/library/time.html#time.mktime). +For more information on time formatting options, see +[`time.strftime`](https://docs.python.org/3/library/time.html#time.strftime). + +``` +{% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/with_timestamps.py tag:time_tuple2unix_time %}``` + +To convert from a +[`datetime.datetime`](https://docs.python.org/3/library/datetime.html#datetime.datetime) +to `unix_time` you can use convert it to a `time.struct_time` first with +[`datetime.timetuple`](https://docs.python.org/3/library/datetime.html#datetime.datetime.timetuple). + +``` +{% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/with_timestamps.py tag:datetime2unix_time %}``` + +### Example 2: Timestamp by logical clock + +If the elements have a chronological number, those can be used as a +[logical clock](https://en.wikipedia.org/wiki/Logical_clock). +They have to be converted to a *"seconds"* equivalent, which can be especially important depending on your windowing and late data rules. + +```py +{% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/with_timestamps.py tag:logical_clock %}``` + +Output `PCollection` after getting the timestamps: + +``` +{% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/with_timestamps_test.py tag:plant_events %}``` + +<table> + <td> + <a class="button" target="_blank" + href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/with_timestamps.py"> + <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" + width="20px" height="20px" alt="View on GitHub" /> + View on GitHub + </a> + </td> +</table> +<br> + +### Example 3: Timestamp by processing time + +If the elements do not have any time data available, you can also use the current processing time for each element. +Note that this will grab the local time of the *worker* that is processing each element. Review comment: "will grab"->"grabs" ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 291472) Time Spent: 38h (was: 37h 50m) > Colab examples for element-wise transforms (Python) > --------------------------------------------------- > > Key: BEAM-7389 > URL: https://issues.apache.org/jira/browse/BEAM-7389 > Project: Beam > Issue Type: Improvement > Components: website > Reporter: Rose Nguyen > Assignee: David Cavazos > Priority: Minor > Time Spent: 38h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)