[
https://issues.apache.org/jira/browse/STORM-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056720#comment-15056720
]
ASF GitHub Bot commented on STORM-1187:
---------------------------------------
Github user Parth-Brahmbhatt commented on a diff in the pull request:
https://github.com/apache/storm/pull/900#discussion_r47557478
--- Diff: docs/documentation/Windowing.md ---
@@ -126,6 +126,96 @@ Time duration based tumbling window that tumbles after
the specified time durati
```
+## Tuple timestamp and out of order tuples
+By default the timestamp tracked in the window is the time when the tuple
is processed by the bolt. The window calculations
+are performed based on the processing timestamp. Storm has support for
tracking windows based on the source generated timestamp.
+
+```java
+/**
+* Specify a field in the tuple that represents the timestamp as a long
value. If this
+* field is not present in the incoming tuple, an {@link
IllegalArgumentException} will be thrown.
+*
+* @param fieldName the name of the field that contains the timestamp
+*/
+public BaseWindowedBolt withTimestampField(String fieldName)
+```
+
+The value for the above `fieldName` will be looked up from the incoming
tuple and considered for windowing calculations.
+If the field is not present in the tuple an exception will be thrown.
Along with the timestamp field name, a time lag parameter
+can also be specified which indicates the max time limit for tuples with
out of order timestamps.
+
+E.g. If the lag is 5 secs and a tuple `t1` arrived with timestamp
`06:00:05` no tuples may arrive with tuple timestamp earlier than `06:00:00`.
If a tuple
+arrives with timestamp 05:59:59 after `t1` and the window has moved past
`t1`, it will be treated as a late tuple and not processed.
--- End diff --
Lets also document how users can find out number of discarded tuples? In
many cases it may also be useful to provide a handler for tuples being
discarded but I am fine with not including that in this patch.
> Support for late and out of order events in time based windows
> --------------------------------------------------------------
>
> Key: STORM-1187
> URL: https://issues.apache.org/jira/browse/STORM-1187
> Project: Apache Storm
> Issue Type: Sub-task
> Reporter: Arun Mahadevan
> Assignee: Arun Mahadevan
>
> Right now the time based windows uses the timestamp when the tuple is
> received by the bolt.
> However there are use cases where the tuples can be processed based on the
> time when they are actually generated vs the time when they are received. So
> we need to add support for processing events with a time lag and also have
> some way to specify and read tuple timestamps.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)