[
https://issues.apache.org/jira/browse/FLINK-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934068#comment-15934068
]
sunjincheng edited comment on FLINK-5990 at 3/21/17 4:14 AM:
-------------------------------------------------------------
In this JIRA. I'll definition two concepts:
1. out-of-order
2. late-event
For above two concepts, we need add a configuration:
1. allowedLateness: which use can definition.the value of the allowedLateness
is length of time that the user configures the allowable data delay.
Determine whether the out-of-order or late-event is based on the value of
allowedLateness
e.g.:
*allowedLateness = 2
*InputData:
```
(1L, 1, "Hello"),
(2L, 2, "Hello"),
(4L, 4, "Hello"),
(3L, 3, "Hello"),
(7L, 7, "Hello"),
(7L, 8, "Hello"),
(5L, 5, "Hello"),
(8L, 8, "Hello World"),
(20L, 20, "Hello World"),
(9L, 9, "Hello World"))
`(3L, 3, "Hello")` is out-of-order, Because 4-3=1 < 2
`(9L, 9, "Hello World")` is late-event,Because 20-9=11>2
What do you think ? @fhueske
was (Author: sunjincheng121):
In this JIRA. I'll definition two concepts:
1. out-of-order
2. late-event
For above two concepts, we need add a configuration:
1. allowedLateness: which use can definition.the value of the allowedLateness
is length of time that the user configures the allowable data delay.
Determine whether the out-of-order or late-event is based on the value of
allowedLateness
e.g.:
** allowedLateness = 2 **
InputData:
```
(1L, 1, "Hello"),
(2L, 2, "Hello"),
(4L, 4, "Hello"),
(3L, 3, "Hello"),
(7L, 7, "Hello"),
(7L, 8, "Hello"),
(5L, 5, "Hello"),
(8L, 8, "Hello World"),
(20L, 20, "Hello World"),
(9L, 9, "Hello World"))
`(3L, 3, "Hello")` is out-of-order, Because 4-3=1 < 2
`(9L, 9, "Hello World")` is late-event,Because 20-9=11>2
What do you think ? @fhueske
> Add event time OVER ROWS BETWEEN x PRECEDING aggregation to SQL
> ---------------------------------------------------------------
>
> Key: FLINK-5990
> URL: https://issues.apache.org/jira/browse/FLINK-5990
> Project: Flink
> Issue Type: Sub-task
> Components: Table API & SQL
> Reporter: sunjincheng
> Assignee: sunjincheng
>
> The goal of this issue is to add support for OVER ROWS aggregations on event
> time streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT
> a,
> SUM(b) OVER (PARTITION BY c ORDER BY rowTime() ROWS BETWEEN 2 PRECEDING AND
> CURRENT ROW) AS sumB,
> MIN(b) OVER (PARTITION BY c ORDER BY rowTime() ROWS BETWEEN 2 PRECEDING AND
> CURRENT ROW) AS minB
> FROM myStream
> {code}
> The following restrictions should initially apply:
> - All OVER clauses in the same SELECT clause must be exactly the same.
> - The PARTITION BY clause is optional (no partitioning results in single
> threaded execution).
> - The ORDER BY clause may only have rowTime() as parameter. rowTime() is a
> parameterless scalar function that just indicates event time mode.
> - UNBOUNDED PRECEDING is not supported (see FLINK-5658)
> - FOLLOWING is not supported.
> The restrictions will be resolved in follow up issues. If we find that some
> of the restrictions are trivial to address, we can add the functionality in
> this issue as well.
> This issue includes:
> - Design of the DataStream operator to compute OVER ROW aggregates
> - Translation from Calcite's RelNode representation (LogicalProject with
> RexOver expression).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)