[ 
https://issues.apache.org/jira/browse/FLINK-5653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937910#comment-15937910
 ] 

ASF GitHub Bot commented on FLINK-5653:
---------------------------------------

Github user fhueske commented on the issue:

    https://github.com/apache/flink/pull/3574
  
    Hi @huawei-flink, let me explain the idea of using `MapState` and its 
benefits in more detail.
    
    I'll start with the way that a `ListState` works. With `ListState` we can 
get efficient access to the head element of the list. However, when updating 
the `ListState`, we cannot remove individual elements but have to clear the 
complete state and reinsert all elements that should remain. Hence we always 
need to deserialize and serialize all elements of a `ListState`.
    
    With the `MapState` approach, we would put the elements in a map which is 
keyed on their processing timestamp. Since multiple records can arrive within 
the same millisecond, we use a `List[Row]` as value type for the map. To 
process a new row, we have to find the "oldest" row (i.e., the one with the 
smallest timestamp) to retract it from the accumulator. With `ListState` this 
is trivial, it is the head element. With `MapState` we have to iterate over the 
keys and find the smallest one (smallest processing timestamp). This requires 
to deserialize all keys, but these are only `Long` values and not complete 
rows. With the smallest key, we can get the `List[Row]` value and take the 
first Row from the list and retract it from the accumulator. When updating the 
state, we only update the `List[Row]` value of the smallest key (or possible 
remove it if the `List[Row]` became empty).
    
    So the benefit of using `MapState` of `ListState` is that we only read `n` 
Long (+ read/write 1 `List[Row]`) instead of reading and writing `n` Row values.


> Add processing time OVER ROWS BETWEEN x PRECEDING aggregation to SQL
> --------------------------------------------------------------------
>
>                 Key: FLINK-5653
>                 URL: https://issues.apache.org/jira/browse/FLINK-5653
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API & SQL
>            Reporter: Fabian Hueske
>            Assignee: Stefano Bortoli
>
> The goal of this issue is to add support for OVER ROWS aggregations on 
> processing time streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT 
>   a, 
>   SUM(b) OVER (PARTITION BY c ORDER BY procTime() ROWS BETWEEN 2 PRECEDING 
> AND CURRENT ROW) AS sumB,
>   MIN(b) OVER (PARTITION BY c ORDER BY procTime() ROWS BETWEEN 2 PRECEDING 
> AND CURRENT ROW) AS minB
> FROM myStream
> {code}
> The following restrictions should initially apply:
> - All OVER clauses in the same SELECT clause must be exactly the same.
> - The PARTITION BY clause is optional (no partitioning results in single 
> threaded execution).
> - The ORDER BY clause may only have procTime() as parameter. procTime() is a 
> parameterless scalar function that just indicates processing time mode.
> - UNBOUNDED PRECEDING is not supported (see FLINK-5656)
> - FOLLOWING is not supported.
> The restrictions will be resolved in follow up issues. If we find that some 
> of the restrictions are trivial to address, we can add the functionality in 
> this issue as well.
> This issue includes:
> - Design of the DataStream operator to compute OVER ROW aggregates
> - Translation from Calcite's RelNode representation (LogicalProject with 
> RexOver expression).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to