Fabian Hueske created FLINK-9673:
------------------------------------
Summary: Improve State efficiency of bounded OVER window operators
Key: FLINK-9673
URL: https://issues.apache.org/jira/browse/FLINK-9673
Project: Flink
Issue Type: Improvement
Components: Table API & SQL
Reporter: Fabian Hueske
Currently, the implementations of bounded OVER window aggregations store the
complete input for the bound interval. For example for the query:
{code}
SELECT user_id, count(action) OVER (PARTITION BY user_id ORDER BY rowtime RANGE
INTERVAL '14' DAY PRECEDING) action_count, rowtime
FROM
SELECT rowtime, user_id, action, val1, val2, val3, val4 FROM user
{code}
The whole records with schema {{(rowtime, user_id, action, val1, val2, val3,
val4)}} are stored for 14 days in order to retract them after 14 days from the
accumulators.
However, it would be sufficient to only store those fields that are required
for the aggregtions, i.e., {{action}} in the example above. All other fields
could be set to {{null}} and hence significantly reduce the amount of data that
needs to be stored in state.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)