Fabian Hueske created FLINK-7799:
------------------------------------

             Summary: Improve performance of windowed joins
                 Key: FLINK-7799
                 URL: https://issues.apache.org/jira/browse/FLINK-7799
             Project: Flink
          Issue Type: Sub-task
          Components: Table API & SQL
    Affects Versions: 1.4.0
            Reporter: Fabian Hueske
            Priority: Critical


The performance of windowed joins can be improved by changing the state access 
patterns.
Right now, rows are inserted into a MapState with their timestamp as key. Since 
we use a time resolution of 1ms, this means that the full key space of the 
state must be iterated and many map entries must be accessed when joining or 
evicting rows. 

A better strategy would be to block the time into larger intervals and register 
the rows in their respective interval. Another benefit would be that we can 
directly access the state entries because we know exactly which timestamps to 
look up. Hence, we can limit the state access to the relevant section during 
joining and state eviction. 

The good size for intervals needs to be identified and might depend on the size 
of the window.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to