Github user huawei-flink commented on the issue:
https://github.com/apache/flink/pull/3574
@fhueske Thanks a lot of the clarification. I understand the issue better
now, and see your attempt to make an average case that would work for both in
memory as well as on external persistence. Considering RocksDB as the state of
art, your choice sounds much more reasonable. We are well aware of the costs of
serialization, and the impact is definitely important. However, low latency
systems with strict SLA will likely run just in memory.
The O(n) of the MapState is granted by the fact that time is monothonic and
therefore the sequential reading is managed by the key timestamp. The cost of
each O(1) in the hashmap increseas with the size of the window thou as you need
to search through the map index. We definitely need better data access patterns
for the state of "time series" types of data.
I will try to internalize it and provide the MapState implementation
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---