[
https://issues.apache.org/jira/browse/FLINK-6219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950388#comment-15950388
]
Xiaogang Shi commented on FLINK-6219:
-------------------------------------
I prefer to use sorted states (e.g., {{SortedMapState}}) rather than a new
state backend to address the described problem. Some users have mentioned
similar demands for sorted states. Hence I think we should provide them to
facilitate the development of user applications.
The implementation of such sorted states however may be very challenging. In
{{HeapStateBackend}}, we need to implement a data structure which supports both
Copy-on-Write (for asynchronous snapshotting) and sorting. In
{{RocksDBStateBackend}} , we need to find an efficient way to support
customized sorting. Though RocksDBJava allows customized comparators, the
performance will be significantly degraded once a customized comparator is used
(approximately 1/3 - 1/15 in QPS).
It's critical to address the problems mentioned above. Otherwise,
{{ValueState}} s whose data is typed {{SortedMap}} are better to sort user data
under the same key.
> Add a state backend which supports sorting
> ------------------------------------------
>
> Key: FLINK-6219
> URL: https://issues.apache.org/jira/browse/FLINK-6219
> Project: Flink
> Issue Type: New Feature
> Components: State Backends, Checkpointing, Table API & SQL
> Reporter: sunjincheng
>
> When we implement the OVER window of
> [FLIP11|https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations]
> We notice that we need a state backend which supports sorting, allows for
> efficient insertion, traversal in order, and removal from the head.
> For example: In event-time OVER window, we need to sort by time,If the datas
> as follow:
> {code}
> (1L, 1, Hello)
> (2L, 2, Hello)
> (5L, 5, Hello)
> (4L, 4, Hello)
> {code}
> We randomly insert the datas, just like:
> {code}
> put((2L, 2, Hello)),put((1L, 1, Hello)),put((5L, 5, Hello)),put((4L, 4,
> Hello)),
> {code}
> We deal with elements in time order:
> {code}
> process((1L, 1, Hello)),process((2L, 2, Hello)),process((4L, 4,
> Hello)),process((5L, 5, Hello))
> {code}
> Welcome anyone to give feedback,And what do you think? [~xiaogang.shi]
> [~aljoscha] [~fhueske]
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)