[ https://issues.apache.org/jira/browse/KAFKA-12314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281526#comment-17281526 ]
Sagar Rao commented on KAFKA-12314: ----------------------------------- hey [~ableegoldman], is this something I can take up? > Leverage custom comparator for optimized range scans on RocksDB > --------------------------------------------------------------- > > Key: KAFKA-12314 > URL: https://issues.apache.org/jira/browse/KAFKA-12314 > Project: Kafka > Issue Type: Improvement > Reporter: A. Sophie Blee-Goldman > Priority: Major > > Currently our SessionStore has poor performance on any range scans due to the > byte layout and possibility of varyingly sized keys. A session window > consists of the key and two timestamps, the windowEnd and windowStart. This > data is formatted as > [key, windowEnd, windowStart] > The default comparator in rocksdb is lexicographical, and so it compares > bytes starting with the key. This means with the above format, the records > are effectively sorted first by key and then by windowEnd. But if two keys > are of different lengths, the comparator will start on the left and end up > comparing the tail bytes of the longer key against the windowEnd timestamp of > the shorter key. Due to this, we have to set the bounds on SessionStore range > scans very conservatively, which means we end up reading way more data than > we need. > One way out of this would be to use a custom comparator which understands the > window bytes format we use. So far we haven't done this because of the > overhead in crossing the JNI with the Java Comparator; we would need a native > comparator to avoid further performance hit. -- This message was sent by Atlassian Jira (v8.3.4#803005)