A. Sophie Blee-Goldman created KAFKA-12314:
----------------------------------------------

             Summary: Leverage custom comparator for optimized range scans on 
RocksDB
                 Key: KAFKA-12314
                 URL: https://issues.apache.org/jira/browse/KAFKA-12314
             Project: Kafka
          Issue Type: Improvement
            Reporter: A. Sophie Blee-Goldman


Currently our SessionStore has poor performance on any range scans due to the 
byte layout and possibility of varyingly sized keys. A session window consists 
of the key and two timestamps, the windowEnd and windowStart. This data is 
formatted as

[key, windowEnd, windowStart]

The default comparator in rocksdb is lexicographical, and so it compares bytes 
starting with the key. This means with the above format, the records are 
effectively sorted first by key and then by windowEnd. But if two keys are of 
different lengths, the comparator will start on the left and end up comparing 
the tail bytes of the longer key against the windowEnd timestamp of the shorter 
key. Due to this, we have to set the bounds on SessionStore range scans very 
conservatively, which means we end up reading way more data than we need.

One way out of this would be to use a custom comparator which understands the 
window bytes format we use. So far we haven't done this because of the overhead 
in crossing the JNI with the Java Comparator; we would need a native comparator 
to avoid further performance hit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to