Michael Viamari created KAFKA-9551:
--------------------------------------

             Summary: Alternate WindowKeySchema Implementations
                 Key: KAFKA-9551
                 URL: https://issues.apache.org/jira/browse/KAFKA-9551
             Project: Kafka
          Issue Type: Improvement
            Reporter: Michael Viamari


Currently, the {{WindowKeySchema}} used by all {{WindowStore}} implementations 
serializes the key with window information as {{keyBytes + timestampBytes + 
seqNumByte}}. This is optimal for iterations and queries that have a fixed key 
with a variable time window (which I think is leveraged in KStream-KStream join 
windows).

 

In cases where the time-window is fixed, but the key range is variable, there 
is a significant overhead for iteration: all time-windows for a given key must 
be traversed. The iteration only uses 1 out of every N keys, where N is the 
number of windows.

A key serialization format that is structured as {{timestampBytes}} + 
{{keyBytes + seqNumByte}} would be much more efficient when iterating over keys 
in a fixed window.

Implementing a custom {{KeySchema}} is not easy at the moment. Currently, 
{{WindowKeySchema}} is instantiated when supplying a RocksDB instance in 
{{RocksDbWindowBytesStoreSupplier}}, but most/all other references to a 
{{KeySchema}} use static functions on {{WindowKeySchema}}. This makes 
supporting an alternate {{KeySchema}} very challenging. Additionally, making a 
custom implementation of {{KeySchema}} is complicated by the fact that although 
{{RocksDBSegmentedBytesStore.KeySchema}} is a public interface, the interface 
depends on {{HasNextCondition}} which is package-private to 
{{org.apache.kafka.streams.state.internals}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to