Re: [DISCUSS] KIP-889 Versioned State Stores

Matthias J. Sax Mon, 21 Nov 2022 18:32:49 -0800

Thanks for the KIP Victoria. Very well written!

Couple of questions (many might just require to add some more details tothe KIP):

(1) Why does the new store not extend KeyValueStore, but StateStore?In the end, it's a KeyValueStore?

(2) Should we have a ReadOnlyVersionedKeyValueStore? Even if we don'twant to support IQ in this KIP, it might be good to add this interfaceright away to avoid complications for follow up KIPs? Or won't there byany complications anyway?

(3) Why do we not have a `delete(key)` method? I am ok with notsupporting all methods from existing KV-store, but a `delete(key)` seemsto be fundamentally to have?

(4a) Do we need `get(key)`? It seems to be the same as `get(key,MAX_VALUE)`? Maybe is good to have as syntactic sugar though? Just formy own clarification (should we add something to the JavaDocs?).

(4b) Should we throw an exception if a user queries out-of-boundinstead of returning `null` (in `get(key,ts)`)?-> You put it into "rejected alternatives", and I understand yourargument. Would love to get input from others about this questionthough. -- It seems we also return `null` for windowed stores, so maybethe strongest argument is to align to existing behavior? Or do we havecase for which the current behavior is problematic?

(4c) JavaDoc on `get(key,ts)` says: "(up to store implementationdiscretion when this is the case)" -> Should we make it a strictercontract such that the user can reason about it better (there is WIP tomake retention time a strict bound for windowed stores atm)-> JavaDocs on `persistentVersionedKeyValueStore` seems to suggest astrict bound, too.

(5a) Do we need to expose `segmentInterval`? For windowed-stores, wealso use segments but hard-code it to two (it was exposed in earlierversions but it seems not useful, even if we would be open to expose itagain if there is user demand).

(5b) JavaDocs says: "Performance degrades as more record versions forthe same key are collected in a single segment. On the other hand,out-of-order writes and reads which access older segments may slow downif there are too many segments." -- Wondering if JavaDocs should makeany statements about expected performance? Seems to be an implementationdetail?

(6) validTo timestamp is "exclusive", right? Ie, if I query`get(key,ts[=validToV1])` I would get `null` or the "next" record v2with validFromV2=ts?

(7) The KIP says, that segments are stores in the same RocksDB -- forthis case, how are efficient deletes handled? For windowed-store, we canjust delete a full RocksDB.

(8) Rejected alternatives: you propose to not return the validTotimestamp -- if we find it useful in the future to return it, wouldthere be a clean path to change it accordingly?



-Matthias


On 11/16/22 9:57 PM, Victoria Xia wrote:

Hi everyone,

I have a proposal for introducing versioned state stores in Kafka Streams.
Versioned state stores are similar to key-value stores except they can
store multiple record versions for a single key. This KIP focuses on
interfaces only in order to limit the scope of the KIP.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-889%3A+Versioned+State+Stores

Thanks,
Victoria

Re: [DISCUSS] KIP-889 Versioned State Stores

Reply via email to