Hello! I have a somewhat complex question regarding suppressed window stores 
and their internal topic retention times.

We have a pretty noisy topic and a pretty expensive aggregate operation, so we 
window all the input messages into a json array, suppress the output until the 
window is closed, and then send the array along to our aggregate function. We 
are using the exactly_once_beta processing.guarantee, setting a 
group.instance.id to maintain a static group, and using a StatefulSet with 
persistent storage in Kubernetes. The input topic is noisy so we use a 5 second 
jumping window with no grace and a 15 minute retention, and a 
WallClockTimestamp extractor so we don't have to worry about out-of-order 
events (single topic input) or grace periods.

My question is: if my app is brought down for a day or so, which I believe is 
about the default retention.ms for the window changelog topic when the window 
retention is short, and the changelog topic is cleaned, am I going to lose 
windows that have not yet closed and been flushed through the suppress and on 
to the aggregate? If this is true, is a day too small as the default 
retention.ms? Is there a reason Kafka specifically adds only a day to the 
window retention config instead of what feels like the standard/default 7 day 
retention? Is there a way to configure a window store to be compacted only, and 
Kafka Streams can just tombstone the topic when the window is closed/emitted?

I would imagine the answer to the above window-loss-question is "yes", unless 
perhaps the persistent storage and static membership or exactly_once_beta saves 
me here. Will a static member that has been down for many hours come back and 
be assigned the same partitions it had before, and thus re-use its local state? 
Does exactly_once_beta perhaps provide some guarantee to a window being in the 
suppress changelog (which is compacted) when a thread commits? And perhaps the 
suppress changelog will hold my to-be-emitted-24-hours-later window?

Sorry if that is a lot to digest, but thank you for your wisdom!

Reply via email to