Hi.

I'd like to re-consume 6 months old data with Kafka Streams.

My current topology can't because it defines aggregations with windows maintain 
durations of 3 days.
TimeWindows.of(ONE_HOUR_MILLIS).until(THREE_DAYS_MILLIS)



As discovered (and shared [1]) a few months ago, consuming a record older than 
3 days will mess up my aggregates. How do you deal with this ? Do you 
temporarily raise the windows maintain durations until all records are consumed 
? Do you always run your topologies with long durations, like a year ? I have 
no idea what would be the impact on the RAM and disk, but I guess RocksDB would 
cry a little.


Final question: il I raise the duration to 6 months, consume my records, and 
then set the duration back to 3 days, would the old aggregates automatically 
destroyed ?


[1] 
http://mail-archives.apache.org/mod_mbox/kafka-users/201610.mbox/%3ccabqkjkj42n7z4bxjdkrdyz_kmpunh738uxvm7gy24dnkx+r...@mail.gmail.com%3e
  

Thanks
Nicolas


Reply via email to