Hi Nicolas,

I've seen your previous message thread too. I think your best bet for now is to 
increase the window duration time, to 6 months.

If you change your application logic, e.g., by changing the duration time, the 
semantics of the change wouldn't immediate be clear and it's worth clarifying 
those. For example, would the intention be to reprocess all the data from the 
beginning? Or start where you left off (in which case the fact that the 
original processing went over data that is 6 month old would not be relevant, 
since you'd start from where you left off the second time)? Right now we 
support a limited way to reprocess the data by effectively resetting a streams 
 I wouldn't recommend using that if you want to keep the results of the 
previous run though. 


> On 12 Jan 2017, at 09:15, Nicolas Fouché <nfou...@onfocus.io> wrote:
> Hi.
> I'd like to re-consume 6 months old data with Kafka Streams.
> My current topology can't because it defines aggregations with windows 
> maintain durations of 3 days.
> As discovered (and shared [1]) a few months ago, consuming a record older 
> than 3 days will mess up my aggregates. How do you deal with this ? Do you 
> temporarily raise the windows maintain durations until all records are 
> consumed ? Do you always run your topologies with long durations, like a year 
> ? I have no idea what would be the impact on the RAM and disk, but I guess 
> RocksDB would cry a little.
> Final question: il I raise the duration to 6 months, consume my records, and 
> then set the duration back to 3 days, would the old aggregates automatically 
> destroyed ?
> [1] 
> http://mail-archives.apache.org/mod_mbox/kafka-users/201610.mbox/%3ccabqkjkj42n7z4bxjdkrdyz_kmpunh738uxvm7gy24dnkx+r...@mail.gmail.com%3e
> Thanks
> Nicolas

Reply via email to