Hi I'm using Samza to aggregate data within sliding windows. I'm especially interested in the fault tolerance aspect of Samza and it's ability to restore from the latest committed offset. I've looked into the WindowableTask interface, but this implementation relies on the system time, rather then the timestamp of the messages received. The main problem is to make sure all data for one window is available and in case of node failure the complete window is restored. With the current implementation of offset committing (committed by time interval, window size or upon `taskCoordinator.commit()`) this is quite cumbersome. I have two different options in mind to overcome this problem and would like to hear your advice on this.
1. Using the Key-Value store the data of a the current window is cached and after a restore one could figure out what data to include from the cache and from Kafka. Unfortunately, this sounds like reimplementing the offset management of Kafka. 2. Set the time based commit interval to Long.MAX_VALUE and handle all commit intervals through `taskCoordinator.commit()`. In this case I could commit the offset after each sliding window and in case of failure Samza would restore from the beginning of the new sliding window. The only problem with this approach is `taskCoordinator.commit()` will commit the offsets of all partitions. But the data may be different between partitions and therefore sliding windows will not match with other partitions. I've looked into SAMZA-23 [1] and found a proposal to coordinate commits for each partition individually. Since this would solve my problem, I'd like to provide a patch for this issue. Is there any interest in this? Best Nicolas [1]: https://issues.apache.org/jira/browse/SAMZA-23
