Thanks for the quick response. Usually a small time window would provide enough granularity, but with zero tolerance to failures in the results and enforced node failures a more solid strategy has to be in place. I'll work on SAMZA-23 and report back.
On Mon, Apr 14, 2014 at 4:55 PM, Jakob Homan <[email protected]> wrote: > For your case, setting the window time to some small value (say a second) > doesn't provide small enough granularity to check for completeness and > produce as necessary? This would generally be how what you're describing > would be handled. > > SAMZA-23 would be a great contribution, if you're interested. > -jg > > > On Mon, Apr 14, 2014 at 4:36 AM, Nicolas Bär <[email protected]> > wrote: > > > Hi > > > > I'm using Samza to aggregate data within sliding windows. I'm especially > > interested in the fault tolerance aspect of Samza and it's ability to > > restore from the latest committed offset. I've looked into the > > WindowableTask interface, but this implementation relies on the system > > time, rather then the timestamp of the messages received. The main > problem > > is to make sure all data for one window is available and in case of node > > failure the complete window is restored. With the current implementation > of > > offset committing (committed by time interval, window size or upon > > `taskCoordinator.commit()`) this is quite cumbersome. I have two > different > > options in mind to overcome this problem and would like to hear your > advice > > on this. > > > > 1. Using the Key-Value store the data of a the current window is cached > and > > after a restore one could figure out what data to include from the cache > > and from Kafka. Unfortunately, this sounds like reimplementing the offset > > management of Kafka. > > > > 2. Set the time based commit interval to Long.MAX_VALUE and handle all > > commit intervals through `taskCoordinator.commit()`. In this case I could > > commit the offset after each sliding window and in case of failure Samza > > would restore from the beginning of the new sliding window. The only > > problem with this approach is `taskCoordinator.commit()` will commit the > > offsets of all partitions. But the data may be different between > partitions > > and therefore sliding windows will not match with other partitions. I've > > looked into SAMZA-23 [1] and found a proposal to coordinate commits for > > each partition individually. Since this would solve my problem, I'd like > to > > provide a patch for this issue. Is there any interest in this? > > > > Best > > Nicolas > > > > > > [1]: https://issues.apache.org/jira/browse/SAMZA-23 > > >
