Hi all, I created simply app to test apex fault tolerance. It is build from three main operators: - sequence generator - operator which generate increasing numbers. One per time window - aggregator - just adds incoming number to the list and emits whole list downstream - file output - operator which writes incoming messages to the file To make it faulty, aggregator operator throws an exception for 10% of messages. Source code is here https://github.com/Matzz/apex-example
I'm running it on sandbox docker image. I thought that even if aggregation operator is faulty, application will checkpoint its state. So over the time output list should be longer and longer. Unfortunately, it looks like on each failure app is resenting it state to the beginning. Sample output: *tail -f -n 100 /tmp/stream.out * *Creating FileOutput 2018-06-16T22:07:01.033* *Creating aggreagator 2018-06-16T22:07:01.040* *Creating FileOutput 2018-06-16T22:07:01.041* *Creating FileOutput 2018-06-16T22:07:02.719* *Creating aggreagator 2018-06-16T22:07:02.722* *Creating FileOutput 2018-06-16T22:07:02.723* *Creating FileOutput 2018-06-16T22:08:48.178* *Creating aggreagator 2018-06-16T22:08:48.185* *Creating FileOutput 2018-06-16T22:08:48.186* *Creating FileOutput 2018-06-16T22:08:49.847* *Creating aggreagator 2018-06-16T22:08:49.850* *Creating FileOutput 2018-06-16T22:08:49.852* *Creating FileOutput 2018-06-16T22:08:56.736* *Creating aggreagator 2018-06-16T22:08:56.740* *Creating FileOutput 2018-06-16T22:08:56.743* *Creating FileOutput 2018-06-16T22:08:57.899* *Creating aggreagator 2018-06-16T22:08:57.899* *Creating FileOutput 2018-06-16T22:08:57.899* *Creating FileOutput 2018-06-16T22:09:10.951* *Creating FileOutput 2018-06-16T22:09:10.986* *Creating aggreagator 2018-06-16T22:09:11.001* *Failing sequence generator!2018-06-16T22:09:11.029* *Creating FileOutput 2018-06-16T22:09:19.484* *Creating FileOutput 2018-06-16T22:09:19.506* *Creating aggreagator 2018-06-16T22:09:19.518* *Failing sequence generator!2018-06-16T22:09:19.542* *Creating FileOutput 2018-06-16T22:09:28.646* *Creating FileOutput 2018-06-16T22:09:28.668* *Creating aggreagator 2018-06-16T22:09:28.680* *Failing sequence generator!2018-06-16T22:09:28.704* *[1.0]* *Creating FileOutput 2018-06-16T22:09:37.864* *Creating FileOutput 2018-06-16T22:09:37.886* *Creating aggreagator 2018-06-16T22:09:37.897* *Failing sequence generator!2018-06-16T22:09:37.924* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]* *Creating FileOutput 2018-06-16T22:09:46.921* *Creating FileOutput 2018-06-16T22:09:46.944* *Creating aggreagator 2018-06-16T22:09:46.956* *Failing sequence generator!2018-06-16T22:09:46.980* *[1.0, 2.0, 3.0, 4.0]* *[1.0, 2.0, 3.0, 4.0]* *[1.0, 2.0, 3.0, 4.0]* *[1.0, 2.0, 3.0, 4.0]* *Creating FileOutput 2018-06-16T22:09:56.049* *Creating FileOutput 2018-06-16T22:09:56.070* *Creating aggreagator 2018-06-16T22:09:56.081* *Failing sequence generator!2018-06-16T22:09:56.112* *[1.0]* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* *[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]* *Creating FileOutput 2018-06-16T22:10:05.213* *Creating FileOutput 2018-06-16T22:10:05.232* *Creating aggreagator 2018-06-16T22:10:05.241* *Failing sequence generator!2018-06-16T22:10:05.266**[1.0, 2.0]* *[1.0, 2.0]* Could I ask for some explanation what I'm doing wrong? Regards, Matuesz Zakarczemny