Hi We found the unpleasant consequences of each restart of the parsers: each time part of the events are reindexed again. Unfortunately, this was confirmed by several special tests.
Perhaps the reason for this is the method used to immediately stop the storm topology using "killTopologyWithOpts" with the option "set_wait_secs (0)". Because of this, the topology does not have time to commit to kafka the current offsets of already processed events. After the parser starts, kafkaSpout starts reading uncommitted events and therefore some events are indexed twice. So the question is: is there a more elegant way to stop the parser topology in order to avoid the problems described above? Of course, we are talking about changes to the source code, not some options or settings. If such a solution exists and the problem can be fixed, then I can create the corresponding issue at https://issues.apache.org/jira/browse/METRON
