Yes, we specially did some more tests with EPS 100 and every time got a 10 second window with reindexing
On 2019/12/17 18:42:17, Michael Miklavcic <[email protected]> wrote: > Is it always a 10 second window, or thereabouts? > > On Sun, Dec 15, 2019 at 11:11 PM Vladimir Mikhailov < > [email protected]> wrote: > > > So, we conducted a number of additional tests that confirmed the problem. > > > > The test is very simple: > > > > we generated a very stable stream of events (2 EPS) and did a parser > > restart and each time we received re-indexing of events with a 10 second > > window. > > > > Parser is simple JSONMap without enrichment or TI rules. > > > > SENSOR PARSER CONFIG: > > > > { > > "parserClassName": "org.apache.metron.parsers.json.JSONMapParser", > > "filterClassName": null, > > "sensorTopic": "netflow-load-test-json", > > "outputTopic": null, > > "errorTopic": null, > > "writerClassName": null, > > "errorWriterClassName": null, > > "readMetadata": true, > > "mergeMetadata": true, > > "numWorkers": 2, > > "numAckers": 2, > > "spoutParallelism": 2, > > "spoutNumTasks": 2, > > "parserParallelism": 2, > > "parserNumTasks": 2, > > "errorWriterParallelism": 1, > > "errorWriterNumTasks": 1, > > "spoutConfig": {}, > > "securityProtocol": null, > > "stormConfig": {}, > > "parserConfig": { > > "mapStrategy": "ALLOW", > > "jsonpQuery": "$", > > "wrapInEntityArray": false, > > "overrideOriginalString": true > > }, > > "fieldTransformations": [], > > "cacheConfig": {}, > > "rawMessageStrategy": "DEFAULT", > > "rawMessageStrategyConfig": {} > > } > > > > SENSOR ENRICHMENT CONFIG > > > > { > > "enrichment": { > > "fieldMap": {}, > > "fieldToTypeMap": {}, > > "config": {} > > }, > > "threatIntel": { > > "fieldMap": {}, > > "fieldToTypeMap": {}, > > "config": {}, > > "triageConfig": { > > "riskLevelRules": [], > > "aggregator": "MAX", > > "aggregationConfig": {} > > } > > }, > > "configuration": {} > > } > > > > INDEXING CONFIGURATIONS > > > > { > > "hdfs": { > > "batchSize": 1000, > > "enabled": true, > > "index": "netflow-load-test-json" > > }, > > "elasticsearch": { > > "batchSize": 1000, > > "enabled": true, > > "batchTimeout": 5, > > "index": "netflow-load-test-json", > > "fieldNameConverter": "NOOP" > > }, > > "solr": { > > "batchSize": 1, > > "enabled": false, > > "index": "netflow-load-test-json" > > } > > } > > > > > > Can anyone repeat this test and check if there are any recurring events in > > the index? > > > > > > On 2019/12/12 07:22:44, Vladimir Mikhailov <[email protected]> > > wrote: > > > Thanks for the clarification! > > > > > > So we need to conduct a few more tests to understand the cause of this > > problem. > > > I will write about the results. > > > > > > On 2019/12/11 14:01:45, Nick Allen <[email protected]> wrote: > > > > > And now the question: does stopping the parser with the "storm kill > > > > <name>" command mean that the topology will complete the processing of > > all > > > > current events that were read by kafkaSpout and commit the > > corresponding > > > > offset to kafka? > > > > > > > > Yes, it will wait as long as the topology's message timeout (by > > default 30 > > > > seconds), which should be plenty of time to commit offsets. > > > > http://storm.apache.org/releases/current/Command-line-client.html > > > > > > > > kill > > > > > > > > Syntax: storm kill topology-name [-w wait-time-secs] > > > > > > > > Kills the topology with the name topology-name. Storm will first > > deactivate > > > > the topology's spouts for the duration of the topology's message > > timeout to > > > > allow all messages currently being processed to finish processing. > > Storm > > > > will then shutdown the workers and clean up their state. You can > > override > > > > the length of time Storm waits between deactivation and shutdown with > > the > > > > -w flag. > > > > > > > > > > > > On Wed, Dec 11, 2019 at 5:10 AM Vladimir Mikhailov < > > > > [email protected]> wrote: > > > > > > > > > It seems that I misinformed our situation by writing about the option > > > > > "set_wait_secs (0)". > > > > > > > > > > We performed all our tests by disabling parsers in the Metron > > Management > > > > > UI, so I suppose they were all stopped using "storm kill <name>" > > (and not > > > > > "storm kill <name> -w 0"). And in these tests we got reindexing some > > > > > messages. > > > > > > > > > > And now the question: does stopping the parser with the "storm kill > > > > > <name>" command mean that the topology will complete the processing > > of all > > > > > current events that were read by kafkaSpout and commit the > > corresponding > > > > > offset to kafka? > > > > > > > > > > On 2019/12/11 06:39:28, Michael Miklavcic < > > [email protected]> > > > > > wrote: > > > > > > It only does that if the arg stopNow is true. It's always false > > per the > > > > > > previous snippets I shared. > > > > > > > > > > > > On Tue, Dec 10, 2019, 10:54 PM Vladimir Mikhailov < > > > > > > [email protected]> wrote: > > > > > > > > > > > > > Hi Michael > > > > > > > > > > > > > > I think the problem is not on the REST side, but in the > > > > > "StormCLIWrapper", > > > > > > > which it uses: > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/metron/blob/88f4d2cefe4bbb389732da3b4f5cbcf02b7b949a/metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/StormCLIWrapper.java#L145 > > > > > > > > > > > > > > Each of the "StormCLIWrapper" methods: stopParserTopology, > > > > > > > stopEnrichmentTopology and stopIndexingTopology simply stop the > > > > > > > corresponding topologies with command "storm kill <name> [-w 0]", > > > > > leading > > > > > > > to the described unpleasant consequences with re-indexing. > > > > > > > > > > > > > > Perhaps, instead, we should give the topology a certain command > > to stop > > > > > > > and wait until it finishes processing current events and commits > > > > > changes to > > > > > > > kafka? > > > > > > > > > > > > > > > > > > > > > On 2019/12/10 18:18:28, Michael Miklavcic < > > [email protected] > > > > > > > > > > > > > wrote: > > > > > > > > Where are you seeing this? As far as I can tell, the UI and > > REST > > > > > > > endpoints > > > > > > > > default to a graceful shutdown. > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/metron/blob/master/metron-interface/metron-config/src/app/service/storm.service.ts#L154 > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/metron/blob/master/metron-interface/metron-rest/src/main/java/org/apache/metron/rest/controller/StormController.java#L91 > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Dec 10, 2019 at 4:11 AM Vladimir Mikhailov < > > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > > > Hi > > > > > > > > > > > > > > > > > > We found the unpleasant consequences of each restart of the > > > > > parsers: > > > > > > > each > > > > > > > > > time part of the events are reindexed again. Unfortunately, > > this > > > > > was > > > > > > > > > confirmed by several special tests. > > > > > > > > > > > > > > > > > > Perhaps the reason for this is the method used to > > immediately stop > > > > > the > > > > > > > > > storm topology using "killTopologyWithOpts" with the option > > > > > > > "set_wait_secs > > > > > > > > > (0)". Because of this, the topology does not have time to > > commit to > > > > > > > kafka > > > > > > > > > the current offsets of already processed events. > > > > > > > > > > > > > > > > > > After the parser starts, kafkaSpout starts reading > > uncommitted > > > > > events > > > > > > > and > > > > > > > > > therefore some events are indexed twice. > > > > > > > > > > > > > > > > > > So the question is: is there a more elegant way to stop the > > parser > > > > > > > > > topology in order to avoid the problems described above? Of > > > > > course, we > > > > > > > are > > > > > > > > > talking about changes to the source code, not some options or > > > > > settings. > > > > > > > > > > > > > > > > > > If such a solution exists and the problem can be fixed, then > > I can > > > > > > > create > > > > > > > > > the corresponding issue at > > > > > > > https://issues.apache.org/jira/browse/METRON > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
