nickwallen commented on a change in pull request #1403: METRON-2109: Add option to use Metron GUID as the id in Elasticsearch URL: https://github.com/apache/metron/pull/1403#discussion_r282897127
########## File path: metron-platform/metron-indexing/README.md ########## @@ -69,13 +69,17 @@ Depending on how you start the indexing topology, it will have either Elasticsea | `batchTimeout` | The timeout after which a batch will be flushed even if `batchSize` has not been met. | Defaults to a duration which is a fraction of the Storm parameter `topology.message.timeout.secs`, if left undefined or set to 0. Ignored if batchSize is `1`, since this disables batching.| | `enabled` | A boolean indicating whether the writer is enabled. | Defaults to `true` | | `fieldNameConverter` | Defines how field names are transformed before being written to the index. Only applicable to `elasticsearch`. | Defaults to `DEDOT`. Acceptable values are `DEDOT` that replaces all '.' with ':' or `NOOP` that does not change the field names . | +| `metronId` | A boolean indicating whether the writer should use the id generated by Metron | Defaults to `false`. This setting only applies to Elasticsearch, the id used with Solr is configured in the Solr schemas. ### Meta Alerts Alerts can be grouped, after appropriate searching, into a set of alerts called a meta alert. A meta alert is useful for maintaining the context of searching and grouping during further investigations. Standard searches can return meta alerts, but grouping and other aggregation or sorting requests will not, because there's not a clear way to aggregate in many cases if there are multiple alerts contained in the meta alert. All meta alerts will have the source type of metaalert, regardless of the contained alert's origins. ### Elasticsearch -Metron comes with built-in templates for the default sensors for Elasticsearch. When adding a new sensor, it will be necessary to add a new template defining the output fields appropriately. In addition, there is a requirement for a field `alert` of type `nested` for Elasticsearch 2.x installs. This is detailed at [Using Metron with Elasticsearch 2.x](../metron-elasticsearch/README.md#using-metron-with-elasticsearch-2x) +Metron comes with built-in templates for the default sensors for Elasticsearch. When adding a new sensor, it will be necessary to add a new template defining the output fields appropriately. In addition, there is a requirement for a field `alert` of type `nested` for Elasticsearch 2.x installs. This is detailed at [Using Metron with Elasticsearch 2.x](../metron-elasticsearch/README.md#using-metron-with-elasticsearch-2x). + +Metron is configured by default to let Elasticsearch generated ids for performance reasons. However, due to Storm's at least once processing guarantee, it is possible for duplicate messages to be indexed when messages are replayed for whatever reason. If this scenario is less desirable, the Metron generated id stored in the `guid` field of the message can be used instead. This can be configured for individual sensors by setting the `metronId` setting to true in the [Sensor Indexing Configuration](#sensor-indexing-configuration). Review comment: Your description makes perfect sense; well said. It might also be helpful to include an example. Maybe something like the following. I am not sure if this should live here or elsewhere, but an example always helps. ``` { "elasticsearch": { "index": "bro", "metronId": true } } ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services