METRON-1855: Make unified enrichment topology the default and deprecate split-join (mmiklavc via mmiklavc) closes apache/metron#1252
Project: http://git-wip-us.apache.org/repos/asf/metron/repo Commit: http://git-wip-us.apache.org/repos/asf/metron/commit/bf6b07f7 Tree: http://git-wip-us.apache.org/repos/asf/metron/tree/bf6b07f7 Diff: http://git-wip-us.apache.org/repos/asf/metron/diff/bf6b07f7 Branch: refs/heads/feature/METRON-1090-stellar-assignment Commit: bf6b07f7cbea3d210878554c7ce7a1bc091b59ee Parents: fdfca3b Author: mmiklavc <michael.miklav...@gmail.com> Authored: Mon Nov 5 16:30:43 2018 -0700 Committer: Michael Miklavcic <michael.miklav...@gmail.com> Committed: Mon Nov 5 16:30:43 2018 -0700 ---------------------------------------------------------------------- Upgrading.md | 17 ++++++++ .../configuration/metron-enrichment-env.xml | 8 ++-- .../METRON/CURRENT/themes/metron_theme.json | 12 +++--- metron-platform/Performance-tuning-guide.md | 6 ++- metron-platform/metron-enrichment/README.md | 43 +++++++++----------- .../main/scripts/start_enrichment_topology.sh | 4 +- 6 files changed, 54 insertions(+), 36 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/metron/blob/bf6b07f7/Upgrading.md ---------------------------------------------------------------------- diff --git a/Upgrading.md b/Upgrading.md index 2124ac5..a0dd5d3 100644 --- a/Upgrading.md +++ b/Upgrading.md @@ -19,6 +19,23 @@ limitations under the License. This document constitutes a per-version listing of changes of configuration which are non-backwards compatible. +## 0.6.0 to 0.6.1 + +### [METRON-1855: Make unified enrichment topology the default and deprecate split-join](https://issues.apache.org/jira/browse/METRON-1855) +The unified enrichment topology will be the new default in this release, +and the split-join enrichment topology is now considered deprecated. +If you wish to keep the deprecated split-join enrichment topology, +you will need to make the following changes: + +* In Ambari > Metron > Config > Enrichment set the enrichment_topology setting to "Split-Join" +* If running `start_enrichment_topology.sh` manually, pass in the parameters to start the Split-Join topology as follows + + ``` + $METRON_HOME/bin/start_enrichment_topology.sh --remote $METRON_HOME/flux/enrichment/remote-splitjoin.yaml --filter $METRON_HOME/config/enrichment-splitjoin.properties + ``` + +* Restart the enrichment topology + ## 0.4.2 to 0.5.0 ### [METRON-941: native PaloAlto parser corrupts message when having a comma in the payload](https://issues.apache.org/jira/browse/METRON-941) http://git-wip-us.apache.org/repos/asf/metron/blob/bf6b07f7/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-enrichment-env.xml ---------------------------------------------------------------------- diff --git a/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-enrichment-env.xml b/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-enrichment-env.xml index b41c455..69dce3f 100644 --- a/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-enrichment-env.xml +++ b/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-enrichment-env.xml @@ -165,17 +165,17 @@ </property> <property> <name>enrichment_topology</name> - <description>Which Enrichment topology to execute</description> - <value>Split-Join</value> + <description>Which Enrichment topology to execute. Note: Split-Join is deprecated in favor of the Unified topology.</description> + <value>Unified</value> <display-name>Enrichment Topology</display-name> <value-attributes> <type>value-list</type> <entries> <entry> - <value>Split-Join</value> + <value>Unified</value> </entry> <entry> - <value>Unified</value> + <value>Split-Join</value> </entry> </entries> <selection-cardinality>1</selection-cardinality> http://git-wip-us.apache.org/repos/asf/metron/blob/bf6b07f7/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/themes/metron_theme.json ---------------------------------------------------------------------- diff --git a/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/themes/metron_theme.json b/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/themes/metron_theme.json index 1d7b6c5..46c06dd 100644 --- a/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/themes/metron_theme.json +++ b/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/themes/metron_theme.json @@ -125,7 +125,7 @@ ] }, { - "name": "section-enrichment-splitjoin", + "name": "section-enrichment-unified", "row-index": "3", "column-index": "0", "row-span": "1", @@ -134,8 +134,8 @@ "section-rows": "1", "subsections": [ { - "name": "subsection-enrichment-splitjoin", - "display-name": "Split Join Topology", + "name": "subsection-enrichment-unified", + "display-name": "Unified Topology", "row-index": "0", "column-index": "0", "row-span": "1", @@ -144,7 +144,7 @@ ] }, { - "name": "section-enrichment-unified", + "name": "section-enrichment-splitjoin", "row-index": "4", "column-index": "0", "row-span": "1", @@ -153,8 +153,8 @@ "section-rows": "1", "subsections": [ { - "name": "subsection-enrichment-unified", - "display-name": "Unified Topology", + "name": "subsection-enrichment-splitjoin", + "display-name": "Split Join Topology", "row-index": "0", "column-index": "0", "row-span": "1", http://git-wip-us.apache.org/repos/asf/metron/blob/bf6b07f7/metron-platform/Performance-tuning-guide.md ---------------------------------------------------------------------- diff --git a/metron-platform/Performance-tuning-guide.md b/metron-platform/Performance-tuning-guide.md index bfc36dd..2e976e9 100644 --- a/metron-platform/Performance-tuning-guide.md +++ b/metron-platform/Performance-tuning-guide.md @@ -188,9 +188,11 @@ See more detail on starting parsers [here](https://github.com/apache/metron/blob **Enrichment** +__Note__ These recommendations are based on the deprecated split-join enrichment topology. See [Enrichment Performance](metron-enrichment/Performance.md) for tuning recommendations for the new default unified enrichment topology. + This is a mapping of the various performance tuning properties for enrichments and how they are materialized. -Flux file found here - $METRON_HOME/flux/enrichment/remote.yaml +Flux file found here - $METRON_HOME/flux/enrichment/remote-splitjoin.yaml _Note 1:_ Changes to Flux file properties that are managed by Ambari will render Ambari unable to further manage the property. @@ -458,6 +460,8 @@ usage: start_parser_topology.sh ### Enrichment Tuning +__Note__ These tuning suggestions are based on the deprecated split-join topology. + We landed on the same number of partitions for enrichemnt and indexing as we did for bro - 48. For configuring Storm, there is a flux file and properties file that we modified. Here are the settings we changed for bro in Flux. http://git-wip-us.apache.org/repos/asf/metron/blob/bf6b07f7/metron-platform/metron-enrichment/README.md ---------------------------------------------------------------------- diff --git a/metron-platform/metron-enrichment/README.md b/metron-platform/metron-enrichment/README.md index 8a53e71..c72970f 100644 --- a/metron-platform/metron-enrichment/README.md +++ b/metron-platform/metron-enrichment/README.md @@ -31,36 +31,22 @@ data format (e.g. a JSON Map structure with `original_message` and ## Enrichment Architecture -![Architecture](enrichment_arch.png) +![Unified Architecture](unified_enrichment_arch.svg) ### Unified Enrichment Topology -There is an experimental unified enrichment topology which is shipped. -Currently the architecture, as described above, has a split/join in -order to perform enrichments in parallel. This poses some issues in -terms of ease of tuning and reasoning about performance. - -In order to deal with these issues, there is an alternative enrichment topology which -uses data parallelism as opposed to the split/join task parallelism. -This architecture uses a worker pool to fully enrich any message within -a worker. This results in +The unified enrichment topology uses data parallelism as opposed to the deprecated +split/join topology's task parallelism. This architecture uses a worker pool to fully +enrich any message within a worker. This results in * Fewer bolts in the topology * Each bolt fully operates on a message. * Fewer network hops -![Unified Architecture](unified_enrichment_arch.svg) - -This architecture is fully backwards compatible; the only difference is -how the enrichment will operate on each message (in one bolt where the -split/join is done in a threadpool as opposed +This architecture is fully backwards compatible with the old split-join +topology; the only difference is how the enrichment will operate on each +message (in one bolt where the split/join is done in a threadpool as opposed to split across multiple bolts). -#### Using It - -In order to use this, you will need to -* Edit `$METRON_HOME/bin/start_enrichment_topology.sh` and adjust it to use `remote-unified.yaml` instead of `remote.yaml` -* Restart the enrichment topology. - #### Configuring It There are two parameters which you might want to tune in this topology. @@ -76,6 +62,19 @@ intel bolt, the configurations will be taken from the respective join bolt parallelism. When proper ambari support for this is added, we will add its own property. +### Split-Join Enrichment Topology + +The now-deprecated split/join topology is also available and performs enrichments in parallel. +This poses some issues in terms of ease of tuning and reasoning about performance. + +![Architecture](enrichment_arch.png) + +#### Using It + +In order to use the older, deprecated topology, you will need to +* Edit `$METRON_HOME/bin/start_enrichment_topology.sh` and adjust it to use `remote-splitjoin.yaml` instead of `remote-unified.yaml` +* Restart the enrichment topology. + ## Enrichment Configuration The configuration for the `enrichment` topology, the topology primarily @@ -85,7 +84,6 @@ defined by JSON documents stored in zookeeper. There are two types of configurations at the moment, `global` and `sensor` specific. - ## Global Configuration There are a few enrichments which have independent configurations, such @@ -134,7 +132,6 @@ The configuration is a complex JSON object with the following top level fields: ### The `enrichment` Configuration - | Field | Description | Example | |------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------| | `fieldToTypeMap` | In the case of a simple HBase enrichment (i.e. a key/value lookup), the mapping between fields and the enrichment types associated with those fields must be known. This enrichment type is used as part of the HBase key. Note: applies to hbaseEnrichment only. | `"fieldToTypeMap" : { "ip_src_addr" : [ "asset_enrichment" ] }` | http://git-wip-us.apache.org/repos/asf/metron/blob/bf6b07f7/metron-platform/metron-enrichment/src/main/scripts/start_enrichment_topology.sh ---------------------------------------------------------------------- diff --git a/metron-platform/metron-enrichment/src/main/scripts/start_enrichment_topology.sh b/metron-platform/metron-enrichment/src/main/scripts/start_enrichment_topology.sh index 77c3a77..d3ed8ad 100755 --- a/metron-platform/metron-enrichment/src/main/scripts/start_enrichment_topology.sh +++ b/metron-platform/metron-enrichment/src/main/scripts/start_enrichment_topology.sh @@ -20,11 +20,11 @@ METRON_VERSION=${project.version} METRON_HOME=/usr/metron/$METRON_VERSION TOPOLOGY_JAR=${project.artifactId}-$METRON_VERSION-uber.jar -# there are two enrichment topologies. by default, the split-join enrichment topology is executed +# There are two enrichment topologies. By default, the unified enrichment topology is executed. Split-join is now deprecated. SPLIT_JOIN_ARGS="--remote $METRON_HOME/flux/enrichment/remote-splitjoin.yaml --filter $METRON_HOME/config/enrichment-splitjoin.properties" UNIFIED_ARGS="--remote $METRON_HOME/flux/enrichment/remote-unified.yaml --filter $METRON_HOME/config/enrichment-unified.properties" # by passing in different args, the user can execute an alternative enrichment topology -ARGS=${@:-$SPLIT_JOIN_ARGS} +ARGS=${@:-$UNIFIED_ARGS} storm jar $METRON_HOME/lib/$TOPOLOGY_JAR org.apache.storm.flux.Flux $ARGS