[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...
Github user mraliagha commented on the issue: https://github.com/apache/metron/pull/940 Is there any document somewhere to show how the previous approach was implemented? I would like to understand the previous architecture in details. Becuase some of the pros/cons didn't make sense to me. Maybe I can help to predict what the impact will be. Thanks. ---
[GitHub] metron pull request #853: METRON-1337: List of facets should not be hardcode...
Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/853#discussion_r170125479 --- Diff: metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/AlertServiceImpl.java --- @@ -37,15 +47,21 @@ @Service public class AlertServiceImpl implements AlertService { --- End diff -- I think that is great. At some point, if we are going to do this a few times we should have a standard naming convention ( separators at least or something). But that isn't a thing for this PR ---
[GitHub] metron pull request #853: METRON-1337: List of facets should not be hardcode...
Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/853#discussion_r170120909 --- Diff: metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/AlertServiceImpl.java --- @@ -37,15 +47,21 @@ @Service public class AlertServiceImpl implements AlertService { --- End diff -- Name is changed in the latest commit. Let me know what you think. ---
[GitHub] metron pull request #858: METRON-1344: Externalize the infrastructural compo...
Github user merrimanr closed the pull request at: https://github.com/apache/metron/pull/858 ---
[GitHub] metron pull request #941: METRON-1355: Convert metron-elasticsearch to new i...
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/941 METRON-1355: Convert metron-elasticsearch to new infrastructure ## Contributor Comments This PR switches metron-elasticsearch integration tests from using in-memory components to the e2e Docker infrastructure. A high-level summary of the changes: - Updated travis to only run the metron-elasticsearch tests - Updated the Elasticsearch Docker image to the same version as Metron - Removed the requirement to build a base "metron-centos" image. Removes an extra Docker build step and I'm not sure caching this helps much anyways. - Updated each integration test to use index names that are namespaced with the test class name - Replaced the in-memory component setup with an Elasticsearch client configured for the Elasticsearch Docker container - Added steps to setup/delete indices before/after each test - Moved Elasticsearch in-memory helper methods and common integration tests methods to a utils class - Fixed a minor bug in the ElasticsearchMetaAlertDao class where the index is always hardcoded to "metaalert" - Added some initial documentation for the metron-docker-e2e module The scope of this PR are 3 of the 4 metron-elasticsearch integration tests: - org.apache.metron.elasticsearch.integration.ElasticsearchMetaAlertIntegrationTest - org.apache.metron.elasticsearch.integration.ElasticsearchSearchIntegrationTest - org.apache.metron.elasticsearch.integration.ElasticsearchUpdateIntegrationTest Most of the test logic in org.apache.metron.elasticsearch.integration.ElasticsearchIndexingIntegrationTest is actually in metron-indexing so that test will be updated when we convert that module. At this point only the metron-elasticsearch tests are run in travis. My plan is to slowly add tests for each module until we reach feature parity with master. At that point the "install" section of .travis.yml will be the same. The alerts ui e2e tests were removed for now until the work being done to stabilize them is complete. The ES in-memory component starts up pretty fast so performance improved for the 3 tests by about 7 seconds. Curious to hear what people think. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your
[GitHub] metron pull request #940: Single bolt split join poc
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/940#discussion_r170089790 --- Diff: metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/UnifiedEnrichmentBolt.java --- @@ -0,0 +1,323 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.enrichment.bolt; + +import org.apache.metron.common.Constants; +import org.apache.metron.common.bolt.ConfiguredEnrichmentBolt; +import org.apache.metron.common.configuration.ConfigurationType; +import org.apache.metron.common.configuration.enrichment.SensorEnrichmentConfig; +import org.apache.metron.common.configuration.enrichment.handler.ConfigHandler; +import org.apache.metron.common.error.MetronError; +import org.apache.metron.common.performance.PerformanceLogger; +import org.apache.metron.common.utils.ErrorUtils; +import org.apache.metron.common.utils.MessageUtils; +import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase; +import org.apache.metron.enrichment.configuration.Enrichment; +import org.apache.metron.enrichment.interfaces.EnrichmentAdapter; +import org.apache.metron.enrichment.parallel.EnrichmentContext; +import org.apache.metron.enrichment.parallel.EnrichmentStrategies; +import org.apache.metron.enrichment.parallel.ParallelEnricher; +import org.apache.metron.enrichment.parallel.WorkerPoolStrategy; +import org.apache.metron.stellar.dsl.Context; +import org.apache.metron.stellar.dsl.StellarFunction; +import org.apache.metron.stellar.dsl.StellarFunctions; +import org.apache.storm.task.OutputCollector; +import org.apache.storm.task.TopologyContext; +import org.apache.storm.topology.OutputFieldsDeclarer; +import org.apache.storm.tuple.Fields; +import org.apache.storm.tuple.Tuple; +import org.apache.storm.tuple.Values; +import org.json.simple.JSONObject; +import org.json.simple.parser.JSONParser; +import org.json.simple.parser.ParseException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.UnsupportedEncodingException; +import java.lang.invoke.MethodHandles; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.UUID; + --- End diff -- Ok, I went through and did more rigorous documentation throughout the new classes. Let me know if it makes sense or if there are still issues. ---
[GitHub] metron pull request #940: Single bolt split join poc
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/940#discussion_r170057327 --- Diff: metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/UnifiedEnrichmentBolt.java --- @@ -0,0 +1,323 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.enrichment.bolt; + +import org.apache.metron.common.Constants; +import org.apache.metron.common.bolt.ConfiguredEnrichmentBolt; +import org.apache.metron.common.configuration.ConfigurationType; +import org.apache.metron.common.configuration.enrichment.SensorEnrichmentConfig; +import org.apache.metron.common.configuration.enrichment.handler.ConfigHandler; +import org.apache.metron.common.error.MetronError; +import org.apache.metron.common.performance.PerformanceLogger; +import org.apache.metron.common.utils.ErrorUtils; +import org.apache.metron.common.utils.MessageUtils; +import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase; +import org.apache.metron.enrichment.configuration.Enrichment; +import org.apache.metron.enrichment.interfaces.EnrichmentAdapter; +import org.apache.metron.enrichment.parallel.EnrichmentContext; +import org.apache.metron.enrichment.parallel.EnrichmentStrategies; +import org.apache.metron.enrichment.parallel.ParallelEnricher; +import org.apache.metron.enrichment.parallel.WorkerPoolStrategy; +import org.apache.metron.stellar.dsl.Context; +import org.apache.metron.stellar.dsl.StellarFunction; +import org.apache.metron.stellar.dsl.StellarFunctions; +import org.apache.storm.task.OutputCollector; +import org.apache.storm.task.TopologyContext; +import org.apache.storm.topology.OutputFieldsDeclarer; +import org.apache.storm.tuple.Fields; +import org.apache.storm.tuple.Tuple; +import org.apache.storm.tuple.Values; +import org.json.simple.JSONObject; +import org.json.simple.parser.JSONParser; +import org.json.simple.parser.ParseException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.UnsupportedEncodingException; +import java.lang.invoke.MethodHandles; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.UUID; + --- End diff -- Yeah, good call. ---
Re: [DISCUSS] Alternatives to split/join enrichment
FYI, the PR for this is up at https://github.com/apache/metron/pull/940 For those interested, please comment on the actual implementation there. On Thu, Feb 22, 2018 at 12:43 PM, Casey Stellawrote: > So, these are good questions, as usual Otto :) > > > how does this effect the distribution of work through the cluster, and > resiliency of the topologies? > > This moves us to a data parallelism scheme rather than a task parallelism > scheme. This, in effect means, that we will not be distributing the > partial enrichments across the network for a given message, but rather > distributing the messages across the network for *full* enrichment. So, > the bundle of work is the same, but we're not concentrating capabilities in > specific workers. Then again, as soon as we moved to stellar enrichments > and sub-groups where you can interact with hbase or geo from within > stellar, we sorta abandoned specialization. Resiliency shouldn't be > effected and, indeed, it should be easier to reason about. We ack after > every bolt in the new scheme rather than avoid acking until we join and ack > the original tuple. In fact, I'm still not convinced there's not a bug > somewhere in that join bolt that makes it so we don't ack the right tuple. > > > Is anyone else doing it like this? > > The stormy way of doing this is to specialize in the bolts and join, no > doubt, in a fan-out/fan-in pattern. I do not think it's unheard of, > though, to use a threadpool. It's slightly peculiar inasmuch as storm has > its own threading model, but it is an embarassingly parallel task and the > main shift is trading the unit of parallelism from enrichment task to > message to the gain of fewer network hops. That being said, as long as > you're not emitting from a different thread that you are receiving from, > there's no technical limitation. > > > Can we have multiple thread pools and group tasks together ( or separate > them ) wrt hbase? > > We could, but I think we might consider starting with just a simple static > threadpool that we configure at the topology level (e.g. multiple worker > threads can share the same threadpool that we can configure). I think as > the trend of moving everything to stellar continues, we may end up in a > situation where we don't have a coherent or clear way to differentiate > between thread pools like we do now. > > > Also, how are we to measure the effect? > > Well, some of the benefits here are at an architectural/feature level, the > most exciting of which is that this approach opens up avenues for stellar > subgroups to depend on each other. Slightly less exciting, but still nice > is the fact that this normalizes us with *other* streaming technologies and > the decoupling work done as part of the PR (soon to be released) will make > it easy to transition if we so desire. Beyond that, for performance, > someone will have to run some performance tests or try it out in a > situation where they're having some enrichment performance issues. Until > we do that, I think we should probably just keep it as a parallel approach > that you can swap out if you so desire. > > On Thu, Feb 22, 2018 at 11:48 AM, Otto Fowler > wrote: > >> This sounds worth exploring. A couple of questions: >> >> * how does this effect the distribution of work through the cluster, and >> resiliency of the topologies? >> * Is anyone else doing it like this? >> * Can we have multiple thread pools and group tasks together ( or >> separate them ) wrt hbase? >> >> >> >> On February 22, 2018 at 11:32:39, Casey Stella (ceste...@gmail.com) >> wrote: >> >> Hi all, >> >> I've been thinking and working on something that I wanted to get some >> feedback on. The way that we do our enrichments, the split/join >> architecture was created to effectively to parallel enrichments in a >> storm-like way in contrast to OpenSoc. >> >> There are some good parts to this architecture: >> >> - It works, enrichments are done in parallel >> - You can tune individual enrichments differently >> - It's very storm-like >> >> There are also some deficiencies: >> >> - It's hard to reason about >> - Understanding the latency of enriching a message requires looking >> at multiple bolts that each give summary statistics >> - The join bolt's cache is really hard to reason about when performance >> tuning >> - During spikes in traffic, you can overload the join bolt's cache >> and drop messages if you aren't careful >> - In general, it's hard to associate a cache size and a duration kept >> in cache with throughput and latency >> - There are a lot of network hops per message >> - Right now we are stuck at 2 stages of transformations being done >> (enrichment and threat intel). It's very possible that you might want >> stellar enrichments to depend on the output of other stellar enrichments. >> In order to implement this in split/join you'd have to create a cycle in >> the storm topology >> >> I propose a
[GitHub] metron pull request #940: Single bolt split join poc
GitHub user cestella opened a pull request: https://github.com/apache/metron/pull/940 Single bolt split join poc ## Contributor Comments There are some deficiencies to the split/join topology. It's hard to reason about * Understanding the latency of enriching a message requires looking at multiple bolts that each give summary statistics * The join bolt's cache is really hard to reason about when performance tuning * During spikes in traffic, you can overload the join bolt's cache and drop messages if you aren't careful * In general, it's hard to associate a cache size and a duration kept in cache with throughput and latency * There are a lot of network hops per message * Right now we are stuck at 2 stages of transformations being done (enrichment and threat intel). It's very possible that you might want stellar enrichments to depend on the output of other stellar enrichments. In order to implement this in split/join you'd have to create a cycle in the storm topology I propose that we move to a model where we do enrichments in a single bolt in parallel using a static threadpool (e.g. multiple workers in the same process would share the threadpool). IN all other ways, this would be backwards compatible. A transparent drop-in for the existing enrichment topology. There are some pros/cons about this too: * Pro * Easier to reason about from an individual message perspective * Architecturally decoupled from Storm * This sets us up if we want to consider other streaming technologies * Fewer bolts * spout -> enrichment bolt -> threatintel bolt -> output bolt * Way fewer network hops per message currently 2n+1 where n is the number of enrichments used (if using stellar subgroups, each subgroup is a hop) * Easier to reason about from a performance perspective * We trade cache size and eviction timeout for threadpool size * We set ourselves up to have stellar subgroups with dependencies i.e. stellar subgroups that depend on the output of other subgroups If we do this, we can shrink the topology to just spout -> enrichment/threat intel -> output * Con * We can no longer tune stellar enrichments independent from HBase enrichments * To be fair, with enrichments moving to stellar, this is the case in the split/join approach too * No idea about performance What I propose is to submit a PR that will deliver an alternative, completely backwards compatible topology for enrichment that you can use by adjusting the `start_enrichment_topology.sh` script to use `remote-unified.yaml` instead of `remote.yaml`. If we live with it for a while and have some good experiences with it, maybe we can consider retiring the old enrichment topology. To test this, spin up vagrant and edit `$METRON_HOME/bin/start_enrichment_topology.sh` to use `remote-unified.yaml` instead of `remote.yaml`. Restart enrichment and you should see a topology that looks something like: ![image](https://user-images.githubusercontent.com/540359/36556636-e0ae092e-17d3-11e8-9e45-5160b4f23451.png) ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [ ] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [ ] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [ ] Have you written or updated unit tests and or integration tests to verify your changes? - [ ] If adding new dependencies to the code, are these
Re: [DISCUSS] Alternatives to split/join enrichment
So, these are good questions, as usual Otto :) > how does this effect the distribution of work through the cluster, and resiliency of the topologies? This moves us to a data parallelism scheme rather than a task parallelism scheme. This, in effect means, that we will not be distributing the partial enrichments across the network for a given message, but rather distributing the messages across the network for *full* enrichment. So, the bundle of work is the same, but we're not concentrating capabilities in specific workers. Then again, as soon as we moved to stellar enrichments and sub-groups where you can interact with hbase or geo from within stellar, we sorta abandoned specialization. Resiliency shouldn't be effected and, indeed, it should be easier to reason about. We ack after every bolt in the new scheme rather than avoid acking until we join and ack the original tuple. In fact, I'm still not convinced there's not a bug somewhere in that join bolt that makes it so we don't ack the right tuple. > Is anyone else doing it like this? The stormy way of doing this is to specialize in the bolts and join, no doubt, in a fan-out/fan-in pattern. I do not think it's unheard of, though, to use a threadpool. It's slightly peculiar inasmuch as storm has its own threading model, but it is an embarassingly parallel task and the main shift is trading the unit of parallelism from enrichment task to message to the gain of fewer network hops. That being said, as long as you're not emitting from a different thread that you are receiving from, there's no technical limitation. > Can we have multiple thread pools and group tasks together ( or separate them ) wrt hbase? We could, but I think we might consider starting with just a simple static threadpool that we configure at the topology level (e.g. multiple worker threads can share the same threadpool that we can configure). I think as the trend of moving everything to stellar continues, we may end up in a situation where we don't have a coherent or clear way to differentiate between thread pools like we do now. > Also, how are we to measure the effect? Well, some of the benefits here are at an architectural/feature level, the most exciting of which is that this approach opens up avenues for stellar subgroups to depend on each other. Slightly less exciting, but still nice is the fact that this normalizes us with *other* streaming technologies and the decoupling work done as part of the PR (soon to be released) will make it easy to transition if we so desire. Beyond that, for performance, someone will have to run some performance tests or try it out in a situation where they're having some enrichment performance issues. Until we do that, I think we should probably just keep it as a parallel approach that you can swap out if you so desire. On Thu, Feb 22, 2018 at 11:48 AM, Otto Fowlerwrote: > This sounds worth exploring. A couple of questions: > > * how does this effect the distribution of work through the cluster, and > resiliency of the topologies? > * Is anyone else doing it like this? > * Can we have multiple thread pools and group tasks together ( or separate > them ) wrt hbase? > > > > On February 22, 2018 at 11:32:39, Casey Stella (ceste...@gmail.com) wrote: > > Hi all, > > I've been thinking and working on something that I wanted to get some > feedback on. The way that we do our enrichments, the split/join > architecture was created to effectively to parallel enrichments in a > storm-like way in contrast to OpenSoc. > > There are some good parts to this architecture: > > - It works, enrichments are done in parallel > - You can tune individual enrichments differently > - It's very storm-like > > There are also some deficiencies: > > - It's hard to reason about > - Understanding the latency of enriching a message requires looking > at multiple bolts that each give summary statistics > - The join bolt's cache is really hard to reason about when performance > tuning > - During spikes in traffic, you can overload the join bolt's cache > and drop messages if you aren't careful > - In general, it's hard to associate a cache size and a duration kept > in cache with throughput and latency > - There are a lot of network hops per message > - Right now we are stuck at 2 stages of transformations being done > (enrichment and threat intel). It's very possible that you might want > stellar enrichments to depend on the output of other stellar enrichments. > In order to implement this in split/join you'd have to create a cycle in > the storm topology > > I propose a change. I propose that we move to a model where we do > enrichments in a single bolt in parallel using a static threadpool (e.g. > multiple workers in the same process would share the threadpool). IN all > other ways, this would be backwards compatible. A transparent drop-in for > the existing enrichment topology. > > There are some pros/cons about this too: > > - Pro > -
Re: [DISCUSS] Alternatives to split/join enrichment
Also, how are we to measure the effect? Not to get all six sigma ;) On February 22, 2018 at 11:48:41, Otto Fowler (ottobackwa...@gmail.com) wrote: This sounds worth exploring. A couple of questions: * how does this effect the distribution of work through the cluster, and resiliency of the topologies? * Is anyone else doing it like this? * Can we have multiple thread pools and group tasks together ( or separate them ) wrt hbase? On February 22, 2018 at 11:32:39, Casey Stella (ceste...@gmail.com) wrote: Hi all, I've been thinking and working on something that I wanted to get some feedback on. The way that we do our enrichments, the split/join architecture was created to effectively to parallel enrichments in a storm-like way in contrast to OpenSoc. There are some good parts to this architecture: - It works, enrichments are done in parallel - You can tune individual enrichments differently - It's very storm-like There are also some deficiencies: - It's hard to reason about - Understanding the latency of enriching a message requires looking at multiple bolts that each give summary statistics - The join bolt's cache is really hard to reason about when performance tuning - During spikes in traffic, you can overload the join bolt's cache and drop messages if you aren't careful - In general, it's hard to associate a cache size and a duration kept in cache with throughput and latency - There are a lot of network hops per message - Right now we are stuck at 2 stages of transformations being done (enrichment and threat intel). It's very possible that you might want stellar enrichments to depend on the output of other stellar enrichments. In order to implement this in split/join you'd have to create a cycle in the storm topology I propose a change. I propose that we move to a model where we do enrichments in a single bolt in parallel using a static threadpool (e.g. multiple workers in the same process would share the threadpool). IN all other ways, this would be backwards compatible. A transparent drop-in for the existing enrichment topology. There are some pros/cons about this too: - Pro - Easier to reason about from an individual message perspective - Architecturally decoupled from Storm - This sets us up if we want to consider other streaming technologies - Fewer bolts - spout -> enrichment bolt -> threatintel bolt -> output bolt - Way fewer network hops per message - currently 2n+1 where n is the number of enrichments used (if using stellar subgroups, each subgroup is a hop) - Easier to reason about from a performance perspective - We trade cache size and eviction timeout for threadpool size - We set ourselves up to have stellar subgroups with dependencies - i.e. stellar subgroups that depend on the output of other subgroups - If we do this, we can shrink the topology to just spout -> enrichment/threat intel -> output - Con - We can no longer tune stellar enrichments independent from HBase enrichments - To be fair, with enrichments moving to stellar, this is the case in the split/join approach too - No idea about performance What I propose is to submit a PR that will deliver an alternative, completely backwards compatible topology for enrichment that you can use by adjusting the start_enrichment_topology.sh script to use remote-unified.yaml instead of remote.yaml. If we live with it for a while and have some good experiences with it, maybe we can consider retiring the old enrichment topology. Thoughts? Keep me honest; if I have over or understated the issues for split/join or missed some important architectural issue let me know. I'm going to submit a PR to this effect by the EOD today so things will be more obvious.
Re: [DISCUSS] Alternatives to split/join enrichment
This sounds worth exploring. A couple of questions: * how does this effect the distribution of work through the cluster, and resiliency of the topologies? * Is anyone else doing it like this? * Can we have multiple thread pools and group tasks together ( or separate them ) wrt hbase? On February 22, 2018 at 11:32:39, Casey Stella (ceste...@gmail.com) wrote: Hi all, I've been thinking and working on something that I wanted to get some feedback on. The way that we do our enrichments, the split/join architecture was created to effectively to parallel enrichments in a storm-like way in contrast to OpenSoc. There are some good parts to this architecture: - It works, enrichments are done in parallel - You can tune individual enrichments differently - It's very storm-like There are also some deficiencies: - It's hard to reason about - Understanding the latency of enriching a message requires looking at multiple bolts that each give summary statistics - The join bolt's cache is really hard to reason about when performance tuning - During spikes in traffic, you can overload the join bolt's cache and drop messages if you aren't careful - In general, it's hard to associate a cache size and a duration kept in cache with throughput and latency - There are a lot of network hops per message - Right now we are stuck at 2 stages of transformations being done (enrichment and threat intel). It's very possible that you might want stellar enrichments to depend on the output of other stellar enrichments. In order to implement this in split/join you'd have to create a cycle in the storm topology I propose a change. I propose that we move to a model where we do enrichments in a single bolt in parallel using a static threadpool (e.g. multiple workers in the same process would share the threadpool). IN all other ways, this would be backwards compatible. A transparent drop-in for the existing enrichment topology. There are some pros/cons about this too: - Pro - Easier to reason about from an individual message perspective - Architecturally decoupled from Storm - This sets us up if we want to consider other streaming technologies - Fewer bolts - spout -> enrichment bolt -> threatintel bolt -> output bolt - Way fewer network hops per message - currently 2n+1 where n is the number of enrichments used (if using stellar subgroups, each subgroup is a hop) - Easier to reason about from a performance perspective - We trade cache size and eviction timeout for threadpool size - We set ourselves up to have stellar subgroups with dependencies - i.e. stellar subgroups that depend on the output of other subgroups - If we do this, we can shrink the topology to just spout -> enrichment/threat intel -> output - Con - We can no longer tune stellar enrichments independent from HBase enrichments - To be fair, with enrichments moving to stellar, this is the case in the split/join approach too - No idea about performance What I propose is to submit a PR that will deliver an alternative, completely backwards compatible topology for enrichment that you can use by adjusting the start_enrichment_topology.sh script to use remote-unified.yaml instead of remote.yaml. If we live with it for a while and have some good experiences with it, maybe we can consider retiring the old enrichment topology. Thoughts? Keep me honest; if I have over or understated the issues for split/join or missed some important architectural issue let me know. I'm going to submit a PR to this effect by the EOD today so things will be more obvious.
[DISCUSS] Alternatives to split/join enrichment
Hi all, I've been thinking and working on something that I wanted to get some feedback on. The way that we do our enrichments, the split/join architecture was created to effectively to parallel enrichments in a storm-like way in contrast to OpenSoc. There are some good parts to this architecture: - It works, enrichments are done in parallel - You can tune individual enrichments differently - It's very storm-like There are also some deficiencies: - It's hard to reason about - Understanding the latency of enriching a message requires looking at multiple bolts that each give summary statistics - The join bolt's cache is really hard to reason about when performance tuning - During spikes in traffic, you can overload the join bolt's cache and drop messages if you aren't careful - In general, it's hard to associate a cache size and a duration kept in cache with throughput and latency - There are a lot of network hops per message - Right now we are stuck at 2 stages of transformations being done (enrichment and threat intel). It's very possible that you might want stellar enrichments to depend on the output of other stellar enrichments. In order to implement this in split/join you'd have to create a cycle in the storm topology I propose a change. I propose that we move to a model where we do enrichments in a single bolt in parallel using a static threadpool (e.g. multiple workers in the same process would share the threadpool). IN all other ways, this would be backwards compatible. A transparent drop-in for the existing enrichment topology. There are some pros/cons about this too: - Pro - Easier to reason about from an individual message perspective - Architecturally decoupled from Storm - This sets us up if we want to consider other streaming technologies - Fewer bolts - spout -> enrichment bolt -> threatintel bolt -> output bolt - Way fewer network hops per message - currently 2n+1 where n is the number of enrichments used (if using stellar subgroups, each subgroup is a hop) - Easier to reason about from a performance perspective - We trade cache size and eviction timeout for threadpool size - We set ourselves up to have stellar subgroups with dependencies - i.e. stellar subgroups that depend on the output of other subgroups - If we do this, we can shrink the topology to just spout -> enrichment/threat intel -> output - Con - We can no longer tune stellar enrichments independent from HBase enrichments - To be fair, with enrichments moving to stellar, this is the case in the split/join approach too - No idea about performance What I propose is to submit a PR that will deliver an alternative, completely backwards compatible topology for enrichment that you can use by adjusting the start_enrichment_topology.sh script to use remote-unified.yaml instead of remote.yaml. If we live with it for a while and have some good experiences with it, maybe we can consider retiring the old enrichment topology. Thoughts? Keep me honest; if I have over or understated the issues for split/join or missed some important architectural issue let me know. I'm going to submit a PR to this effect by the EOD today so things will be more obvious.
[GitHub] metron issue #924: METRON-1299 In MetronError tests, don't test for HostName...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/924 +1 pending @cestella's approval. Thanks @ottobackwards. ---