[GitHub] metron issue #940: METRON-1460: Create a complementary non-split-join enrich...

2018-02-22 Thread mraliagha
Github user mraliagha commented on the issue:

https://github.com/apache/metron/pull/940
  
Is there any document somewhere to show how the previous approach was 
implemented? I would like to understand the previous architecture in details. 
Becuase some of the pros/cons didn't make sense to me. Maybe I can help to 
predict what the impact will be. Thanks. 


---


[GitHub] metron pull request #853: METRON-1337: List of facets should not be hardcode...

2018-02-22 Thread ottobackwards
Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/853#discussion_r170125479
  
--- Diff: 
metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/AlertServiceImpl.java
 ---
@@ -37,15 +47,21 @@
 @Service
 public class AlertServiceImpl implements AlertService {
 
--- End diff --

I think that is great.  At some point, if we are going to do this a few 
times we should have a standard naming convention ( separators at least or 
something).  But that isn't a thing for this PR


---


[GitHub] metron pull request #853: METRON-1337: List of facets should not be hardcode...

2018-02-22 Thread merrimanr
Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/853#discussion_r170120909
  
--- Diff: 
metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/AlertServiceImpl.java
 ---
@@ -37,15 +47,21 @@
 @Service
 public class AlertServiceImpl implements AlertService {
 
--- End diff --

Name is changed in the latest commit.  Let me know what you think.


---


[GitHub] metron pull request #858: METRON-1344: Externalize the infrastructural compo...

2018-02-22 Thread merrimanr
Github user merrimanr closed the pull request at:

https://github.com/apache/metron/pull/858


---


[GitHub] metron pull request #941: METRON-1355: Convert metron-elasticsearch to new i...

2018-02-22 Thread merrimanr
GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/941

METRON-1355: Convert metron-elasticsearch to new infrastructure

## Contributor Comments
This PR switches metron-elasticsearch integration tests from using 
in-memory components to the e2e Docker infrastructure.  A high-level summary of 
the changes:

- Updated travis to only run the metron-elasticsearch tests
- Updated the Elasticsearch Docker image to the same version as Metron
- Removed the requirement to build a base "metron-centos" image.  Removes 
an extra Docker build step and I'm not sure caching this helps much anyways.
- Updated each integration test to use index names that are namespaced with 
the test class name
- Replaced the in-memory component setup with an Elasticsearch client 
configured for the Elasticsearch Docker container
- Added steps to setup/delete indices before/after each test
- Moved Elasticsearch in-memory helper methods and common integration tests 
methods to a utils class
- Fixed a minor bug in the ElasticsearchMetaAlertDao class where the index 
is always hardcoded to "metaalert"
- Added some initial documentation for the metron-docker-e2e module

The scope of this PR are 3 of the 4 metron-elasticsearch integration tests:

- 
org.apache.metron.elasticsearch.integration.ElasticsearchMetaAlertIntegrationTest
- 
org.apache.metron.elasticsearch.integration.ElasticsearchSearchIntegrationTest
- 
org.apache.metron.elasticsearch.integration.ElasticsearchUpdateIntegrationTest

Most of the test logic in 
org.apache.metron.elasticsearch.integration.ElasticsearchIndexingIntegrationTest
 is actually in metron-indexing so that test will be updated when we convert 
that module.

At this point only the metron-elasticsearch tests are run in travis.  My 
plan is to slowly add tests for each module until we reach feature parity with 
master.  At that point the "install" section of .travis.yml will be the same.  
The alerts ui e2e tests were removed for now until the work being done to 
stabilize them is complete.

The ES in-memory component starts up pretty fast so performance improved 
for the 3 tests by about 7 seconds.  Curious to hear what people think.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your 

[GitHub] metron pull request #940: Single bolt split join poc

2018-02-22 Thread cestella
Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/940#discussion_r170089790
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/UnifiedEnrichmentBolt.java
 ---
@@ -0,0 +1,323 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.bolt;
+
+import org.apache.metron.common.Constants;
+import org.apache.metron.common.bolt.ConfiguredEnrichmentBolt;
+import org.apache.metron.common.configuration.ConfigurationType;
+import 
org.apache.metron.common.configuration.enrichment.SensorEnrichmentConfig;
+import 
org.apache.metron.common.configuration.enrichment.handler.ConfigHandler;
+import org.apache.metron.common.error.MetronError;
+import org.apache.metron.common.performance.PerformanceLogger;
+import org.apache.metron.common.utils.ErrorUtils;
+import org.apache.metron.common.utils.MessageUtils;
+import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase;
+import org.apache.metron.enrichment.configuration.Enrichment;
+import org.apache.metron.enrichment.interfaces.EnrichmentAdapter;
+import org.apache.metron.enrichment.parallel.EnrichmentContext;
+import org.apache.metron.enrichment.parallel.EnrichmentStrategies;
+import org.apache.metron.enrichment.parallel.ParallelEnricher;
+import org.apache.metron.enrichment.parallel.WorkerPoolStrategy;
+import org.apache.metron.stellar.dsl.Context;
+import org.apache.metron.stellar.dsl.StellarFunction;
+import org.apache.metron.stellar.dsl.StellarFunctions;
+import org.apache.storm.task.OutputCollector;
+import org.apache.storm.task.TopologyContext;
+import org.apache.storm.topology.OutputFieldsDeclarer;
+import org.apache.storm.tuple.Fields;
+import org.apache.storm.tuple.Tuple;
+import org.apache.storm.tuple.Values;
+import org.json.simple.JSONObject;
+import org.json.simple.parser.JSONParser;
+import org.json.simple.parser.ParseException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.UnsupportedEncodingException;
+import java.lang.invoke.MethodHandles;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+
--- End diff --

Ok, I went through and did more rigorous documentation throughout the new 
classes.  Let me know if it makes sense or if there are still issues.


---


[GitHub] metron pull request #940: Single bolt split join poc

2018-02-22 Thread cestella
Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/940#discussion_r170057327
  
--- Diff: 
metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/UnifiedEnrichmentBolt.java
 ---
@@ -0,0 +1,323 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.enrichment.bolt;
+
+import org.apache.metron.common.Constants;
+import org.apache.metron.common.bolt.ConfiguredEnrichmentBolt;
+import org.apache.metron.common.configuration.ConfigurationType;
+import 
org.apache.metron.common.configuration.enrichment.SensorEnrichmentConfig;
+import 
org.apache.metron.common.configuration.enrichment.handler.ConfigHandler;
+import org.apache.metron.common.error.MetronError;
+import org.apache.metron.common.performance.PerformanceLogger;
+import org.apache.metron.common.utils.ErrorUtils;
+import org.apache.metron.common.utils.MessageUtils;
+import org.apache.metron.enrichment.adapters.geo.GeoLiteDatabase;
+import org.apache.metron.enrichment.configuration.Enrichment;
+import org.apache.metron.enrichment.interfaces.EnrichmentAdapter;
+import org.apache.metron.enrichment.parallel.EnrichmentContext;
+import org.apache.metron.enrichment.parallel.EnrichmentStrategies;
+import org.apache.metron.enrichment.parallel.ParallelEnricher;
+import org.apache.metron.enrichment.parallel.WorkerPoolStrategy;
+import org.apache.metron.stellar.dsl.Context;
+import org.apache.metron.stellar.dsl.StellarFunction;
+import org.apache.metron.stellar.dsl.StellarFunctions;
+import org.apache.storm.task.OutputCollector;
+import org.apache.storm.task.TopologyContext;
+import org.apache.storm.topology.OutputFieldsDeclarer;
+import org.apache.storm.tuple.Fields;
+import org.apache.storm.tuple.Tuple;
+import org.apache.storm.tuple.Values;
+import org.json.simple.JSONObject;
+import org.json.simple.parser.JSONParser;
+import org.json.simple.parser.ParseException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.UnsupportedEncodingException;
+import java.lang.invoke.MethodHandles;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+
--- End diff --

Yeah, good call.


---


Re: [DISCUSS] Alternatives to split/join enrichment

2018-02-22 Thread Casey Stella
FYI, the PR for this is up at https://github.com/apache/metron/pull/940
For those interested, please comment on the actual implementation there.

On Thu, Feb 22, 2018 at 12:43 PM, Casey Stella  wrote:

> So, these are good questions, as usual Otto :)
>
> > how does this effect the distribution of work through the cluster, and
> resiliency of the topologies?
>
> This moves us to a data parallelism scheme rather than a task parallelism
> scheme.  This, in effect means, that we will not be distributing the
> partial enrichments across the network for a given message, but rather
> distributing the messages across the network for *full* enrichment.  So,
> the bundle of work is the same, but we're not concentrating capabilities in
> specific workers.  Then again, as soon as we moved to stellar enrichments
> and sub-groups where you can interact with hbase or geo from within
> stellar, we sorta abandoned specialization.  Resiliency shouldn't be
> effected and, indeed, it should be easier to reason about.  We ack after
> every bolt in the new scheme rather than avoid acking until we join and ack
> the original tuple.  In fact, I'm still not convinced there's not a bug
> somewhere in that join bolt that makes it so we don't ack the right tuple.
>
> > Is anyone else doing it like this?
>
> The stormy way of doing this is to specialize in the bolts and join, no
> doubt, in a fan-out/fan-in pattern.  I do not think it's unheard of,
> though, to use a threadpool.  It's slightly peculiar inasmuch as storm has
> its own threading model, but it is an embarassingly parallel task and the
> main shift is trading the unit of parallelism from enrichment task to
> message to the gain of fewer network hops.  That being said, as long as
> you're not emitting from a different thread that you are receiving from,
> there's no technical limitation.
>
> > Can we have multiple thread pools and group tasks together ( or separate
> them ) wrt hbase?
>
> We could, but I think we might consider starting with just a simple static
> threadpool that we configure at the topology level (e.g. multiple worker
> threads can share the same threadpool that we can configure).  I think as
> the trend of moving everything to stellar continues, we may end up in a
> situation where we don't have a coherent or clear way to differentiate
> between thread pools like we do now.
>
> > Also, how are we to measure the effect?
>
> Well, some of the benefits here are at an architectural/feature level, the
> most exciting of which is that this approach opens up avenues for stellar
> subgroups to depend on each other.  Slightly less exciting, but still nice
> is the fact that this normalizes us with *other* streaming technologies and
> the decoupling work done as part of the PR (soon to be released) will make
> it easy to transition if we so desire.  Beyond that, for performance,
> someone will have to run some performance tests or try it out in a
> situation where they're having some enrichment performance issues.  Until
> we do that, I think we should probably just keep it as a parallel approach
> that you can swap out if you so desire.
>
> On Thu, Feb 22, 2018 at 11:48 AM, Otto Fowler 
> wrote:
>
>> This sounds worth exploring.  A couple of questions:
>>
>> * how does this effect the distribution of work through the cluster, and
>> resiliency of the topologies?
>> * Is anyone else doing it like this?
>> * Can we have multiple thread pools and group tasks together ( or
>> separate them ) wrt hbase?
>>
>>
>>
>> On February 22, 2018 at 11:32:39, Casey Stella (ceste...@gmail.com)
>> wrote:
>>
>> Hi all,
>>
>> I've been thinking and working on something that I wanted to get some
>> feedback on. The way that we do our enrichments, the split/join
>> architecture was created to effectively to parallel enrichments in a
>> storm-like way in contrast to OpenSoc.
>>
>> There are some good parts to this architecture:
>>
>> - It works, enrichments are done in parallel
>> - You can tune individual enrichments differently
>> - It's very storm-like
>>
>> There are also some deficiencies:
>>
>> - It's hard to reason about
>> - Understanding the latency of enriching a message requires looking
>> at multiple bolts that each give summary statistics
>> - The join bolt's cache is really hard to reason about when performance
>> tuning
>> - During spikes in traffic, you can overload the join bolt's cache
>> and drop messages if you aren't careful
>> - In general, it's hard to associate a cache size and a duration kept
>> in cache with throughput and latency
>> - There are a lot of network hops per message
>> - Right now we are stuck at 2 stages of transformations being done
>> (enrichment and threat intel). It's very possible that you might want
>> stellar enrichments to depend on the output of other stellar enrichments.
>> In order to implement this in split/join you'd have to create a cycle in
>> the storm topology
>>
>> I propose a 

[GitHub] metron pull request #940: Single bolt split join poc

2018-02-22 Thread cestella
GitHub user cestella opened a pull request:

https://github.com/apache/metron/pull/940

Single bolt split join poc

## Contributor Comments
There are some deficiencies to the split/join topology.

It's hard to reason about
* Understanding the latency of enriching a message requires looking at 
multiple bolts that each give summary statistics
* The join bolt's cache is really hard to reason about when performance 
tuning
* During spikes in traffic, you can overload the join bolt's cache and drop 
messages if you aren't careful
* In general, it's hard to associate a cache size and a duration kept in 
cache with throughput and latency
* There are a lot of network hops per message
* Right now we are stuck at 2 stages of transformations being done 
(enrichment and threat intel).  It's very possible that you might want stellar 
enrichments to depend on the output of other stellar enrichments.  In order to 
implement this in split/join you'd have to create a cycle in the storm topology

I propose that we move to a model where we do enrichments in a single bolt 
in parallel using a static threadpool (e.g. multiple workers in the same 
process would share the threadpool).  IN all other ways, this would be 
backwards compatible.  A transparent drop-in for the existing enrichment 
topology.
There are some pros/cons about this too:
* Pro
  * Easier to reason about from an individual message perspective
  * Architecturally decoupled from Storm
  * This sets us up if we want to consider other streaming technologies
  * Fewer bolts
* spout -> enrichment bolt -> threatintel bolt -> output bolt
  * Way fewer network hops per message
currently 2n+1 where n is the number of enrichments used (if using stellar 
subgroups, each subgroup is a hop)
  * Easier to reason about from a performance perspective
  * We trade cache size and eviction timeout for threadpool size
  * We set ourselves up to have stellar subgroups with dependencies
i.e. stellar subgroups that depend on the output of other subgroups
If we do this, we can shrink the topology to just spout -> 
enrichment/threat intel -> output
* Con
  * We can no longer tune stellar enrichments independent from HBase 
enrichments
* To be fair, with enrichments moving to stellar, this is the case in 
the split/join approach too
  * No idea about performance

What I propose is to submit a PR that will deliver an alternative, 
completely backwards compatible topology for enrichment that you can use by 
adjusting the `start_enrichment_topology.sh` script to use 
`remote-unified.yaml` instead of `remote.yaml`.  If we live with it for a while 
and have some good experiences with it, maybe we can consider retiring the old 
enrichment topology.

To test this, spin up vagrant and edit 
`$METRON_HOME/bin/start_enrichment_topology.sh` to use `remote-unified.yaml` 
instead of `remote.yaml`.  Restart enrichment and you should see a topology 
that looks something like:

![image](https://user-images.githubusercontent.com/540359/36556636-e0ae092e-17d3-11e8-9e45-5160b4f23451.png)



## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [ ] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [ ] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [ ] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [ ] If adding new dependencies to the code, are these 

Re: [DISCUSS] Alternatives to split/join enrichment

2018-02-22 Thread Casey Stella
So, these are good questions, as usual Otto :)

> how does this effect the distribution of work through the cluster, and
resiliency of the topologies?

This moves us to a data parallelism scheme rather than a task parallelism
scheme.  This, in effect means, that we will not be distributing the
partial enrichments across the network for a given message, but rather
distributing the messages across the network for *full* enrichment.  So,
the bundle of work is the same, but we're not concentrating capabilities in
specific workers.  Then again, as soon as we moved to stellar enrichments
and sub-groups where you can interact with hbase or geo from within
stellar, we sorta abandoned specialization.  Resiliency shouldn't be
effected and, indeed, it should be easier to reason about.  We ack after
every bolt in the new scheme rather than avoid acking until we join and ack
the original tuple.  In fact, I'm still not convinced there's not a bug
somewhere in that join bolt that makes it so we don't ack the right tuple.

> Is anyone else doing it like this?

The stormy way of doing this is to specialize in the bolts and join, no
doubt, in a fan-out/fan-in pattern.  I do not think it's unheard of,
though, to use a threadpool.  It's slightly peculiar inasmuch as storm has
its own threading model, but it is an embarassingly parallel task and the
main shift is trading the unit of parallelism from enrichment task to
message to the gain of fewer network hops.  That being said, as long as
you're not emitting from a different thread that you are receiving from,
there's no technical limitation.

> Can we have multiple thread pools and group tasks together ( or separate
them ) wrt hbase?

We could, but I think we might consider starting with just a simple static
threadpool that we configure at the topology level (e.g. multiple worker
threads can share the same threadpool that we can configure).  I think as
the trend of moving everything to stellar continues, we may end up in a
situation where we don't have a coherent or clear way to differentiate
between thread pools like we do now.

> Also, how are we to measure the effect?

Well, some of the benefits here are at an architectural/feature level, the
most exciting of which is that this approach opens up avenues for stellar
subgroups to depend on each other.  Slightly less exciting, but still nice
is the fact that this normalizes us with *other* streaming technologies and
the decoupling work done as part of the PR (soon to be released) will make
it easy to transition if we so desire.  Beyond that, for performance,
someone will have to run some performance tests or try it out in a
situation where they're having some enrichment performance issues.  Until
we do that, I think we should probably just keep it as a parallel approach
that you can swap out if you so desire.

On Thu, Feb 22, 2018 at 11:48 AM, Otto Fowler 
wrote:

> This sounds worth exploring.  A couple of questions:
>
> * how does this effect the distribution of work through the cluster, and
> resiliency of the topologies?
> * Is anyone else doing it like this?
> * Can we have multiple thread pools and group tasks together ( or separate
> them ) wrt hbase?
>
>
>
> On February 22, 2018 at 11:32:39, Casey Stella (ceste...@gmail.com) wrote:
>
> Hi all,
>
> I've been thinking and working on something that I wanted to get some
> feedback on. The way that we do our enrichments, the split/join
> architecture was created to effectively to parallel enrichments in a
> storm-like way in contrast to OpenSoc.
>
> There are some good parts to this architecture:
>
> - It works, enrichments are done in parallel
> - You can tune individual enrichments differently
> - It's very storm-like
>
> There are also some deficiencies:
>
> - It's hard to reason about
> - Understanding the latency of enriching a message requires looking
> at multiple bolts that each give summary statistics
> - The join bolt's cache is really hard to reason about when performance
> tuning
> - During spikes in traffic, you can overload the join bolt's cache
> and drop messages if you aren't careful
> - In general, it's hard to associate a cache size and a duration kept
> in cache with throughput and latency
> - There are a lot of network hops per message
> - Right now we are stuck at 2 stages of transformations being done
> (enrichment and threat intel). It's very possible that you might want
> stellar enrichments to depend on the output of other stellar enrichments.
> In order to implement this in split/join you'd have to create a cycle in
> the storm topology
>
> I propose a change. I propose that we move to a model where we do
> enrichments in a single bolt in parallel using a static threadpool (e.g.
> multiple workers in the same process would share the threadpool). IN all
> other ways, this would be backwards compatible. A transparent drop-in for
> the existing enrichment topology.
>
> There are some pros/cons about this too:
>
> - Pro
> - 

Re: [DISCUSS] Alternatives to split/join enrichment

2018-02-22 Thread Otto Fowler
Also, how are we to measure the effect?  Not to get all six sigma ;)


On February 22, 2018 at 11:48:41, Otto Fowler (ottobackwa...@gmail.com)
wrote:

This sounds worth exploring.  A couple of questions:

* how does this effect the distribution of work through the cluster, and
resiliency of the topologies?
* Is anyone else doing it like this?
* Can we have multiple thread pools and group tasks together ( or separate
them ) wrt hbase?



On February 22, 2018 at 11:32:39, Casey Stella (ceste...@gmail.com) wrote:

Hi all,

I've been thinking and working on something that I wanted to get some
feedback on. The way that we do our enrichments, the split/join
architecture was created to effectively to parallel enrichments in a
storm-like way in contrast to OpenSoc.

There are some good parts to this architecture:

- It works, enrichments are done in parallel
- You can tune individual enrichments differently
- It's very storm-like

There are also some deficiencies:

- It's hard to reason about
- Understanding the latency of enriching a message requires looking
at multiple bolts that each give summary statistics
- The join bolt's cache is really hard to reason about when performance
tuning
- During spikes in traffic, you can overload the join bolt's cache
and drop messages if you aren't careful
- In general, it's hard to associate a cache size and a duration kept
in cache with throughput and latency
- There are a lot of network hops per message
- Right now we are stuck at 2 stages of transformations being done
(enrichment and threat intel). It's very possible that you might want
stellar enrichments to depend on the output of other stellar enrichments.
In order to implement this in split/join you'd have to create a cycle in
the storm topology

I propose a change. I propose that we move to a model where we do
enrichments in a single bolt in parallel using a static threadpool (e.g.
multiple workers in the same process would share the threadpool). IN all
other ways, this would be backwards compatible. A transparent drop-in for
the existing enrichment topology.

There are some pros/cons about this too:

- Pro
- Easier to reason about from an individual message perspective
- Architecturally decoupled from Storm
- This sets us up if we want to consider other streaming
technologies
- Fewer bolts
- spout -> enrichment bolt -> threatintel bolt -> output bolt
- Way fewer network hops per message
- currently 2n+1 where n is the number of enrichments used (if
using stellar subgroups, each subgroup is a hop)
- Easier to reason about from a performance perspective
- We trade cache size and eviction timeout for threadpool size
- We set ourselves up to have stellar subgroups with dependencies
- i.e. stellar subgroups that depend on the output of other
subgroups
- If we do this, we can shrink the topology to just spout ->
enrichment/threat intel -> output
- Con
- We can no longer tune stellar enrichments independent from HBase
enrichments
- To be fair, with enrichments moving to stellar, this is the case
in the split/join approach too
- No idea about performance


What I propose is to submit a PR that will deliver an alternative,
completely backwards compatible topology for enrichment that you can use by
adjusting the start_enrichment_topology.sh script to use
remote-unified.yaml instead of remote.yaml. If we live with it for a while
and have some good experiences with it, maybe we can consider retiring the
old enrichment topology.

Thoughts? Keep me honest; if I have over or understated the issues for
split/join or missed some important architectural issue let me know. I'm
going to submit a PR to this effect by the EOD today so things will be more
obvious.


Re: [DISCUSS] Alternatives to split/join enrichment

2018-02-22 Thread Otto Fowler
This sounds worth exploring.  A couple of questions:

* how does this effect the distribution of work through the cluster, and
resiliency of the topologies?
* Is anyone else doing it like this?
* Can we have multiple thread pools and group tasks together ( or separate
them ) wrt hbase?



On February 22, 2018 at 11:32:39, Casey Stella (ceste...@gmail.com) wrote:

Hi all,

I've been thinking and working on something that I wanted to get some
feedback on. The way that we do our enrichments, the split/join
architecture was created to effectively to parallel enrichments in a
storm-like way in contrast to OpenSoc.

There are some good parts to this architecture:

- It works, enrichments are done in parallel
- You can tune individual enrichments differently
- It's very storm-like

There are also some deficiencies:

- It's hard to reason about
- Understanding the latency of enriching a message requires looking
at multiple bolts that each give summary statistics
- The join bolt's cache is really hard to reason about when performance
tuning
- During spikes in traffic, you can overload the join bolt's cache
and drop messages if you aren't careful
- In general, it's hard to associate a cache size and a duration kept
in cache with throughput and latency
- There are a lot of network hops per message
- Right now we are stuck at 2 stages of transformations being done
(enrichment and threat intel). It's very possible that you might want
stellar enrichments to depend on the output of other stellar enrichments.
In order to implement this in split/join you'd have to create a cycle in
the storm topology

I propose a change. I propose that we move to a model where we do
enrichments in a single bolt in parallel using a static threadpool (e.g.
multiple workers in the same process would share the threadpool). IN all
other ways, this would be backwards compatible. A transparent drop-in for
the existing enrichment topology.

There are some pros/cons about this too:

- Pro
- Easier to reason about from an individual message perspective
- Architecturally decoupled from Storm
- This sets us up if we want to consider other streaming
technologies
- Fewer bolts
- spout -> enrichment bolt -> threatintel bolt -> output bolt
- Way fewer network hops per message
- currently 2n+1 where n is the number of enrichments used (if
using stellar subgroups, each subgroup is a hop)
- Easier to reason about from a performance perspective
- We trade cache size and eviction timeout for threadpool size
- We set ourselves up to have stellar subgroups with dependencies
- i.e. stellar subgroups that depend on the output of other
subgroups
- If we do this, we can shrink the topology to just spout ->
enrichment/threat intel -> output
- Con
- We can no longer tune stellar enrichments independent from HBase
enrichments
- To be fair, with enrichments moving to stellar, this is the case
in the split/join approach too
- No idea about performance


What I propose is to submit a PR that will deliver an alternative,
completely backwards compatible topology for enrichment that you can use by
adjusting the start_enrichment_topology.sh script to use
remote-unified.yaml instead of remote.yaml. If we live with it for a while
and have some good experiences with it, maybe we can consider retiring the
old enrichment topology.

Thoughts? Keep me honest; if I have over or understated the issues for
split/join or missed some important architectural issue let me know. I'm
going to submit a PR to this effect by the EOD today so things will be more
obvious.


[DISCUSS] Alternatives to split/join enrichment

2018-02-22 Thread Casey Stella
Hi all,

I've been thinking and working on something that I wanted to get some
feedback on.  The way that we do our enrichments, the split/join
architecture was created to effectively to parallel enrichments in a
storm-like way in contrast to OpenSoc.

There are some good parts to this architecture:

   - It works, enrichments are done in parallel
   - You can tune individual enrichments differently
   - It's very storm-like

There are also some deficiencies:

   - It's hard to reason about
  - Understanding the latency of enriching a message requires looking
  at multiple bolts that each give summary statistics
   - The join bolt's cache is really hard to reason about when performance
   tuning
  - During spikes in traffic, you can overload the join bolt's cache
  and drop messages if you aren't careful
  - In general, it's hard to associate a cache size and a duration kept
  in cache with throughput and latency
   - There are a lot of network hops per message
   - Right now we are stuck at 2 stages of transformations being done
   (enrichment and threat intel).  It's very possible that you might want
   stellar enrichments to depend on the output of other stellar enrichments.
   In order to implement this in split/join you'd have to create a cycle in
   the storm topology

I propose a change.  I propose that we move to a model where we do
enrichments in a single bolt in parallel using a static threadpool (e.g.
multiple workers in the same process would share the threadpool).  IN all
other ways, this would be backwards compatible.  A transparent drop-in for
the existing enrichment topology.

There are some pros/cons about this too:

   - Pro
  - Easier to reason about from an individual message perspective
  - Architecturally decoupled from Storm
 - This sets us up if we want to consider other streaming
 technologies
  - Fewer bolts
 - spout -> enrichment bolt -> threatintel bolt -> output bolt
  - Way fewer network hops per message
 - currently 2n+1 where n is the number of enrichments used (if
 using stellar subgroups, each subgroup is a hop)
  - Easier to reason about from a performance perspective
 - We trade cache size and eviction timeout for threadpool size
  - We set ourselves up to have stellar subgroups with dependencies
 - i.e. stellar subgroups that depend on the output of other
 subgroups
 - If we do this, we can shrink the topology to just spout ->
 enrichment/threat intel -> output
  - Con
  - We can no longer tune stellar enrichments independent from HBase
  enrichments
 - To be fair, with enrichments moving to stellar, this is the case
 in the split/join approach too
  - No idea about performance


What I propose is to submit a PR that will deliver an alternative,
completely backwards compatible topology for enrichment that you can use by
adjusting the start_enrichment_topology.sh script to use
remote-unified.yaml instead of remote.yaml.  If we live with it for a while
and have some good experiences with it, maybe we can consider retiring the
old enrichment topology.

Thoughts?  Keep me honest; if I have over or understated the issues for
split/join or missed some important architectural issue let me know.  I'm
going to submit a PR to this effect by the EOD today so things will be more
obvious.


[GitHub] metron issue #924: METRON-1299 In MetronError tests, don't test for HostName...

2018-02-22 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/924
  
+1 pending @cestella's approval.  Thanks @ottobackwards.


---