[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290249#comment-16290249
 ] 

ASF GitHub Bot commented on METRON-1349:


Github user ottobackwards commented on the issue:

https://github.com/apache/metron/pull/866
  
Do you mean I should see two cowsay building metrons before and only one 
after?


> Full Dev Builds Metron Twice
> 
>
> Key: METRON-1349
> URL: https://issues.apache.org/jira/browse/METRON-1349
> Project: Metron
>  Issue Type: Bug
>Reporter: Nick Allen
>Assignee: Nick Allen
>
> When deploying Metron in Full Dev, the "Build Metron" step gets run twice.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1352) Integration and e2e test infrastructure

2017-12-13 Thread Otto Fowler (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290240#comment-16290240
 ] 

Otto Fowler commented on METRON-1352:
-

Do we need multiple compose configurations, a "base" one, and then one with 
rest, and one with the UI's etc?  Or some sane set of other composition, that 
lets you start different combinations of services?  Or is there a more "docker" 
way to do it?


> Integration and e2e test infrastructure
> ---
>
> Key: METRON-1352
> URL: https://issues.apache.org/jira/browse/METRON-1352
> Project: Metron
>  Issue Type: New Feature
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>
> This feature is based on the work done in 
> https://issues.apache.org/jira/browse/METRON-1344.  That feature branch 
> serves as the base branch for this Jira and includes:
> # Dockerfile for Metron REST
> # Dockerfile for Metron UIs
> # Docker Compose application including Metron images, Elasticsearch, Kafka, 
> Zookeeper
> # Modified travis file that manages the Docker environment and runs the e2e 
> tests as part of the build
> # Maven pom.xml that installs all the required assets into the Docker e2e 
> module
> # Modified metron-alerts pom.xml that allows e2e tests to be run through Maven
> # An example integration test that has been converted to use the new 
> infrastructure
> The initial requirements are as follows:
> # All e2e and integration tests run on common infrastructure.
> # All e2e and integration tests are run automatically in the Travis build.
> # All e2e and integration tests run repeatably and reliably in the Travis 
> build.
> # Debugging options are available and documented.
> # The new infra and how to interact with it is documented.
> # Old infrastructure removed (anything unused or commented out is deleted, 
> instead of staying).
> These requirements are being actively 
> [discussed|https://lists.apache.org/thread.html/2aaa5a4a66ebdbf41323ea5ad7b059e5acd0a315d57ff871e0a7817e@%3Cdev.metron.apache.org%3E]
>  on dev list and are subject to change. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1302) Split up Indexing Topology into batch and random access sections

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290238#comment-16290238
 ] 

ASF GitHub Bot commented on METRON-1302:


Github user ottobackwards commented on the issue:

https://github.com/apache/metron/pull/831
  
The batch v. hdfs stuff still confuses me, I thought we decided on a 
different name?


> Split up Indexing Topology into batch and random access sections
> 
>
> Key: METRON-1302
> URL: https://issues.apache.org/jira/browse/METRON-1302
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Currently we have the indexing topology handle writing to both random access 
> indices (e.g. elasticsearch) as well as batch write indices (e.g. hdfs).  We 
> should split these up and configure them separately.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1302) Split up Indexing Topology into batch and random access sections

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290232#comment-16290232
 ] 

ASF GitHub Bot commented on METRON-1302:


Github user cestella commented on the issue:

https://github.com/apache/metron/pull/831
  
I think I managed to address the issues here.  Is there anything else 
outstanding that I missed?  If not, then bump.


> Split up Indexing Topology into batch and random access sections
> 
>
> Key: METRON-1302
> URL: https://issues.apache.org/jira/browse/METRON-1302
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Currently we have the indexing topology handle writing to both random access 
> indices (e.g. elasticsearch) as well as batch write indices (e.g. hdfs).  We 
> should split these up and configure them separately.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1359) Create guide for managing and interacting with new test infrastructure

2017-12-13 Thread Ryan Merriman (JIRA)
Ryan Merriman created METRON-1359:
-

 Summary: Create guide for managing and interacting with new test 
infrastructure
 Key: METRON-1359
 URL: https://issues.apache.org/jira/browse/METRON-1359
 Project: Metron
  Issue Type: Sub-task
Reporter: Ryan Merriman


A document should be create that clearly explains:
# How to manage the lifecycle of the infrastructure
# How to extend, add to, or modify the infrastructure
# How to learn more about the underlying technologies
# How to interact with the infrastructure at various levels



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1360) Clean up any old infrastructure

2017-12-13 Thread Ryan Merriman (JIRA)
Ryan Merriman created METRON-1360:
-

 Summary: Clean up any old infrastructure
 Key: METRON-1360
 URL: https://issues.apache.org/jira/browse/METRON-1360
 Project: Metron
  Issue Type: Sub-task
Reporter: Ryan Merriman






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1358) Create Debugging guide based on the new test infrastructure

2017-12-13 Thread Ryan Merriman (JIRA)
Ryan Merriman created METRON-1358:
-

 Summary: Create Debugging guide based on the new test 
infrastructure
 Key: METRON-1358
 URL: https://issues.apache.org/jira/browse/METRON-1358
 Project: Metron
  Issue Type: Sub-task
Reporter: Ryan Merriman


A document should be created that provides guidance on how to debug each Metron 
module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1357) Optimize travis file

2017-12-13 Thread Ryan Merriman (JIRA)
Ryan Merriman created METRON-1357:
-

 Summary: Optimize travis file
 Key: METRON-1357
 URL: https://issues.apache.org/jira/browse/METRON-1357
 Project: Metron
  Issue Type: Sub-task
Reporter: Ryan Merriman


After all tests have been converted, the Travis config file should be optimized 
for performance and stability. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1356) Add a mechanism in Java for discovering service host/ports

2017-12-13 Thread Ryan Merriman (JIRA)
Ryan Merriman created METRON-1356:
-

 Summary: Add a mechanism in Java for discovering service host/ports
 Key: METRON-1356
 URL: https://issues.apache.org/jira/browse/METRON-1356
 Project: Metron
  Issue Type: Sub-task
Reporter: Ryan Merriman


Integration tests will need to initialize clients with service urls.  These may 
change depending on where and how the infrastructure is run (Docker engine vs 
Docker for Mac).  It would be helpful to have a unified way of retrieving these 
across all integration tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (METRON-1353) Create HBase Docker service

2017-12-13 Thread Ryan Merriman (JIRA)

 [ 
https://issues.apache.org/jira/browse/METRON-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Merriman reassigned METRON-1353:
-

Assignee: (was: Ryan Merriman)

> Create HBase Docker service
> ---
>
> Key: METRON-1353
> URL: https://issues.apache.org/jira/browse/METRON-1353
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>
> This task includes adding an HBase service to the Docker compose file either 
> as an image from DockerHub or a Dockerfile.  Also includes any relevant 
> configuration changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1355) Convert metron-elasticsearch to new infrastructure

2017-12-13 Thread Ryan Merriman (JIRA)
Ryan Merriman created METRON-1355:
-

 Summary: Convert metron-elasticsearch to new infrastructure
 Key: METRON-1355
 URL: https://issues.apache.org/jira/browse/METRON-1355
 Project: Metron
  Issue Type: Sub-task
Reporter: Ryan Merriman
Assignee: Ryan Merriman


Integration tests need to be converted to use the new infrastructure.  This 
includes:
# Updating clients with new infrastructure urls
# Adding a namespace to any assets the tests depend on
# Cleaning up interactions with old in memory infrastructure



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1354) Add all e2e tests back to protractor

2017-12-13 Thread Ryan Merriman (JIRA)
Ryan Merriman created METRON-1354:
-

 Summary: Add all e2e tests back to protractor
 Key: METRON-1354
 URL: https://issues.apache.org/jira/browse/METRON-1354
 Project: Metron
  Issue Type: Sub-task
Reporter: Ryan Merriman
Assignee: Ryan Merriman


In the base feature branch, all but one reliable e2e test has been temporarily 
been removed until they have all been stabilized.   Once that work has been 
completed they will need to be merged into this feature branch and updated to 
work with the new test infrastructure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1353) Create HBase Docker service

2017-12-13 Thread Ryan Merriman (JIRA)
Ryan Merriman created METRON-1353:
-

 Summary: Create HBase Docker service
 Key: METRON-1353
 URL: https://issues.apache.org/jira/browse/METRON-1353
 Project: Metron
  Issue Type: Sub-task
Reporter: Ryan Merriman
Assignee: Ryan Merriman


This task includes adding an HBase service to the Docker compose file either as 
an image from DockerHub or a Dockerfile.  Also includes any relevant 
configuration changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290018#comment-16290018
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156800428
  
--- Diff: metron-analytics/metron-statistics/README.md ---
@@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is 
used.
   * bounds - A list of value bounds (excluding min and max) in sorted 
order.
 * Returns: Which bin N the value falls in such that bound(N-1) < value <= 
bound(N).  No min and max bounds are provided, so values smaller than the 0'th 
bound go in the 0'th bin, and values greater than the last bound go in the M'th 
bin.
 
+### Sampling Functions
+
+ `SAMPLE_ADD`
+* Description: Add a value or collection of values to a sampler.
+* Input:
--- End diff --

Actually, more than likely it'd be a separate init since each type are 
going to have different types of parameters depending on the algorithm.  So a 
biased sampler would be `SAMPLE_INIT_BIASED(size, ...)`


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290013#comment-16290013
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156799854
  
--- Diff: 
metron-analytics/metron-statistics/src/main/java/org/apache/metron/statistics/sampling/UniformSampler.java
 ---
@@ -0,0 +1,91 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.statistics.sampling;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+
+public class UniformSampler implements Sampler {
+  private List reservoir;
+  private int seen = 0;
+  private int size;
+  private Random rng = new Random(0);
+
+  public UniformSampler() {
+this(DEFAULT_SIZE);
+  }
+
+  public UniformSampler(int size) {
+this.size = size;
+reservoir = new ArrayList<>(size);
+  }
+
+  @Override
+  public Iterable get() {
+return reservoir;
+  }
+
+  /**
+   * Add an object to the reservoir
+   * @param o
+   */
+  public void add(Object o) {
+if(o == null) {
+  return;
+}
+if (reservoir.size() < size) {
+  reservoir.add(o);
+} else {
+  int rIndex = rng.nextInt(seen + 1);
--- End diff --

Shouldn't we reference Reservoir Sampling in the documentation?  Then the 
use of Universal and other terms would be more in context.


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290014#comment-16290014
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156799950
  
--- Diff: 
metron-analytics/metron-statistics/src/main/java/org/apache/metron/statistics/sampling/UniformSampler.java
 ---
@@ -0,0 +1,91 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.statistics.sampling;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+
+public class UniformSampler implements Sampler {
+  private List reservoir;
+  private int seen = 0;
+  private int size;
+  private Random rng = new Random(0);
+
+  public UniformSampler() {
+this(DEFAULT_SIZE);
+  }
+
+  public UniformSampler(int size) {
+this.size = size;
+reservoir = new ArrayList<>(size);
+  }
+
+  @Override
+  public Iterable get() {
+return reservoir;
+  }
+
+  /**
+   * Add an object to the reservoir
+   * @param o
+   */
+  public void add(Object o) {
+if(o == null) {
+  return;
+}
+if (reservoir.size() < size) {
+  reservoir.add(o);
+} else {
+  int rIndex = rng.nextInt(seen + 1);
--- End diff --

This makes me think that we need "namespace" scoped documentation


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (METRON-1352) Integration and e2e test infrastructure

2017-12-13 Thread Ryan Merriman (JIRA)

 [ 
https://issues.apache.org/jira/browse/METRON-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Merriman updated METRON-1352:
--
Description: 
This feature is based on the work done in 
https://issues.apache.org/jira/browse/METRON-1344.  That feature branch serves 
as the base branch for this Jira and includes:

# Dockerfile for Metron REST
# Dockerfile for Metron UIs
# Docker Compose application including Metron images, Elasticsearch, Kafka, 
Zookeeper
# Modified travis file that manages the Docker environment and runs the e2e 
tests as part of the build
# Maven pom.xml that installs all the required assets into the Docker e2e module
# Modified metron-alerts pom.xml that allows e2e tests to be run through Maven
# An example integration test that has been converted to use the new 
infrastructure

The initial requirements are as follows:

# All e2e and integration tests run on common infrastructure.
# All e2e and integration tests are run automatically in the Travis build.
# All e2e and integration tests run repeatably and reliably in the Travis build.
# Debugging options are available and documented.
# The new infra and how to interact with it is documented.
# Old infrastructure removed (anything unused or commented out is deleted, 
instead of staying).

These requirements are being actively 
[discussed|https://lists.apache.org/thread.html/2aaa5a4a66ebdbf41323ea5ad7b059e5acd0a315d57ff871e0a7817e@%3Cdev.metron.apache.org%3E]
 on dev list and are subject to change. 

  was:
This feature is based on the work done in 
https://issues.apache.org/jira/browse/METRON-1344.  The initial requirements 
are as follows:

# All e2e and integration tests run on common infrastructure.
# All e2e and integration tests are run automatically in the Travis build.
# All e2e and integration tests run repeatably and reliably in the Travis build.
# Debugging options are available and documented.
# The new infra and how to interact with it is documented.
# Old infrastructure removed (anything unused or commented out is deleted, 
instead of staying).

These requirements are being actively 
[discussed|https://lists.apache.org/thread.html/2aaa5a4a66ebdbf41323ea5ad7b059e5acd0a315d57ff871e0a7817e@%3Cdev.metron.apache.org%3E]
 on dev list and are subject to change. 


> Integration and e2e test infrastructure
> ---
>
> Key: METRON-1352
> URL: https://issues.apache.org/jira/browse/METRON-1352
> Project: Metron
>  Issue Type: New Feature
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>
> This feature is based on the work done in 
> https://issues.apache.org/jira/browse/METRON-1344.  That feature branch 
> serves as the base branch for this Jira and includes:
> # Dockerfile for Metron REST
> # Dockerfile for Metron UIs
> # Docker Compose application including Metron images, Elasticsearch, Kafka, 
> Zookeeper
> # Modified travis file that manages the Docker environment and runs the e2e 
> tests as part of the build
> # Maven pom.xml that installs all the required assets into the Docker e2e 
> module
> # Modified metron-alerts pom.xml that allows e2e tests to be run through Maven
> # An example integration test that has been converted to use the new 
> infrastructure
> The initial requirements are as follows:
> # All e2e and integration tests run on common infrastructure.
> # All e2e and integration tests are run automatically in the Travis build.
> # All e2e and integration tests run repeatably and reliably in the Travis 
> build.
> # Debugging options are available and documented.
> # The new infra and how to interact with it is documented.
> # Old infrastructure removed (anything unused or commented out is deleted, 
> instead of staying).
> These requirements are being actively 
> [discussed|https://lists.apache.org/thread.html/2aaa5a4a66ebdbf41323ea5ad7b059e5acd0a315d57ff871e0a7817e@%3Cdev.metron.apache.org%3E]
>  on dev list and are subject to change. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290009#comment-16290009
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156799548
  
--- Diff: metron-analytics/metron-statistics/README.md ---
@@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is 
used.
   * bounds - A list of value bounds (excluding min and max) in sorted 
order.
 * Returns: Which bin N the value falls in such that bound(N-1) < value <= 
bound(N).  No min and max bounds are provided, so values smaller than the 0'th 
bound go in the 0'th bin, and values greater than the last bound go in the M'th 
bin.
 
+### Sampling Functions
+
+ `SAMPLE_ADD`
+* Description: Add a value or collection of values to a sampler.
+* Input:
--- End diff --

Then we'll have a get sample types method, like we do with other things 
like this right?


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289990#comment-16289990
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156796690
  
--- Diff: metron-analytics/metron-statistics/README.md ---
@@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is 
used.
   * bounds - A list of value bounds (excluding min and max) in sorted 
order.
 * Returns: Which bin N the value falls in such that bound(N-1) < value <= 
bound(N).  No min and max bounds are provided, so values smaller than the 0'th 
bound go in the 0'th bin, and values greater than the last bound go in the M'th 
bin.
 
+### Sampling Functions
+
+ `SAMPLE_ADD`
+* Description: Add a value or collection of values to a sampler.
+* Input:
--- End diff --

They're both needed.  Some use-cases would be fine without bias and some 
would be better with bias.  As a follow-on, I was planning on adding a biased 
sampler, but this is a big enough PR without it.  It'd look something like:
```
samples := SAMPLE_MERGE(PROFILE_GET('samples', ...))
biased_sample := SAMPLE_GET_BIASED(samples, 0.015)
```


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1352) Integration and e2e test infrastructure

2017-12-13 Thread Ryan Merriman (JIRA)
Ryan Merriman created METRON-1352:
-

 Summary: Integration and e2e test infrastructure
 Key: METRON-1352
 URL: https://issues.apache.org/jira/browse/METRON-1352
 Project: Metron
  Issue Type: New Feature
Reporter: Ryan Merriman
Assignee: Ryan Merriman


This feature is based on the work done in 
https://issues.apache.org/jira/browse/METRON-1344.  The initial requirements 
are as follows:

# All e2e and integration tests run on common infrastructure.
# All e2e and integration tests are run automatically in the Travis build.
# All e2e and integration tests run repeatably and reliably in the Travis build.
# Debugging options are available and documented.
# The new infra and how to interact with it is documented.
# Old infrastructure removed (anything unused or commented out is deleted, 
instead of staying).

These requirements are being actively 
[discussed|https://lists.apache.org/thread.html/2aaa5a4a66ebdbf41323ea5ad7b059e5acd0a315d57ff871e0a7817e@%3Cdev.metron.apache.org%3E]
 on dev list and are subject to change. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289978#comment-16289978
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user simonellistonball commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156794990
  
--- Diff: metron-analytics/metron-statistics/README.md ---
@@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is 
used.
   * bounds - A list of value bounds (excluding min and max) in sorted 
order.
 * Returns: Which bin N the value falls in such that bound(N-1) < value <= 
bound(N).  No min and max bounds are provided, so values smaller than the 0'th 
bound go in the 0'th bin, and values greater than the last bound go in the M'th 
bin.
 
+### Sampling Functions
+
+ `SAMPLE_ADD`
+* Description: Add a value or collection of values to a sampler.
+* Input:
--- End diff --

Recency would surely be more relevant for merged resampling in a profile 
context? 


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289975#comment-16289975
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156794655
  
--- Diff: metron-analytics/metron-statistics/README.md ---
@@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is 
used.
   * bounds - A list of value bounds (excluding min and max) in sorted 
order.
 * Returns: Which bin N the value falls in such that bound(N-1) < value <= 
bound(N).  No min and max bounds are provided, so values smaller than the 0'th 
bound go in the 0'th bin, and values greater than the last bound go in the M'th 
bin.
 
+### Sampling Functions
+
+ `SAMPLE_ADD`
+* Description: Add a value or collection of values to a sampler.
+* Input:
--- End diff --

There are definitely other types of reservoir samplers which we will 
probably want.  Most specifically a sampler that is biased toward recency (so 
non-uniform in that case).


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289955#comment-16289955
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156792855
  
--- Diff: metron-analytics/metron-statistics/README.md ---
@@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is 
used.
   * bounds - A list of value bounds (excluding min and max) in sorted 
order.
 * Returns: Which bin N the value falls in such that bound(N-1) < value <= 
bound(N).  No min and max bounds are provided, so values smaller than the 0'th 
bound go in the 0'th bin, and values greater than the last bound go in the M'th 
bin.
 
+### Sampling Functions
+
+ `SAMPLE_ADD`
+* Description: Add a value or collection of values to a sampler.
+* Input:
--- End diff --

Ok, It seemed like the Uniform implementation was leaking


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (METRON-1351) Create Installable Packages for Ubuntu Trusty

2017-12-13 Thread Nick Allen (JIRA)

 [ 
https://issues.apache.org/jira/browse/METRON-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Allen updated METRON-1351:
---
Summary: Create Installable Packages for Ubuntu Trusty  (was: Create Debian 
Packages for Ubuntu Trusty)

> Create Installable Packages for Ubuntu Trusty
> -
>
> Key: METRON-1351
> URL: https://issues.apache.org/jira/browse/METRON-1351
> Project: Metron
>  Issue Type: Improvement
>Affects Versions: 0.4.1
>Reporter: Nick Allen
>Assignee: Nick Allen
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289953#comment-16289953
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156792461
  
--- Diff: metron-analytics/metron-statistics/README.md ---
@@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is 
used.
   * bounds - A list of value bounds (excluding min and max) in sorted 
order.
 * Returns: Which bin N the value falls in such that bound(N-1) < value <= 
bound(N).  No min and max bounds are provided, so values smaller than the 0'th 
bound go in the 0'th bin, and values greater than the last bound go in the M'th 
bin.
 
+### Sampling Functions
+
+ `SAMPLE_ADD`
+* Description: Add a value or collection of values to a sampler.
+* Input:
--- End diff --

Couldn't this be simplified to

```java
 if(ret == null ) {
  if(obj != null) {
 throw new IllegalStateException(argName + "argument(" + obj
+ " is expected to be an " + 
expectedClazz.getName()
+ ", but was " + obj
);
   }
 }
return Optional.ofNullable(ret);


```


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289951#comment-16289951
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156792227
  
--- Diff: metron-analytics/metron-statistics/README.md ---
@@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is 
used.
   * bounds - A list of value bounds (excluding min and max) in sorted 
order.
 * Returns: Which bin N the value falls in such that bound(N-1) < value <= 
bound(N).  No min and max bounds are provided, so values smaller than the 0'th 
bound go in the 0'th bin, and values greater than the last bound go in the M'th 
bin.
 
+### Sampling Functions
+
+ `SAMPLE_ADD`
+* Description: Add a value or collection of values to a sampler.
+* Input:
--- End diff --

Sorry, `uniform` here is intended to mean that there's each element has 
equal probability of being in the sample (e.g. the probability is pulled from a 
[uniform probability 
distribution](https://en.wikipedia.org/wiki/Uniform_distribution_(continuous))).
  I can probably do a better job documenting. 


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289946#comment-16289946
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156788055
  
--- Diff: metron-analytics/metron-statistics/README.md ---
@@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is 
used.
   * bounds - A list of value bounds (excluding min and max) in sorted 
order.
 * Returns: Which bin N the value falls in such that bound(N-1) < value <= 
bound(N).  No min and max bounds are provided, so values smaller than the 0'th 
bound go in the 0'th bin, and values greater than the last bound go in the M'th 
bin.
 
+### Sampling Functions
+
+ `SAMPLE_ADD`
+* Description: Add a value or collection of values to a sampler.
+* Input:
--- End diff --

This makes it seem like Uniform sampler is a 'known' thing.  But it is not, 
either by explanation or reference to where it is explained ( as we have done 
referring to algorithms before ).
Is there another type of sampler?

Somewhere ( I'm not sure where ) we should say:
"A sampler is a x that is | does | acts as  x for the sample 
functions. The default has these properties, but you can override that in init"

Why even mention the Universal?





> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289945#comment-16289945
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156790508
  
--- Diff: 
metron-analytics/metron-statistics/src/main/java/org/apache/metron/statistics/sampling/SamplingInitFunctions.java
 ---
@@ -0,0 +1,89 @@
+/*
+ *
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ *
+ */
+package org.apache.metron.statistics.sampling;
+
+import org.apache.metron.stellar.common.utils.ConversionUtils;
+import org.apache.metron.stellar.dsl.Context;
+import org.apache.metron.stellar.dsl.ParseException;
+import org.apache.metron.stellar.dsl.Stellar;
+import org.apache.metron.stellar.dsl.StellarFunction;
+
+import java.util.List;
+import java.util.Optional;
+import java.util.function.Supplier;
+
+public class SamplingInitFunctions {
+
+  @Stellar(namespace="SAMPLE"
+  ,name="INIT"
+  ,description="Create a uniform reservoir sampler of a specific 
size or, if unspecified, size " + Sampler.DEFAULT_SIZE
+  ,params = {
+"size? - The size of the reservoir sampler.  If unspecified, 
the size is " + Sampler.DEFAULT_SIZE
+  }
+  ,returns="The sampler object."
+  )
+
+  public static class UniformSamplerInit implements StellarFunction {
+@Override
+public Object apply(List args, Context context) throws 
ParseException {
+  if(args.size() == 0) {
+return new UniformSampler();
+  }
+  else {
+Optional sizeArg = get(args, 0, "Size", Integer.class);
+if(sizeArg.isPresent() && sizeArg.get() <= 0) {
+  throw new IllegalStateException("Size must be a positive 
integer");
+}
+else {
+  return new UniformSampler(sizeArg.orElse(Sampler.DEFAULT_SIZE));
+}
+  }
+}
+
+@Override
+public void initialize(Context context) {
+}
+
+@Override
+public boolean isInitialized() {
+  return true;
+}
+  }
+
+
+  public static  Optional get(List args, int offset, String 
argName, Class expectedClazz) {
+Object obj = args.get(offset);
+T ret = ConversionUtils.convert(obj, expectedClazz);
--- End diff --

Couldn't this be simplified to :

```java 
if(ret == null ) {
  if(obj != null) {
 throw new IllegalStateException(argName + "argument(" + obj
+ " is expected to be an " + 
expectedClazz.getName()
+ ", but was " + obj
);
   }
}
return Optional.ofNullable(ret);
 }
```


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289944#comment-16289944
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user simonellistonball commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156791019
  
--- Diff: 
metron-analytics/metron-statistics/src/main/java/org/apache/metron/statistics/sampling/UniformSampler.java
 ---
@@ -0,0 +1,91 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.statistics.sampling;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+
+public class UniformSampler implements Sampler {
+  private List reservoir;
+  private int seen = 0;
+  private int size;
+  private Random rng = new Random(0);
+
+  public UniformSampler() {
+this(DEFAULT_SIZE);
+  }
+
+  public UniformSampler(int size) {
+this.size = size;
+reservoir = new ArrayList<>(size);
+  }
+
+  @Override
+  public Iterable get() {
+return reservoir;
+  }
+
+  /**
+   * Add an object to the reservoir
+   * @param o
+   */
+  public void add(Object o) {
+if(o == null) {
+  return;
+}
+if (reservoir.size() < size) {
+  reservoir.add(o);
+} else {
+  int rIndex = rng.nextInt(seen + 1);
--- End diff --

you are 100% right, that's what I get for skim reading.


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289941#comment-16289941
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/867#discussion_r156790945
  
--- Diff: 
metron-analytics/metron-statistics/src/main/java/org/apache/metron/statistics/sampling/UniformSampler.java
 ---
@@ -0,0 +1,91 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.statistics.sampling;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+
+public class UniformSampler implements Sampler {
+  private List reservoir;
+  private int seen = 0;
+  private int size;
+  private Random rng = new Random(0);
+
+  public UniformSampler() {
+this(DEFAULT_SIZE);
+  }
+
+  public UniformSampler(int size) {
+this.size = size;
+reservoir = new ArrayList<>(size);
+  }
+
+  @Override
+  public Iterable get() {
+return reservoir;
+  }
+
+  /**
+   * Add an object to the reservoir
+   * @param o
+   */
+  public void add(Object o) {
+if(o == null) {
+  return;
+}
+if (reservoir.size() < size) {
+  reservoir.add(o);
+} else {
+  int rIndex = rng.nextInt(seen + 1);
--- End diff --

Just so I'm clear, up to the reservoir size, we add to the reservoir.  When 
we're past the reservoir, we do a random replacement as per 
https://en.wikipedia.org/wiki/Reservoir_sampling


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289937#comment-16289937
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user cestella commented on the issue:

https://github.com/apache/metron/pull/867
  
Sorry, I am not sure I understand, this is random replacement when after 
the size limit.  Am I mistaking your question?


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1351) Create Debian Packages for Ubuntu Trusty

2017-12-13 Thread Nick Allen (JIRA)
Nick Allen created METRON-1351:
--

 Summary: Create Debian Packages for Ubuntu Trusty
 Key: METRON-1351
 URL: https://issues.apache.org/jira/browse/METRON-1351
 Project: Metron
  Issue Type: Improvement
Affects Versions: 0.4.1
Reporter: Nick Allen
Assignee: Nick Allen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289926#comment-16289926
 ] 

ASF GitHub Bot commented on METRON-1350:


Github user simonellistonball commented on the issue:

https://github.com/apache/metron/pull/867
  
Should the size limit on the sample really be a cut off? In a likely usage 
scenario a users would sample over a window in a profile. Limiting the size is 
likely to skew to time at the beginning of the window rather than being 
genuinely uniform. Would a random replacement strategy make more sense when 
over the limit? This could be a lot heavier in terms of performance, but may be 
more mathematically sound.


> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289914#comment-16289914
 ] 

ASF GitHub Bot commented on METRON-1350:


GitHub user cestella opened a pull request:

https://github.com/apache/metron/pull/867

METRON-1350: Add reservoir sampling functions to Stellar

## Contributor Comments
Sampling capabilities would fit very well with the profiler and enable 
algorithms that do not necessarily support our existing probabilistic sketches. 
We should add a reservoir sampler and utilities to merge and resample.

You can play with `SAMPLE_INIT`, `SAMPLE_ADD`, `SAMPLE_MERGE` and 
`SAMPLE_GET` via the REPL:
```
[Stellar]>>> ?SAMPLE_ADD
SAMPLE_ADD
Description: Add to a sample

Arguments:
sampler - Sampler to use.  If null, then a default Uniform sampler is 
created
o - The value to add.  If o is an Iterable, then each item is added.

Returns:
[Stellar]>>> s_10 := SAMPLE_INIT(10)
[Stellar]>>> sample := REDUCE( [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 ], (s, 
x) -> SAMPLE_ADD(s, x), SAMPLE_INIT(5))
[Stellar]>>> SAMPLE_GET(sample)
[6, 8, 11, 4, 5]
[Stellar]>>> SAMPLE_ADD(s_10, [5, 2, 5, 7, 10 ])
org.apache.metron.statistics.sampling.UniformSampler@3d8d06c0
[Stellar]>>> SAMPLE_GET(SAMPLE_ADD(s_10, [5, 2, 5, 7, 10 ]))
[5, 2, 5, 7, 10, 5, 2, 5, 7, 10]
```
## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
 
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && build_utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cestella/incubator-metron sampling

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/867.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #867


commit 7e1a19e29f86a23140aa46291f0083409ddac40d
Author: cstella 
Date:   2017-12-13T20:59:40Z

METRON-1350: Add reservoir sampling functions to Stellar




> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues

[jira] [Updated] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread Casey Stella (JIRA)

 [ 
https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Casey Stella updated METRON-1350:
-
Description: Sampling capabilities would fit very well with the profiler 
and enable algorithms that do not necessarily support our existing 
probabilistic sketches.  We should add a reservoir sampler and utilities to 
merge and resample.   (was: Sampling capabilities would fit very well with the 
profiler and enable algorithms that do not necessarily support our existing 
probabalistic sketches.  We should add a reservoir sampler and utilities to 
merge and resample. )

> Add reservoir sampling functions to Stellar
> ---
>
> Key: METRON-1350
> URL: https://issues.apache.org/jira/browse/METRON-1350
> Project: Metron
>  Issue Type: Improvement
>Reporter: Casey Stella
>
> Sampling capabilities would fit very well with the profiler and enable 
> algorithms that do not necessarily support our existing probabilistic 
> sketches.  We should add a reservoir sampler and utilities to merge and 
> resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1350) Add reservoir sampling functions to Stellar

2017-12-13 Thread Casey Stella (JIRA)
Casey Stella created METRON-1350:


 Summary: Add reservoir sampling functions to Stellar
 Key: METRON-1350
 URL: https://issues.apache.org/jira/browse/METRON-1350
 Project: Metron
  Issue Type: Improvement
Reporter: Casey Stella


Sampling capabilities would fit very well with the profiler and enable 
algorithms that do not necessarily support our existing probabalistic sketches. 
 We should add a reservoir sampler and utilities to merge and resample. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289748#comment-16289748
 ] 

ASF GitHub Bot commented on METRON-1345:


Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/859#discussion_r156755792
  
--- Diff: metron-deployment/amazon-ec2/README.md ---
@@ -126,6 +126,10 @@ To provision only subsets of the entire Metron 
deployment, Ansible tags can be s
 ./run.sh --tags="ec2,sensors"
 ```
 
+### Setting REST API Profile
+
--- End diff --

Reading that back, it seems a little stronger than I intended.  Sorry.  I 
don't think everyone following these deployment steps is necessarily going to 
know why they need to start tripping through readme land without some context.  
We have a lot of people trying deployment and having problems who are not as 
expert


> Update EC2 README for custom Ansible tags
> -
>
> Key: METRON-1345
> URL: https://issues.apache.org/jira/browse/METRON-1345
> Project: Metron
>  Issue Type: Improvement
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289727#comment-16289727
 ] 

ASF GitHub Bot commented on METRON-1345:


Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/859#discussion_r156751241
  
--- Diff: metron-deployment/amazon-ec2/README.md ---
@@ -126,6 +126,10 @@ To provision only subsets of the entire Metron 
deployment, Ansible tags can be s
 ./run.sh --tags="ec2,sensors"
 ```
 
+### Setting REST API Profile
+
--- End diff --

Just linking to the other doc, without the user knowing even a little of 
why they need to look at it is not much better.  I think my blurb is 
appropriate.


> Update EC2 README for custom Ansible tags
> -
>
> Key: METRON-1345
> URL: https://issues.apache.org/jira/browse/METRON-1345
> Project: Metron
>  Issue Type: Improvement
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289704#comment-16289704
 ] 

ASF GitHub Bot commented on METRON-1345:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/859#discussion_r156748130
  
--- Diff: metron-deployment/amazon-ec2/README.md ---
@@ -126,6 +126,10 @@ To provision only subsets of the entire Metron 
deployment, Ansible tags can be s
 ./run.sh --tags="ec2,sensors"
 ```
 
+### Setting REST API Profile
+
--- End diff --

That's what I linked to in that README change @merrimanr and 
@ottobackwards. I didn't want to duplicate the REST docs, but agree with Otto 
about having a reference there.


> Update EC2 README for custom Ansible tags
> -
>
> Key: METRON-1345
> URL: https://issues.apache.org/jira/browse/METRON-1345
> Project: Metron
>  Issue Type: Improvement
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289696#comment-16289696
 ] 

ASF GitHub Bot commented on METRON-1349:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/866#discussion_r156746647
  
--- Diff: metron-deployment/playbooks/metron_install.yml ---
@@ -15,13 +15,6 @@
 #  limitations under the License.
 #
 ---
-- hosts: metron
-  become: true
-  roles:
-- { role: ambari_slave }
-- { role: metron-builder, tags: ['build'] }
--- End diff --

Yes


> Full Dev Builds Metron Twice
> 
>
> Key: METRON-1349
> URL: https://issues.apache.org/jira/browse/METRON-1349
> Project: Metron
>  Issue Type: Bug
>Reporter: Nick Allen
>Assignee: Nick Allen
>
> When deploying Metron in Full Dev, the "Build Metron" step gets run twice.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289686#comment-16289686
 ] 

ASF GitHub Bot commented on METRON-1345:


Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/859#discussion_r156744095
  
--- Diff: metron-deployment/amazon-ec2/README.md ---
@@ -126,6 +126,10 @@ To provision only subsets of the entire Metron 
deployment, Ansible tags can be s
 ./run.sh --tags="ec2,sensors"
 ```
 
+### Setting REST API Profile
+
--- End diff --

Spring profiles are documented in the REST README:  
https://github.com/apache/metron/tree/master/metron-interface/metron-rest#spring-profiles.
  

Is there something we can do to make the REST README more accessible?  I 
feel like a lot of questions people ask are already answered there but no one 
ever reads it.  What can we do to make it more useful?  Table of contents 
maybe?  I would be happy to take that on in a follow-up PR.


> Update EC2 README for custom Ansible tags
> -
>
> Key: METRON-1345
> URL: https://issues.apache.org/jira/browse/METRON-1345
> Project: Metron
>  Issue Type: Improvement
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289652#comment-16289652
 ] 

ASF GitHub Bot commented on METRON-1349:


Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/866#discussion_r156737918
  
--- Diff: metron-deployment/playbooks/metron_install.yml ---
@@ -15,13 +15,6 @@
 #  limitations under the License.
 #
 ---
-- hosts: metron
-  become: true
-  roles:
-- { role: ambari_slave }
-- { role: metron-builder, tags: ['build'] }
--- End diff --

Will --ansible-skip-tags="build"  still keep it from building at all?


> Full Dev Builds Metron Twice
> 
>
> Key: METRON-1349
> URL: https://issues.apache.org/jira/browse/METRON-1349
> Project: Metron
>  Issue Type: Bug
>Reporter: Nick Allen
>Assignee: Nick Allen
>
> When deploying Metron in Full Dev, the "Build Metron" step gets run twice.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289647#comment-16289647
 ] 

ASF GitHub Bot commented on METRON-1345:


Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/859#discussion_r156737230
  
--- Diff: metron-deployment/amazon-ec2/README.md ---
@@ -126,6 +126,10 @@ To provision only subsets of the entire Metron 
deployment, Ansible tags can be s
 ./run.sh --tags="ec2,sensors"
 ```
 
+### Setting REST API Profile
+
--- End diff --

Can we say something for people like me..  along the lines of
"Spring profiles are used for x,y,z.  By default the dev profile is 
selected.  Change the values of  For more information on setting the 
profiles and their use please see 


> Update EC2 README for custom Ansible tags
> -
>
> Key: METRON-1345
> URL: https://issues.apache.org/jira/browse/METRON-1345
> Project: Metron
>  Issue Type: Improvement
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability

2017-12-13 Thread Otto Fowler (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289626#comment-16289626
 ] 

Otto Fowler commented on METRON-1293:
-

Yeah, it is thin.  I create jiras to track ideas for discussions like this, so 
thanks!

All I'll say is, any extension / exit point in the flow would benefit from 
rules run on the data before sending on, either as routing hints or as 
transforms.  The Foreign topology could sit on index, and run rules pre send.

But I agree this is a solution looking for a problem ;)


> NiFi Indexer(?) capability
> --
>
> Key: METRON-1293
> URL: https://issues.apache.org/jira/browse/METRON-1293
> Project: Metron
>  Issue Type: New Feature
>Reporter: Otto Fowler
>Priority: Trivial
>
> Either through an extension ( if and when they come to indexing ) or through 
> a normal module support for indexing to nifi input ports would allow for 
> flexible capabilities for systems where nifi is used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289578#comment-16289578
 ] 

ASF GitHub Bot commented on METRON-1349:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/866#discussion_r156726426
  
--- Diff: metron-deployment/playbooks/metron_install.yml ---
@@ -15,13 +15,6 @@
 #  limitations under the License.
 #
 ---
-- hosts: metron
-  become: true
-  roles:
-- { role: ambari_slave }
-- { role: metron-builder, tags: ['build'] }
--- End diff --

This is the cause of the second build.  

The process was a bit more complex when Quick Dev was around because it 
would launch the Quick Dev image, then rebuild and try to push out new bits to 
Quick Dev.  Since we don't need that any longer, this can be simplified.


> Full Dev Builds Metron Twice
> 
>
> Key: METRON-1349
> URL: https://issues.apache.org/jira/browse/METRON-1349
> Project: Metron
>  Issue Type: Bug
>Reporter: Nick Allen
>Assignee: Nick Allen
>
> When deploying Metron in Full Dev, the "Build Metron" step gets run twice.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289572#comment-16289572
 ] 

ASF GitHub Bot commented on METRON-1349:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/866#discussion_r156725657
  
--- Diff: metron-deployment/roles/ambari_config/tasks/main.yml ---
@@ -26,16 +26,15 @@
   retries: 5
   delay: 10
 
-- name : check if ambari-server is up on {{ ambari_host }}:{{ambari_port}}
+- name : Wait for Ambari to start; http://{{ ambari_host }}:{{ ambari_port 
}}
   wait_for :
 host: "{{ ambari_host }}"
 port: "{{ ambari_port }}"
-delay: 120
-timeout: 300
+timeout: 600
--- End diff --

There is no need to always wait 2 minutes for Ambari to be ready.  Most 
often it is already up and kicking by the time we get here.  Rather than have a 
forced delay, I just added the delay duration to the overall timeout parameter 
in case there is a delay in getting Ambari going.


> Full Dev Builds Metron Twice
> 
>
> Key: METRON-1349
> URL: https://issues.apache.org/jira/browse/METRON-1349
> Project: Metron
>  Issue Type: Bug
>Reporter: Nick Allen
>Assignee: Nick Allen
>
> When deploying Metron in Full Dev, the "Build Metron" step gets run twice.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289567#comment-16289567
 ] 

ASF GitHub Bot commented on METRON-1349:


Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/866#discussion_r156724938
  
--- Diff: metron-deployment/roles/epel/tasks/main.yml ---
@@ -16,6 +16,4 @@
 #
 ---
 - name: Install EPEL repository
-  yum: name=epel-release update_cache=yes
-
-
+  yum: name=epel-release
--- End diff --

There is no need to force a cache update here.  This role gets run 
repetitively and forcing a cache update just slows us down.


> Full Dev Builds Metron Twice
> 
>
> Key: METRON-1349
> URL: https://issues.apache.org/jira/browse/METRON-1349
> Project: Metron
>  Issue Type: Bug
>Reporter: Nick Allen
>Assignee: Nick Allen
>
> When deploying Metron in Full Dev, the "Build Metron" step gets run twice.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289566#comment-16289566
 ] 

ASF GitHub Bot commented on METRON-1349:


GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/866

METRON-1349 Full Dev Builds Metron Twice

Removing the "Quick Dev" environment in #852 had an unintended side effect. 
 It caused Metron to be built twice during the Full Dev deployment process.  
Unless you prefer a double-build for thoroughness, this can be annoying.

## Testing

Deploy Full Dev and ensure that Metron is not build twice.  Once Metron is 
deployed, login to Ambari and run the Metron Service Check.  If the service 
check passes, we've done a solid.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1349

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/866.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #866


commit f152326f4c401129f0b05d03045440e7cf5dda2b
Author: Nick Allen 
Date:   2017-12-13T17:11:59Z

METRON-1349 Full Dev Builds Metron Twice




> Full Dev Builds Metron Twice
> 
>
> Key: METRON-1349
> URL: https://issues.apache.org/jira/browse/METRON-1349
> Project: Metron
>  Issue Type: Bug
>Reporter: Nick Allen
>Assignee: Nick Allen
>
> When deploying Metron in Full Dev, the "Build Metron" step gets run twice.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289552#comment-16289552
 ] 

ASF GitHub Bot commented on METRON-1345:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/859#discussion_r156723676
  
--- Diff: metron-deployment/roles/ambari_config/vars/small_cluster.yml ---
@@ -87,6 +87,8 @@ configurations:
   topology.classpath: '{{ topology_classpath }}'
   - kafka-broker:
   log.dirs: '{{ kafka_log_dirs | default("/kafka-log") }}'
--- End diff --

Just pushed a change. How's that @ottobackwards?


> Update EC2 README for custom Ansible tags
> -
>
> Key: METRON-1345
> URL: https://issues.apache.org/jira/browse/METRON-1345
> Project: Metron
>  Issue Type: Improvement
>Reporter: Michael Miklavcic
>Assignee: Michael Miklavcic
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability

2017-12-13 Thread Simon Elliston Ball (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289538#comment-16289538
 ] 

Simon Elliston Ball commented on METRON-1293:
-

So you would imagine something that picks up data from the indexing topic in 
Kafka, and sends it to NiFi site-to-site. Personally I don't see any benefit in 
that vs just having NiFi ConsumeKafka pointed at the indexing topic.

> NiFi Indexer(?) capability
> --
>
> Key: METRON-1293
> URL: https://issues.apache.org/jira/browse/METRON-1293
> Project: Metron
>  Issue Type: New Feature
>Reporter: Otto Fowler
>Priority: Trivial
>
> Either through an extension ( if and when they come to indexing ) or through 
> a normal module support for indexing to nifi input ports would allow for 
> flexible capabilities for systems where nifi is used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (METRON-1349) Full Dev Builds Metron Twice

2017-12-13 Thread Nick Allen (JIRA)
Nick Allen created METRON-1349:
--

 Summary: Full Dev Builds Metron Twice
 Key: METRON-1349
 URL: https://issues.apache.org/jira/browse/METRON-1349
 Project: Metron
  Issue Type: Bug
Reporter: Nick Allen
Assignee: Nick Allen


When deploying Metron in Full Dev, the "Build Metron" step gets run twice.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability

2017-12-13 Thread Otto Fowler (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289521#comment-16289521
 ] 

Otto Fowler commented on METRON-1293:
-

So, I don't think it is really an 'indexing' thing, other than indexing is the 
end of our pipeline.
If there was a peer to indexing, for sending to other systems -- like 
streamline or nifi etc then that would be what the ticket should be.  If you 
follow.

That would also be 'another' place to plugin a rules engine or stellar.

> NiFi Indexer(?) capability
> --
>
> Key: METRON-1293
> URL: https://issues.apache.org/jira/browse/METRON-1293
> Project: Metron
>  Issue Type: New Feature
>Reporter: Otto Fowler
>Priority: Trivial
>
> Either through an extension ( if and when they come to indexing ) or through 
> a normal module support for indexing to nifi input ports would allow for 
> flexible capabilities for systems where nifi is used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability

2017-12-13 Thread Simon Elliston Ball (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289486#comment-16289486
 ] 

Simon Elliston Ball commented on METRON-1293:
-

I see the thinking there, and there is some benefit in directly supporting 
site-to-site protocol in terms of load balancing direct from Metron to NiFi 
without the intermediate step of load balancing across kafka consumers, but I'm 
not sure I would recommend writing direct into NiFi from a resilience point of 
view, given that the NiFi content repository is not distributed. 

Still, once we've split out indexing to make it more extensible, this would be 
a good indexing option (that I would never recommend to anyone in production)

> NiFi Indexer(?) capability
> --
>
> Key: METRON-1293
> URL: https://issues.apache.org/jira/browse/METRON-1293
> Project: Metron
>  Issue Type: New Feature
>Reporter: Otto Fowler
>Priority: Trivial
>
> Either through an extension ( if and when they come to indexing ) or through 
> a normal module support for indexing to nifi input ports would allow for 
> flexible capabilities for systems where nifi is used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability

2017-12-13 Thread Otto Fowler (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289477#comment-16289477
 ] 

Otto Fowler commented on METRON-1293:
-

The idea would be a 'native' nifi connection might offer better integration and 
flexibility.  I do not have something more concrete than that.

> NiFi Indexer(?) capability
> --
>
> Key: METRON-1293
> URL: https://issues.apache.org/jira/browse/METRON-1293
> Project: Metron
>  Issue Type: New Feature
>Reporter: Otto Fowler
>Priority: Trivial
>
> Either through an extension ( if and when they come to indexing ) or through 
> a normal module support for indexing to nifi input ports would allow for 
> flexible capabilities for systems where nifi is used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (METRON-1293) NiFi Indexer(?) capability

2017-12-13 Thread Otto Fowler (JIRA)

 [ 
https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Otto Fowler reassigned METRON-1293:
---

Assignee: (was: Otto Fowler)

> NiFi Indexer(?) capability
> --
>
> Key: METRON-1293
> URL: https://issues.apache.org/jira/browse/METRON-1293
> Project: Metron
>  Issue Type: New Feature
>Reporter: Otto Fowler
>Priority: Trivial
>
> Either through an extension ( if and when they come to indexing ) or through 
> a normal module support for indexing to nifi input ports would allow for 
> flexible capabilities for systems where nifi is used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1343) Swagger UI for User Controller needs request method

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289468#comment-16289468
 ] 

ASF GitHub Bot commented on METRON-1343:


Github user ottobackwards commented on the issue:

https://github.com/apache/metron/pull/862
  
Please take care to mark the jira as done


> Swagger UI for User Controller needs request method
> ---
>
> Key: METRON-1343
> URL: https://issues.apache.org/jira/browse/METRON-1343
> Project: Metron
>  Issue Type: Bug
>Reporter: Mohan
>Assignee: Mohan
>Priority: Minor
> Attachments: Screen Shot 2017-12-05 at 4.35.07 PM (2).png
>
>
> Swagger UI for metron rest endpoints for User Controller has Multiple 
> requestMethods list for the same operations 
> !Screen Shot 2017-12-05 at 4.35.07 PM (2).png|thumbnail!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1343) Swagger UI for User Controller needs request method

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289467#comment-16289467
 ] 

ASF GitHub Bot commented on METRON-1343:


Github user asfgit closed the pull request at:

https://github.com/apache/metron/pull/862


> Swagger UI for User Controller needs request method
> ---
>
> Key: METRON-1343
> URL: https://issues.apache.org/jira/browse/METRON-1343
> Project: Metron
>  Issue Type: Bug
>Reporter: Mohan
>Assignee: Mohan
>Priority: Minor
> Attachments: Screen Shot 2017-12-05 at 4.35.07 PM (2).png
>
>
> Swagger UI for metron rest endpoints for User Controller has Multiple 
> requestMethods list for the same operations 
> !Screen Shot 2017-12-05 at 4.35.07 PM (2).png|thumbnail!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1340) Improve e2e tests for metron alerts

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289442#comment-16289442
 ] 

ASF GitHub Bot commented on METRON-1340:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/857
  
Follow up from @merrimanr and my work yesterday. We upped the versions of 
Node to 9.2.1. Per the doc, >8 is required to work with async/await. For good 
measure, I also set the NPM version to 5.6.0. We didn't touch Jasmine, but the 
Protractor docs also state that it should be > 2.7. Looks like we are currently 
using 2.5.2 per the package.json file. We may want to consider increasing that 
version as well.

We added `SELENIUM_PROMISE_MANAGER: false` to `protractor.conf.js` and 
immediately got failures due to `Promise` use in the Protractor tests and 
configuration. e.g. `var defer = protractor.promise.defer();`. So we removed 
references to promises in the conf file and were able to get past that first 
batch of errors. Now we were into problems with the tests. I started with the 
`login.e2e-spec.ts` spec file and removed `: Promise`. Running the tests 
again, the login tests were able to succeed.

There are still a large number of failures due to disabling the promise 
manager, but still having code throughout the test suite that leverages the 
older style. It's unclear if this will resolve all stability issues, but I 
think this is moving in the right direction.


> Improve e2e tests for metron alerts
> ---
>
> Key: METRON-1340
> URL: https://issues.apache.org/jira/browse/METRON-1340
> Project: Metron
>  Issue Type: Bug
>Reporter: RaghuMitra
>Assignee: RaghuMitra
>
> Need to improve e2e tests in the following areas:
>  - Tests should not be flaky
>  - Remove the sleep ( This should implicitly make the tests run faster)
>  - Truncate HBase table 'metron_update' before starting the tests
>  - Improve the tests descriptions
>  - Run the tests headless if possible
>  - Check the node version and browser version before launching the tests
> The expected behavior is that there are no intermittent failures. Acceptance 
> criteria: 5 consecutive runs without failures.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289427#comment-16289427
 ] 

ASF GitHub Bot commented on METRON-1347:


Github user cestella commented on the issue:

https://github.com/apache/metron/pull/863
  
Actually, I don't think `original_string` is required past the parser 
topology.  For instance, profiler messages into enrichment do not have 
`original_string`.


> Indexing Topology should fail tuples without a source.type
> --
>
> Key: METRON-1347
> URL: https://issues.apache.org/jira/browse/METRON-1347
> Project: Metron
>  Issue Type: Bug
>Reporter: Casey Stella
>
> If you are sending data into metron indexing without a source.type, which 
> would only happen if you're bypassing our previous topologies, we cannot 
> configure how we write to the indices, so the message should be explicitly 
> failed and reported.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1343) Swagger UI for User Controller needs request method

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289405#comment-16289405
 ] 

ASF GitHub Bot commented on METRON-1343:


Github user ottobackwards commented on the issue:

https://github.com/apache/metron/pull/862
  
+1, Thanks for the contribution!


> Swagger UI for User Controller needs request method
> ---
>
> Key: METRON-1343
> URL: https://issues.apache.org/jira/browse/METRON-1343
> Project: Metron
>  Issue Type: Bug
>Reporter: Mohan
>Assignee: Mohan
>Priority: Minor
> Attachments: Screen Shot 2017-12-05 at 4.35.07 PM (2).png
>
>
> Swagger UI for metron rest endpoints for User Controller has Multiple 
> requestMethods list for the same operations 
> !Screen Shot 2017-12-05 at 4.35.07 PM (2).png|thumbnail!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1212) Bundles and Maven Plugin

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289402#comment-16289402
 ] 

ASF GitHub Bot commented on METRON-1212:


GitHub user ottobackwards opened a pull request:

https://github.com/apache/metron/pull/865

METRON-1212 The bundle System and Maven Plugin (Feature Branch)

This PR contains the Bundle system and Maven Plugin.

The bundle system and the plugin are adapted from the Apache Nifi project.  

## bundles-maven-plugin
The bundles-maven-plugin is an adapted version of the jar dependency plugin 
whose function is to bundle a jar of jars based on the dependencies for a 
project.  It also creates metadata attributes.
A project's jar, and it's non-provided dependency jars are place in a /lib 
entry in the bundle, with the bundle itself being in jar format.

## bundles-lib 
The bundles-lib contains the functionality required to:
- discover bundles
- inspect bundles for exposed extension types
- load the bundles
- create special class loaders for bundles
- deliver instances of extension types for use

NAR exposed the bundles through many classes.  I have created the 
BundleSystem interface to expose a more usable, simplified api for our use 
cases.

### From the original PR for METRON-777:
Metron Bundle Plugin
- adaptation of the nifi plugin
- more configurable wrt file extension/dependency and metadata naming
bundle-lib
- adaptation of nifi-nar-utils to be used outside of the nifi project
- rudimentary extensibility to allow configuration and injection of service 
types and other things that were hard coded to nifi
- refactored from File based to VFS based
- rebranding to Bundle from Nar ( although the lib and the plugin allow 
that to be configured now )
- added capability to the properties class to write to stream, adapted to 
uri from paths
- added integration tests for hdfs
- changed to be ClassIndex based instead of ServiceLoader. Service loader 
is slower, and Casey's ClassIndex work is great. This also removes the NAR's 
required manual maintenance of the service file.
- refactored to use VFS to load the bundle/nar into the classloader AND to 
use VFS to load the dependency jars -> VFS as a composite filesystem. Thus 
going from NAR's 'working directory', exploded NARS to just loading the 
bundle/nar.

## Previous Review
Please see [@mattf_apache's 
review](https://github.com/apache/metron/pull/530/files/c5f8c34e4de8e6d456b97edd6f8a0d33b4819d69)

## changes from that review
I have changed the InitContext operations to have explicit builders, and 
made it so that creating a context can be done separately from initialization.  
Two contexts can then be 'merged'.  This is to allow for the addition of new 
bundles after initialization.

In preparing this PR I have:
- made checkstyle fixes
- fixed several types
- added a requested set of tests loading and executing simple 
interface/implementation from bundle beyond what is already in the bundle-lib 
tests

## Testing

*` cd bundles-maven-plugin && mvn -q install && cd .. ` must be run once to 
install the maven plugin
* This review is code review and test code review and running only
* [Test Project](https://github.com/ottobackwards/test-bundles-plugin) can 
be examined as a simple example of how to create bundles.
* The README.md has getting started and quickstart sections with some 
overview of creating by hand

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
 
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?
- [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron \
- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ottobackwards/metron fifth_bundles

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/865.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #865
 

[jira] [Commented] (METRON-1212) Bundles and Maven Plugin

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289395#comment-16289395
 ] 

ASF GitHub Bot commented on METRON-1212:


Github user ottobackwards closed the pull request at:

https://github.com/apache/metron/pull/774


> Bundles and Maven Plugin
> 
>
> Key: METRON-1212
> URL: https://issues.apache.org/jira/browse/METRON-1212
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Otto Fowler
>Assignee: Otto Fowler
>  Labels: metron-feature-canidate, 
> metron-feature-extensions-parsers
>
> The first effort will be to land the bundle system and supporting maven 
> plugin on master



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289389#comment-16289389
 ] 

ASF GitHub Bot commented on METRON-1347:


Github user ottobackwards commented on the issue:

https://github.com/apache/metron/pull/863
  
The minimum required fields, as far as I can see right now are source.type, 
original_string and timestamp.  Given the use case for this is something that 
has skipped the parser topology, we should validate those.

If we think the same can be done for indexing, then we should use the same 
classes/technique there.

Again, this is based on the presented use case


> Indexing Topology should fail tuples without a source.type
> --
>
> Key: METRON-1347
> URL: https://issues.apache.org/jira/browse/METRON-1347
> Project: Metron
>  Issue Type: Bug
>Reporter: Casey Stella
>
> If you are sending data into metron indexing without a source.type, which 
> would only happen if you're bypassing our previous topologies, we cannot 
> configure how we write to the indices, so the message should be explicitly 
> failed and reported.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289328#comment-16289328
 ] 

ASF GitHub Bot commented on METRON-1347:


Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/863
  
I would like to hear feedback from @ottobackwards on other required fields 
but this looks good to me otherwise.  


> Indexing Topology should fail tuples without a source.type
> --
>
> Key: METRON-1347
> URL: https://issues.apache.org/jira/browse/METRON-1347
> Project: Metron
>  Issue Type: Bug
>Reporter: Casey Stella
>
> If you are sending data into metron indexing without a source.type, which 
> would only happen if you're bypassing our previous topologies, we cannot 
> configure how we write to the indices, so the message should be explicitly 
> failed and reported.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289341#comment-16289341
 ] 

ASF GitHub Bot commented on METRON-1347:


Github user simonellistonball commented on a diff in the pull request:

https://github.com/apache/metron/pull/863#discussion_r156676868
  
--- Diff: 
metron-platform/metron-writer/src/main/java/org/apache/metron/writer/bolt/BulkMessageWriterBolt.java
 ---
@@ -229,17 +239,30 @@ public void execute(Tuple tuple) {
   LOG.trace("Writing enrichment message: {}", message);
   WriterConfiguration writerConfiguration = 
configurationTransformation.apply(
   new IndexingWriterConfiguration(bulkMessageWriter.getName(), 
getConfigurations()));
-  if(writerConfiguration.isDefault(sensorType)) {
-//want to warn, but not fail the tuple
-collector.reportError(new Exception("WARNING: Default and (likely) 
unoptimized writer config used for " + bulkMessageWriter.getName() + " writer 
and sensor " + sensorType));
+  if(sensorType == null) {
--- End diff --

Strictly speaking that's true, but by convention original_string should be 
required. There is a broader topic about what should be required, but that 
certainly doesn't belong in a comment on a PR.


> Indexing Topology should fail tuples without a source.type
> --
>
> Key: METRON-1347
> URL: https://issues.apache.org/jira/browse/METRON-1347
> Project: Metron
>  Issue Type: Bug
>Reporter: Casey Stella
>
> If you are sending data into metron indexing without a source.type, which 
> would only happen if you're bypassing our previous topologies, we cannot 
> configure how we write to the indices, so the message should be explicitly 
> failed and reported.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289338#comment-16289338
 ] 

ASF GitHub Bot commented on METRON-1347:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/863#discussion_r156676155
  
--- Diff: metron-platform/metron-indexing/README.md ---
@@ -15,6 +15,12 @@ Indices are written in batch and the batch size and 
batch timeout are specified
 [Sensor Indexing Configuration](#sensor-indexing-configuration) via the 
`batchSize` and `batchTimeout` parameters.
 These configs are variable by sensor type.
 
--- End diff --

So, strictly speaking messages really only require `source.type` (which I 
typo'd) and `timestamp` (which I should add).  I'll fix that, but did I miss 
anything?


> Indexing Topology should fail tuples without a source.type
> --
>
> Key: METRON-1347
> URL: https://issues.apache.org/jira/browse/METRON-1347
> Project: Metron
>  Issue Type: Bug
>Reporter: Casey Stella
>
> If you are sending data into metron indexing without a source.type, which 
> would only happen if you're bypassing our previous topologies, we cannot 
> configure how we write to the indices, so the message should be explicitly 
> failed and reported.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289332#comment-16289332
 ] 

ASF GitHub Bot commented on METRON-1347:


Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/863#discussion_r156675356
  
--- Diff: 
metron-platform/metron-writer/src/main/java/org/apache/metron/writer/bolt/BulkMessageWriterBolt.java
 ---
@@ -229,17 +239,30 @@ public void execute(Tuple tuple) {
   LOG.trace("Writing enrichment message: {}", message);
   WriterConfiguration writerConfiguration = 
configurationTransformation.apply(
   new IndexingWriterConfiguration(bulkMessageWriter.getName(), 
getConfigurations()));
-  if(writerConfiguration.isDefault(sensorType)) {
-//want to warn, but not fail the tuple
-collector.reportError(new Exception("WARNING: Default and (likely) 
unoptimized writer config used for " + bulkMessageWriter.getName() + " writer 
and sensor " + sensorType));
+  if(sensorType == null) {
--- End diff --

Sure thing.  Really the only two required are `timestamp` and 
`source.type`.  Did I miss any?


> Indexing Topology should fail tuples without a source.type
> --
>
> Key: METRON-1347
> URL: https://issues.apache.org/jira/browse/METRON-1347
> Project: Metron
>  Issue Type: Bug
>Reporter: Casey Stella
>
> If you are sending data into metron indexing without a source.type, which 
> would only happen if you're bypassing our previous topologies, we cannot 
> configure how we write to the indices, so the message should be explicitly 
> failed and reported.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability

2017-12-13 Thread Simon Elliston Ball (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289330#comment-16289330
 ] 

Simon Elliston Ball commented on METRON-1293:
-

[~ottobackwards] I'm curious about what you would get from this that would not 
be achieved by just hookinf NiFi's ConsumeKafka process up to the indexing 
topic.

> NiFi Indexer(?) capability
> --
>
> Key: METRON-1293
> URL: https://issues.apache.org/jira/browse/METRON-1293
> Project: Metron
>  Issue Type: New Feature
>Reporter: Otto Fowler
>Assignee: Otto Fowler
>Priority: Trivial
>
> Either through an extension ( if and when they come to indexing ) or through 
> a normal module support for indexing to nifi input ports would allow for 
> flexible capabilities for systems where nifi is used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type

2017-12-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289326#comment-16289326
 ] 

ASF GitHub Bot commented on METRON-1347:


Github user merrimanr commented on a diff in the pull request:

https://github.com/apache/metron/pull/863#discussion_r156674159
  
--- Diff: 
metron-platform/metron-writer/src/main/java/org/apache/metron/writer/bolt/BulkMessageWriterBolt.java
 ---
@@ -229,17 +239,30 @@ public void execute(Tuple tuple) {
   LOG.trace("Writing enrichment message: {}", message);
   WriterConfiguration writerConfiguration = 
configurationTransformation.apply(
   new IndexingWriterConfiguration(bulkMessageWriter.getName(), 
getConfigurations()));
-  if(writerConfiguration.isDefault(sensorType)) {
-//want to warn, but not fail the tuple
-collector.reportError(new Exception("WARNING: Default and (likely) 
unoptimized writer config used for " + bulkMessageWriter.getName() + " writer 
and sensor " + sensorType));
+  if(sensorType == null) {
--- End diff --

@ottobackwards which fields should we validate here?


> Indexing Topology should fail tuples without a source.type
> --
>
> Key: METRON-1347
> URL: https://issues.apache.org/jira/browse/METRON-1347
> Project: Metron
>  Issue Type: Bug
>Reporter: Casey Stella
>
> If you are sending data into metron indexing without a source.type, which 
> would only happen if you're bypassing our previous topologies, we cannot 
> configure how we write to the indices, so the message should be explicitly 
> failed and reported.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (METRON-1244) Metron should support VPN Log Parsing

2017-12-13 Thread Simon Elliston Ball (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289216#comment-16289216
 ] 

Simon Elliston Ball commented on METRON-1244:
-

Absolutely agreed. The challenge is getting hold of good quality sample logs 
that we can incorporate cleanly into integration tests. If anyone can 
contribute logs alone, that would be helpful, and happy to collaborate on 
getting the parsers done too.

> Metron should support VPN Log Parsing
> -
>
> Key: METRON-1244
> URL: https://issues.apache.org/jira/browse/METRON-1244
> Project: Metron
>  Issue Type: New Feature
>Reporter: Otto Fowler
>
> VPN Log parsing is very valuable.  Metron should support parsing VPN logs 
> from multiple vendors, and currently supported devices such as ASA if not 
> already.
> Juniper (Pulse Secure)
> openVPN
> fortigate
> F5
> Sonicwall
> others
> This support may be by grok rules or by custom parser.
> We may want to expand this to custom dashboards for VPN specific fields, 
> extensions to metron fields for vpn class logs etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)