[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice
[ https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290249#comment-16290249 ] ASF GitHub Bot commented on METRON-1349: Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/866 Do you mean I should see two cowsay building metrons before and only one after? > Full Dev Builds Metron Twice > > > Key: METRON-1349 > URL: https://issues.apache.org/jira/browse/METRON-1349 > Project: Metron > Issue Type: Bug >Reporter: Nick Allen >Assignee: Nick Allen > > When deploying Metron in Full Dev, the "Build Metron" step gets run twice. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1352) Integration and e2e test infrastructure
[ https://issues.apache.org/jira/browse/METRON-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290240#comment-16290240 ] Otto Fowler commented on METRON-1352: - Do we need multiple compose configurations, a "base" one, and then one with rest, and one with the UI's etc? Or some sane set of other composition, that lets you start different combinations of services? Or is there a more "docker" way to do it? > Integration and e2e test infrastructure > --- > > Key: METRON-1352 > URL: https://issues.apache.org/jira/browse/METRON-1352 > Project: Metron > Issue Type: New Feature >Reporter: Ryan Merriman >Assignee: Ryan Merriman > > This feature is based on the work done in > https://issues.apache.org/jira/browse/METRON-1344. That feature branch > serves as the base branch for this Jira and includes: > # Dockerfile for Metron REST > # Dockerfile for Metron UIs > # Docker Compose application including Metron images, Elasticsearch, Kafka, > Zookeeper > # Modified travis file that manages the Docker environment and runs the e2e > tests as part of the build > # Maven pom.xml that installs all the required assets into the Docker e2e > module > # Modified metron-alerts pom.xml that allows e2e tests to be run through Maven > # An example integration test that has been converted to use the new > infrastructure > The initial requirements are as follows: > # All e2e and integration tests run on common infrastructure. > # All e2e and integration tests are run automatically in the Travis build. > # All e2e and integration tests run repeatably and reliably in the Travis > build. > # Debugging options are available and documented. > # The new infra and how to interact with it is documented. > # Old infrastructure removed (anything unused or commented out is deleted, > instead of staying). > These requirements are being actively > [discussed|https://lists.apache.org/thread.html/2aaa5a4a66ebdbf41323ea5ad7b059e5acd0a315d57ff871e0a7817e@%3Cdev.metron.apache.org%3E] > on dev list and are subject to change. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1302) Split up Indexing Topology into batch and random access sections
[ https://issues.apache.org/jira/browse/METRON-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290238#comment-16290238 ] ASF GitHub Bot commented on METRON-1302: Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/831 The batch v. hdfs stuff still confuses me, I thought we decided on a different name? > Split up Indexing Topology into batch and random access sections > > > Key: METRON-1302 > URL: https://issues.apache.org/jira/browse/METRON-1302 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Currently we have the indexing topology handle writing to both random access > indices (e.g. elasticsearch) as well as batch write indices (e.g. hdfs). We > should split these up and configure them separately. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1302) Split up Indexing Topology into batch and random access sections
[ https://issues.apache.org/jira/browse/METRON-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290232#comment-16290232 ] ASF GitHub Bot commented on METRON-1302: Github user cestella commented on the issue: https://github.com/apache/metron/pull/831 I think I managed to address the issues here. Is there anything else outstanding that I missed? If not, then bump. > Split up Indexing Topology into batch and random access sections > > > Key: METRON-1302 > URL: https://issues.apache.org/jira/browse/METRON-1302 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Currently we have the indexing topology handle writing to both random access > indices (e.g. elasticsearch) as well as batch write indices (e.g. hdfs). We > should split these up and configure them separately. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1359) Create guide for managing and interacting with new test infrastructure
Ryan Merriman created METRON-1359: - Summary: Create guide for managing and interacting with new test infrastructure Key: METRON-1359 URL: https://issues.apache.org/jira/browse/METRON-1359 Project: Metron Issue Type: Sub-task Reporter: Ryan Merriman A document should be create that clearly explains: # How to manage the lifecycle of the infrastructure # How to extend, add to, or modify the infrastructure # How to learn more about the underlying technologies # How to interact with the infrastructure at various levels -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1360) Clean up any old infrastructure
Ryan Merriman created METRON-1360: - Summary: Clean up any old infrastructure Key: METRON-1360 URL: https://issues.apache.org/jira/browse/METRON-1360 Project: Metron Issue Type: Sub-task Reporter: Ryan Merriman -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1358) Create Debugging guide based on the new test infrastructure
Ryan Merriman created METRON-1358: - Summary: Create Debugging guide based on the new test infrastructure Key: METRON-1358 URL: https://issues.apache.org/jira/browse/METRON-1358 Project: Metron Issue Type: Sub-task Reporter: Ryan Merriman A document should be created that provides guidance on how to debug each Metron module. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1357) Optimize travis file
Ryan Merriman created METRON-1357: - Summary: Optimize travis file Key: METRON-1357 URL: https://issues.apache.org/jira/browse/METRON-1357 Project: Metron Issue Type: Sub-task Reporter: Ryan Merriman After all tests have been converted, the Travis config file should be optimized for performance and stability. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1356) Add a mechanism in Java for discovering service host/ports
Ryan Merriman created METRON-1356: - Summary: Add a mechanism in Java for discovering service host/ports Key: METRON-1356 URL: https://issues.apache.org/jira/browse/METRON-1356 Project: Metron Issue Type: Sub-task Reporter: Ryan Merriman Integration tests will need to initialize clients with service urls. These may change depending on where and how the infrastructure is run (Docker engine vs Docker for Mac). It would be helpful to have a unified way of retrieving these across all integration tests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (METRON-1353) Create HBase Docker service
[ https://issues.apache.org/jira/browse/METRON-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Merriman reassigned METRON-1353: - Assignee: (was: Ryan Merriman) > Create HBase Docker service > --- > > Key: METRON-1353 > URL: https://issues.apache.org/jira/browse/METRON-1353 > Project: Metron > Issue Type: Sub-task >Reporter: Ryan Merriman > > This task includes adding an HBase service to the Docker compose file either > as an image from DockerHub or a Dockerfile. Also includes any relevant > configuration changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1355) Convert metron-elasticsearch to new infrastructure
Ryan Merriman created METRON-1355: - Summary: Convert metron-elasticsearch to new infrastructure Key: METRON-1355 URL: https://issues.apache.org/jira/browse/METRON-1355 Project: Metron Issue Type: Sub-task Reporter: Ryan Merriman Assignee: Ryan Merriman Integration tests need to be converted to use the new infrastructure. This includes: # Updating clients with new infrastructure urls # Adding a namespace to any assets the tests depend on # Cleaning up interactions with old in memory infrastructure -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1354) Add all e2e tests back to protractor
Ryan Merriman created METRON-1354: - Summary: Add all e2e tests back to protractor Key: METRON-1354 URL: https://issues.apache.org/jira/browse/METRON-1354 Project: Metron Issue Type: Sub-task Reporter: Ryan Merriman Assignee: Ryan Merriman In the base feature branch, all but one reliable e2e test has been temporarily been removed until they have all been stabilized. Once that work has been completed they will need to be merged into this feature branch and updated to work with the new test infrastructure. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1353) Create HBase Docker service
Ryan Merriman created METRON-1353: - Summary: Create HBase Docker service Key: METRON-1353 URL: https://issues.apache.org/jira/browse/METRON-1353 Project: Metron Issue Type: Sub-task Reporter: Ryan Merriman Assignee: Ryan Merriman This task includes adding an HBase service to the Docker compose file either as an image from DockerHub or a Dockerfile. Also includes any relevant configuration changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290018#comment-16290018 ] ASF GitHub Bot commented on METRON-1350: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156800428 --- Diff: metron-analytics/metron-statistics/README.md --- @@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is used. * bounds - A list of value bounds (excluding min and max) in sorted order. * Returns: Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. +### Sampling Functions + + `SAMPLE_ADD` +* Description: Add a value or collection of values to a sampler. +* Input: --- End diff -- Actually, more than likely it'd be a separate init since each type are going to have different types of parameters depending on the algorithm. So a biased sampler would be `SAMPLE_INIT_BIASED(size, ...)` > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290013#comment-16290013 ] ASF GitHub Bot commented on METRON-1350: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156799854 --- Diff: metron-analytics/metron-statistics/src/main/java/org/apache/metron/statistics/sampling/UniformSampler.java --- @@ -0,0 +1,91 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.statistics.sampling; + +import java.util.ArrayList; +import java.util.List; +import java.util.Random; + +public class UniformSampler implements Sampler { + private List reservoir; + private int seen = 0; + private int size; + private Random rng = new Random(0); + + public UniformSampler() { +this(DEFAULT_SIZE); + } + + public UniformSampler(int size) { +this.size = size; +reservoir = new ArrayList<>(size); + } + + @Override + public Iterable get() { +return reservoir; + } + + /** + * Add an object to the reservoir + * @param o + */ + public void add(Object o) { +if(o == null) { + return; +} +if (reservoir.size() < size) { + reservoir.add(o); +} else { + int rIndex = rng.nextInt(seen + 1); --- End diff -- Shouldn't we reference Reservoir Sampling in the documentation? Then the use of Universal and other terms would be more in context. > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290014#comment-16290014 ] ASF GitHub Bot commented on METRON-1350: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156799950 --- Diff: metron-analytics/metron-statistics/src/main/java/org/apache/metron/statistics/sampling/UniformSampler.java --- @@ -0,0 +1,91 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.statistics.sampling; + +import java.util.ArrayList; +import java.util.List; +import java.util.Random; + +public class UniformSampler implements Sampler { + private List reservoir; + private int seen = 0; + private int size; + private Random rng = new Random(0); + + public UniformSampler() { +this(DEFAULT_SIZE); + } + + public UniformSampler(int size) { +this.size = size; +reservoir = new ArrayList<>(size); + } + + @Override + public Iterable get() { +return reservoir; + } + + /** + * Add an object to the reservoir + * @param o + */ + public void add(Object o) { +if(o == null) { + return; +} +if (reservoir.size() < size) { + reservoir.add(o); +} else { + int rIndex = rng.nextInt(seen + 1); --- End diff -- This makes me think that we need "namespace" scoped documentation > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (METRON-1352) Integration and e2e test infrastructure
[ https://issues.apache.org/jira/browse/METRON-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Merriman updated METRON-1352: -- Description: This feature is based on the work done in https://issues.apache.org/jira/browse/METRON-1344. That feature branch serves as the base branch for this Jira and includes: # Dockerfile for Metron REST # Dockerfile for Metron UIs # Docker Compose application including Metron images, Elasticsearch, Kafka, Zookeeper # Modified travis file that manages the Docker environment and runs the e2e tests as part of the build # Maven pom.xml that installs all the required assets into the Docker e2e module # Modified metron-alerts pom.xml that allows e2e tests to be run through Maven # An example integration test that has been converted to use the new infrastructure The initial requirements are as follows: # All e2e and integration tests run on common infrastructure. # All e2e and integration tests are run automatically in the Travis build. # All e2e and integration tests run repeatably and reliably in the Travis build. # Debugging options are available and documented. # The new infra and how to interact with it is documented. # Old infrastructure removed (anything unused or commented out is deleted, instead of staying). These requirements are being actively [discussed|https://lists.apache.org/thread.html/2aaa5a4a66ebdbf41323ea5ad7b059e5acd0a315d57ff871e0a7817e@%3Cdev.metron.apache.org%3E] on dev list and are subject to change. was: This feature is based on the work done in https://issues.apache.org/jira/browse/METRON-1344. The initial requirements are as follows: # All e2e and integration tests run on common infrastructure. # All e2e and integration tests are run automatically in the Travis build. # All e2e and integration tests run repeatably and reliably in the Travis build. # Debugging options are available and documented. # The new infra and how to interact with it is documented. # Old infrastructure removed (anything unused or commented out is deleted, instead of staying). These requirements are being actively [discussed|https://lists.apache.org/thread.html/2aaa5a4a66ebdbf41323ea5ad7b059e5acd0a315d57ff871e0a7817e@%3Cdev.metron.apache.org%3E] on dev list and are subject to change. > Integration and e2e test infrastructure > --- > > Key: METRON-1352 > URL: https://issues.apache.org/jira/browse/METRON-1352 > Project: Metron > Issue Type: New Feature >Reporter: Ryan Merriman >Assignee: Ryan Merriman > > This feature is based on the work done in > https://issues.apache.org/jira/browse/METRON-1344. That feature branch > serves as the base branch for this Jira and includes: > # Dockerfile for Metron REST > # Dockerfile for Metron UIs > # Docker Compose application including Metron images, Elasticsearch, Kafka, > Zookeeper > # Modified travis file that manages the Docker environment and runs the e2e > tests as part of the build > # Maven pom.xml that installs all the required assets into the Docker e2e > module > # Modified metron-alerts pom.xml that allows e2e tests to be run through Maven > # An example integration test that has been converted to use the new > infrastructure > The initial requirements are as follows: > # All e2e and integration tests run on common infrastructure. > # All e2e and integration tests are run automatically in the Travis build. > # All e2e and integration tests run repeatably and reliably in the Travis > build. > # Debugging options are available and documented. > # The new infra and how to interact with it is documented. > # Old infrastructure removed (anything unused or commented out is deleted, > instead of staying). > These requirements are being actively > [discussed|https://lists.apache.org/thread.html/2aaa5a4a66ebdbf41323ea5ad7b059e5acd0a315d57ff871e0a7817e@%3Cdev.metron.apache.org%3E] > on dev list and are subject to change. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290009#comment-16290009 ] ASF GitHub Bot commented on METRON-1350: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156799548 --- Diff: metron-analytics/metron-statistics/README.md --- @@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is used. * bounds - A list of value bounds (excluding min and max) in sorted order. * Returns: Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. +### Sampling Functions + + `SAMPLE_ADD` +* Description: Add a value or collection of values to a sampler. +* Input: --- End diff -- Then we'll have a get sample types method, like we do with other things like this right? > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289990#comment-16289990 ] ASF GitHub Bot commented on METRON-1350: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156796690 --- Diff: metron-analytics/metron-statistics/README.md --- @@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is used. * bounds - A list of value bounds (excluding min and max) in sorted order. * Returns: Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. +### Sampling Functions + + `SAMPLE_ADD` +* Description: Add a value or collection of values to a sampler. +* Input: --- End diff -- They're both needed. Some use-cases would be fine without bias and some would be better with bias. As a follow-on, I was planning on adding a biased sampler, but this is a big enough PR without it. It'd look something like: ``` samples := SAMPLE_MERGE(PROFILE_GET('samples', ...)) biased_sample := SAMPLE_GET_BIASED(samples, 0.015) ``` > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1352) Integration and e2e test infrastructure
Ryan Merriman created METRON-1352: - Summary: Integration and e2e test infrastructure Key: METRON-1352 URL: https://issues.apache.org/jira/browse/METRON-1352 Project: Metron Issue Type: New Feature Reporter: Ryan Merriman Assignee: Ryan Merriman This feature is based on the work done in https://issues.apache.org/jira/browse/METRON-1344. The initial requirements are as follows: # All e2e and integration tests run on common infrastructure. # All e2e and integration tests are run automatically in the Travis build. # All e2e and integration tests run repeatably and reliably in the Travis build. # Debugging options are available and documented. # The new infra and how to interact with it is documented. # Old infrastructure removed (anything unused or commented out is deleted, instead of staying). These requirements are being actively [discussed|https://lists.apache.org/thread.html/2aaa5a4a66ebdbf41323ea5ad7b059e5acd0a315d57ff871e0a7817e@%3Cdev.metron.apache.org%3E] on dev list and are subject to change. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289978#comment-16289978 ] ASF GitHub Bot commented on METRON-1350: Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156794990 --- Diff: metron-analytics/metron-statistics/README.md --- @@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is used. * bounds - A list of value bounds (excluding min and max) in sorted order. * Returns: Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. +### Sampling Functions + + `SAMPLE_ADD` +* Description: Add a value or collection of values to a sampler. +* Input: --- End diff -- Recency would surely be more relevant for merged resampling in a profile context? > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289975#comment-16289975 ] ASF GitHub Bot commented on METRON-1350: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156794655 --- Diff: metron-analytics/metron-statistics/README.md --- @@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is used. * bounds - A list of value bounds (excluding min and max) in sorted order. * Returns: Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. +### Sampling Functions + + `SAMPLE_ADD` +* Description: Add a value or collection of values to a sampler. +* Input: --- End diff -- There are definitely other types of reservoir samplers which we will probably want. Most specifically a sampler that is biased toward recency (so non-uniform in that case). > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289955#comment-16289955 ] ASF GitHub Bot commented on METRON-1350: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156792855 --- Diff: metron-analytics/metron-statistics/README.md --- @@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is used. * bounds - A list of value bounds (excluding min and max) in sorted order. * Returns: Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. +### Sampling Functions + + `SAMPLE_ADD` +* Description: Add a value or collection of values to a sampler. +* Input: --- End diff -- Ok, It seemed like the Uniform implementation was leaking > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (METRON-1351) Create Installable Packages for Ubuntu Trusty
[ https://issues.apache.org/jira/browse/METRON-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Allen updated METRON-1351: --- Summary: Create Installable Packages for Ubuntu Trusty (was: Create Debian Packages for Ubuntu Trusty) > Create Installable Packages for Ubuntu Trusty > - > > Key: METRON-1351 > URL: https://issues.apache.org/jira/browse/METRON-1351 > Project: Metron > Issue Type: Improvement >Affects Versions: 0.4.1 >Reporter: Nick Allen >Assignee: Nick Allen > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289953#comment-16289953 ] ASF GitHub Bot commented on METRON-1350: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156792461 --- Diff: metron-analytics/metron-statistics/README.md --- @@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is used. * bounds - A list of value bounds (excluding min and max) in sorted order. * Returns: Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. +### Sampling Functions + + `SAMPLE_ADD` +* Description: Add a value or collection of values to a sampler. +* Input: --- End diff -- Couldn't this be simplified to ```java if(ret == null ) { if(obj != null) { throw new IllegalStateException(argName + "argument(" + obj + " is expected to be an " + expectedClazz.getName() + ", but was " + obj ); } } return Optional.ofNullable(ret); ``` > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289951#comment-16289951 ] ASF GitHub Bot commented on METRON-1350: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156792227 --- Diff: metron-analytics/metron-statistics/README.md --- @@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is used. * bounds - A list of value bounds (excluding min and max) in sorted order. * Returns: Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. +### Sampling Functions + + `SAMPLE_ADD` +* Description: Add a value or collection of values to a sampler. +* Input: --- End diff -- Sorry, `uniform` here is intended to mean that there's each element has equal probability of being in the sample (e.g. the probability is pulled from a [uniform probability distribution](https://en.wikipedia.org/wiki/Uniform_distribution_(continuous))). I can probably do a better job documenting. > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289946#comment-16289946 ] ASF GitHub Bot commented on METRON-1350: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156788055 --- Diff: metron-analytics/metron-statistics/README.md --- @@ -53,6 +53,32 @@ functions can be used from everywhere where Stellar is used. * bounds - A list of value bounds (excluding min and max) in sorted order. * Returns: Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin. +### Sampling Functions + + `SAMPLE_ADD` +* Description: Add a value or collection of values to a sampler. +* Input: --- End diff -- This makes it seem like Uniform sampler is a 'known' thing. But it is not, either by explanation or reference to where it is explained ( as we have done referring to algorithms before ). Is there another type of sampler? Somewhere ( I'm not sure where ) we should say: "A sampler is a x that is | does | acts as x for the sample functions. The default has these properties, but you can override that in init" Why even mention the Universal? > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289945#comment-16289945 ] ASF GitHub Bot commented on METRON-1350: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156790508 --- Diff: metron-analytics/metron-statistics/src/main/java/org/apache/metron/statistics/sampling/SamplingInitFunctions.java --- @@ -0,0 +1,89 @@ +/* + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * + */ +package org.apache.metron.statistics.sampling; + +import org.apache.metron.stellar.common.utils.ConversionUtils; +import org.apache.metron.stellar.dsl.Context; +import org.apache.metron.stellar.dsl.ParseException; +import org.apache.metron.stellar.dsl.Stellar; +import org.apache.metron.stellar.dsl.StellarFunction; + +import java.util.List; +import java.util.Optional; +import java.util.function.Supplier; + +public class SamplingInitFunctions { + + @Stellar(namespace="SAMPLE" + ,name="INIT" + ,description="Create a uniform reservoir sampler of a specific size or, if unspecified, size " + Sampler.DEFAULT_SIZE + ,params = { +"size? - The size of the reservoir sampler. If unspecified, the size is " + Sampler.DEFAULT_SIZE + } + ,returns="The sampler object." + ) + + public static class UniformSamplerInit implements StellarFunction { +@Override +public Object apply(List args, Context context) throws ParseException { + if(args.size() == 0) { +return new UniformSampler(); + } + else { +Optional sizeArg = get(args, 0, "Size", Integer.class); +if(sizeArg.isPresent() && sizeArg.get() <= 0) { + throw new IllegalStateException("Size must be a positive integer"); +} +else { + return new UniformSampler(sizeArg.orElse(Sampler.DEFAULT_SIZE)); +} + } +} + +@Override +public void initialize(Context context) { +} + +@Override +public boolean isInitialized() { + return true; +} + } + + + public static Optional get(List args, int offset, String argName, Class expectedClazz) { +Object obj = args.get(offset); +T ret = ConversionUtils.convert(obj, expectedClazz); --- End diff -- Couldn't this be simplified to : ```java if(ret == null ) { if(obj != null) { throw new IllegalStateException(argName + "argument(" + obj + " is expected to be an " + expectedClazz.getName() + ", but was " + obj ); } } return Optional.ofNullable(ret); } ``` > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289944#comment-16289944 ] ASF GitHub Bot commented on METRON-1350: Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156791019 --- Diff: metron-analytics/metron-statistics/src/main/java/org/apache/metron/statistics/sampling/UniformSampler.java --- @@ -0,0 +1,91 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.statistics.sampling; + +import java.util.ArrayList; +import java.util.List; +import java.util.Random; + +public class UniformSampler implements Sampler { + private List reservoir; + private int seen = 0; + private int size; + private Random rng = new Random(0); + + public UniformSampler() { +this(DEFAULT_SIZE); + } + + public UniformSampler(int size) { +this.size = size; +reservoir = new ArrayList<>(size); + } + + @Override + public Iterable get() { +return reservoir; + } + + /** + * Add an object to the reservoir + * @param o + */ + public void add(Object o) { +if(o == null) { + return; +} +if (reservoir.size() < size) { + reservoir.add(o); +} else { + int rIndex = rng.nextInt(seen + 1); --- End diff -- you are 100% right, that's what I get for skim reading. > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289941#comment-16289941 ] ASF GitHub Bot commented on METRON-1350: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/867#discussion_r156790945 --- Diff: metron-analytics/metron-statistics/src/main/java/org/apache/metron/statistics/sampling/UniformSampler.java --- @@ -0,0 +1,91 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.statistics.sampling; + +import java.util.ArrayList; +import java.util.List; +import java.util.Random; + +public class UniformSampler implements Sampler { + private List reservoir; + private int seen = 0; + private int size; + private Random rng = new Random(0); + + public UniformSampler() { +this(DEFAULT_SIZE); + } + + public UniformSampler(int size) { +this.size = size; +reservoir = new ArrayList<>(size); + } + + @Override + public Iterable get() { +return reservoir; + } + + /** + * Add an object to the reservoir + * @param o + */ + public void add(Object o) { +if(o == null) { + return; +} +if (reservoir.size() < size) { + reservoir.add(o); +} else { + int rIndex = rng.nextInt(seen + 1); --- End diff -- Just so I'm clear, up to the reservoir size, we add to the reservoir. When we're past the reservoir, we do a random replacement as per https://en.wikipedia.org/wiki/Reservoir_sampling > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289937#comment-16289937 ] ASF GitHub Bot commented on METRON-1350: Github user cestella commented on the issue: https://github.com/apache/metron/pull/867 Sorry, I am not sure I understand, this is random replacement when after the size limit. Am I mistaking your question? > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1351) Create Debian Packages for Ubuntu Trusty
Nick Allen created METRON-1351: -- Summary: Create Debian Packages for Ubuntu Trusty Key: METRON-1351 URL: https://issues.apache.org/jira/browse/METRON-1351 Project: Metron Issue Type: Improvement Affects Versions: 0.4.1 Reporter: Nick Allen Assignee: Nick Allen -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289926#comment-16289926 ] ASF GitHub Bot commented on METRON-1350: Github user simonellistonball commented on the issue: https://github.com/apache/metron/pull/867 Should the size limit on the sample really be a cut off? In a likely usage scenario a users would sample over a window in a profile. Limiting the size is likely to skew to time at the beginning of the window rather than being genuinely uniform. Would a random replacement strategy make more sense when over the limit? This could be a lot heavier in terms of performance, but may be more mathematically sound. > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289914#comment-16289914 ] ASF GitHub Bot commented on METRON-1350: GitHub user cestella opened a pull request: https://github.com/apache/metron/pull/867 METRON-1350: Add reservoir sampling functions to Stellar ## Contributor Comments Sampling capabilities would fit very well with the profiler and enable algorithms that do not necessarily support our existing probabilistic sketches. We should add a reservoir sampler and utilities to merge and resample. You can play with `SAMPLE_INIT`, `SAMPLE_ADD`, `SAMPLE_MERGE` and `SAMPLE_GET` via the REPL: ``` [Stellar]>>> ?SAMPLE_ADD SAMPLE_ADD Description: Add to a sample Arguments: sampler - Sampler to use. If null, then a default Uniform sampler is created o - The value to add. If o is an Iterable, then each item is added. Returns: [Stellar]>>> s_10 := SAMPLE_INIT(10) [Stellar]>>> sample := REDUCE( [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 ], (s, x) -> SAMPLE_ADD(s, x), SAMPLE_INIT(5)) [Stellar]>>> SAMPLE_GET(sample) [6, 8, 11, 4, 5] [Stellar]>>> SAMPLE_ADD(s_10, [5, 2, 5, 7, 10 ]) org.apache.metron.statistics.sampling.UniformSampler@3d8d06c0 [Stellar]>>> SAMPLE_GET(SAMPLE_ADD(s_10, [5, 2, 5, 7, 10 ])) [5, 2, 5, 7, 10, 5, 2, 5, 7, 10] ``` ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cestella/incubator-metron sampling Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/867.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #867 commit 7e1a19e29f86a23140aa46291f0083409ddac40d Author: cstella Date: 2017-12-13T20:59:40Z METRON-1350: Add reservoir sampling functions to Stellar > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues
[jira] [Updated] (METRON-1350) Add reservoir sampling functions to Stellar
[ https://issues.apache.org/jira/browse/METRON-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Casey Stella updated METRON-1350: - Description: Sampling capabilities would fit very well with the profiler and enable algorithms that do not necessarily support our existing probabilistic sketches. We should add a reservoir sampler and utilities to merge and resample. (was: Sampling capabilities would fit very well with the profiler and enable algorithms that do not necessarily support our existing probabalistic sketches. We should add a reservoir sampler and utilities to merge and resample. ) > Add reservoir sampling functions to Stellar > --- > > Key: METRON-1350 > URL: https://issues.apache.org/jira/browse/METRON-1350 > Project: Metron > Issue Type: Improvement >Reporter: Casey Stella > > Sampling capabilities would fit very well with the profiler and enable > algorithms that do not necessarily support our existing probabilistic > sketches. We should add a reservoir sampler and utilities to merge and > resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1350) Add reservoir sampling functions to Stellar
Casey Stella created METRON-1350: Summary: Add reservoir sampling functions to Stellar Key: METRON-1350 URL: https://issues.apache.org/jira/browse/METRON-1350 Project: Metron Issue Type: Improvement Reporter: Casey Stella Sampling capabilities would fit very well with the profiler and enable algorithms that do not necessarily support our existing probabalistic sketches. We should add a reservoir sampler and utilities to merge and resample. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags
[ https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289748#comment-16289748 ] ASF GitHub Bot commented on METRON-1345: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/859#discussion_r156755792 --- Diff: metron-deployment/amazon-ec2/README.md --- @@ -126,6 +126,10 @@ To provision only subsets of the entire Metron deployment, Ansible tags can be s ./run.sh --tags="ec2,sensors" ``` +### Setting REST API Profile + --- End diff -- Reading that back, it seems a little stronger than I intended. Sorry. I don't think everyone following these deployment steps is necessarily going to know why they need to start tripping through readme land without some context. We have a lot of people trying deployment and having problems who are not as expert > Update EC2 README for custom Ansible tags > - > > Key: METRON-1345 > URL: https://issues.apache.org/jira/browse/METRON-1345 > Project: Metron > Issue Type: Improvement >Reporter: Michael Miklavcic >Assignee: Michael Miklavcic > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags
[ https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289727#comment-16289727 ] ASF GitHub Bot commented on METRON-1345: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/859#discussion_r156751241 --- Diff: metron-deployment/amazon-ec2/README.md --- @@ -126,6 +126,10 @@ To provision only subsets of the entire Metron deployment, Ansible tags can be s ./run.sh --tags="ec2,sensors" ``` +### Setting REST API Profile + --- End diff -- Just linking to the other doc, without the user knowing even a little of why they need to look at it is not much better. I think my blurb is appropriate. > Update EC2 README for custom Ansible tags > - > > Key: METRON-1345 > URL: https://issues.apache.org/jira/browse/METRON-1345 > Project: Metron > Issue Type: Improvement >Reporter: Michael Miklavcic >Assignee: Michael Miklavcic > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags
[ https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289704#comment-16289704 ] ASF GitHub Bot commented on METRON-1345: Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/metron/pull/859#discussion_r156748130 --- Diff: metron-deployment/amazon-ec2/README.md --- @@ -126,6 +126,10 @@ To provision only subsets of the entire Metron deployment, Ansible tags can be s ./run.sh --tags="ec2,sensors" ``` +### Setting REST API Profile + --- End diff -- That's what I linked to in that README change @merrimanr and @ottobackwards. I didn't want to duplicate the REST docs, but agree with Otto about having a reference there. > Update EC2 README for custom Ansible tags > - > > Key: METRON-1345 > URL: https://issues.apache.org/jira/browse/METRON-1345 > Project: Metron > Issue Type: Improvement >Reporter: Michael Miklavcic >Assignee: Michael Miklavcic > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice
[ https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289696#comment-16289696 ] ASF GitHub Bot commented on METRON-1349: Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/866#discussion_r156746647 --- Diff: metron-deployment/playbooks/metron_install.yml --- @@ -15,13 +15,6 @@ # limitations under the License. # --- -- hosts: metron - become: true - roles: -- { role: ambari_slave } -- { role: metron-builder, tags: ['build'] } --- End diff -- Yes > Full Dev Builds Metron Twice > > > Key: METRON-1349 > URL: https://issues.apache.org/jira/browse/METRON-1349 > Project: Metron > Issue Type: Bug >Reporter: Nick Allen >Assignee: Nick Allen > > When deploying Metron in Full Dev, the "Build Metron" step gets run twice. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags
[ https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289686#comment-16289686 ] ASF GitHub Bot commented on METRON-1345: Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/859#discussion_r156744095 --- Diff: metron-deployment/amazon-ec2/README.md --- @@ -126,6 +126,10 @@ To provision only subsets of the entire Metron deployment, Ansible tags can be s ./run.sh --tags="ec2,sensors" ``` +### Setting REST API Profile + --- End diff -- Spring profiles are documented in the REST README: https://github.com/apache/metron/tree/master/metron-interface/metron-rest#spring-profiles. Is there something we can do to make the REST README more accessible? I feel like a lot of questions people ask are already answered there but no one ever reads it. What can we do to make it more useful? Table of contents maybe? I would be happy to take that on in a follow-up PR. > Update EC2 README for custom Ansible tags > - > > Key: METRON-1345 > URL: https://issues.apache.org/jira/browse/METRON-1345 > Project: Metron > Issue Type: Improvement >Reporter: Michael Miklavcic >Assignee: Michael Miklavcic > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice
[ https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289652#comment-16289652 ] ASF GitHub Bot commented on METRON-1349: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/866#discussion_r156737918 --- Diff: metron-deployment/playbooks/metron_install.yml --- @@ -15,13 +15,6 @@ # limitations under the License. # --- -- hosts: metron - become: true - roles: -- { role: ambari_slave } -- { role: metron-builder, tags: ['build'] } --- End diff -- Will --ansible-skip-tags="build" still keep it from building at all? > Full Dev Builds Metron Twice > > > Key: METRON-1349 > URL: https://issues.apache.org/jira/browse/METRON-1349 > Project: Metron > Issue Type: Bug >Reporter: Nick Allen >Assignee: Nick Allen > > When deploying Metron in Full Dev, the "Build Metron" step gets run twice. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags
[ https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289647#comment-16289647 ] ASF GitHub Bot commented on METRON-1345: Github user ottobackwards commented on a diff in the pull request: https://github.com/apache/metron/pull/859#discussion_r156737230 --- Diff: metron-deployment/amazon-ec2/README.md --- @@ -126,6 +126,10 @@ To provision only subsets of the entire Metron deployment, Ansible tags can be s ./run.sh --tags="ec2,sensors" ``` +### Setting REST API Profile + --- End diff -- Can we say something for people like me.. along the lines of "Spring profiles are used for x,y,z. By default the dev profile is selected. Change the values of For more information on setting the profiles and their use please see > Update EC2 README for custom Ansible tags > - > > Key: METRON-1345 > URL: https://issues.apache.org/jira/browse/METRON-1345 > Project: Metron > Issue Type: Improvement >Reporter: Michael Miklavcic >Assignee: Michael Miklavcic > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability
[ https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289626#comment-16289626 ] Otto Fowler commented on METRON-1293: - Yeah, it is thin. I create jiras to track ideas for discussions like this, so thanks! All I'll say is, any extension / exit point in the flow would benefit from rules run on the data before sending on, either as routing hints or as transforms. The Foreign topology could sit on index, and run rules pre send. But I agree this is a solution looking for a problem ;) > NiFi Indexer(?) capability > -- > > Key: METRON-1293 > URL: https://issues.apache.org/jira/browse/METRON-1293 > Project: Metron > Issue Type: New Feature >Reporter: Otto Fowler >Priority: Trivial > > Either through an extension ( if and when they come to indexing ) or through > a normal module support for indexing to nifi input ports would allow for > flexible capabilities for systems where nifi is used. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice
[ https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289578#comment-16289578 ] ASF GitHub Bot commented on METRON-1349: Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/866#discussion_r156726426 --- Diff: metron-deployment/playbooks/metron_install.yml --- @@ -15,13 +15,6 @@ # limitations under the License. # --- -- hosts: metron - become: true - roles: -- { role: ambari_slave } -- { role: metron-builder, tags: ['build'] } --- End diff -- This is the cause of the second build. The process was a bit more complex when Quick Dev was around because it would launch the Quick Dev image, then rebuild and try to push out new bits to Quick Dev. Since we don't need that any longer, this can be simplified. > Full Dev Builds Metron Twice > > > Key: METRON-1349 > URL: https://issues.apache.org/jira/browse/METRON-1349 > Project: Metron > Issue Type: Bug >Reporter: Nick Allen >Assignee: Nick Allen > > When deploying Metron in Full Dev, the "Build Metron" step gets run twice. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice
[ https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289572#comment-16289572 ] ASF GitHub Bot commented on METRON-1349: Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/866#discussion_r156725657 --- Diff: metron-deployment/roles/ambari_config/tasks/main.yml --- @@ -26,16 +26,15 @@ retries: 5 delay: 10 -- name : check if ambari-server is up on {{ ambari_host }}:{{ambari_port}} +- name : Wait for Ambari to start; http://{{ ambari_host }}:{{ ambari_port }} wait_for : host: "{{ ambari_host }}" port: "{{ ambari_port }}" -delay: 120 -timeout: 300 +timeout: 600 --- End diff -- There is no need to always wait 2 minutes for Ambari to be ready. Most often it is already up and kicking by the time we get here. Rather than have a forced delay, I just added the delay duration to the overall timeout parameter in case there is a delay in getting Ambari going. > Full Dev Builds Metron Twice > > > Key: METRON-1349 > URL: https://issues.apache.org/jira/browse/METRON-1349 > Project: Metron > Issue Type: Bug >Reporter: Nick Allen >Assignee: Nick Allen > > When deploying Metron in Full Dev, the "Build Metron" step gets run twice. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice
[ https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289567#comment-16289567 ] ASF GitHub Bot commented on METRON-1349: Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/866#discussion_r156724938 --- Diff: metron-deployment/roles/epel/tasks/main.yml --- @@ -16,6 +16,4 @@ # --- - name: Install EPEL repository - yum: name=epel-release update_cache=yes - - + yum: name=epel-release --- End diff -- There is no need to force a cache update here. This role gets run repetitively and forcing a cache update just slows us down. > Full Dev Builds Metron Twice > > > Key: METRON-1349 > URL: https://issues.apache.org/jira/browse/METRON-1349 > Project: Metron > Issue Type: Bug >Reporter: Nick Allen >Assignee: Nick Allen > > When deploying Metron in Full Dev, the "Build Metron" step gets run twice. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1349) Full Dev Builds Metron Twice
[ https://issues.apache.org/jira/browse/METRON-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289566#comment-16289566 ] ASF GitHub Bot commented on METRON-1349: GitHub user nickwallen opened a pull request: https://github.com/apache/metron/pull/866 METRON-1349 Full Dev Builds Metron Twice Removing the "Quick Dev" environment in #852 had an unintended side effect. It caused Metron to be built twice during the Full Dev deployment process. Unless you prefer a double-build for thoroughness, this can be annoying. ## Testing Deploy Full Dev and ensure that Metron is not build twice. Once Metron is deployed, login to Ambari and run the Metron Service Check. If the service check passes, we've done a solid. You can merge this pull request into a Git repository by running: $ git pull https://github.com/nickwallen/metron METRON-1349 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/866.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #866 commit f152326f4c401129f0b05d03045440e7cf5dda2b Author: Nick Allen Date: 2017-12-13T17:11:59Z METRON-1349 Full Dev Builds Metron Twice > Full Dev Builds Metron Twice > > > Key: METRON-1349 > URL: https://issues.apache.org/jira/browse/METRON-1349 > Project: Metron > Issue Type: Bug >Reporter: Nick Allen >Assignee: Nick Allen > > When deploying Metron in Full Dev, the "Build Metron" step gets run twice. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1345) Update EC2 README for custom Ansible tags
[ https://issues.apache.org/jira/browse/METRON-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289552#comment-16289552 ] ASF GitHub Bot commented on METRON-1345: Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/metron/pull/859#discussion_r156723676 --- Diff: metron-deployment/roles/ambari_config/vars/small_cluster.yml --- @@ -87,6 +87,8 @@ configurations: topology.classpath: '{{ topology_classpath }}' - kafka-broker: log.dirs: '{{ kafka_log_dirs | default("/kafka-log") }}' --- End diff -- Just pushed a change. How's that @ottobackwards? > Update EC2 README for custom Ansible tags > - > > Key: METRON-1345 > URL: https://issues.apache.org/jira/browse/METRON-1345 > Project: Metron > Issue Type: Improvement >Reporter: Michael Miklavcic >Assignee: Michael Miklavcic > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability
[ https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289538#comment-16289538 ] Simon Elliston Ball commented on METRON-1293: - So you would imagine something that picks up data from the indexing topic in Kafka, and sends it to NiFi site-to-site. Personally I don't see any benefit in that vs just having NiFi ConsumeKafka pointed at the indexing topic. > NiFi Indexer(?) capability > -- > > Key: METRON-1293 > URL: https://issues.apache.org/jira/browse/METRON-1293 > Project: Metron > Issue Type: New Feature >Reporter: Otto Fowler >Priority: Trivial > > Either through an extension ( if and when they come to indexing ) or through > a normal module support for indexing to nifi input ports would allow for > flexible capabilities for systems where nifi is used. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (METRON-1349) Full Dev Builds Metron Twice
Nick Allen created METRON-1349: -- Summary: Full Dev Builds Metron Twice Key: METRON-1349 URL: https://issues.apache.org/jira/browse/METRON-1349 Project: Metron Issue Type: Bug Reporter: Nick Allen Assignee: Nick Allen When deploying Metron in Full Dev, the "Build Metron" step gets run twice. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability
[ https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289521#comment-16289521 ] Otto Fowler commented on METRON-1293: - So, I don't think it is really an 'indexing' thing, other than indexing is the end of our pipeline. If there was a peer to indexing, for sending to other systems -- like streamline or nifi etc then that would be what the ticket should be. If you follow. That would also be 'another' place to plugin a rules engine or stellar. > NiFi Indexer(?) capability > -- > > Key: METRON-1293 > URL: https://issues.apache.org/jira/browse/METRON-1293 > Project: Metron > Issue Type: New Feature >Reporter: Otto Fowler >Priority: Trivial > > Either through an extension ( if and when they come to indexing ) or through > a normal module support for indexing to nifi input ports would allow for > flexible capabilities for systems where nifi is used. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability
[ https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289486#comment-16289486 ] Simon Elliston Ball commented on METRON-1293: - I see the thinking there, and there is some benefit in directly supporting site-to-site protocol in terms of load balancing direct from Metron to NiFi without the intermediate step of load balancing across kafka consumers, but I'm not sure I would recommend writing direct into NiFi from a resilience point of view, given that the NiFi content repository is not distributed. Still, once we've split out indexing to make it more extensible, this would be a good indexing option (that I would never recommend to anyone in production) > NiFi Indexer(?) capability > -- > > Key: METRON-1293 > URL: https://issues.apache.org/jira/browse/METRON-1293 > Project: Metron > Issue Type: New Feature >Reporter: Otto Fowler >Priority: Trivial > > Either through an extension ( if and when they come to indexing ) or through > a normal module support for indexing to nifi input ports would allow for > flexible capabilities for systems where nifi is used. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability
[ https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289477#comment-16289477 ] Otto Fowler commented on METRON-1293: - The idea would be a 'native' nifi connection might offer better integration and flexibility. I do not have something more concrete than that. > NiFi Indexer(?) capability > -- > > Key: METRON-1293 > URL: https://issues.apache.org/jira/browse/METRON-1293 > Project: Metron > Issue Type: New Feature >Reporter: Otto Fowler >Priority: Trivial > > Either through an extension ( if and when they come to indexing ) or through > a normal module support for indexing to nifi input ports would allow for > flexible capabilities for systems where nifi is used. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (METRON-1293) NiFi Indexer(?) capability
[ https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Otto Fowler reassigned METRON-1293: --- Assignee: (was: Otto Fowler) > NiFi Indexer(?) capability > -- > > Key: METRON-1293 > URL: https://issues.apache.org/jira/browse/METRON-1293 > Project: Metron > Issue Type: New Feature >Reporter: Otto Fowler >Priority: Trivial > > Either through an extension ( if and when they come to indexing ) or through > a normal module support for indexing to nifi input ports would allow for > flexible capabilities for systems where nifi is used. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1343) Swagger UI for User Controller needs request method
[ https://issues.apache.org/jira/browse/METRON-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289468#comment-16289468 ] ASF GitHub Bot commented on METRON-1343: Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/862 Please take care to mark the jira as done > Swagger UI for User Controller needs request method > --- > > Key: METRON-1343 > URL: https://issues.apache.org/jira/browse/METRON-1343 > Project: Metron > Issue Type: Bug >Reporter: Mohan >Assignee: Mohan >Priority: Minor > Attachments: Screen Shot 2017-12-05 at 4.35.07 PM (2).png > > > Swagger UI for metron rest endpoints for User Controller has Multiple > requestMethods list for the same operations > !Screen Shot 2017-12-05 at 4.35.07 PM (2).png|thumbnail! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1343) Swagger UI for User Controller needs request method
[ https://issues.apache.org/jira/browse/METRON-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289467#comment-16289467 ] ASF GitHub Bot commented on METRON-1343: Github user asfgit closed the pull request at: https://github.com/apache/metron/pull/862 > Swagger UI for User Controller needs request method > --- > > Key: METRON-1343 > URL: https://issues.apache.org/jira/browse/METRON-1343 > Project: Metron > Issue Type: Bug >Reporter: Mohan >Assignee: Mohan >Priority: Minor > Attachments: Screen Shot 2017-12-05 at 4.35.07 PM (2).png > > > Swagger UI for metron rest endpoints for User Controller has Multiple > requestMethods list for the same operations > !Screen Shot 2017-12-05 at 4.35.07 PM (2).png|thumbnail! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1340) Improve e2e tests for metron alerts
[ https://issues.apache.org/jira/browse/METRON-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289442#comment-16289442 ] ASF GitHub Bot commented on METRON-1340: Github user mmiklavc commented on the issue: https://github.com/apache/metron/pull/857 Follow up from @merrimanr and my work yesterday. We upped the versions of Node to 9.2.1. Per the doc, >8 is required to work with async/await. For good measure, I also set the NPM version to 5.6.0. We didn't touch Jasmine, but the Protractor docs also state that it should be > 2.7. Looks like we are currently using 2.5.2 per the package.json file. We may want to consider increasing that version as well. We added `SELENIUM_PROMISE_MANAGER: false` to `protractor.conf.js` and immediately got failures due to `Promise` use in the Protractor tests and configuration. e.g. `var defer = protractor.promise.defer();`. So we removed references to promises in the conf file and were able to get past that first batch of errors. Now we were into problems with the tests. I started with the `login.e2e-spec.ts` spec file and removed `: Promise`. Running the tests again, the login tests were able to succeed. There are still a large number of failures due to disabling the promise manager, but still having code throughout the test suite that leverages the older style. It's unclear if this will resolve all stability issues, but I think this is moving in the right direction. > Improve e2e tests for metron alerts > --- > > Key: METRON-1340 > URL: https://issues.apache.org/jira/browse/METRON-1340 > Project: Metron > Issue Type: Bug >Reporter: RaghuMitra >Assignee: RaghuMitra > > Need to improve e2e tests in the following areas: > - Tests should not be flaky > - Remove the sleep ( This should implicitly make the tests run faster) > - Truncate HBase table 'metron_update' before starting the tests > - Improve the tests descriptions > - Run the tests headless if possible > - Check the node version and browser version before launching the tests > The expected behavior is that there are no intermittent failures. Acceptance > criteria: 5 consecutive runs without failures. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type
[ https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289427#comment-16289427 ] ASF GitHub Bot commented on METRON-1347: Github user cestella commented on the issue: https://github.com/apache/metron/pull/863 Actually, I don't think `original_string` is required past the parser topology. For instance, profiler messages into enrichment do not have `original_string`. > Indexing Topology should fail tuples without a source.type > -- > > Key: METRON-1347 > URL: https://issues.apache.org/jira/browse/METRON-1347 > Project: Metron > Issue Type: Bug >Reporter: Casey Stella > > If you are sending data into metron indexing without a source.type, which > would only happen if you're bypassing our previous topologies, we cannot > configure how we write to the indices, so the message should be explicitly > failed and reported. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1343) Swagger UI for User Controller needs request method
[ https://issues.apache.org/jira/browse/METRON-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289405#comment-16289405 ] ASF GitHub Bot commented on METRON-1343: Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/862 +1, Thanks for the contribution! > Swagger UI for User Controller needs request method > --- > > Key: METRON-1343 > URL: https://issues.apache.org/jira/browse/METRON-1343 > Project: Metron > Issue Type: Bug >Reporter: Mohan >Assignee: Mohan >Priority: Minor > Attachments: Screen Shot 2017-12-05 at 4.35.07 PM (2).png > > > Swagger UI for metron rest endpoints for User Controller has Multiple > requestMethods list for the same operations > !Screen Shot 2017-12-05 at 4.35.07 PM (2).png|thumbnail! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1212) Bundles and Maven Plugin
[ https://issues.apache.org/jira/browse/METRON-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289402#comment-16289402 ] ASF GitHub Bot commented on METRON-1212: GitHub user ottobackwards opened a pull request: https://github.com/apache/metron/pull/865 METRON-1212 The bundle System and Maven Plugin (Feature Branch) This PR contains the Bundle system and Maven Plugin. The bundle system and the plugin are adapted from the Apache Nifi project. ## bundles-maven-plugin The bundles-maven-plugin is an adapted version of the jar dependency plugin whose function is to bundle a jar of jars based on the dependencies for a project. It also creates metadata attributes. A project's jar, and it's non-provided dependency jars are place in a /lib entry in the bundle, with the bundle itself being in jar format. ## bundles-lib The bundles-lib contains the functionality required to: - discover bundles - inspect bundles for exposed extension types - load the bundles - create special class loaders for bundles - deliver instances of extension types for use NAR exposed the bundles through many classes. I have created the BundleSystem interface to expose a more usable, simplified api for our use cases. ### From the original PR for METRON-777: Metron Bundle Plugin - adaptation of the nifi plugin - more configurable wrt file extension/dependency and metadata naming bundle-lib - adaptation of nifi-nar-utils to be used outside of the nifi project - rudimentary extensibility to allow configuration and injection of service types and other things that were hard coded to nifi - refactored from File based to VFS based - rebranding to Bundle from Nar ( although the lib and the plugin allow that to be configured now ) - added capability to the properties class to write to stream, adapted to uri from paths - added integration tests for hdfs - changed to be ClassIndex based instead of ServiceLoader. Service loader is slower, and Casey's ClassIndex work is great. This also removes the NAR's required manual maintenance of the service file. - refactored to use VFS to load the bundle/nar into the classloader AND to use VFS to load the dependency jars -> VFS as a composite filesystem. Thus going from NAR's 'working directory', exploded NARS to just loading the bundle/nar. ## Previous Review Please see [@mattf_apache's review](https://github.com/apache/metron/pull/530/files/c5f8c34e4de8e6d456b97edd6f8a0d33b4819d69) ## changes from that review I have changed the InitContext operations to have explicit builders, and made it so that creating a context can be done separately from initialization. Two contexts can then be 'merged'. This is to allow for the addition of new bundles after initialization. In preparing this PR I have: - made checkstyle fixes - fixed several types - added a requested set of tests loading and executing simple interface/implementation from bundle beyond what is already in the bundle-lib tests ## Testing *` cd bundles-maven-plugin && mvn -q install && cd .. ` must be run once to install the maven plugin * This review is code review and test code review and running only * [Test Project](https://github.com/ottobackwards/test-bundles-plugin) can be examined as a simple example of how to create bundles. * The README.md has getting started and quickstart sections with some overview of creating by hand ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron \ - [x] Have you written or updated unit tests and or integration tests to verify your changes? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? You can merge this pull request into a Git repository by running: $ git pull https://github.com/ottobackwards/metron fifth_bundles Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/865.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #865
[jira] [Commented] (METRON-1212) Bundles and Maven Plugin
[ https://issues.apache.org/jira/browse/METRON-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289395#comment-16289395 ] ASF GitHub Bot commented on METRON-1212: Github user ottobackwards closed the pull request at: https://github.com/apache/metron/pull/774 > Bundles and Maven Plugin > > > Key: METRON-1212 > URL: https://issues.apache.org/jira/browse/METRON-1212 > Project: Metron > Issue Type: Sub-task >Reporter: Otto Fowler >Assignee: Otto Fowler > Labels: metron-feature-canidate, > metron-feature-extensions-parsers > > The first effort will be to land the bundle system and supporting maven > plugin on master -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type
[ https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289389#comment-16289389 ] ASF GitHub Bot commented on METRON-1347: Github user ottobackwards commented on the issue: https://github.com/apache/metron/pull/863 The minimum required fields, as far as I can see right now are source.type, original_string and timestamp. Given the use case for this is something that has skipped the parser topology, we should validate those. If we think the same can be done for indexing, then we should use the same classes/technique there. Again, this is based on the presented use case > Indexing Topology should fail tuples without a source.type > -- > > Key: METRON-1347 > URL: https://issues.apache.org/jira/browse/METRON-1347 > Project: Metron > Issue Type: Bug >Reporter: Casey Stella > > If you are sending data into metron indexing without a source.type, which > would only happen if you're bypassing our previous topologies, we cannot > configure how we write to the indices, so the message should be explicitly > failed and reported. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type
[ https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289328#comment-16289328 ] ASF GitHub Bot commented on METRON-1347: Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/863 I would like to hear feedback from @ottobackwards on other required fields but this looks good to me otherwise. > Indexing Topology should fail tuples without a source.type > -- > > Key: METRON-1347 > URL: https://issues.apache.org/jira/browse/METRON-1347 > Project: Metron > Issue Type: Bug >Reporter: Casey Stella > > If you are sending data into metron indexing without a source.type, which > would only happen if you're bypassing our previous topologies, we cannot > configure how we write to the indices, so the message should be explicitly > failed and reported. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type
[ https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289341#comment-16289341 ] ASF GitHub Bot commented on METRON-1347: Github user simonellistonball commented on a diff in the pull request: https://github.com/apache/metron/pull/863#discussion_r156676868 --- Diff: metron-platform/metron-writer/src/main/java/org/apache/metron/writer/bolt/BulkMessageWriterBolt.java --- @@ -229,17 +239,30 @@ public void execute(Tuple tuple) { LOG.trace("Writing enrichment message: {}", message); WriterConfiguration writerConfiguration = configurationTransformation.apply( new IndexingWriterConfiguration(bulkMessageWriter.getName(), getConfigurations())); - if(writerConfiguration.isDefault(sensorType)) { -//want to warn, but not fail the tuple -collector.reportError(new Exception("WARNING: Default and (likely) unoptimized writer config used for " + bulkMessageWriter.getName() + " writer and sensor " + sensorType)); + if(sensorType == null) { --- End diff -- Strictly speaking that's true, but by convention original_string should be required. There is a broader topic about what should be required, but that certainly doesn't belong in a comment on a PR. > Indexing Topology should fail tuples without a source.type > -- > > Key: METRON-1347 > URL: https://issues.apache.org/jira/browse/METRON-1347 > Project: Metron > Issue Type: Bug >Reporter: Casey Stella > > If you are sending data into metron indexing without a source.type, which > would only happen if you're bypassing our previous topologies, we cannot > configure how we write to the indices, so the message should be explicitly > failed and reported. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type
[ https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289338#comment-16289338 ] ASF GitHub Bot commented on METRON-1347: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/863#discussion_r156676155 --- Diff: metron-platform/metron-indexing/README.md --- @@ -15,6 +15,12 @@ Indices are written in batch and the batch size and batch timeout are specified [Sensor Indexing Configuration](#sensor-indexing-configuration) via the `batchSize` and `batchTimeout` parameters. These configs are variable by sensor type. --- End diff -- So, strictly speaking messages really only require `source.type` (which I typo'd) and `timestamp` (which I should add). I'll fix that, but did I miss anything? > Indexing Topology should fail tuples without a source.type > -- > > Key: METRON-1347 > URL: https://issues.apache.org/jira/browse/METRON-1347 > Project: Metron > Issue Type: Bug >Reporter: Casey Stella > > If you are sending data into metron indexing without a source.type, which > would only happen if you're bypassing our previous topologies, we cannot > configure how we write to the indices, so the message should be explicitly > failed and reported. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type
[ https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289332#comment-16289332 ] ASF GitHub Bot commented on METRON-1347: Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/863#discussion_r156675356 --- Diff: metron-platform/metron-writer/src/main/java/org/apache/metron/writer/bolt/BulkMessageWriterBolt.java --- @@ -229,17 +239,30 @@ public void execute(Tuple tuple) { LOG.trace("Writing enrichment message: {}", message); WriterConfiguration writerConfiguration = configurationTransformation.apply( new IndexingWriterConfiguration(bulkMessageWriter.getName(), getConfigurations())); - if(writerConfiguration.isDefault(sensorType)) { -//want to warn, but not fail the tuple -collector.reportError(new Exception("WARNING: Default and (likely) unoptimized writer config used for " + bulkMessageWriter.getName() + " writer and sensor " + sensorType)); + if(sensorType == null) { --- End diff -- Sure thing. Really the only two required are `timestamp` and `source.type`. Did I miss any? > Indexing Topology should fail tuples without a source.type > -- > > Key: METRON-1347 > URL: https://issues.apache.org/jira/browse/METRON-1347 > Project: Metron > Issue Type: Bug >Reporter: Casey Stella > > If you are sending data into metron indexing without a source.type, which > would only happen if you're bypassing our previous topologies, we cannot > configure how we write to the indices, so the message should be explicitly > failed and reported. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1293) NiFi Indexer(?) capability
[ https://issues.apache.org/jira/browse/METRON-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289330#comment-16289330 ] Simon Elliston Ball commented on METRON-1293: - [~ottobackwards] I'm curious about what you would get from this that would not be achieved by just hookinf NiFi's ConsumeKafka process up to the indexing topic. > NiFi Indexer(?) capability > -- > > Key: METRON-1293 > URL: https://issues.apache.org/jira/browse/METRON-1293 > Project: Metron > Issue Type: New Feature >Reporter: Otto Fowler >Assignee: Otto Fowler >Priority: Trivial > > Either through an extension ( if and when they come to indexing ) or through > a normal module support for indexing to nifi input ports would allow for > flexible capabilities for systems where nifi is used. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1347) Indexing Topology should fail tuples without a source.type
[ https://issues.apache.org/jira/browse/METRON-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289326#comment-16289326 ] ASF GitHub Bot commented on METRON-1347: Github user merrimanr commented on a diff in the pull request: https://github.com/apache/metron/pull/863#discussion_r156674159 --- Diff: metron-platform/metron-writer/src/main/java/org/apache/metron/writer/bolt/BulkMessageWriterBolt.java --- @@ -229,17 +239,30 @@ public void execute(Tuple tuple) { LOG.trace("Writing enrichment message: {}", message); WriterConfiguration writerConfiguration = configurationTransformation.apply( new IndexingWriterConfiguration(bulkMessageWriter.getName(), getConfigurations())); - if(writerConfiguration.isDefault(sensorType)) { -//want to warn, but not fail the tuple -collector.reportError(new Exception("WARNING: Default and (likely) unoptimized writer config used for " + bulkMessageWriter.getName() + " writer and sensor " + sensorType)); + if(sensorType == null) { --- End diff -- @ottobackwards which fields should we validate here? > Indexing Topology should fail tuples without a source.type > -- > > Key: METRON-1347 > URL: https://issues.apache.org/jira/browse/METRON-1347 > Project: Metron > Issue Type: Bug >Reporter: Casey Stella > > If you are sending data into metron indexing without a source.type, which > would only happen if you're bypassing our previous topologies, we cannot > configure how we write to the indices, so the message should be explicitly > failed and reported. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (METRON-1244) Metron should support VPN Log Parsing
[ https://issues.apache.org/jira/browse/METRON-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16289216#comment-16289216 ] Simon Elliston Ball commented on METRON-1244: - Absolutely agreed. The challenge is getting hold of good quality sample logs that we can incorporate cleanly into integration tests. If anyone can contribute logs alone, that would be helpful, and happy to collaborate on getting the parsers done too. > Metron should support VPN Log Parsing > - > > Key: METRON-1244 > URL: https://issues.apache.org/jira/browse/METRON-1244 > Project: Metron > Issue Type: New Feature >Reporter: Otto Fowler > > VPN Log parsing is very valuable. Metron should support parsing VPN logs > from multiple vendors, and currently supported devices such as ASA if not > already. > Juniper (Pulse Secure) > openVPN > fortigate > F5 > Sonicwall > others > This support may be by grok rules or by custom parser. > We may want to expand this to custom dashboards for VPN specific fields, > extensions to metron fields for vpn class logs etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)