[GitHub] metron issue #923: METRON-1442: Split rest end points for indexing topology ...

2018-02-02 Thread MohanDV
Github user MohanDV commented on the issue:

https://github.com/apache/metron/pull/923
  
@cestella I merged your PR


---


[GitHub] metron issue #923: METRON-1442: Split rest end points for indexing topology ...

2018-02-02 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/923
  
Looks like you didn't quite get all of the mock test infrastructure for 
`metron-rest` set up properly.  I went ahead and submitted a PR against your 
branch to help out. :) If you can merge 
https://github.com/MohanDV/metron/pull/1 into your branch, it should fix the 
tests.


---


[GitHub] metron issue #922: METRON-1441: Create complementary Solr schemas for the ma...

2018-02-02 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/922
  
I tested this in full dev using the install script in 
https://github.com/apache/metron/pull/918.  I was able to create collections 
for each schema except for "error".  For that to work properly, I had to: 

- remove `docValues="true"` from the "bytes" field type
- add the "guid" field used in other schemas

Still working on indexing data into these collections but so far so good.


---


Re: [DISCUSS] Profiler Enhancement

2018-02-02 Thread Otto Fowler
You know, I am going to back this up.
I usually thing of replay as replay, profiler or not, but that is not true.
Replay of data through the full pipeline (parsers/enrichement) has more
consequences or concerns, so we can drop this.
I don’t want to expand the scope of your idea.  We can reuse/refactor to
the other case (parser + enrichment) later.
Sorry.


——

So, about re-writing.
If we replay a set of data with a new version of a profile I think it will
always have to be a new profile and not ‘replace’
the old one.   Series1, Seriers2  etc?




On February 2, 2018 at 17:24:46, Nick Allen (n...@nickallen.org) wrote:

I think that is definitely a reasonable extension.

In this case would we need any additional actions to indicate that data
will be overwritten?

I am trying to think of other additional needs that this use case has over
the others.

On Feb 2, 2018 12:38 PM, "Otto Fowler"  wrote:

> Scenario 3:
> As a Security ?  I have modified a profile or parser configuration (
> replay is replay ), and I want to run the new version
> against my old data.
>
>
>
> On February 2, 2018 at 12:19:54, Nick Allen (n...@nickallen.org) wrote:
>
> I have been thinking about an enhancement to the Profiler for quite some
> time. Actually, my first pass at defining this was called "Replay
> Telemetry through Profiler" back in METRON-594 [1].
>
> I'd like to first discuss the use case to make sure we start out on the
> right foot. Here is how I would define the use cases for this
> functionality.
>
> *> Scenario 1: Model Development*
>
> As a Security Data Scientist, I want to understand the historical behaviors
> and trends of a profile that I have created so that I can understand if it
> is valuable for model building.
>
> There are two possible negative outcomes that the Security Data Scientist
> must be aware of when creating profiles.
>
>
> - The profile might have been defined incorrectly resulting in a feature
> set that does not match reality (a bug in the profile definition).
>
>
> - The profile might have been defined correctly, but the feature set
> itself has no predictive value.
>
> Analyzing the profile over archived, historical telemetry allows the
> Security Data Scientist to better to mitigate both of these negative
> outcomes.
>
>
> *> Scenario 2: Model Deployment*
>
> As a Security Platform Engineer, I want to generate a profile using
> archived telemetry when I deploy a new model to production so that models
> depending on that profile can begin to function on day 1.
>
>
>
> (Q) Do these make sense? Am I missing anything? Too broad or too narrow?
>
> Once we nail down the use case(s), I'll delete the old JIRA and create a
> new JIRA with the use cases. That would give us a place to start on the
> technical details of the implementation.
>
> [1] https://issues.apache.org/jira/browse/METRON-594
>
>


Re: [DISCUSS] Profiler Enhancement

2018-02-02 Thread Nick Allen
I think that is definitely a reasonable extension.

In this case would we need any additional actions to indicate that data
will be overwritten?

I am trying to think of other additional needs that this use case has over
the others.

On Feb 2, 2018 12:38 PM, "Otto Fowler"  wrote:

> Scenario 3:
> As a Security ?  I have modified a profile or parser configuration (
> replay is replay ), and I want to run the new version
> against my old data.
>
>
>
> On February 2, 2018 at 12:19:54, Nick Allen (n...@nickallen.org) wrote:
>
> I have been thinking about an enhancement to the Profiler for quite some
> time. Actually, my first pass at defining this was called "Replay
> Telemetry through Profiler" back in METRON-594 [1].
>
> I'd like to first discuss the use case to make sure we start out on the
> right foot. Here is how I would define the use cases for this
> functionality.
>
> *> Scenario 1: Model Development*
>
> As a Security Data Scientist, I want to understand the historical
> behaviors
> and trends of a profile that I have created so that I can understand if it
> is valuable for model building.
>
> There are two possible negative outcomes that the Security Data Scientist
> must be aware of when creating profiles.
>
>
> - The profile might have been defined incorrectly resulting in a feature
> set that does not match reality (a bug in the profile definition).
>
>
> - The profile might have been defined correctly, but the feature set
> itself has no predictive value.
>
> Analyzing the profile over archived, historical telemetry allows the
> Security Data Scientist to better to mitigate both of these negative
> outcomes.
>
>
> *> Scenario 2: Model Deployment*
>
> As a Security Platform Engineer, I want to generate a profile using
> archived telemetry when I deploy a new model to production so that models
> depending on that profile can begin to function on day 1.
>
>
>
> (Q) Do these make sense? Am I missing anything? Too broad or too narrow?
>
> Once we nail down the use case(s), I'll delete the old JIRA and create a
> new JIRA with the use cases. That would give us a place to start on the
> technical details of the implementation.
>
> [1] https://issues.apache.org/jira/browse/METRON-594
>
>


[GitHub] metron pull request #925: METRON-1443 Missing Critical MPack Install Instruc...

2018-02-02 Thread nickwallen
GitHub user nickwallen opened a pull request:

https://github.com/apache/metron/pull/925

METRON-1443 Missing Critical MPack Install Instruction for Ubuntu

When installing Elasticsearch with the MPack on Ubuntu, you must manually 
install the Elasticsearch repositories.  The Mpack itself does not do this, 
like it does on CentOS. 

When the development environment on Ubuntu is spun-up this step is 
performed within Ansible as a prerequisite to the Mpack install.  Until this 
can be fixed so that it matches what happens on CentOS, this needs to be at 
least documented.

I should have documented this in #903 , but did not do so.  Oops.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nickwallen/metron METRON-1443

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/925.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #925


commit 0c40178494d2a12e8e4abbd43ba4f85338aa05da
Author: Nick Allen 
Date:   2018-02-02T22:12:23Z

METRON-1443 Missing Critical MPack Install Instruction for Ubuntu




---


[GitHub] metron pull request #924: METRON-1299 In MetronError tests, don't test for H...

2018-02-02 Thread ottobackwards
GitHub user ottobackwards opened a pull request:

https://github.com/apache/metron/pull/924

METRON-1299 In MetronError tests, don't test for HostName if getHostName 
wouldn't work

MetronError ignores exceptions from 
InetAddress.getLocalHost().getHostName() and leaves the field unset.

The unit test however assumes it would be set, and someone has logged a 
jira on this, since it makes the build fail.

Changed the test so that it only verifies the hostName if it would have 
worked.

### Testing
- Code review
- Tests Pass

> no non-test changes in pr

```java
 private void addHostname(JSONObject errorMessage) {
try {
  errorMessage.put(ErrorFields.HOSTNAME.getName(), 
InetAddress.getLocalHost().getHostName());
} catch (UnknownHostException ex) {
  // Leave the hostname field off if it cannot be found
}
  }
```

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ottobackwards/metron error_addHost

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/924.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #924


commit cee3acba914f97ca7d2faf6e7822c97928a4e242
Author: Otto Fowler 
Date:   2018-02-02T21:57:47Z

do not test for hostName if calling hostName throws, since it will be null




---


[GitHub] metron issue #922: METRON-1441: Create complementary Solr schemas for the ma...

2018-02-02 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/922
  
Ok, I did the following:
* Augmented the readme to point to the Solr documentation around schemas.  
Keep in mind, this is intermediate work that will feed into the "install Solr" 
work
* Added yaf and error schemas
* Renamed the test to an integration test
* Moved the data from multiline to separate files


---


[GitHub] metron issue #865: METRON-1212 The bundle System and Maven Plugin (Feature B...

2018-02-02 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/metron/pull/865
  
Hey @JonZeolla how is it going?



---


[GitHub] metron pull request #923: METRON-1442: Split rest end points for indexing to...

2018-02-02 Thread MohanDV
GitHub user MohanDV opened a pull request:

https://github.com/apache/metron/pull/923

METRON-1442: Split rest end points for indexing topology into random access 
indexing and batch indexing


## Contributor Comments
Split rest end points for indexing topology into random access indexing 
topology and batch indexing topology to support the 
start/stop/activate/deactivate/status operations on the respective topologies.
**Steps to Verify**
1 . Spin up Full Dev
2. Go to Swagger at http://node1:8082/swagger-ui.html#!/storm-controller/
3. You should see the below newly added rest endpoints
https://user-images.githubusercontent.com/12934693/35753803-1f96b6c4-0887-11e8-9d55-6e185e34cc52.png;>


## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/MohanDV/metron METRON-1442

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/923.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #923


commit 59a5f26195032c33e0cf56a0abf0153cda09d71d
Author: Mohan Venkateshaiah 
Date:   2018-02-02T20:24:34Z

Splitted rest end points for indexing topology to support the 
start/stop/activate/deactivate/status operations on the randomaccess and batch 
indexing topology




---


[GitHub] metron pull request #922: METRON-1441: Create complementary Solr schemas for...

2018-02-02 Thread cestella
Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/922#discussion_r165711148
  
--- Diff: 
metron-platform/metron-solr/src/test/java/org/apache/metron/solr/schema/SchemaTranslatorTest.java
 ---
@@ -0,0 +1,188 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.solr.schema;
+
+import com.google.common.base.Splitter;
+import com.google.common.collect.Iterables;
+import org.adrianwalker.multilinestring.Multiline;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.utils.JSONUtils;
+import org.apache.metron.integration.UnableToStartException;
+import org.apache.metron.solr.integration.components.SolrComponent;
+import org.apache.metron.solr.writer.SolrWriter;
+import org.json.simple.JSONObject;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.io.StringWriter;
+import java.util.*;
+
+public class SchemaTranslatorTest {
+
+  /**

+{"adapter.threatinteladapter.end.ts":"1517499201357","bro_timestamp":"1517499194.7338","ip_dst_port":8080,"enrichmentsplitterbolt.splitter.end.ts":"1517499201202","enrichmentsplitterbolt.splitter.begin.ts":"1517499201200","adapter.hostfromjsonlistadapter.end.ts":"1517499201207","adapter.geoadapter.begin.ts":"1517499201209","uid":"CUrRne3iLIxXavQtci","trans_depth":143,"protocol":"http","original_string":"HTTP
 | id.orig_p:50451 method:GET request_body_len:0 id.resp_p:8080 
uri:\/api\/v1\/clusters\/metron_cluster\/services\/KAFKA\/components\/KAFKA_BROKER?fields=metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesOutPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsMessagesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/controller\/KafkaController\/ActiveControllerCount[1484165330,1484168930,15],metrics\/kafk
 
a\/controller\/ControllerStats\/LeaderElectionRateAndTimeMs\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/controller\/ControllerStats\/UncleanLeaderElectionsPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaFetcherManager\/Replica-MaxLag[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaManager\/PartitionCount[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaManager\/UnderReplicatedPartitions[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaManager\/LeaderCount[1484165330,1484168930,15]=null_padding&_=1484168930776
 tags:[] uid:CUrRne3iLIxXavQtci referrer:http:\/\/node1:8080\/ trans_depth:143 
host:node1 id.orig_h:192.168.66.1 response_body_len:0 user_agent:Mozilla\/5.0 
(Macintosh; Intel Mac OS X 10_12_2) AppleWebKit\/537.36 (KHTML, like Gecko) 
Chrome\/55.0.2883.95 Safari\/537.36 ts:1517499194.7338 
id.resp_h:192.168.66.121","ip_dst_addr":"192.168.66.121","threatinteljoinbolt.joiner.ts":"1517499201359","host":"node1","en
 
richmentjoinbolt.joiner.ts":"1517499201212","adapter.hostfromjsonlistadapter.begin.ts":"1517499201206","threatintelsplitterbolt.splitter.begin.ts":"1517499201215","ip_src_addr":"192.168.66.1","user_agent":"Mozilla\/5.0
 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit\/537.36 (KHTML, like Gecko) 
Chrome\/55.0.2883.95 
Safari\/537.36","timestamp":1517499194733,"method":"GET","request_body_len":0,"uri":"\/api\/v1\/clusters\/metron_cluster\/services\/KAFKA\/components\/KAFKA_BROKER?fields=metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesOutPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsMessagesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/controller\/KafkaController\/ActiveControllerCount[1484165330,1484168930,15],metrics\/kafka\/controller\/ControllerStats\/LeaderElectionRateAndTimeMs\/1MinuteRate[148416533
 

Re: [DISCUSS] Profiler Enhancement

2018-02-02 Thread Otto Fowler
Scenario 3:
As a Security ?  I have modified a profile or parser configuration ( replay
is replay ), and I want to run the new version
against my old data.



On February 2, 2018 at 12:19:54, Nick Allen (n...@nickallen.org) wrote:

I have been thinking about an enhancement to the Profiler for quite some
time. Actually, my first pass at defining this was called "Replay
Telemetry through Profiler" back in METRON-594 [1].

I'd like to first discuss the use case to make sure we start out on the
right foot. Here is how I would define the use cases for this
functionality.

*> Scenario 1: Model Development*

As a Security Data Scientist, I want to understand the historical behaviors
and trends of a profile that I have created so that I can understand if it
is valuable for model building.

There are two possible negative outcomes that the Security Data Scientist
must be aware of when creating profiles.


- The profile might have been defined incorrectly resulting in a feature
set that does not match reality (a bug in the profile definition).


- The profile might have been defined correctly, but the feature set
itself has no predictive value.

Analyzing the profile over archived, historical telemetry allows the
Security Data Scientist to better to mitigate both of these negative
outcomes.


*> Scenario 2: Model Deployment*

As a Security Platform Engineer, I want to generate a profile using
archived telemetry when I deploy a new model to production so that models
depending on that profile can begin to function on day 1.



(Q) Do these make sense? Am I missing anything? Too broad or too narrow?

Once we nail down the use case(s), I'll delete the old JIRA and create a
new JIRA with the use cases. That would give us a place to start on the
technical details of the implementation.

[1] https://issues.apache.org/jira/browse/METRON-594


[DISCUSS] Profiler Enhancement

2018-02-02 Thread Nick Allen
I have been thinking about an enhancement to the Profiler for quite some
time.  Actually, my first pass at defining this was called "Replay
Telemetry through Profiler" back in METRON-594 [1].

I'd like to first discuss the use case to make sure we start out on the
right foot.  Here is how I would define the use cases for this
functionality.

*> Scenario 1:  Model Development*

As a Security Data Scientist, I want to understand the historical behaviors
and trends of a profile that I have created so that I can understand if it
is valuable for model building.

There are two possible negative outcomes that the Security Data Scientist
must be aware of when creating profiles.


   - The profile might have been defined incorrectly resulting in a feature
  set that does not match reality (a bug in the profile definition).


   - The profile might have been defined correctly, but the feature set
  itself has no predictive value.

Analyzing the profile over archived, historical telemetry allows the
Security Data Scientist to better to mitigate both of these negative
outcomes.


*> Scenario 2:  Model Deployment*

As a  Security Platform Engineer, I want to generate a profile using
archived telemetry when I deploy a new model to production so that models
depending on that profile can begin to function on day 1.



(Q) Do these make sense?  Am I missing anything?  Too broad or too narrow?

Once we nail down the use case(s), I'll delete the old JIRA and create a
new JIRA with the use cases.  That would give us a place to start on the
technical details of the implementation.

[1] https://issues.apache.org/jira/browse/METRON-594


[GitHub] metron pull request #920: METRON-1438 Move SHELL functions from metron-manag...

2018-02-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/metron/pull/920


---


[GitHub] metron pull request #918: METRON-1436: Manually Install Solr Cloud in Full D...

2018-02-02 Thread mmiklavc
Github user mmiklavc closed the pull request at:

https://github.com/apache/metron/pull/918


---


[GitHub] metron issue #920: METRON-1438 Move SHELL functions from metron-management t...

2018-02-02 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/metron/pull/920
  
+1 Thanks @ottobackwards 


---


[GitHub] metron pull request #920: METRON-1438 Move SHELL functions from metron-manag...

2018-02-02 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/920#discussion_r165663581
  
--- Diff: 
metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/ShellFunctionsTest.java
 ---
@@ -40,8 +45,8 @@
   );
 
   Context context = new Context.Builder()
-.with(Context.Capabilities.SHELL_VARIABLES , () -> variables)
-.build();
+.with(Context.Capabilities.SHELL_VARIABLES , () -> 
variables).build();
--- End diff --

Right now, I never let my IDE reformat for me.  Like you said, if we get 
the code base matching check style and I can load that style into my IDE, then 
I'd gladly let it do most of the work for me.

Maybe I'll open a discuss thread.  I don't know how to handle this kind of 
thing and it happens all the time.

But for this specific scenario in your PR, it really doesn't matter either 
way.  I think you're good to go either way.


---


[GitHub] metron pull request #922: METRON-1441: Create complementary Solr schemas for...

2018-02-02 Thread ottobackwards
Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/922#discussion_r165662614
  
--- Diff: 
metron-platform/metron-solr/src/test/java/org/apache/metron/solr/schema/SchemaTranslatorTest.java
 ---
@@ -0,0 +1,188 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.solr.schema;
+
+import com.google.common.base.Splitter;
+import com.google.common.collect.Iterables;
+import org.adrianwalker.multilinestring.Multiline;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.utils.JSONUtils;
+import org.apache.metron.integration.UnableToStartException;
+import org.apache.metron.solr.integration.components.SolrComponent;
+import org.apache.metron.solr.writer.SolrWriter;
+import org.json.simple.JSONObject;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.io.StringWriter;
+import java.util.*;
+
+public class SchemaTranslatorTest {
+
+  /**

+{"adapter.threatinteladapter.end.ts":"1517499201357","bro_timestamp":"1517499194.7338","ip_dst_port":8080,"enrichmentsplitterbolt.splitter.end.ts":"1517499201202","enrichmentsplitterbolt.splitter.begin.ts":"1517499201200","adapter.hostfromjsonlistadapter.end.ts":"1517499201207","adapter.geoadapter.begin.ts":"1517499201209","uid":"CUrRne3iLIxXavQtci","trans_depth":143,"protocol":"http","original_string":"HTTP
 | id.orig_p:50451 method:GET request_body_len:0 id.resp_p:8080 
uri:\/api\/v1\/clusters\/metron_cluster\/services\/KAFKA\/components\/KAFKA_BROKER?fields=metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesOutPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsMessagesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/controller\/KafkaController\/ActiveControllerCount[1484165330,1484168930,15],metrics\/kafk
 
a\/controller\/ControllerStats\/LeaderElectionRateAndTimeMs\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/controller\/ControllerStats\/UncleanLeaderElectionsPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaFetcherManager\/Replica-MaxLag[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaManager\/PartitionCount[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaManager\/UnderReplicatedPartitions[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaManager\/LeaderCount[1484165330,1484168930,15]=null_padding&_=1484168930776
 tags:[] uid:CUrRne3iLIxXavQtci referrer:http:\/\/node1:8080\/ trans_depth:143 
host:node1 id.orig_h:192.168.66.1 response_body_len:0 user_agent:Mozilla\/5.0 
(Macintosh; Intel Mac OS X 10_12_2) AppleWebKit\/537.36 (KHTML, like Gecko) 
Chrome\/55.0.2883.95 Safari\/537.36 ts:1517499194.7338 
id.resp_h:192.168.66.121","ip_dst_addr":"192.168.66.121","threatinteljoinbolt.joiner.ts":"1517499201359","host":"node1","en
 
richmentjoinbolt.joiner.ts":"1517499201212","adapter.hostfromjsonlistadapter.begin.ts":"1517499201206","threatintelsplitterbolt.splitter.begin.ts":"1517499201215","ip_src_addr":"192.168.66.1","user_agent":"Mozilla\/5.0
 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit\/537.36 (KHTML, like Gecko) 
Chrome\/55.0.2883.95 
Safari\/537.36","timestamp":1517499194733,"method":"GET","request_body_len":0,"uri":"\/api\/v1\/clusters\/metron_cluster\/services\/KAFKA\/components\/KAFKA_BROKER?fields=metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesOutPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsMessagesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/controller\/KafkaController\/ActiveControllerCount[1484165330,1484168930,15],metrics\/kafka\/controller\/ControllerStats\/LeaderElectionRateAndTimeMs\/1MinuteRate[148416533
 

[GitHub] metron pull request #920: METRON-1438 Move SHELL functions from metron-manag...

2018-02-02 Thread ottobackwards
Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/920#discussion_r165662017
  
--- Diff: 
metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/ShellFunctionsTest.java
 ---
@@ -40,8 +45,8 @@
   );
 
   Context context = new Context.Builder()
-.with(Context.Capabilities.SHELL_VARIABLES , () -> variables)
-.build();
+.with(Context.Capabilities.SHELL_VARIABLES , () -> 
variables).build();
--- End diff --

How do you have your formatting preferences set to get the above?


---


[GitHub] metron pull request #920: METRON-1438 Move SHELL functions from metron-manag...

2018-02-02 Thread ottobackwards
Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/metron/pull/920#discussion_r165661676
  
--- Diff: 
metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/ShellFunctionsTest.java
 ---
@@ -40,8 +45,8 @@
   );
 
   Context context = new Context.Builder()
-.with(Context.Capabilities.SHELL_VARIABLES , () -> variables)
-.build();
+.with(Context.Capabilities.SHELL_VARIABLES , () -> 
variables).build();
--- End diff --

I didn't reformat that like that on purpose.  I had a period where I was 
setting the CONSOLE capability.  When I removed it, it just worked out like 
this.

Even if I select and format in intellij it doesn't change the .build() to 
the next line.

I think what you have above is fine.  I would like it to 'just' happen if I 
format code though, since it is easy for things to slip through.

It is tough right now, because so much of the codebase isn't formatted to 
check style, and I don't think we want every pr to include a lot of formatting 
changes.



---


[GitHub] metron pull request #917: METRON-1435: Management UI cannot save json object...

2018-02-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/metron/pull/917


---


[GitHub] metron issue #918: METRON-1436: Manually Install Solr Cloud in Full Dev

2018-02-02 Thread justinleet
Github user justinleet commented on the issue:

https://github.com/apache/metron/pull/918
  
Thanks for the updates.  I'm +1 on including this in the feature branch.


---


Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Nick Allen
> Glad you agree with me that this isn’t HBase scale… it’s clearly not. I
would never suggest introducing HBase for something like this, but since
it’s there.

Ah, gotcha.  Misunderstood your statement.



On Fri, Feb 2, 2018 at 9:01 AM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Glad you agree with me that this isn’t HBase scale… it’s clearly not. I
> would never suggest introducing HBase for something like this, but since
> it’s there.
>
> On the idea of using the Ambari RDBMS for the same basis of it being
> there, I see your point. That said, it can be postgres, sql server, mysql,
> maria, oracle… various. Yes we have an ORM, but those are not nearly as
> magic as they claim, and upgrade / schema evolution of an RDBMS often
> involves some sort of platform dependent SQL migration in my experience. I
> would suggest that supporting that range of options is not a good idea for
> us. The Ambari project also pretty much reserve the right to blow away that
> infrastructure in upgrades (which is fair enough). So relying on there
> being an RDBMS owned by another component is not something I would
> necessarily say was a clean choice.
>
> Simon
>
> > On 2 Feb 2018, at 13:50, Nick Allen  wrote:
> >
> > I fall marginally on the side of an RDBMS.  There is definitely a case to
> > be made on both sides, but I'll point out a few things for the RDBMS.
> >
> >
> > (1) Flexibility.  Using an RDBMS is going to provide us with much greater
> > flexibility going forward.  We really don't know what the specific use
> > cases will be, but I am willing to bet they are user-focused
> (preferences,
> > etc).  The type of use cases that most web applications use an RDBMS for.
> >
> >
> >> If anything I would like to see the current RDBMS dependency come out...
> >
> > (2) Don't we already have an RDBMS requirement for Ambari?  That's a
> > dependency that we do not control.
> >
> >
> >> ... hbase seems a good option (because we already have it there, it
> would
> > be kinda crazy at this scale if we didn’t already have it)
> >
> > (3) In this scenario, the RDBMS would not scale proportionally with the
> > amount of telemetry, it would scale based on usage; primarily the number
> of
> > users.  This is not "big data" scale.  I don't think we can make the case
> > for HBase based on scale here.
> >
> >
> >> We would also end up with, as Mike points out, a whole new disk
> > deployment patterns and a bunch of additional DBA ops process
> requirements
> > for every install.
> >
> > (4) Most users that need HA/DR (and other 'advanced stuff'), are
> > enterprises and organizations that are already very familiar with RDBMS
> > solutions and have the infrastructure in place to manage those.  For
> users
> > that don't need HA/DR, just use the DB that gets spun-up with Ambari.
> >
> >
> >
> >
> >
> > On Fri, Feb 2, 2018 at 7:17 AM Simon Elliston Ball <
> > si...@simonellistonball.com> wrote:
> >
> >> Introducing a RDBMS to the stack seems unnecessary for this.
> >>
> >> If we consider the data access patterns for user profiles, we are
> unlikely
> >> to query into them, or indeed do anything other than look them up, or
> write
> >> them out by a username key. To that end, using an ORM to translate a a
> >> nested config object into a load of tables seems to introduce complexity
> >> and brittleness we then have to take away through relying on relational
> >> consistency models. We would also end up with, as Mike points out, a
> whole
> >> new disk deployment patterns and a bunch of additional DBA ops process
> >> requirements for every install.
> >>
> >> Since the access pattern is almost entirely key => value, hbase seems a
> >> good option (because we already have it there, it would be kinda crazy
> at
> >> this scale if we didn’t already have it) or arguably zookeeper, but that
> >> might be at the other end of the scale argument. I’d even go as far as
> to
> >> suggest files on HDFS to keep it simple.
> >>
> >> Simon
> >>
> >>> On 1 Feb 2018, at 23:24, Michael Miklavcic <
> michael.miklav...@gmail.com>
> >> wrote:
> >>>
> >>> Personally, I'd be in favor of something like Maria DB as an open
> source
> >>> repo. Or any other ansi sql store. On the positive side, it should mesh
> >>> seamlessly with ORM tools. And the schema for this should be pretty
> >>> vanilla, I'd imagine. I might even consider skipping ORM for straight
> >> JDBC
> >>> and simple command scripts in Java for something this small. I'm not
> >>> worried so much about migrations of this sort. Large scale DBs can get
> >>> involved with major schema changes, but thats usually when the
> datastore
> >> is
> >>> a massive set of tables with complex relationships, at least in my
> >>> experience.
> >>>
> >>> We could also use hbase, which probably wouldn't be that hard either,
> but
> >>> there may be more boilerplate to write for the client as compared to
> >>> standard SQL. But I'm assuming we could reuse a fair amount of 

[GitHub] metron issue #922: METRON-1441: Create complementary Solr schemas for the ma...

2018-02-02 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/metron/pull/922
  
@ottobackwards Very likely these schema files won't stay in this spot, but 
the final resting spot won't be apparent until we figure out how to 
automatically apply the schemas.  Treat this PR as just unlocking progress for 
downstream PRs (like correcting SolrWriter to write to Solr again).


---


[GitHub] metron pull request #922: METRON-1441: Create complementary Solr schemas for...

2018-02-02 Thread cestella
Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/922#discussion_r165656636
  
--- Diff: 
metron-platform/metron-solr/src/test/java/org/apache/metron/solr/schema/SchemaTranslatorTest.java
 ---
@@ -0,0 +1,188 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.solr.schema;
+
+import com.google.common.base.Splitter;
+import com.google.common.collect.Iterables;
+import org.adrianwalker.multilinestring.Multiline;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.utils.JSONUtils;
+import org.apache.metron.integration.UnableToStartException;
+import org.apache.metron.solr.integration.components.SolrComponent;
+import org.apache.metron.solr.writer.SolrWriter;
+import org.json.simple.JSONObject;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.io.StringWriter;
+import java.util.*;
+
+public class SchemaTranslatorTest {
+
+  /**

+{"adapter.threatinteladapter.end.ts":"1517499201357","bro_timestamp":"1517499194.7338","ip_dst_port":8080,"enrichmentsplitterbolt.splitter.end.ts":"1517499201202","enrichmentsplitterbolt.splitter.begin.ts":"1517499201200","adapter.hostfromjsonlistadapter.end.ts":"1517499201207","adapter.geoadapter.begin.ts":"1517499201209","uid":"CUrRne3iLIxXavQtci","trans_depth":143,"protocol":"http","original_string":"HTTP
 | id.orig_p:50451 method:GET request_body_len:0 id.resp_p:8080 
uri:\/api\/v1\/clusters\/metron_cluster\/services\/KAFKA\/components\/KAFKA_BROKER?fields=metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesOutPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsMessagesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/controller\/KafkaController\/ActiveControllerCount[1484165330,1484168930,15],metrics\/kafk
 
a\/controller\/ControllerStats\/LeaderElectionRateAndTimeMs\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/controller\/ControllerStats\/UncleanLeaderElectionsPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaFetcherManager\/Replica-MaxLag[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaManager\/PartitionCount[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaManager\/UnderReplicatedPartitions[1484165330,1484168930,15],metrics\/kafka\/server\/ReplicaManager\/LeaderCount[1484165330,1484168930,15]=null_padding&_=1484168930776
 tags:[] uid:CUrRne3iLIxXavQtci referrer:http:\/\/node1:8080\/ trans_depth:143 
host:node1 id.orig_h:192.168.66.1 response_body_len:0 user_agent:Mozilla\/5.0 
(Macintosh; Intel Mac OS X 10_12_2) AppleWebKit\/537.36 (KHTML, like Gecko) 
Chrome\/55.0.2883.95 Safari\/537.36 ts:1517499194.7338 
id.resp_h:192.168.66.121","ip_dst_addr":"192.168.66.121","threatinteljoinbolt.joiner.ts":"1517499201359","host":"node1","en
 
richmentjoinbolt.joiner.ts":"1517499201212","adapter.hostfromjsonlistadapter.begin.ts":"1517499201206","threatintelsplitterbolt.splitter.begin.ts":"1517499201215","ip_src_addr":"192.168.66.1","user_agent":"Mozilla\/5.0
 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit\/537.36 (KHTML, like Gecko) 
Chrome\/55.0.2883.95 
Safari\/537.36","timestamp":1517499194733,"method":"GET","request_body_len":0,"uri":"\/api\/v1\/clusters\/metron_cluster\/services\/KAFKA\/components\/KAFKA_BROKER?fields=metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsBytesOutPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/server\/BrokerTopicMetrics\/AllTopicsMessagesInPerSec\/1MinuteRate[1484165330,1484168930,15],metrics\/kafka\/controller\/KafkaController\/ActiveControllerCount[1484165330,1484168930,15],metrics\/kafka\/controller\/ControllerStats\/LeaderElectionRateAndTimeMs\/1MinuteRate[148416533
 

[GitHub] metron pull request #922: METRON-1441: Create complementary Solr schemas for...

2018-02-02 Thread cestella
Github user cestella commented on a diff in the pull request:

https://github.com/apache/metron/pull/922#discussion_r165656511
  
--- Diff: 
metron-platform/metron-solr/src/test/java/org/apache/metron/solr/schema/SchemaTranslatorTest.java
 ---
@@ -0,0 +1,188 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.solr.schema;
+
+import com.google.common.base.Splitter;
+import com.google.common.collect.Iterables;
+import org.adrianwalker.multilinestring.Multiline;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.utils.JSONUtils;
+import org.apache.metron.integration.UnableToStartException;
+import org.apache.metron.solr.integration.components.SolrComponent;
+import org.apache.metron.solr.writer.SolrWriter;
+import org.json.simple.JSONObject;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.io.StringWriter;
+import java.util.*;
+
--- End diff --

Yes, I absolutely can.


---


[GitHub] metron pull request #920: METRON-1438 Move SHELL functions from metron-manag...

2018-02-02 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/920#discussion_r165654991
  
--- Diff: 
metron-stellar/stellar-common/src/main/java/org/apache/metron/stellar/common/shell/cli/PausableInput.java
 ---
@@ -36,8 +37,8 @@
  *
  */
 public class PausableInput extends InputStream {
-  InputStream in = System.in;
-  boolean paused = false;
+  private InputStream in = System.in;
+  private AtomicBoolean paused = new AtomicBoolean(false);
--- End diff --

What problem were you solving here @ottobackwards ?  Is this bit access by 
multiple threads?


---


[GitHub] metron pull request #920: METRON-1438 Move SHELL functions from metron-manag...

2018-02-02 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/metron/pull/920#discussion_r165653517
  
--- Diff: 
metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/ShellFunctionsTest.java
 ---
@@ -40,8 +45,8 @@
   );
 
   Context context = new Context.Builder()
-.with(Context.Capabilities.SHELL_VARIABLES , () -> variables)
-.build();
+.with(Context.Capabilities.SHELL_VARIABLES , () -> 
variables).build();
--- End diff --

I am actually interested in what direction as a project we should be taking 
with these types of fluent, chained statements.  I run across this all the time 
and I want to know the 'right' way that I should be doing it for the project.

IMHO, the way it was (separated by a line break) is more readable.  
Meaning, a long set of chained statements should be separated by line breaks.  
For example...
```
  result = new ProfileMeasurement()
  .withProfileName(profileName)
  .withEntity(entity)
  .withGroups(groups)
  .withPeriod(period)
  .withProfileValue(profileValue)
  .withTriageValues(triageValues)
  .withDefinition(definition);
```

But, of course, in terms of code style my opinion doesn't matter.  It is 
all about our style guidelines. 
 What does the Google code style guidelines say?   

Doesn't 
[this](https://google.github.io/styleguide/javaguide.html#s4.5.1-line-wrapping-where-to-break)
 support what I have said above about line breaks in this case?  




---


Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Simon Elliston Ball
Glad you agree with me that this isn’t HBase scale… it’s clearly not. I would 
never suggest introducing HBase for something like this, but since it’s there.

On the idea of using the Ambari RDBMS for the same basis of it being there, I 
see your point. That said, it can be postgres, sql server, mysql, maria, 
oracle… various. Yes we have an ORM, but those are not nearly as magic as they 
claim, and upgrade / schema evolution of an RDBMS often involves some sort of 
platform dependent SQL migration in my experience. I would suggest that 
supporting that range of options is not a good idea for us. The Ambari project 
also pretty much reserve the right to blow away that infrastructure in upgrades 
(which is fair enough). So relying on there being an RDBMS owned by another 
component is not something I would necessarily say was a clean choice. 

Simon

> On 2 Feb 2018, at 13:50, Nick Allen  wrote:
> 
> I fall marginally on the side of an RDBMS.  There is definitely a case to
> be made on both sides, but I'll point out a few things for the RDBMS.
> 
> 
> (1) Flexibility.  Using an RDBMS is going to provide us with much greater
> flexibility going forward.  We really don't know what the specific use
> cases will be, but I am willing to bet they are user-focused (preferences,
> etc).  The type of use cases that most web applications use an RDBMS for.
> 
> 
>> If anything I would like to see the current RDBMS dependency come out...
> 
> (2) Don't we already have an RDBMS requirement for Ambari?  That's a
> dependency that we do not control.
> 
> 
>> ... hbase seems a good option (because we already have it there, it would
> be kinda crazy at this scale if we didn’t already have it)
> 
> (3) In this scenario, the RDBMS would not scale proportionally with the
> amount of telemetry, it would scale based on usage; primarily the number of
> users.  This is not "big data" scale.  I don't think we can make the case
> for HBase based on scale here.
> 
> 
>> We would also end up with, as Mike points out, a whole new disk
> deployment patterns and a bunch of additional DBA ops process requirements
> for every install.
> 
> (4) Most users that need HA/DR (and other 'advanced stuff'), are
> enterprises and organizations that are already very familiar with RDBMS
> solutions and have the infrastructure in place to manage those.  For users
> that don't need HA/DR, just use the DB that gets spun-up with Ambari.
> 
> 
> 
> 
> 
> On Fri, Feb 2, 2018 at 7:17 AM Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
> 
>> Introducing a RDBMS to the stack seems unnecessary for this.
>> 
>> If we consider the data access patterns for user profiles, we are unlikely
>> to query into them, or indeed do anything other than look them up, or write
>> them out by a username key. To that end, using an ORM to translate a a
>> nested config object into a load of tables seems to introduce complexity
>> and brittleness we then have to take away through relying on relational
>> consistency models. We would also end up with, as Mike points out, a whole
>> new disk deployment patterns and a bunch of additional DBA ops process
>> requirements for every install.
>> 
>> Since the access pattern is almost entirely key => value, hbase seems a
>> good option (because we already have it there, it would be kinda crazy at
>> this scale if we didn’t already have it) or arguably zookeeper, but that
>> might be at the other end of the scale argument. I’d even go as far as to
>> suggest files on HDFS to keep it simple.
>> 
>> Simon
>> 
>>> On 1 Feb 2018, at 23:24, Michael Miklavcic 
>> wrote:
>>> 
>>> Personally, I'd be in favor of something like Maria DB as an open source
>>> repo. Or any other ansi sql store. On the positive side, it should mesh
>>> seamlessly with ORM tools. And the schema for this should be pretty
>>> vanilla, I'd imagine. I might even consider skipping ORM for straight
>> JDBC
>>> and simple command scripts in Java for something this small. I'm not
>>> worried so much about migrations of this sort. Large scale DBs can get
>>> involved with major schema changes, but thats usually when the datastore
>> is
>>> a massive set of tables with complex relationships, at least in my
>>> experience.
>>> 
>>> We could also use hbase, which probably wouldn't be that hard either, but
>>> there may be more boilerplate to write for the client as compared to
>>> standard SQL. But I'm assuming we could reuse a fair amount of existing
>>> code from our enrichments. One additional reason in favor of hbase might
>> be
>>> data replication. For a SQL instance we'd probably recommend a RAID store
>>> or backup procedure, but we get that pretty easy with hbase too.
>>> 
>>> On Feb 1, 2018 2:45 PM, "Casey Stella"  wrote:
>>> 
 So, I'll answer your question with some questions:
 
  - No matter the data store we use upgrading will take some care,
>> right?
  - Do we currently 

Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Nick Allen
I fall marginally on the side of an RDBMS.  There is definitely a case to
be made on both sides, but I'll point out a few things for the RDBMS.


(1) Flexibility.  Using an RDBMS is going to provide us with much greater
flexibility going forward.  We really don't know what the specific use
cases will be, but I am willing to bet they are user-focused (preferences,
etc).  The type of use cases that most web applications use an RDBMS for.


> If anything I would like to see the current RDBMS dependency come out...

(2) Don't we already have an RDBMS requirement for Ambari?  That's a
dependency that we do not control.


> ... hbase seems a good option (because we already have it there, it would
be kinda crazy at this scale if we didn’t already have it)

(3) In this scenario, the RDBMS would not scale proportionally with the
amount of telemetry, it would scale based on usage; primarily the number of
users.  This is not "big data" scale.  I don't think we can make the case
for HBase based on scale here.


> We would also end up with, as Mike points out, a whole new disk
deployment patterns and a bunch of additional DBA ops process requirements
for every install.

(4) Most users that need HA/DR (and other 'advanced stuff'), are
enterprises and organizations that are already very familiar with RDBMS
solutions and have the infrastructure in place to manage those.  For users
that don't need HA/DR, just use the DB that gets spun-up with Ambari.





On Fri, Feb 2, 2018 at 7:17 AM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Introducing a RDBMS to the stack seems unnecessary for this.
>
> If we consider the data access patterns for user profiles, we are unlikely
> to query into them, or indeed do anything other than look them up, or write
> them out by a username key. To that end, using an ORM to translate a a
> nested config object into a load of tables seems to introduce complexity
> and brittleness we then have to take away through relying on relational
> consistency models. We would also end up with, as Mike points out, a whole
> new disk deployment patterns and a bunch of additional DBA ops process
> requirements for every install.
>
> Since the access pattern is almost entirely key => value, hbase seems a
> good option (because we already have it there, it would be kinda crazy at
> this scale if we didn’t already have it) or arguably zookeeper, but that
> might be at the other end of the scale argument. I’d even go as far as to
> suggest files on HDFS to keep it simple.
>
> Simon
>
> > On 1 Feb 2018, at 23:24, Michael Miklavcic 
> wrote:
> >
> > Personally, I'd be in favor of something like Maria DB as an open source
> > repo. Or any other ansi sql store. On the positive side, it should mesh
> > seamlessly with ORM tools. And the schema for this should be pretty
> > vanilla, I'd imagine. I might even consider skipping ORM for straight
> JDBC
> > and simple command scripts in Java for something this small. I'm not
> > worried so much about migrations of this sort. Large scale DBs can get
> > involved with major schema changes, but thats usually when the datastore
> is
> > a massive set of tables with complex relationships, at least in my
> > experience.
> >
> > We could also use hbase, which probably wouldn't be that hard either, but
> > there may be more boilerplate to write for the client as compared to
> > standard SQL. But I'm assuming we could reuse a fair amount of existing
> > code from our enrichments. One additional reason in favor of hbase might
> be
> > data replication. For a SQL instance we'd probably recommend a RAID store
> > or backup procedure, but we get that pretty easy with hbase too.
> >
> > On Feb 1, 2018 2:45 PM, "Casey Stella"  wrote:
> >
> >> So, I'll answer your question with some questions:
> >>
> >>   - No matter the data store we use upgrading will take some care,
> right?
> >>   - Do we currently depend on a RDBMS anywhere?  I want to say that we
> do
> >>   in the REST layer already, right?
> >>   - If we don't use a RDBMs, what's the other option?  What are the pros
> >>   and cons?
> >>   - Have we considered non-server offline persistent solutions (e.g.
> >>   https://www.html5rocks.com/en/features/storage)?
> >>
> >>
> >>
> >> On Thu, Feb 1, 2018 at 9:11 AM, Ryan Merriman 
> wrote:
> >>
> >>> There is currently a PR up for review that allows a user to configure
> and
> >>> save the list of facet fields that appear in the left column of the
> >> Alerts
> >>> UI:  https://github.com/apache/metron/pull/853.  The REST layer has
> ORM
> >>> support which means we can store those in a relational database.
> >>>
> >>> However I'm not 100% sure this is the best place to keep this.  As we
> add
> >>> more use cases like this the backing tables in the RDBMS will need to
> be
> >>> managed.  This could make upgrading more tedious and error-prone.  Is
> >> there
> >>> are a better way to store this, 

Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Simon Elliston Ball
Couldn’t agree with you more Otto! On the perms / ACLs / AXOs / groups / users 
etc concerns though, there are other Apache projects (such as Ranger) which 
have already done a lot of the hard thinking and architecture / data structure 
/ admin ui and persistence pieces for us, so I’d say we lean on them before 
designing our own approach to IAM. 

Simon

> On 2 Feb 2018, at 13:22, Otto Fowler  wrote:
> 
> Fair enough,  I don’t have a preference.  I think my point is that we need to 
> understand the use cases we can think of more, especially if we are going to 
> be having permissions, grouping and crud around that, and preloading, before 
> just throwing everything in RDBMS -or- HBASE.
> 
> 
> 
> On February 2, 2018 at 08:08:24, Simon Elliston Ball 
> (si...@simonellistonball.com ) wrote:
> 
>> True, and that is a requirement I’ve heard a lot (standard views or field 
>> sets in shared sets of saved search for example). That would definitely rule 
>> out sticking with the current approach (browser local storage, per Casey’s 
>> suggestion below). 
>> 
>> That said, I’m not sure that changes my views on RDBMS. There is an argument 
>> that a single query from RDBMS could return a set of group prefs with a user 
>> overlay, but that’s not that much better than pulling groups and overwriting 
>> the maps clientside with user, from the key value store. We’re not talking 
>> about huge amounts of preference data here. I could be swayed the other way 
>> if we were to use the RDBMS as a canonical store for user and group 
>> information (we use it for users right now, in a really not great way) but I 
>> would much rather see us plugin to the Hadoop ecosystem and use something 
>> like Ranger to sync users, or an LDAP source directly for user and group 
>> data, because I suspect no one wants to have to administer a separate user 
>> database for Metron and open up the result IAM security hole we currently 
>> have (on that, let’s at least stop storing plain text passwords!) /rant. 
>> 
>> If anything I would like to see the current RDBMS dependency come out to 
>> reduce the overall complexity, unless we have a use case that genuinely 
>> benefits from a normalised data structure, or from SQL access patterns. 
>> 
>> In short, I would still go with LDAP / Ranger or users and groups, and 
>> instead of adding an RDBMS, using group prefs and user prefs in the existing 
>> KV store (HBase) to reduce the operational maintenance burden on the 
>> platform. 
>> 
>> Simon
>> 
>>> On 2 Feb 2018, at 12:50, Otto Fowler >> > wrote:
>>> 
>>> It is not uncommon to want to have ‘shared’ preferences or setups.   Think 
>>> of shared dashboards or queries vs. personal version in jira.  Would RDBMS 
>>> help with that?
>>> 
>>> 
>>> 
>>> On February 2, 2018 at 07:17:04, Simon Elliston Ball 
>>> (si...@simonellistonball.com ) wrote:
>>> 
 Introducing a RDBMS to the stack seems unnecessary for this. 
 
 If we consider the data access patterns for user profiles, we are unlikely 
 to query into them, or indeed do anything other than look them up, or 
 write them out by a username key. To that end, using an ORM to translate a 
 a nested config object into a load of tables seems to introduce complexity 
 and brittleness we then have to take away through relying on relational 
 consistency models. We would also end up with, as Mike points out, a whole 
 new disk deployment patterns and a bunch of additional DBA ops process 
 requirements for every install. 
 
 Since the access pattern is almost entirely key => value, hbase seems a 
 good option (because we already have it there, it would be kinda crazy at 
 this scale if we didn’t already have it) or arguably zookeeper, but that 
 might be at the other end of the scale argument. I’d even go as far as to 
 suggest files on HDFS to keep it simple.  
 
 Simon 
 
 > On 1 Feb 2018, at 23:24, Michael Miklavcic  > wrote: 
 >  
 > Personally, I'd be in favor of something like Maria DB as an open source 
 > repo. Or any other ansi sql store. On the positive side, it should mesh 
 > seamlessly with ORM tools. And the schema for this should be pretty 
 > vanilla, I'd imagine. I might even consider skipping ORM for straight 
 > JDBC 
 > and simple command scripts in Java for something this small. I'm not 
 > worried so much about migrations of this sort. Large scale DBs can get 
 > involved with major schema changes, but thats usually when the datastore 
 > is 
 > a massive set of tables with complex relationships, at least in my 
 > experience. 
 >  
 > We could also use hbase, which probably wouldn't be that hard either, 
 > but 

Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Otto Fowler
Fair enough,  I don’t have a preference.  I think my point is that we need
to understand the use cases we can think of more, especially if we are
going to be having permissions, grouping and crud around that, and
preloading, before just throwing everything in RDBMS -or- HBASE.



On February 2, 2018 at 08:08:24, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

True, and that is a requirement I’ve heard a lot (standard views or field
sets in shared sets of saved search for example). That would definitely
rule out sticking with the current approach (browser local storage, per
Casey’s suggestion below).

That said, I’m not sure that changes my views on RDBMS. There is an
argument that a single query from RDBMS could return a set of group prefs
with a user overlay, but that’s not that much better than pulling groups
and overwriting the maps clientside with user, from the key value store.
We’re not talking about huge amounts of preference data here. I could be
swayed the other way if we were to use the RDBMS as a canonical store for
user and group information (we use it for users right now, in a really not
great way) but I would much rather see us plugin to the Hadoop ecosystem
and use something like Ranger to sync users, or an LDAP source directly for
user and group data, because I suspect no one wants to have to administer a
separate user database for Metron and open up the result IAM security hole
we currently have (on that, let’s at least stop storing plain text
passwords!) /rant.

If anything I would like to see the current RDBMS dependency come out to
reduce the overall complexity, unless we have a use case that genuinely
benefits from a normalised data structure, or from SQL access patterns.

In short, I would still go with LDAP / Ranger or users and groups, and
instead of adding an RDBMS, using group prefs and user prefs in the
existing KV store (HBase) to reduce the operational maintenance burden on
the platform.

Simon

On 2 Feb 2018, at 12:50, Otto Fowler  wrote:

It is not uncommon to want to have ‘shared’ preferences or setups.   Think
of shared dashboards or queries vs. personal version in jira.  Would RDBMS
help with that?



On February 2, 2018 at 07:17:04, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Introducing a RDBMS to the stack seems unnecessary for this.

If we consider the data access patterns for user profiles, we are unlikely
to query into them, or indeed do anything other than look them up, or write
them out by a username key. To that end, using an ORM to translate a a
nested config object into a load of tables seems to introduce complexity
and brittleness we then have to take away through relying on relational
consistency models. We would also end up with, as Mike points out, a whole
new disk deployment patterns and a bunch of additional DBA ops process
requirements for every install.

Since the access pattern is almost entirely key => value, hbase seems a
good option (because we already have it there, it would be kinda crazy at
this scale if we didn’t already have it) or arguably zookeeper, but that
might be at the other end of the scale argument. I’d even go as far as to
suggest files on HDFS to keep it simple.

Simon

> On 1 Feb 2018, at 23:24, Michael Miklavcic 
wrote:
>
> Personally, I'd be in favor of something like Maria DB as an open source
> repo. Or any other ansi sql store. On the positive side, it should mesh
> seamlessly with ORM tools. And the schema for this should be pretty
> vanilla, I'd imagine. I might even consider skipping ORM for straight JDBC

> and simple command scripts in Java for something this small. I'm not
> worried so much about migrations of this sort. Large scale DBs can get
> involved with major schema changes, but thats usually when the datastore
is
> a massive set of tables with complex relationships, at least in my
> experience.
>
> We could also use hbase, which probably wouldn't be that hard either, but
> there may be more boilerplate to write for the client as compared to
> standard SQL. But I'm assuming we could reuse a fair amount of existing
> code from our enrichments. One additional reason in favor of hbase might
be
> data replication. For a SQL instance we'd probably recommend a RAID store
> or backup procedure, but we get that pretty easy with hbase too.
>
> On Feb 1, 2018 2:45 PM, "Casey Stella"  wrote:
>
>> So, I'll answer your question with some questions:
>>
>> - No matter the data store we use upgrading will take some care, right?
>> - Do we currently depend on a RDBMS anywhere? I want to say that we do
>> in the REST layer already, right?
>> - If we don't use a RDBMs, what's the other option? What are the pros
>> and cons?
>> - Have we considered non-server offline persistent solutions (e.g.
>>  https://www.html5rocks.com/en/features/storage)?
>>
>>
>>
>> On Thu, Feb 1, 2018 at 9:11 AM, Ryan Merriman 
wrote:
>>

Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Simon Elliston Ball
True, and that is a requirement I’ve heard a lot (standard views or field sets 
in shared sets of saved search for example). That would definitely rule out 
sticking with the current approach (browser local storage, per Casey’s 
suggestion below). 

That said, I’m not sure that changes my views on RDBMS. There is an argument 
that a single query from RDBMS could return a set of group prefs with a user 
overlay, but that’s not that much better than pulling groups and overwriting 
the maps clientside with user, from the key value store. We’re not talking 
about huge amounts of preference data here. I could be swayed the other way if 
we were to use the RDBMS as a canonical store for user and group information 
(we use it for users right now, in a really not great way) but I would much 
rather see us plugin to the Hadoop ecosystem and use something like Ranger to 
sync users, or an LDAP source directly for user and group data, because I 
suspect no one wants to have to administer a separate user database for Metron 
and open up the result IAM security hole we currently have (on that, let’s at 
least stop storing plain text passwords!) /rant. 

If anything I would like to see the current RDBMS dependency come out to reduce 
the overall complexity, unless we have a use case that genuinely benefits from 
a normalised data structure, or from SQL access patterns. 

In short, I would still go with LDAP / Ranger or users and groups, and instead 
of adding an RDBMS, using group prefs and user prefs in the existing KV store 
(HBase) to reduce the operational maintenance burden on the platform. 

Simon

> On 2 Feb 2018, at 12:50, Otto Fowler  wrote:
> 
> It is not uncommon to want to have ‘shared’ preferences or setups.   Think of 
> shared dashboards or queries vs. personal version in jira.  Would RDBMS help 
> with that?
> 
> 
> 
> On February 2, 2018 at 07:17:04, Simon Elliston Ball 
> (si...@simonellistonball.com ) wrote:
> 
>> Introducing a RDBMS to the stack seems unnecessary for this. 
>> 
>> If we consider the data access patterns for user profiles, we are unlikely 
>> to query into them, or indeed do anything other than look them up, or write 
>> them out by a username key. To that end, using an ORM to translate a a 
>> nested config object into a load of tables seems to introduce complexity and 
>> brittleness we then have to take away through relying on relational 
>> consistency models. We would also end up with, as Mike points out, a whole 
>> new disk deployment patterns and a bunch of additional DBA ops process 
>> requirements for every install. 
>> 
>> Since the access pattern is almost entirely key => value, hbase seems a good 
>> option (because we already have it there, it would be kinda crazy at this 
>> scale if we didn’t already have it) or arguably zookeeper, but that might be 
>> at the other end of the scale argument. I’d even go as far as to suggest 
>> files on HDFS to keep it simple.  
>> 
>> Simon 
>> 
>> > On 1 Feb 2018, at 23:24, Michael Miklavcic > > > wrote: 
>> >  
>> > Personally, I'd be in favor of something like Maria DB as an open source 
>> > repo. Or any other ansi sql store. On the positive side, it should mesh 
>> > seamlessly with ORM tools. And the schema for this should be pretty 
>> > vanilla, I'd imagine. I might even consider skipping ORM for straight JDBC 
>> > and simple command scripts in Java for something this small. I'm not 
>> > worried so much about migrations of this sort. Large scale DBs can get 
>> > involved with major schema changes, but thats usually when the datastore 
>> > is 
>> > a massive set of tables with complex relationships, at least in my 
>> > experience. 
>> >  
>> > We could also use hbase, which probably wouldn't be that hard either, but 
>> > there may be more boilerplate to write for the client as compared to 
>> > standard SQL. But I'm assuming we could reuse a fair amount of existing 
>> > code from our enrichments. One additional reason in favor of hbase might 
>> > be 
>> > data replication. For a SQL instance we'd probably recommend a RAID store 
>> > or backup procedure, but we get that pretty easy with hbase too. 
>> >  
>> > On Feb 1, 2018 2:45 PM, "Casey Stella" > > > wrote: 
>> >  
>> >> So, I'll answer your question with some questions: 
>> >>  
>> >> - No matter the data store we use upgrading will take some care, right? 
>> >> - Do we currently depend on a RDBMS anywhere? I want to say that we do 
>> >> in the REST layer already, right? 
>> >> - If we don't use a RDBMs, what's the other option? What are the pros 
>> >> and cons? 
>> >> - Have we considered non-server offline persistent solutions (e.g. 
>> >>  https://www.html5rocks.com/en/features/storage 
>> >> )? 
>> >>  
>> >>  
>> >>  
>> >> On Thu, 

Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Otto Fowler
It is not uncommon to want to have ‘shared’ preferences or setups.   Think
of shared dashboards or queries vs. personal version in jira.  Would RDBMS
help with that?



On February 2, 2018 at 07:17:04, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Introducing a RDBMS to the stack seems unnecessary for this.

If we consider the data access patterns for user profiles, we are unlikely
to query into them, or indeed do anything other than look them up, or write
them out by a username key. To that end, using an ORM to translate a a
nested config object into a load of tables seems to introduce complexity
and brittleness we then have to take away through relying on relational
consistency models. We would also end up with, as Mike points out, a whole
new disk deployment patterns and a bunch of additional DBA ops process
requirements for every install.

Since the access pattern is almost entirely key => value, hbase seems a
good option (because we already have it there, it would be kinda crazy at
this scale if we didn’t already have it) or arguably zookeeper, but that
might be at the other end of the scale argument. I’d even go as far as to
suggest files on HDFS to keep it simple.

Simon

> On 1 Feb 2018, at 23:24, Michael Miklavcic 
wrote:
>
> Personally, I'd be in favor of something like Maria DB as an open source
> repo. Or any other ansi sql store. On the positive side, it should mesh
> seamlessly with ORM tools. And the schema for this should be pretty
> vanilla, I'd imagine. I might even consider skipping ORM for straight
JDBC
> and simple command scripts in Java for something this small. I'm not
> worried so much about migrations of this sort. Large scale DBs can get
> involved with major schema changes, but thats usually when the datastore
is
> a massive set of tables with complex relationships, at least in my
> experience.
>
> We could also use hbase, which probably wouldn't be that hard either, but
> there may be more boilerplate to write for the client as compared to
> standard SQL. But I'm assuming we could reuse a fair amount of existing
> code from our enrichments. One additional reason in favor of hbase might
be
> data replication. For a SQL instance we'd probably recommend a RAID store
> or backup procedure, but we get that pretty easy with hbase too.
>
> On Feb 1, 2018 2:45 PM, "Casey Stella"  wrote:
>
>> So, I'll answer your question with some questions:
>>
>> - No matter the data store we use upgrading will take some care, right?
>> - Do we currently depend on a RDBMS anywhere? I want to say that we do
>> in the REST layer already, right?
>> - If we don't use a RDBMs, what's the other option? What are the pros
>> and cons?
>> - Have we considered non-server offline persistent solutions (e.g.
>> https://www.html5rocks.com/en/features/storage)?
>>
>>
>>
>> On Thu, Feb 1, 2018 at 9:11 AM, Ryan Merriman 
wrote:
>>
>>> There is currently a PR up for review that allows a user to configure
and
>>> save the list of facet fields that appear in the left column of the
>> Alerts
>>> UI: https://github.com/apache/metron/pull/853. The REST layer has ORM
>>> support which means we can store those in a relational database.
>>>
>>> However I'm not 100% sure this is the best place to keep this. As we
add
>>> more use cases like this the backing tables in the RDBMS will need to
be
>>> managed. This could make upgrading more tedious and error-prone. Is
>> there
>>> are a better way to store this, assuming we can leverage a component
>> that's
>>> already included in our stack?
>>>
>>> Ryan
>>>
>>


Re: [DISCUSS] Persistence store for user profile settings

2018-02-02 Thread Simon Elliston Ball
Introducing a RDBMS to the stack seems unnecessary for this.

If we consider the data access patterns for user profiles, we are unlikely to 
query into them, or indeed do anything other than look them up, or write them 
out by a username key. To that end, using an ORM to translate a a nested config 
object into a load of tables seems to introduce complexity and brittleness we 
then have to take away through relying on relational consistency models. We 
would also end up with, as Mike points out, a whole new disk deployment 
patterns and a bunch of additional DBA ops process requirements for every 
install.

Since the access pattern is almost entirely key => value, hbase seems a good 
option (because we already have it there, it would be kinda crazy at this scale 
if we didn’t already have it) or arguably zookeeper, but that might be at the 
other end of the scale argument. I’d even go as far as to suggest files on HDFS 
to keep it simple. 

Simon

> On 1 Feb 2018, at 23:24, Michael Miklavcic  
> wrote:
> 
> Personally, I'd be in favor of something like Maria DB as an open source
> repo. Or any other ansi sql store. On the positive side, it should mesh
> seamlessly with ORM tools. And the schema for this should be pretty
> vanilla, I'd imagine. I might even consider skipping ORM for straight JDBC
> and simple command scripts in Java for something this small. I'm not
> worried so much about migrations of this sort. Large scale DBs can get
> involved with major schema changes, but thats usually when the datastore is
> a massive set of tables with complex relationships, at least in my
> experience.
> 
> We could also use hbase, which probably wouldn't be that hard either, but
> there may be more boilerplate to write for the client as compared to
> standard SQL. But I'm assuming we could reuse a fair amount of existing
> code from our enrichments. One additional reason in favor of hbase might be
> data replication. For a SQL instance we'd probably recommend a RAID store
> or backup procedure, but we get that pretty easy with hbase too.
> 
> On Feb 1, 2018 2:45 PM, "Casey Stella"  wrote:
> 
>> So, I'll answer your question with some questions:
>> 
>>   - No matter the data store we use upgrading will take some care, right?
>>   - Do we currently depend on a RDBMS anywhere?  I want to say that we do
>>   in the REST layer already, right?
>>   - If we don't use a RDBMs, what's the other option?  What are the pros
>>   and cons?
>>   - Have we considered non-server offline persistent solutions (e.g.
>>   https://www.html5rocks.com/en/features/storage)?
>> 
>> 
>> 
>> On Thu, Feb 1, 2018 at 9:11 AM, Ryan Merriman  wrote:
>> 
>>> There is currently a PR up for review that allows a user to configure and
>>> save the list of facet fields that appear in the left column of the
>> Alerts
>>> UI:  https://github.com/apache/metron/pull/853.  The REST layer has ORM
>>> support which means we can store those in a relational database.
>>> 
>>> However I'm not 100% sure this is the best place to keep this.  As we add
>>> more use cases like this the backing tables in the RDBMS will need to be
>>> managed.  This could make upgrading more tedious and error-prone.  Is
>> there
>>> are a better way to store this, assuming we can leverage a component
>> that's
>>> already included in our stack?
>>> 
>>> Ryan
>>> 
>> 



Re: Disable Metron parser output writer entirely

2018-02-02 Thread Otto Fowler
You cannot.



On February 1, 2018 at 23:51:28, Ali Nazemian (alinazem...@gmail.com) wrote:

Hi All,

I am trying to investigate whether we can disable a Metron parser output
writer entirely and manage it via KAFKA_PUT Stellar function instead.
First, is it possible via configuration? Second, will be any performance
difference between normal Kafka writer and the Stellar version of it
(KAFKA_PUT).

Regards,
Ali