[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-06 Thread nickwallen
Github user nickwallen commented on the issue: https://github.com/apache/incubator-metron/pull/453 I will create an issue to track the possibility for looping. Other than that, this looks good. +1 --- If your project is set up for it, you can reply to this email and have

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-06 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 Does anyone have any outstanding comments that they need addressed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-02 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 Just pushed out a commit to address recent comments. Commit includes: - unit test and javadoc for MessageGetters - error index template with the "ignore_above": 8191" setting -

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-02 Thread james-sirota
Github user james-sirota commented on the issue: https://github.com/apache/incubator-metron/pull/453 Hi guys, this PR is built on one fundamental assumption: kafka is always available. The source of truth for errors, therefore, is a kafka topic. In a production setting errors should

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-02 Thread JonZeolla
Github user JonZeolla commented on the issue: https://github.com/apache/incubator-metron/pull/453 I agree, I [recommended the same thing](https://lists.apache.org/thread.html/2a673bc97d975bc7e8e160228742c07a8cf45e43c1b1efb3fea83579@%3Cdev.metron.apache.org%3E) on the dev list and in I

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-02 Thread justinleet
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/453 I tried running this up and discovered that there's at least one error that doesn't get caught. Json parsing errors, e.g. if someone gives outright badly formatted messages to indexing

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-02 Thread nickwallen
Github user nickwallen commented on the issue: https://github.com/apache/incubator-metron/pull/453 I think focusing on the one specific error that we've seen is not the right way to think about this. Many different types of errors would cause unexpected looping, no? When unexpected

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-01 Thread JonZeolla
Github user JonZeolla commented on the issue: https://github.com/apache/incubator-metron/pull/453 Yeah. So, I edited my prior message but for clarity, I think we should just set "ignore_above": 8191 ([details](https://www.elastic.co/guide/en/elasticsearch/reference/current/ignore-abo

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-01 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 I think you're right, that could happen (didn't this happen to you at one point?). So what is the correct approach then? Do we leave off the raw message field if the error happens in th

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-01 Thread JonZeolla
Github user JonZeolla commented on the issue: https://github.com/apache/incubator-metron/pull/453 I believe you would still have the issue in some cases. The limitation is that the raw_message field could be a long set of characters, processed as a single token. I don't know of a wa

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-01 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 Technically generating an error message and sending it back through the indexing topology should be ok now because the original message is serialized into a string and shouldn't cause ano

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-01 Thread JonZeolla
Github user JonZeolla commented on the issue: https://github.com/apache/incubator-metron/pull/453 > For the concern about indexing errors getting in a loop, I thought of a simple solution that required no extra work. If an error happens in the indexing topology, simply send it to a di

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-01 Thread nickwallen
Github user nickwallen commented on the issue: https://github.com/apache/incubator-metron/pull/453 I think how you have it as a fine approach. Just need to doc it as such. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-01 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 Default in this case is JSON_FROM_FIELD without specifying the field (defaults to "message"). I couldn't think of an ideal way to do this so open to ideas. We could just treat a null fi

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-01 Thread nickwallen
Github user nickwallen commented on the issue: https://github.com/apache/incubator-metron/pull/453 I like the flexibility of the new MessageGetter stuff. I don't understand when I would use a `DEFAULT_JSON_FROM_FIELD` versus a `JSON_FROM_FIELD`. Some basic javadocs on those new clas

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-03-01 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 The latest commit includes the changes discussed. Most error messages now go to the indexing topic by default so there is no need for another error indexing topology. The error topic se

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-23 Thread JonZeolla
Github user JonZeolla commented on the issue: https://github.com/apache/incubator-metron/pull/453 I'm sorry if this was already covered and I missed it, I'm writing this from my phone. How does this handle errors during indexing itself? For instance, if the failure is due to the con

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-22 Thread justinleet
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/453 @merrimanr Thanks for pointing me there. That looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-22 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 You need a source.type to look up a parser config right? I don't think you can parse a message without it. I'm relying on the bolts to supply the source.type and I think that's where an

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-22 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 That wasn't an exact structure. Take a look at the ErrorFields enum in https://github.com/apache/incubator-metron/pull/453/files#diff-19fcef3f36d5353a8ad399a128f40f3e. It's very close t

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-22 Thread nickwallen
Github user nickwallen commented on the issue: https://github.com/apache/incubator-metron/pull/453 Thanks @merrimanr . Error message has tons of useful fields. I like it. I think your option 2 above ('source.type=error' with a separate field for original source type) would b

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-22 Thread justinleet
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/453 @merrimanr Can we make rawMessage, rawMessage_bytes, and rawMessage_hash to raw_message, raw_message_bytes, and raw_message_hash for consistency with error_type and failed_sensor_type?

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-22 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 The error message will look something like this: { "exception": "java.lang.Exception: there was an error", "hostname": "host", "stack": "java.lang.Exception: ...",

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-22 Thread nickwallen
Github user nickwallen commented on the issue: https://github.com/apache/incubator-metron/pull/453 In regards to `source.type`, what does the error message look-like? Is it transformed in any way? What fields are in the error message? If we set the `source.type` to `error`

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-21 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 So the consensus is to use a single indexing topology for both error and normal messages by default. I will remove the scripts and configuration (flux and properties) files and remove th

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-21 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 Those scripts and configurations have already been created and are part of this PR so the question is should we remove them. --- If your project is set up for it, you can reply to this e

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-21 Thread nickwallen
Github user nickwallen commented on the issue: https://github.com/apache/incubator-metron/pull/453 > After thinking about it a few minutes more, maybe the answer to the deployment question is to deploy all the necessary scripts, configurations, etc but just not start it by default?

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-21 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 I would prefer a separate error topology but you and others have provided good reasons for not doing that. Single indexing approach by default is a good compromise and fine with me.

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-21 Thread nickwallen
Github user nickwallen commented on the issue: https://github.com/apache/incubator-metron/pull/453 > I think a configurable error topic is a reasonable request... Hey @merrimanr - Are you saying that you want to take the approach of using a single indexing topology by default,

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-21 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 After thinking about it a few minutes more, maybe the answer to the deployment question is to deploy all the necessary scripts, configurations, etc but just not start it by default? ---

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-21 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 I think a configurable error topic is a reasonable request. The requirements need a little clarification though. How granular should this configuration property be? Should it

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-20 Thread nickwallen
Github user nickwallen commented on the issue: https://github.com/apache/incubator-metron/pull/453 > I guess I'm ok with adding a new topology, but it's added a fair amount of complexity to this JIRA and I'm not sure I buy that indexing errors (what should be a rare event, IMO, in a h

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-18 Thread cestella
Github user cestella commented on the issue: https://github.com/apache/incubator-metron/pull/453 Right, I am less concerned with the act of finding errors in single node vagrant. I'm more interested in mimicking the mechanisms used in a multi node deployment in our acceptance testing

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-18 Thread james-sirota
Github user james-sirota commented on the issue: https://github.com/apache/incubator-metron/pull/453 I think for a single node this is not an issue because all your storm logs are in the same place so you can just pipe to grep. This feature is meant to address gathering errors from d

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-18 Thread cestella
Github user cestella commented on the issue: https://github.com/apache/incubator-metron/pull/453 Can we make the Kafka queue used for errors parameterized in the various topologies? For those who want one indexing topology, we can use the "indexing" queue and for those who want a sep

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-18 Thread james-sirota
Github user james-sirota commented on the issue: https://github.com/apache/incubator-metron/pull/453 I lean toward having a separate kafka queue for errors. Errors are lower in volume, but are a lot more verbose. You are essentially indexing large text document when you do this and

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-17 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 @ottobackwards to answer some of your questions (sorry was at RSA all week): - An extra topology will take up a couple worker slots but the performance implications should be mini

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-17 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 I think removing the error indexing topology is fine as long as we're careful to avoid error messages getting stuck in a loop in the case of a failed index write. I agree that it will re

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-17 Thread cestella
Github user cestella commented on the issue: https://github.com/apache/incubator-metron/pull/453 Ok, catching up on this: Regarding invalid vs errors, I'm ok with the notion of treating them the same, but it's different from how we've done this in the past. Prior to this, in

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread ottobackwards
Github user ottobackwards commented on the issue: https://github.com/apache/incubator-metron/pull/453 @cestella - how can we utilize the error indexing if we were going to say - output errors or warnings that there were deprecated stellar statements? --- If your project is set up for

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread ottobackwards
Github user ottobackwards commented on the issue: https://github.com/apache/incubator-metron/pull/453 What comes to mind is the 'source' of an error. Is this error wrong because METRON thinks it is invalid, or it it wrong because of some other configuration specific evaluation. I do

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 On the topic of invalid messages, they are now treated as error messages. They can still be distinguished as invalid message though. Is there any reason they should be treated different

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 In response to "Is there any reason we didn't just use the normal indexing topology". Here are the issues I see with doing that. First, I think we should be careful about putting additi

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread ottobackwards
Github user ottobackwards commented on the issue: https://github.com/apache/incubator-metron/pull/453 That makes complete sense, we should call that stuff out pre-review. I think what we are seeing is that folks have some really good ideas and are willing to contribute to documentati

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 Sorry I should have included this in the original description. I still need to update the various READMEs, that task is outstanding and this should not be merged until that is done. Thi

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread ottobackwards
Github user ottobackwards commented on the issue: https://github.com/apache/incubator-metron/pull/453 Just real quick before I start to review: - What are the performance implications of this? How can we measure that? - What is the effect of running this on quick dev? is

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread cestella
Github user cestella commented on the issue: https://github.com/apache/incubator-metron/pull/453 Based on this PR, what I'd probably do is: * Walk through setting up the CSV parser * Set up a field validation * Send valid data through, make sure it works and is in the index

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 No you are correct, we need a more comprehensive test plan. I'm still thinking about it. Triggering errors at each point in the topologies is not straightforward. Sending in a message

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread cestella
Github user cestella commented on the issue: https://github.com/apache/incubator-metron/pull/453 So from what I see we have a new topology and every error message has a source type of "error", correct? Is there any reason we didn't just use the normal indexing topology and index erro

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread cestella
Github user cestella commented on the issue: https://github.com/apache/incubator-metron/pull/453 Whoops sorry I see that you did mention a way to test it out. Sorry, on a phone ;) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread cestella
Github user cestella commented on the issue: https://github.com/apache/incubator-metron/pull/453 Can you provide an acceptance test plan for validation on vagrant or a cluster? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as w

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread merrimanr
Github user merrimanr commented on the issue: https://github.com/apache/incubator-metron/pull/453 METRON-694 includes both the topology changes and the Ambari MPack changes. I started on METRON-695 but decided to include both in a single PR, hence the branch being named METRON-695 bu

[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies

2017-02-13 Thread cestella
Github user cestella commented on the issue: https://github.com/apache/incubator-metron/pull/453 Did you name this PR correctly? Should it be METRON-695? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project do