Re: [DISCUSS] Error Indexing

James Sirota Mon, 23 Jan 2017 09:43:34 -0800

Thanks for your feedback, Jon.  Anyone else has interest/feedback for this 
feature?


20.01.2017, 14:01, "zeo...@gmail.com" <zeo...@gmail.com>:
> As someone who currently fills the platform engineer role, I can give this
> idea a huge +1. My thoughts:
>
> 1. I think it depends on exactly what data is pushed into the index (#3).
> However, assuming the errors you proposed recording, I can't see huge
> benefits to having more than one dashboard. I would be happy to be
> persuaded otherwise.
>
> 2. I would say yes, storing the errors in HDFS in addition to indexing is
> a good thing. Using METRON-510
> <https://issues.apache.org/jira/browse/METRON-510> as a case study, there
> is the potential in this environment for attacker-controlled data to result
> in processing errors which could be a method of evading security
> monitoring. Once an attack is identified, the long term HDFS storage would
> allow better historical analysis for low-and-slow/persistent attacks (I'm
> thinking of a method of data exfil that also won't successfully get stored
> in Lucene, but is hard to identify over a short period of time).
>  - Along this line, I think that there are various parts of Metron (this
> included) which could benefit from having method of configuring data aging
> by bucket in HDFS (Following Nick's comments here
> <https://issues.apache.org/jira/browse/METRON-477>).
>
> 3. I would potentially add a hash of the content that failed validation to
> help identify repeats over time with less of a concern that you'd have back
> to back failures (i.e. instead of storing the value itself). Additionally,
> I think it's helpful to be able to search all times there was an indexing
> error (instead of it hitting the catch-all).
>
> Jon
>
> On Fri, Jan 20, 2017 at 1:17 PM James Sirota <jsir...@apache.org> wrote:
>
> We already have a capability to capture bolt errors and validation errors
> and pipe them into a Kafka topic. I want to propose that we attach a
> writer topology to the error and validation failed kafka topics so that we
> can (a) create a new ES index for these errors and (b) create a new Kibana
> dashboard to visualize them. The benefit would be that errors and
> validation failures would be easier to see and analyze.
>
> I am seeking feedback on the following:
>
> - How granular would we want this feature to be? Think we would want one
> index/dashboard per source? Or would it be better to collapse everything
> into the same index?
> - Do we care about storing these errors in HDFS as well? Or is indexing
> them enough?
> - What types of errors should we record? I am proposing:
>
> For error reporting:
> --Message failed to parse
> --Enrichment failed to enrich
> --Threat intel feed failures
> --Generic catch-all for all other errors
>
> For validation reporting:
> --What part of message failed validation
> --What stellar validator caused the failure
>
> -------------------
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>
> --
>
> Jon
>
> Sent from my mobile device

------------------- 
Thank you,

James Sirota
PPMC- Apache Metron (Incubating)
jsirota AT apache DOT org

Re: [DISCUSS] Error Indexing

Reply via email to