For 1 - I'm good with that.

I'm talking about hashing the relevant content itself not the error.  Some
benefits are (1) minimize load on search index (there's minimal benefit in
spending the CPU and disk to keep it at full fidelity (tokenize and store))
(2) provide something to key on for dashboards (assuming a good hash
algorithm that avoids collisions and is second preimage resistant) and (3)
specific to errors, if the issue is that it failed to index, a hash gives
us some protection that the issue will not occur twice.

Jon

On Mon, Jan 23, 2017, 2:47 PM James Sirota <jsir...@apache.org> wrote:

Jon,

With regards to 1, collapsing to a single dashboard for each would be
fine.  So we would have one error index and one "failed to validate"
index.  The distinction is that errors would be things that went wrong
during stream processing (failed to parse, etc...), while validation
failures are messages that explicitly failed stellar validation/schema
enforcement.  There should be relatively few of the second type.


With respect to 3, why do you want the error hashed?  Why not just search
for the error text?

Thanks,
James


20.01.2017, 14:01, "zeo...@gmail.com" <zeo...@gmail.com>:
> As someone who currently fills the platform engineer role, I can give this
> idea a huge +1. My thoughts:
>
> 1. I think it depends on exactly what data is pushed into the index (#3).
> However, assuming the errors you proposed recording, I can't see huge
> benefits to having more than one dashboard. I would be happy to be
> persuaded otherwise.
>
> 2. I would say yes, storing the errors in HDFS in addition to indexing is
> a good thing. Using METRON-510
> <https://issues.apache.org/jira/browse/METRON-510> as a case study, there
> is the potential in this environment for attacker-controlled data to
result
> in processing errors which could be a method of evading security
> monitoring. Once an attack is identified, the long term HDFS storage would
> allow better historical analysis for low-and-slow/persistent attacks (I'm
> thinking of a method of data exfil that also won't successfully get stored
> in Lucene, but is hard to identify over a short period of time).
>  - Along this line, I think that there are various parts of Metron (this
> included) which could benefit from having method of configuring data aging
> by bucket in HDFS (Following Nick's comments here
> <https://issues.apache.org/jira/browse/METRON-477>).
>
> 3. I would potentially add a hash of the content that failed validation to
> help identify repeats over time with less of a concern that you'd have
back
> to back failures (i.e. instead of storing the value itself). Additionally,
> I think it's helpful to be able to search all times there was an indexing
> error (instead of it hitting the catch-all).
>
> Jon
>
> On Fri, Jan 20, 2017 at 1:17 PM James Sirota <jsir...@apache.org> wrote:
>
> We already have a capability to capture bolt errors and validation errors
> and pipe them into a Kafka topic. I want to propose that we attach a
> writer topology to the error and validation failed kafka topics so that we
> can (a) create a new ES index for these errors and (b) create a new Kibana
> dashboard to visualize them. The benefit would be that errors and
> validation failures would be easier to see and analyze.
>
> I am seeking feedback on the following:
>
> - How granular would we want this feature to be? Think we would want one
> index/dashboard per source? Or would it be better to collapse everything
> into the same index?
> - Do we care about storing these errors in HDFS as well? Or is indexing
> them enough?
> - What types of errors should we record? I am proposing:
>
> For error reporting:
> --Message failed to parse
> --Enrichment failed to enrich
> --Threat intel feed failures
> --Generic catch-all for all other errors
>
> For validation reporting:
> --What part of message failed validation
> --What stellar validator caused the failure
>
> -------------------
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>
> --
>
> Jon
>
> Sent from my mobile device

-------------------
Thank you,

James Sirota
PPMC- Apache Metron (Incubating)
jsirota AT apache DOT org

-- 

Jon

Sent from my mobile device

Reply via email to