Thanks for the direction :) will work on this and update the thread. -Raghu
On 10/03/17, 10:47 PM, "Casey Stella" <ceste...@gmail.com> wrote: >Yes, we do use a UUID in the enrichment topology; this is our message join >key on the join portion of the split/join enrichment. The logic being used >is EnrichmentSplitterBolt.java line 63. > >We might bring that out and make it part of the message IMO and be able to >reuse that unique identifier in the enrichment topology. > >On Fri, Mar 10, 2017 at 10:51 AM, zeo...@gmail.com <zeo...@gmail.com> wrote: > >> I definitely think that this is a valuable discussion. I seem to recall >> cstella mentioning at some point in the past that there is a UUID already >> used in storm that we might be able to expose into the message itself, but >> I could be wrong. >> >> For additional context regarding prior discussions, this was also briefly >> discussed in another topic here here >> <https://lists.apache.org/thread.html/b039f0f0a5e6cfaf30944dc768088e >> 1e1bd5dae4b2247dda12698805@%3Cdev.metron.apache.org%3E>. >> In that context I was hoping to be able to link messages across all >> indexing destinations (HDFS, ES, Solr, etc.). >> >> On Fri, Mar 10, 2017 at 9:26 AM Raghu Mitra Kandikonda < >> r...@hortonworks.com> >> wrote: >> >> > Hi All, >> > >> > I would like to start a discussion around adding a unique id to all the >> > parsed messages. I feel there was a discussion around a similar topic >> > but I am not sure as a community we agreed on a proposal. >> > >> > We could >> > -use a random number generator like UUID but this might have performance >> > implications >> > -use a kafka topic name + systemtime + Kafka message offset to generate a >> > unique identifier >> > -use the input message to generate a hashcode >> > >> > Any thoughts ? >> > >> > (Attached email that had similar discussion for error indexing) >> > >> > Regards, >> > RaghuM >> > >> > >> > >> > ---------- Forwarded message ---------- >> > From: "zeo...@gmail.com" <zeo...@gmail.com> >> > To: "dev@metron.incubator.apache.org" <dev@metron.incubator.apache.org> >> > Cc: >> > Bcc: >> > Date: Wed, 1 Feb 2017 22:18:12 +0000 >> > Subject: Re: [DISCUSS] Error Indexing >> > Simply as a unique identifier of the original information which is >> failing >> > some step, and thus giving you something to key in on and create a count >> of >> > unique events and prioritize issues without the concern of cyclical >> issues >> > (if the issue is with indexing a specific message, and you try to index >> it >> > again, it will just fail in a loop). >> > >> > Jon >> > >> > On Wed, Feb 1, 2017 at 6:59 AM Dima Kovalyov <dima.koval...@sstech.us> >> > wrote: >> > >> > > That's a great topic of discussion. >> > > >> > > Throughout the thread the idea of having hash of the message that >> failed >> > > is changed, can someone please explain why do you plan to use this hash >> > > and how? >> > > >> > > - Dima >> > > >> > > On 02/01/2017 06:23 AM, zeo...@gmail.com wrote: >> > > > After thinking on this for a few days I recant my previous suggestion >> > of >> > > > TupleHash256. It's still a bit early for SHA-3 - no good reference >> > > > implementations/libraries exist (I did some searching and emailing), >> it >> > > is >> > > > optimized for hardware but no hardware implementation is widely >> > > accessible, >> > > > FIPS 140-3 is still not close to finalized, etc. >> > > > >> > > > I think we could simulate the benefits of tuplehash by sorting the >> > > tuples, >> > > > then doing SHA-256(len(tuple1) | tuple1 | ... | len(tuplen) | >> tuplen). >> > > > Happy to entertain opposing thoughts, such as BLAKE2, etc. but with >> the >> > > > likely users of Metron, I think sticking with FIPS 140-2 is a solid >> > > choice. >> > > > >> > > > Jon >> > > > >> > > > On Thu, Jan 26, 2017, 11:23 AM zeo...@gmail.com <zeo...@gmail.com> >> > > wrote: >> > > > >> > > > So one more thing regarding why I think we should throw an exception >> > on a >> > > > failed enrichment. If we do make something like username a constant >> > > field, >> > > > in cases where that is used to calculate rawMessage_hash, if it fails >> > to >> > > > enrich, the hash would be different compared to when it succeeds. Of >> > > > course I think the initial intent of adding username as a constant >> > field >> > > > would be to handle it in the parsers, where that information is >> > provided >> > > in >> > > > the messages themselves, but how would Threat Intel know the >> > difference? >> > > > In my environment I am looking forward to a streaming enrichment that >> > > adds >> > > > the username, where applicable, anywhere I have an IP. >> > > > >> > > > My hesitant suggestion for a hashing algorithm would be to use >> > > > TupleHash256, as it is a NIST-provided implementation of SHA-3 (using >> > > > cSHAKE) for this use case. Details here >> > > > < >> > > http://nvlpubs.nist.gov/nistpubs/specialpublications/ >> nist.sp.800-185.pdf >> > >. >> > > > However, I haven't been able to find a reference implementation of >> this >> > > in >> > > > any language, so that's a bit of a downside. A more general SHA3-256 >> > > > implementation where we handle ordering could work as well, but would >> > be >> > > > significantly less optimal. >> > > > >> > > > Jon >> > > > >> > > > On Thu, Jan 26, 2017 at 10:20 AM Ryan Merriman <merrim...@gmail.com> >> > > wrote: >> > > > >> > > > Jon, I misread the code in the GenericEnrichmentBolt. The error is >> > > > forwarded on so no issues there. >> > > > >> > > > Defaulting to the common fields makes sense. I will dig into the >> > > > GenericEnrichmentBolt more, maybe there is a way to get the error >> > fields >> > > > without having to significantly change things. Any opinion on a >> > hashing >> > > > algorithm? >> > > > >> > > > On Wed, Jan 25, 2017 at 9:37 PM, zeo...@gmail.com <zeo...@gmail.com> >> > > wrote: >> > > > >> > > >> Although hashing the whole message is better than nothing, it >> misses a >> > > lot >> > > >> of the benefits we could get. >> > > >> >> > > >> While I'd love to have consistency for this field across all of the >> > > >> different error.types, it appears that may not be reasonably >> possible >> > > >> because of the parsers. So, how about something like hash all of >> the >> > > >> constant >> > > >> fields >> > > >> <https://github.com/apache/incubator-metron/blob/master/ >> > > >> metron-platform/metron-common/src/main/java/org/apache/ >> > > >> metron/common/Constants.java> >> > > >> excluding >> > > >> timestamp and original_string unless it is a parser, in which case >> > hash >> > > > the >> > > >> entire message? This gives us some measure of event uniqueness and >> it >> > > can >> > > >> grow as we define additional constant fields (I recall discussing >> with >> > > >> someone else on the list regarding expanding those standard fields >> to >> > > >> include things like usernames but I can't find the specific email >> > > >> exchange). >> > > >> >> > > >> Because some enrichments can be heavily relied on, I think it makes >> > > sense >> > > >> to put a message onto the error queue when it throws an exception. >> > Not >> > > >> only does this help troubleshoot edge cases, but it makes issues >> more >> > > >> obvious when assembling a new enrichment in dev/test. I can't think >> > of >> > > a >> > > >> scenario currently where an enrichment would only be "best effort" >> and >> > > > that >> > > >> I wouldn't want that error indexed and retrievable. However, this >> > gets >> > > >> interesting when talking about the various options to solve the >> > "Enrich >> > > >> enrichment" discussion from earlier in the month. We can keep that >> > part >> > > > of >> > > >> this separate though, as I don't think that's being actively pursued >> > > right >> > > >> now. >> > > >> >> > > >> Jon >> > > >> >> > > >> On Wed, Jan 25, 2017 at 10:49 AM David Lyle <dlyle65...@gmail.com> >> > > wrote: >> > > >> >> > > >> RE: separate JIRA for MPack/Ansible. No objection to tracking them >> > > >> separately, but for this item to be complete, you'll need both the >> > > feature >> > > >> and the ability to install it. >> > > >> >> > > >> -D... >> > > >> >> > > >> >> > > >> On Tue, Jan 24, 2017 at 5:33 PM, Ryan Merriman <merrim...@gmail.com >> > >> > > >> wrote: >> > > >> >> > > >>> Assuming we're going to write all errors to a single error topic, I >> > > > think >> > > >>> it makes sense to agree on an error message schema and handle >> errors >> > > >> across >> > > >>> the 3 different topologies in the same way with a single >> > > implementation. >> > > >>> The implementation in ParserBolt (ErrorUtils.handleError) produces >> > the >> > > >> most >> > > >>> verbose error object so I think it's a good candidate for the >> single >> > > >>> implementation. Here is the message structure it currently >> produces: >> > > >>> >> > > >>> { >> > > >>> "exception": "java.lang.Exception: there was an error", >> > > >>> "hostname": "host", >> > > >>> "stack": "java.lang.Exception: ...", >> > > >>> "time": 1485295416563, >> > > >>> "message": "there was an error", >> > > >>> "rawMessage": "raw message", >> > > >>> "rawMessage_bytes": [], >> > > >>> "source.type": "bro_error" >> > > >>> } >> > > >>> >> > > >>> From our discussion so far we need to add a couple fields: an >> error >> > > > type >> > > >>> and hash id. Adding these to the message looks like: >> > > >>> >> > > >>> { >> > > >>> "exception": "java.lang.Exception: there was an error", >> > > >>> "hostname": "host", >> > > >>> "stack": "java.lang.Exception: ...", >> > > >>> "time": 1485295416563, >> > > >>> "message": "there was an error", >> > > >>> "rawMessage": "raw message", >> > > >>> "rawMessage_bytes": [], >> > > >>> "source.type": "bro_error", >> > > >>> "error.type": "parser_error", >> > > >>> "rawMessage_hash": "dde41b9920954f94066daf6291fb58a9" >> > > >>> } >> > > >>> >> > > >>> We should also consider expanding the error types I listed earlier. >> > > >>> Instead of just having "indexing_error" we could have >> > > >>> "elasticsearch_indexing_error", "hdfs_indexing_error" and so on. >> > > >>> >> > > >>> Jon, if an exception happens in an enrichment or threat intel bolt >> > the >> > > >>> message is passed along with no error thrown (only logged). >> > Everywhere >> > > >>> else I'm having trouble identifying specific fields that should be >> > > >> hashed. >> > > >>> Would hashing the message in every case be acceptable? Do you know >> > of >> > > a >> > > >>> place where we could hash a field instead? On the topic of >> > exceptions >> > > > in >> > > >>> enrichments, are we ok with an error only being logged and not >> added >> > to >> > > >> the >> > > >>> message or emitted to the error queue? >> > > >>> >> > > >>> >> > > >>> >> > > >>> On Tue, Jan 24, 2017 at 3:10 PM, Ryan Merriman < >> merrim...@gmail.com> >> > > >>> wrote: >> > > >>> >> > > >>>> That use case makes sense to me. I don't think it will require >> that >> > > >> much >> > > >>>> additional effort either. >> > > >>>> >> > > >>>> On Tue, Jan 24, 2017 at 1:02 PM, zeo...@gmail.com < >> zeo...@gmail.com >> > > >> > > >>>> wrote: >> > > >>>> >> > > >>>>> Regarding error vs validation - Either way I'm not very >> > concerned. I >> > > >>>>> initially assumed they would be combined and agree with that >> > > > approach, >> > > >>> but >> > > >>>>> splitting them out isn't a very big deal to me either. >> > > >>>>> >> > > >>>>> Re: Ryan. Yes, exactly. In the case of a parser issue (or >> > anywhere >> > > >>> else >> > > >>>>> where it's not possible to pick out the exact thing causing the >> > > > issue) >> > > >>> it >> > > >>>>> would be a hash of the complete message. >> > > >>>>> >> > > >>>>> Regarding the architecture, I mostly agree with James except >> that I >> > > >>> think >> > > >>>>> step 3 needs to also be able to somehow group errors via the >> > original >> > > >>>>> data (identify >> > > >>>>> replays, identify repeat issues with data in a specific field, >> > issues >> > > >>> with >> > > >>>>> consistently different data, etc.). This is essentially the >> first >> > > >> step >> > > >>> of >> > > >>>>> troubleshooting, which I assume you are doing if you're looking >> at >> > > > the >> > > >>>>> error dashboard. >> > > >>>>> >> > > >>>>> If the hash gets moved out of the initial implementation, I'm >> > fairly >> > > >>>>> certain you lose this ability. The point here isn't to handle >> long >> > > >>> fields >> > > >>>>> (although that's a benefit of this approach), it's to attach a >> > unique >> > > >>>>> identifier to the error/validation issue message that links it to >> > the >> > > >>>>> original problem. I'd be happy to consider alternative solutions >> > to >> > > >>> this >> > > >>>>> problem (for instance, actually sending across the data itself) I >> > > > just >> > > >>>>> haven't been able to think of another way to do this that I like >> > > >> better. >> > > >>>>> Jon >> > > >>>>> >> > > >>>>> On Tue, Jan 24, 2017 at 1:13 PM Ryan Merriman < >> merrim...@gmail.com >> > > >> > > >>>>> wrote: >> > > >>>>> >> > > >>>>>> We also need a JIRA for any install/Ansible/MPack work needed. >> > > >>>>>> >> > > >>>>>> On Tue, Jan 24, 2017 at 12:06 PM, James Sirota < >> > jsir...@apache.org> >> > > >>>>> wrote: >> > > >>>>>>> Now that I had some time to think about it I would collapse all >> > > >>> error >> > > >>>>> and >> > > >>>>>>> validation topics into one. We can differentiate between >> > > >> different >> > > >>>>> views >> > > >>>>>>> of the data (split by error source etc) via Kibana >> dashboards. I >> > > >>>>> would >> > > >>>>>>> implement this feature incrementally. First I would modify all >> > > >> the >> > > >>>>> bolts >> > > >>>>>>> to log to a single topic. Second, I would get the error >> indexing >> > > >>>>> done by >> > > >>>>>>> attaching the indexing topology to the error topic. Third I >> would >> > > >>>>> create >> > > >>>>>>> the necessary dashboards to view errors and validation failures >> > > > by >> > > >>>>>> source. >> > > >>>>>>> Lastly, I would file a follow-on JIRA to introduce hashing of >> > > >> errors >> > > >>>>> or >> > > >>>>>>> fields that are too long. It seems like a separate feature >> that >> > > >> we >> > > >>>>> need >> > > >>>>>> to >> > > >>>>>>> think through. We may need a stellar function around that. >> > > >>>>>>> >> > > >>>>>>> Thanks, >> > > >>>>>>> James >> > > >>>>>>> >> > > >>>>>>> 24.01.2017, 10:25, "Ryan Merriman" <merrim...@gmail.com>: >> > > >>>>>>>> I understand what Jon is talking about. He's proposing we hash >> > > >> the >> > > >>>>>> value >> > > >>>>>>>> that caused the error, not necessarily the error message >> > > > itself. >> > > >>>>> For an >> > > >>>>>>>> enrichment this is easy. Just pass along the field value that >> > > >>> failed >> > > >>>>>>>> enrichment. For other cases the field that caused the error >> may >> > > >>> not >> > > >>>>> be >> > > >>>>>> so >> > > >>>>>>>> obvious. Take parser validation for example. The message is >> > > >>>>> validated >> > > >>>>>> as >> > > >>>>>>>> a whole and it may not be easy to determine which field is the >> > > >>>>> cause. >> > > >>>>>> In >> > > >>>>>>>> that case would a hash of the whole message work? >> > > >>>>>>>> >> > > >>>>>>>> There is a broader architectural discussion that needs to >> > > > happen >> > > >>>>> before >> > > >>>>>>> we >> > > >>>>>>>> can implement this. Currently we have an indexing topology >> that >> > > >>>>> reads >> > > >>>>>>> from >> > > >>>>>>>> 1 topic and writes messages to ES but errors are written to >> > > >>> several >> > > >>>>>>>> different topics: >> > > >>>>>>>> >> > > >>>>>>>> - parser_error >> > > >>>>>>>> - parser_invalid >> > > >>>>>>>> - enrichments_error >> > > >>>>>>>> - threatintel_error >> > > >>>>>>>> - indexing_error >> > > >>>>>>>> >> > > >>>>>>>> I can see 4 possible approaches to implementing this: >> > > >>>>>>>> >> > > >>>>>>>> 1. Create an index topology for each error topic >> > > >>>>>>>> 1. Good because we can easily reuse the indexing >> topology >> > > >>> and >> > > >>>>>> would >> > > >>>>>>>> require the least development effort >> > > >>>>>>>> 2. Bad because it would consume a lot of extra worker >> > > >> slots >> > > >>>>>>>> 2. Move the topic name into the error JSON message as a new >> > > >>>>>>> "error_type" >> > > >>>>>>>> field and write all messages to the indexing topic >> > > >>>>>>>> 1. Good because we don't need to create a new topology >> > > >>>>>>>> 2. Bad because we would be flowing data and errors >> > > > through >> > > >>> the >> > > >>>>>> same >> > > >>>>>>>> topology. A spike in errors could affect message >> > > > indexing. >> > > >>>>>>>> 3. Compromise between 1 and 2. Create another indexing >> > > >> topology >> > > >>>>> that >> > > >>>>>>> is >> > > >>>>>>>> dedicated to indexing errors. Move the topic name into the >> > > >>> error >> > > >>>>>> JSON >> > > >>>>>>>> message as a new "error_type" field and write all errors to >> > > > a >> > > >>>>> single >> > > >>>>>>> error >> > > >>>>>>>> topic. >> > > >>>>>>>> 4. Write a completely new topology with multiple spouts (1 >> > > >> for >> > > >>>>> each >> > > >>>>>>>> error type listed above) that all feed into a single >> > > >>>>>>> BulkMessageWriterBolt. >> > > >>>>>>>> 1. Good because the current topologies would not need to >> > > >>>>> change >> > > >>>>>>>> 2. Bad because it would require the most development >> > > >> effort, >> > > >>>>>> would >> > > >>>>>>>> not reuse existing topologies and takes up more worker >> > > >> slots >> > > >>>>>> than 3 >> > > >>>>>>>> Are there other approaches I haven't thought of? I think 1 and >> > > > 2 >> > > >>> are >> > > >>>>>> off >> > > >>>>>>>> the table because they are shortcuts and not good long-term >> > > >>>>> solutions. >> > > >>>>>> 3 >> > > >>>>>>>> would be my choice because it introduces less complexity than >> > > > 4. >> > > >>>>>>> Thoughts? >> > > >>>>>>>> Ryan >> > > >>>>>>>> >> > > >>>>>>>> On Mon, Jan 23, 2017 at 5:44 PM, zeo...@gmail.com < >> > > >>> zeo...@gmail.com >> > > >>>>>>> wrote: >> > > >>>>>>>>> In that case the hash would be of the value in the IP field, >> > > >>> such >> > > >>>>> as >> > > >>>>>>>>> sha3(8.8.8.8). >> > > >>>>>>>>> >> > > >>>>>>>>> Jon >> > > >>>>>>>>> >> > > >>>>>>>>> On Mon, Jan 23, 2017, 6:41 PM James Sirota < >> > > >> jsir...@apache.org> >> > > >>>>>> wrote: >> > > >>>>>>>>> > Jon, >> > > >>>>>>>>> > >> > > >>>>>>>>> > I am still not entirely following why we would want to use >> > > >>>>> hashing. >> > > >>>>>>> For >> > > >>>>>>>>> > example if my error is "Your IP field is invalid and >> failed >> > > >>>>>>> validation" >> > > >>>>>>>>> > hashing this error string will always result in the same >> > > >> hash. >> > > >>>>> Why >> > > >>>>>>> not >> > > >>>>>>>>> > just use the actual error string? Can you provide an >> > > > example >> > > >>>>> where >> > > >>>>>>> you >> > > >>>>>>>>> > would use it? >> > > >>>>>>>>> > >> > > >>>>>>>>> > Thanks, >> > > >>>>>>>>> > James >> > > >>>>>>>>> > >> > > >>>>>>>>> > 23.01.2017, 16:29, "zeo...@gmail.com" <zeo...@gmail.com>: >> > > >>>>>>>>> > > For 1 - I'm good with that. >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > I'm talking about hashing the relevant content itself >> not >> > > >>> the >> > > >>>>>>> error. >> > > >>>>>>>>> Some >> > > >>>>>>>>> > > benefits are (1) minimize load on search index (there's >> > > >>>>> minimal >> > > >>>>>>> benefit >> > > >>>>>>>>> > in >> > > >>>>>>>>> > > spending the CPU and disk to keep it at full fidelity >> > > >>>>> (tokenize >> > > >>>>>> and >> > > >>>>>>>>> > store)) >> > > >>>>>>>>> > > (2) provide something to key on for dashboards (assuming >> > > > a >> > > >>>>> good >> > > >>>>>>> hash >> > > >>>>>>>>> > > algorithm that avoids collisions and is second preimage >> > > >>>>>> resistant) >> > > >>>>>>> and >> > > >>>>>>>>> > (3) >> > > >>>>>>>>> > > specific to errors, if the issue is that it failed to >> > > >>> index, a >> > > >>>>>> hash >> > > >>>>>>>>> gives >> > > >>>>>>>>> > > us some protection that the issue will not occur twice. >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > Jon >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > On Mon, Jan 23, 2017, 2:47 PM James Sirota < >> > > >>>>> jsir...@apache.org> >> > > >>>>>>> wrote: >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > Jon, >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > With regards to 1, collapsing to a single dashboard for >> > > >> each >> > > >>>>>> would >> > > >>>>>>> be >> > > >>>>>>>>> > > fine. So we would have one error index and one "failed >> to >> > > >>>>>> validate" >> > > >>>>>>>>> > > index. The distinction is that errors would be things >> > > > that >> > > >>>>> went >> > > >>>>>>> wrong >> > > >>>>>>>>> > > during stream processing (failed to parse, etc...), >> while >> > > >>>>>>> validation >> > > >>>>>>>>> > > failures are messages that explicitly failed stellar >> > > >>>>>>> validation/schema >> > > >>>>>>>>> > > enforcement. There should be relatively few of the >> second >> > > >>>>> type. >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > With respect to 3, why do you want the error hashed? Why >> > > >> not >> > > >>>>> just >> > > >>>>>>>>> search >> > > >>>>>>>>> > > for the error text? >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > Thanks, >> > > >>>>>>>>> > > James >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > 20.01.2017, 14:01, "zeo...@gmail.com" <zeo...@gmail.com >> >: >> > > >>>>>>>>> > >> As someone who currently fills the platform engineer >> > > >> role, >> > > >>> I >> > > >>>>> can >> > > >>>>>>> give >> > > >>>>>>>>> > this >> > > >>>>>>>>> > >> idea a huge +1. My thoughts: >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> 1. I think it depends on exactly what data is pushed >> > > > into >> > > >>> the >> > > >>>>>>> index >> > > >>>>>>>>> > (#3). >> > > >>>>>>>>> > >> However, assuming the errors you proposed recording, I >> > > >>> can't >> > > >>>>> see >> > > >>>>>>> huge >> > > >>>>>>>>> > >> benefits to having more than one dashboard. I would be >> > > >>> happy >> > > >>>>> to >> > > >>>>>> be >> > > >>>>>>>>> > >> persuaded otherwise. >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> 2. I would say yes, storing the errors in HDFS in >> > > >> addition >> > > >>> to >> > > >>>>>>>>> indexing >> > > >>>>>>>>> > is >> > > >>>>>>>>> > >> a good thing. Using METRON-510 >> > > >>>>>>>>> > >> <https://issues.apache.org/jira/browse/METRON-510> as >> a >> > > >>> case >> > > >>>>>>> study, >> > > >>>>>>>>> > there >> > > >>>>>>>>> > >> is the potential in this environment for >> > > >>> attacker-controlled >> > > >>>>>> data >> > > >>>>>>> to >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > result >> > > >>>>>>>>> > >> in processing errors which could be a method of evading >> > > >>>>> security >> > > >>>>>>>>> > >> monitoring. Once an attack is identified, the long term >> > > >>> HDFS >> > > >>>>>>> storage >> > > >>>>>>>>> > would >> > > >>>>>>>>> > >> allow better historical analysis for >> > > >>> low-and-slow/persistent >> > > >>>>>>> attacks >> > > >>>>>>>>> > (I'm >> > > >>>>>>>>> > >> thinking of a method of data exfil that also won't >> > > >>>>> successfully >> > > >>>>>>> get >> > > >>>>>>>>> > stored >> > > >>>>>>>>> > >> in Lucene, but is hard to identify over a short period >> > > > of >> > > >>>>> time). >> > > >>>>>>>>> > >> - Along this line, I think that there are various parts >> > > >> of >> > > >>>>>> Metron >> > > >>>>>>>>> > (this >> > > >>>>>>>>> > >> included) which could benefit from having method of >> > > >>>>> configuring >> > > >>>>>>> data >> > > >>>>>>>>> > aging >> > > >>>>>>>>> > >> by bucket in HDFS (Following Nick's comments here >> > > >>>>>>>>> > >> <https://issues.apache.org/jira/browse/METRON-477>). >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> 3. I would potentially add a hash of the content that >> > > >>> failed >> > > >>>>>>>>> > validation to >> > > >>>>>>>>> > >> help identify repeats over time with less of a concern >> > > >> that >> > > >>>>>> you'd >> > > >>>>>>>>> have >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > back >> > > >>>>>>>>> > >> to back failures (i.e. instead of storing the value >> > > >>> itself). >> > > >>>>>>>>> > Additionally, >> > > >>>>>>>>> > >> I think it's helpful to be able to search all times >> > > > there >> > > >>>>> was an >> > > >>>>>>>>> > indexing >> > > >>>>>>>>> > >> error (instead of it hitting the catch-all). >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> Jon >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> On Fri, Jan 20, 2017 at 1:17 PM James Sirota < >> > > >>>>>> jsir...@apache.org> >> > > >>>>>>>>> > wrote: >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> We already have a capability to capture bolt errors and >> > > >>>>>> validation >> > > >>>>>>>>> > errors >> > > >>>>>>>>> > >> and pipe them into a Kafka topic. I want to propose >> that >> > > >> we >> > > >>>>>>> attach a >> > > >>>>>>>>> > >> writer topology to the error and validation failed >> kafka >> > > >>>>> topics >> > > >>>>>> so >> > > >>>>>>>>> > that we >> > > >>>>>>>>> > >> can (a) create a new ES index for these errors and (b) >> > > >>>>> create a >> > > >>>>>>> new >> > > >>>>>>>>> > Kibana >> > > >>>>>>>>> > >> dashboard to visualize them. The benefit would be that >> > > >>> errors >> > > >>>>>> and >> > > >>>>>>>>> > >> validation failures would be easier to see and analyze. >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> I am seeking feedback on the following: >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> - How granular would we want this feature to be? Think >> > > > we >> > > >>>>> would >> > > >>>>>>> want >> > > >>>>>>>>> > one >> > > >>>>>>>>> > >> index/dashboard per source? Or would it be better to >> > > >>> collapse >> > > >>>>>>>>> > everything >> > > >>>>>>>>> > >> into the same index? >> > > >>>>>>>>> > >> - Do we care about storing these errors in HDFS as >> well? >> > > >> Or >> > > >>>>> is >> > > >>>>>>>>> indexing >> > > >>>>>>>>> > >> them enough? >> > > >>>>>>>>> > >> - What types of errors should we record? I am >> proposing: >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> For error reporting: >> > > >>>>>>>>> > >> --Message failed to parse >> > > >>>>>>>>> > >> --Enrichment failed to enrich >> > > >>>>>>>>> > >> --Threat intel feed failures >> > > >>>>>>>>> > >> --Generic catch-all for all other errors >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> For validation reporting: >> > > >>>>>>>>> > >> --What part of message failed validation >> > > >>>>>>>>> > >> --What stellar validator caused the failure >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> ------------------- >> > > >>>>>>>>> > >> Thank you, >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> James Sirota >> > > >>>>>>>>> > >> PPMC- Apache Metron (Incubating) >> > > >>>>>>>>> > >> jsirota AT apache DOT org >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> -- >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> Jon >> > > >>>>>>>>> > >> >> > > >>>>>>>>> > >> Sent from my mobile device >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > ------------------- >> > > >>>>>>>>> > > Thank you, >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > James Sirota >> > > >>>>>>>>> > > PPMC- Apache Metron (Incubating) >> > > >>>>>>>>> > > jsirota AT apache DOT org >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > -- >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > Jon >> > > >>>>>>>>> > > >> > > >>>>>>>>> > > Sent from my mobile device >> > > >>>>>>>>> > >> > > >>>>>>>>> > ------------------- >> > > >>>>>>>>> > Thank you, >> > > >>>>>>>>> > >> > > >>>>>>>>> > James Sirota >> > > >>>>>>>>> > PPMC- Apache Metron (Incubating) >> > > >>>>>>>>> > jsirota AT apache DOT org >> > > >>>>>>>>> > >> > > >>>>>>>>> -- >> > > >>>>>>>>> >> > > >>>>>>>>> Jon >> > > >>>>>>>>> >> > > >>>>>>>>> Sent from my mobile device >> > > >>>>>>> ------------------- >> > > >>>>>>> Thank you, >> > > >>>>>>> >> > > >>>>>>> James Sirota >> > > >>>>>>> PPMC- Apache Metron (Incubating) >> > > >>>>>>> jsirota AT apache DOT org >> > > >>>>>>> >> > > >>>>> -- >> > > >>>>> >> > > >>>>> Jon >> > > >>>>> >> > > >>>>> Sent from my mobile device >> > > >>>>> >> > > >>>> >> > > >> -- >> > > >> >> > > >> Jon >> > > >> >> > > >> Sent from my mobile device >> > > >> >> > > >> > > -- >> > >> > Jon >> > >> > Sent from my mobile device >> > >> > -- >> >> Jon >>