Re: [DISCUSS] Unique id for messages

Raghu Mitra Kandikonda Mon, 13 Mar 2017 07:28:11 -0700

Thanks for the direction :) will work on this and update the thread.

-Raghu





On 10/03/17, 10:47 PM, "Casey Stella" <ceste...@gmail.com> wrote:

>Yes, we do use a UUID in the enrichment topology; this is our message join
>key on the join portion of the split/join enrichment.  The logic being used
>is EnrichmentSplitterBolt.java  line 63.
>
>We might bring that out and make it part of the message IMO and be able to
>reuse that unique identifier in the enrichment topology.
>
>On Fri, Mar 10, 2017 at 10:51 AM, zeo...@gmail.com <zeo...@gmail.com> wrote:
>
>> I definitely think that this is a valuable discussion.  I seem to recall
>> cstella mentioning at some point in the past that there is a UUID already
>> used in storm that we might be able to expose into the message itself, but
>> I could be wrong.
>>
>> For additional context regarding prior discussions, this was also briefly
>> discussed in another topic here here
>> <https://lists.apache.org/thread.html/b039f0f0a5e6cfaf30944dc768088e
>> 1e1bd5dae4b2247dda12698805@%3Cdev.metron.apache.org%3E>.
>> In that context I was hoping to be able to link messages across all
>> indexing destinations (HDFS, ES, Solr, etc.).
>>
>> On Fri, Mar 10, 2017 at 9:26 AM Raghu Mitra Kandikonda <
>> r...@hortonworks.com>
>> wrote:
>>
>> > Hi All,
>> >
>> > I would like to start a discussion around adding a unique id to all the
>> > parsed messages.  I feel there  was  a discussion around a similar topic
>> > but I am not sure as a community we agreed on a proposal.
>> >
>> > We could
>> > -use a random number generator like UUID but this might have performance
>> > implications
>> > -use a kafka topic name + systemtime + Kafka message offset to generate a
>> > unique identifier
>> > -use the input message to generate a hashcode
>> >
>> > Any thoughts ?
>> >
>> > (Attached email that had similar discussion for error indexing)
>> >
>> > Regards,
>> > RaghuM
>> >
>> >
>> >
>> > ---------- Forwarded message ----------
>> > From: "zeo...@gmail.com" <zeo...@gmail.com>
>> > To: "dev@metron.incubator.apache.org" <dev@metron.incubator.apache.org>
>> > Cc:
>> > Bcc:
>> > Date: Wed, 1 Feb 2017 22:18:12 +0000
>> > Subject: Re: [DISCUSS] Error Indexing
>> > Simply as a unique identifier of the original information which is
>> failing
>> > some step, and thus giving you something to key in on and create a count
>> of
>> > unique events and prioritize issues without the concern of cyclical
>> issues
>> > (if the issue is with indexing a specific message, and you try to index
>> it
>> > again, it will just fail in a loop).
>> >
>> > Jon
>> >
>> > On Wed, Feb 1, 2017 at 6:59 AM Dima Kovalyov <dima.koval...@sstech.us>
>> > wrote:
>> >
>> > > That's a great topic of discussion.
>> > >
>> > > Throughout the thread the idea of having hash of the message that
>> failed
>> > > is changed, can someone please explain why do you plan to use this hash
>> > > and how?
>> > >
>> > > - Dima
>> > >
>> > > On 02/01/2017 06:23 AM, zeo...@gmail.com wrote:
>> > > > After thinking on this for a few days I recant my previous suggestion
>> > of
>> > > > TupleHash256.  It's still a bit early for SHA-3 - no good reference
>> > > > implementations/libraries exist (I did some searching and emailing),
>> it
>> > > is
>> > > > optimized for hardware but no hardware implementation is widely
>> > > accessible,
>> > > > FIPS 140-3 is still not close to finalized, etc.
>> > > >
>> > > > I think we could simulate the benefits of tuplehash by sorting the
>> > > tuples,
>> > > > then doing SHA-256(len(tuple1) | tuple1 | ... | len(tuplen) |
>> tuplen).
>> > > > Happy to entertain opposing thoughts, such as BLAKE2, etc. but with
>> the
>> > > > likely users of Metron, I think sticking with FIPS 140-2 is a solid
>> > > choice.
>> > > >
>> > > > Jon
>> > > >
>> > > > On Thu, Jan 26, 2017, 11:23 AM zeo...@gmail.com <zeo...@gmail.com>
>> > > wrote:
>> > > >
>> > > > So one more thing regarding why I think we should throw an exception
>> > on a
>> > > > failed enrichment.  If we do make something like username a constant
>> > > field,
>> > > > in cases where that is used to calculate rawMessage_hash, if it fails
>> > to
>> > > > enrich, the hash would be different compared to when it succeeds.  Of
>> > > > course I think the initial intent of adding username as a constant
>> > field
>> > > > would be to handle it in the parsers, where that information is
>> > provided
>> > > in
>> > > > the messages themselves, but how would Threat Intel know the
>> > difference?
>> > > > In my environment I am looking forward to a streaming enrichment that
>> > > adds
>> > > > the username, where applicable, anywhere I have an IP.
>> > > >
>> > > > My hesitant suggestion for a hashing algorithm would be to use
>> > > > TupleHash256, as it is a NIST-provided implementation of SHA-3 (using
>> > > > cSHAKE) for this use case.  Details here
>> > > > <
>> > > http://nvlpubs.nist.gov/nistpubs/specialpublications/
>> nist.sp.800-185.pdf
>> > >.
>> > > > However, I haven't been able to find a reference implementation of
>> this
>> > > in
>> > > > any language, so that's a bit of a downside.  A more general SHA3-256
>> > > > implementation where we handle ordering could work as well, but would
>> > be
>> > > > significantly less optimal.
>> > > >
>> > > > Jon
>> > > >
>> > > > On Thu, Jan 26, 2017 at 10:20 AM Ryan Merriman <merrim...@gmail.com>
>> > > wrote:
>> > > >
>> > > > Jon, I misread the code in the GenericEnrichmentBolt.  The error is
>> > > > forwarded on so no issues there.
>> > > >
>> > > > Defaulting to the common fields makes sense.  I will dig into the
>> > > > GenericEnrichmentBolt more, maybe there is a way to get the error
>> > fields
>> > > > without having to significantly change things.  Any opinion on a
>> > hashing
>> > > > algorithm?
>> > > >
>> > > > On Wed, Jan 25, 2017 at 9:37 PM, zeo...@gmail.com <zeo...@gmail.com>
>> > > wrote:
>> > > >
>> > > >> Although hashing the whole message is better than nothing, it
>> misses a
>> > > lot
>> > > >> of the benefits we could get.
>> > > >>
>> > > >> While I'd love to have consistency for this field across all of the
>> > > >> different error.types, it appears that may not be reasonably
>> possible
>> > > >> because of the parsers.  So, how about something like hash all of
>> the
>> > > >> constant
>> > > >> fields
>> > > >> <https://github.com/apache/incubator-metron/blob/master/
>> > > >> metron-platform/metron-common/src/main/java/org/apache/
>> > > >> metron/common/Constants.java>
>> > > >> excluding
>> > > >> timestamp and original_string unless it is a parser, in which case
>> > hash
>> > > > the
>> > > >> entire message?  This gives us some measure of event uniqueness and
>> it
>> > > can
>> > > >> grow as we define additional constant fields (I recall discussing
>> with
>> > > >> someone else on the list regarding expanding those standard fields
>> to
>> > > >> include things like usernames but I can't find the specific email
>> > > >> exchange).
>> > > >>
>> > > >> Because some enrichments can be heavily relied on, I think it makes
>> > > sense
>> > > >> to put a message onto the error queue when it throws an exception.
>> > Not
>> > > >> only does this help troubleshoot edge cases, but it makes issues
>> more
>> > > >> obvious when assembling a new enrichment in dev/test.  I can't think
>> > of
>> > > a
>> > > >> scenario currently where an enrichment would only be "best effort"
>> and
>> > > > that
>> > > >> I wouldn't want that error indexed and retrievable.  However, this
>> > gets
>> > > >> interesting when talking about the various options to solve the
>> > "Enrich
>> > > >> enrichment" discussion from earlier in the month.  We can keep that
>> > part
>> > > > of
>> > > >> this separate though, as I don't think that's being actively pursued
>> > > right
>> > > >> now.
>> > > >>
>> > > >> Jon
>> > > >>
>> > > >> On Wed, Jan 25, 2017 at 10:49 AM David Lyle <dlyle65...@gmail.com>
>> > > wrote:
>> > > >>
>> > > >> RE: separate JIRA for MPack/Ansible. No objection to tracking them
>> > > >> separately, but for this item to be complete, you'll need both the
>> > > feature
>> > > >> and the ability to install it.
>> > > >>
>> > > >> -D...
>> > > >>
>> > > >>
>> > > >> On Tue, Jan 24, 2017 at 5:33 PM, Ryan Merriman <merrim...@gmail.com
>> >
>> > > >> wrote:
>> > > >>
>> > > >>> Assuming we're going to write all errors to a single error topic, I
>> > > > think
>> > > >>> it makes sense to agree on an error message schema and handle
>> errors
>> > > >> across
>> > > >>> the 3 different topologies in the same way with a single
>> > > implementation.
>> > > >>> The implementation in ParserBolt (ErrorUtils.handleError) produces
>> > the
>> > > >> most
>> > > >>> verbose error object so I think it's a good candidate for the
>> single
>> > > >>> implementation.  Here is the message structure it currently
>> produces:
>> > > >>>
>> > > >>> {
>> > > >>>   "exception": "java.lang.Exception: there was an error",
>> > > >>>   "hostname": "host",
>> > > >>>   "stack": "java.lang.Exception: ...",
>> > > >>>   "time": 1485295416563,
>> > > >>>   "message": "there was an error",
>> > > >>>   "rawMessage": "raw message",
>> > > >>>   "rawMessage_bytes": [],
>> > > >>>   "source.type": "bro_error"
>> > > >>> }
>> > > >>>
>> > > >>> From our discussion so far we need to add a couple fields:  an
>> error
>> > > > type
>> > > >>> and hash id.  Adding these to the message looks like:
>> > > >>>
>> > > >>> {
>> > > >>>   "exception": "java.lang.Exception: there was an error",
>> > > >>>   "hostname": "host",
>> > > >>>   "stack": "java.lang.Exception: ...",
>> > > >>>   "time": 1485295416563,
>> > > >>>   "message": "there was an error",
>> > > >>>   "rawMessage": "raw message",
>> > > >>>   "rawMessage_bytes": [],
>> > > >>>   "source.type": "bro_error",
>> > > >>>   "error.type": "parser_error",
>> > > >>>   "rawMessage_hash": "dde41b9920954f94066daf6291fb58a9"
>> > > >>> }
>> > > >>>
>> > > >>> We should also consider expanding the error types I listed earlier.
>> > > >>> Instead of just having "indexing_error" we could have
>> > > >>> "elasticsearch_indexing_error", "hdfs_indexing_error" and so on.
>> > > >>>
>> > > >>> Jon, if an exception happens in an enrichment or threat intel bolt
>> > the
>> > > >>> message is passed along with no error thrown (only logged).
>> > Everywhere
>> > > >>> else I'm having trouble identifying specific fields that should be
>> > > >> hashed.
>> > > >>> Would hashing the message in every case be acceptable?  Do you know
>> > of
>> > > a
>> > > >>> place where we could hash a field instead?  On the topic of
>> > exceptions
>> > > > in
>> > > >>> enrichments, are we ok with an error only being logged and not
>> added
>> > to
>> > > >> the
>> > > >>> message or emitted to the error queue?
>> > > >>>
>> > > >>>
>> > > >>>
>> > > >>> On Tue, Jan 24, 2017 at 3:10 PM, Ryan Merriman <
>> merrim...@gmail.com>
>> > > >>> wrote:
>> > > >>>
>> > > >>>> That use case makes sense to me.  I don't think it will require
>> that
>> > > >> much
>> > > >>>> additional effort either.
>> > > >>>>
>> > > >>>> On Tue, Jan 24, 2017 at 1:02 PM, zeo...@gmail.com <
>> zeo...@gmail.com
>> > >
>> > > >>>> wrote:
>> > > >>>>
>> > > >>>>> Regarding error vs validation - Either way I'm not very
>> > concerned.  I
>> > > >>>>> initially assumed they would be combined and agree with that
>> > > > approach,
>> > > >>> but
>> > > >>>>> splitting them out isn't a very big deal to me either.
>> > > >>>>>
>> > > >>>>> Re: Ryan.  Yes, exactly.  In the case of a parser issue (or
>> > anywhere
>> > > >>> else
>> > > >>>>> where it's not possible to pick out the exact thing causing the
>> > > > issue)
>> > > >>> it
>> > > >>>>> would be a hash of the complete message.
>> > > >>>>>
>> > > >>>>> Regarding the architecture, I mostly agree with James except
>> that I
>> > > >>> think
>> > > >>>>> step 3 needs to also be able to somehow group errors via the
>> > original
>> > > >>>>> data (identify
>> > > >>>>> replays, identify repeat issues with data in a specific field,
>> > issues
>> > > >>> with
>> > > >>>>> consistently different data, etc.).  This is essentially the
>> first
>> > > >> step
>> > > >>> of
>> > > >>>>> troubleshooting, which I assume you are doing if you're looking
>> at
>> > > > the
>> > > >>>>> error dashboard.
>> > > >>>>>
>> > > >>>>> If the hash gets moved out of the initial implementation, I'm
>> > fairly
>> > > >>>>> certain you lose this ability.  The point here isn't to handle
>> long
>> > > >>> fields
>> > > >>>>> (although that's a benefit of this approach), it's to attach a
>> > unique
>> > > >>>>> identifier to the error/validation issue message that links it to
>> > the
>> > > >>>>> original problem.  I'd be happy to consider alternative solutions
>> > to
>> > > >>> this
>> > > >>>>> problem (for instance, actually sending across the data itself) I
>> > > > just
>> > > >>>>> haven't been able to think of another way to do this that I like
>> > > >> better.
>> > > >>>>> Jon
>> > > >>>>>
>> > > >>>>> On Tue, Jan 24, 2017 at 1:13 PM Ryan Merriman <
>> merrim...@gmail.com
>> > >
>> > > >>>>> wrote:
>> > > >>>>>
>> > > >>>>>> We also need a JIRA for any install/Ansible/MPack work needed.
>> > > >>>>>>
>> > > >>>>>> On Tue, Jan 24, 2017 at 12:06 PM, James Sirota <
>> > jsir...@apache.org>
>> > > >>>>> wrote:
>> > > >>>>>>> Now that I had some time to think about it I would collapse all
>> > > >>> error
>> > > >>>>> and
>> > > >>>>>>> validation topics into one.  We can differentiate between
>> > > >> different
>> > > >>>>> views
>> > > >>>>>>> of the data (split by error source etc) via Kibana
>> dashboards.  I
>> > > >>>>> would
>> > > >>>>>>> implement this feature incrementally.  First I would modify all
>> > > >> the
>> > > >>>>> bolts
>> > > >>>>>>> to log to a single topic.  Second, I would get the error
>> indexing
>> > > >>>>> done by
>> > > >>>>>>> attaching the indexing topology to the error topic. Third I
>> would
>> > > >>>>> create
>> > > >>>>>>> the necessary dashboards to view errors and validation failures
>> > > > by
>> > > >>>>>> source.
>> > > >>>>>>> Lastly, I would file a follow-on JIRA to introduce hashing of
>> > > >> errors
>> > > >>>>> or
>> > > >>>>>>> fields that are too long.  It seems like a separate feature
>> that
>> > > >> we
>> > > >>>>> need
>> > > >>>>>> to
>> > > >>>>>>> think through.  We may need a stellar function around that.
>> > > >>>>>>>
>> > > >>>>>>> Thanks,
>> > > >>>>>>> James
>> > > >>>>>>>
>> > > >>>>>>> 24.01.2017, 10:25, "Ryan Merriman" <merrim...@gmail.com>:
>> > > >>>>>>>> I understand what Jon is talking about. He's proposing we hash
>> > > >> the
>> > > >>>>>> value
>> > > >>>>>>>> that caused the error, not necessarily the error message
>> > > > itself.
>> > > >>>>> For an
>> > > >>>>>>>> enrichment this is easy. Just pass along the field value that
>> > > >>> failed
>> > > >>>>>>>> enrichment. For other cases the field that caused the error
>> may
>> > > >>> not
>> > > >>>>> be
>> > > >>>>>> so
>> > > >>>>>>>> obvious. Take parser validation for example. The message is
>> > > >>>>> validated
>> > > >>>>>> as
>> > > >>>>>>>> a whole and it may not be easy to determine which field is the
>> > > >>>>> cause.
>> > > >>>>>> In
>> > > >>>>>>>> that case would a hash of the whole message work?
>> > > >>>>>>>>
>> > > >>>>>>>> There is a broader architectural discussion that needs to
>> > > > happen
>> > > >>>>> before
>> > > >>>>>>> we
>> > > >>>>>>>> can implement this. Currently we have an indexing topology
>> that
>> > > >>>>> reads
>> > > >>>>>>> from
>> > > >>>>>>>> 1 topic and writes messages to ES but errors are written to
>> > > >>> several
>> > > >>>>>>>> different topics:
>> > > >>>>>>>>
>> > > >>>>>>>>    - parser_error
>> > > >>>>>>>>    - parser_invalid
>> > > >>>>>>>>    - enrichments_error
>> > > >>>>>>>>    - threatintel_error
>> > > >>>>>>>>    - indexing_error
>> > > >>>>>>>>
>> > > >>>>>>>> I can see 4 possible approaches to implementing this:
>> > > >>>>>>>>
>> > > >>>>>>>>    1. Create an index topology for each error topic
>> > > >>>>>>>>       1. Good because we can easily reuse the indexing
>> topology
>> > > >>> and
>> > > >>>>>> would
>> > > >>>>>>>>       require the least development effort
>> > > >>>>>>>>       2. Bad because it would consume a lot of extra worker
>> > > >> slots
>> > > >>>>>>>>    2. Move the topic name into the error JSON message as a new
>> > > >>>>>>> "error_type"
>> > > >>>>>>>>    field and write all messages to the indexing topic
>> > > >>>>>>>>       1. Good because we don't need to create a new topology
>> > > >>>>>>>>       2. Bad because we would be flowing data and errors
>> > > > through
>> > > >>> the
>> > > >>>>>> same
>> > > >>>>>>>>       topology. A spike in errors could affect message
>> > > > indexing.
>> > > >>>>>>>>    3. Compromise between 1 and 2. Create another indexing
>> > > >> topology
>> > > >>>>> that
>> > > >>>>>>> is
>> > > >>>>>>>>    dedicated to indexing errors. Move the topic name into the
>> > > >>> error
>> > > >>>>>> JSON
>> > > >>>>>>>>    message as a new "error_type" field and write all errors to
>> > > > a
>> > > >>>>> single
>> > > >>>>>>> error
>> > > >>>>>>>>    topic.
>> > > >>>>>>>>    4. Write a completely new topology with multiple spouts (1
>> > > >> for
>> > > >>>>> each
>> > > >>>>>>>>    error type listed above) that all feed into a single
>> > > >>>>>>> BulkMessageWriterBolt.
>> > > >>>>>>>>       1. Good because the current topologies would not need to
>> > > >>>>> change
>> > > >>>>>>>>       2. Bad because it would require the most development
>> > > >> effort,
>> > > >>>>>> would
>> > > >>>>>>>>       not reuse existing topologies and takes up more worker
>> > > >> slots
>> > > >>>>>> than 3
>> > > >>>>>>>> Are there other approaches I haven't thought of? I think 1 and
>> > > > 2
>> > > >>> are
>> > > >>>>>> off
>> > > >>>>>>>> the table because they are shortcuts and not good long-term
>> > > >>>>> solutions.
>> > > >>>>>> 3
>> > > >>>>>>>> would be my choice because it introduces less complexity than
>> > > > 4.
>> > > >>>>>>> Thoughts?
>> > > >>>>>>>> Ryan
>> > > >>>>>>>>
>> > > >>>>>>>> On Mon, Jan 23, 2017 at 5:44 PM, zeo...@gmail.com <
>> > > >>> zeo...@gmail.com
>> > > >>>>>>> wrote:
>> > > >>>>>>>>>  In that case the hash would be of the value in the IP field,
>> > > >>> such
>> > > >>>>> as
>> > > >>>>>>>>>  sha3(8.8.8.8).
>> > > >>>>>>>>>
>> > > >>>>>>>>>  Jon
>> > > >>>>>>>>>
>> > > >>>>>>>>>  On Mon, Jan 23, 2017, 6:41 PM James Sirota <
>> > > >> jsir...@apache.org>
>> > > >>>>>> wrote:
>> > > >>>>>>>>>  > Jon,
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  > I am still not entirely following why we would want to use
>> > > >>>>> hashing.
>> > > >>>>>>> For
>> > > >>>>>>>>>  > example if my error is "Your IP field is invalid and
>> failed
>> > > >>>>>>> validation"
>> > > >>>>>>>>>  > hashing this error string will always result in the same
>> > > >> hash.
>> > > >>>>> Why
>> > > >>>>>>> not
>> > > >>>>>>>>>  > just use the actual error string? Can you provide an
>> > > > example
>> > > >>>>> where
>> > > >>>>>>> you
>> > > >>>>>>>>>  > would use it?
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  > Thanks,
>> > > >>>>>>>>>  > James
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  > 23.01.2017, 16:29, "zeo...@gmail.com" <zeo...@gmail.com>:
>> > > >>>>>>>>>  > > For 1 - I'm good with that.
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > I'm talking about hashing the relevant content itself
>> not
>> > > >>> the
>> > > >>>>>>> error.
>> > > >>>>>>>>>  Some
>> > > >>>>>>>>>  > > benefits are (1) minimize load on search index (there's
>> > > >>>>> minimal
>> > > >>>>>>> benefit
>> > > >>>>>>>>>  > in
>> > > >>>>>>>>>  > > spending the CPU and disk to keep it at full fidelity
>> > > >>>>> (tokenize
>> > > >>>>>> and
>> > > >>>>>>>>>  > store))
>> > > >>>>>>>>>  > > (2) provide something to key on for dashboards (assuming
>> > > > a
>> > > >>>>> good
>> > > >>>>>>> hash
>> > > >>>>>>>>>  > > algorithm that avoids collisions and is second preimage
>> > > >>>>>> resistant)
>> > > >>>>>>> and
>> > > >>>>>>>>>  > (3)
>> > > >>>>>>>>>  > > specific to errors, if the issue is that it failed to
>> > > >>> index, a
>> > > >>>>>> hash
>> > > >>>>>>>>>  gives
>> > > >>>>>>>>>  > > us some protection that the issue will not occur twice.
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > Jon
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > On Mon, Jan 23, 2017, 2:47 PM James Sirota <
>> > > >>>>> jsir...@apache.org>
>> > > >>>>>>> wrote:
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > Jon,
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > With regards to 1, collapsing to a single dashboard for
>> > > >> each
>> > > >>>>>> would
>> > > >>>>>>> be
>> > > >>>>>>>>>  > > fine. So we would have one error index and one "failed
>> to
>> > > >>>>>> validate"
>> > > >>>>>>>>>  > > index. The distinction is that errors would be things
>> > > > that
>> > > >>>>> went
>> > > >>>>>>> wrong
>> > > >>>>>>>>>  > > during stream processing (failed to parse, etc...),
>> while
>> > > >>>>>>> validation
>> > > >>>>>>>>>  > > failures are messages that explicitly failed stellar
>> > > >>>>>>> validation/schema
>> > > >>>>>>>>>  > > enforcement. There should be relatively few of the
>> second
>> > > >>>>> type.
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > With respect to 3, why do you want the error hashed? Why
>> > > >> not
>> > > >>>>> just
>> > > >>>>>>>>>  search
>> > > >>>>>>>>>  > > for the error text?
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > Thanks,
>> > > >>>>>>>>>  > > James
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > 20.01.2017, 14:01, "zeo...@gmail.com" <zeo...@gmail.com
>> >:
>> > > >>>>>>>>>  > >> As someone who currently fills the platform engineer
>> > > >> role,
>> > > >>> I
>> > > >>>>> can
>> > > >>>>>>> give
>> > > >>>>>>>>>  > this
>> > > >>>>>>>>>  > >> idea a huge +1. My thoughts:
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> 1. I think it depends on exactly what data is pushed
>> > > > into
>> > > >>> the
>> > > >>>>>>> index
>> > > >>>>>>>>>  > (#3).
>> > > >>>>>>>>>  > >> However, assuming the errors you proposed recording, I
>> > > >>> can't
>> > > >>>>> see
>> > > >>>>>>> huge
>> > > >>>>>>>>>  > >> benefits to having more than one dashboard. I would be
>> > > >>> happy
>> > > >>>>> to
>> > > >>>>>> be
>> > > >>>>>>>>>  > >> persuaded otherwise.
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> 2. I would say yes, storing the errors in HDFS in
>> > > >> addition
>> > > >>> to
>> > > >>>>>>>>>  indexing
>> > > >>>>>>>>>  > is
>> > > >>>>>>>>>  > >> a good thing. Using METRON-510
>> > > >>>>>>>>>  > >> <https://issues.apache.org/jira/browse/METRON-510> as
>> a
>> > > >>> case
>> > > >>>>>>> study,
>> > > >>>>>>>>>  > there
>> > > >>>>>>>>>  > >> is the potential in this environment for
>> > > >>> attacker-controlled
>> > > >>>>>> data
>> > > >>>>>>> to
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > result
>> > > >>>>>>>>>  > >> in processing errors which could be a method of evading
>> > > >>>>> security
>> > > >>>>>>>>>  > >> monitoring. Once an attack is identified, the long term
>> > > >>> HDFS
>> > > >>>>>>> storage
>> > > >>>>>>>>>  > would
>> > > >>>>>>>>>  > >> allow better historical analysis for
>> > > >>> low-and-slow/persistent
>> > > >>>>>>> attacks
>> > > >>>>>>>>>  > (I'm
>> > > >>>>>>>>>  > >> thinking of a method of data exfil that also won't
>> > > >>>>> successfully
>> > > >>>>>>> get
>> > > >>>>>>>>>  > stored
>> > > >>>>>>>>>  > >> in Lucene, but is hard to identify over a short period
>> > > > of
>> > > >>>>> time).
>> > > >>>>>>>>>  > >> - Along this line, I think that there are various parts
>> > > >> of
>> > > >>>>>> Metron
>> > > >>>>>>>>>  > (this
>> > > >>>>>>>>>  > >> included) which could benefit from having method of
>> > > >>>>> configuring
>> > > >>>>>>> data
>> > > >>>>>>>>>  > aging
>> > > >>>>>>>>>  > >> by bucket in HDFS (Following Nick's comments here
>> > > >>>>>>>>>  > >> <https://issues.apache.org/jira/browse/METRON-477>).
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> 3. I would potentially add a hash of the content that
>> > > >>> failed
>> > > >>>>>>>>>  > validation to
>> > > >>>>>>>>>  > >> help identify repeats over time with less of a concern
>> > > >> that
>> > > >>>>>> you'd
>> > > >>>>>>>>>  have
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > back
>> > > >>>>>>>>>  > >> to back failures (i.e. instead of storing the value
>> > > >>> itself).
>> > > >>>>>>>>>  > Additionally,
>> > > >>>>>>>>>  > >> I think it's helpful to be able to search all times
>> > > > there
>> > > >>>>> was an
>> > > >>>>>>>>>  > indexing
>> > > >>>>>>>>>  > >> error (instead of it hitting the catch-all).
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> Jon
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> On Fri, Jan 20, 2017 at 1:17 PM James Sirota <
>> > > >>>>>> jsir...@apache.org>
>> > > >>>>>>>>>  > wrote:
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> We already have a capability to capture bolt errors and
>> > > >>>>>> validation
>> > > >>>>>>>>>  > errors
>> > > >>>>>>>>>  > >> and pipe them into a Kafka topic. I want to propose
>> that
>> > > >> we
>> > > >>>>>>> attach a
>> > > >>>>>>>>>  > >> writer topology to the error and validation failed
>> kafka
>> > > >>>>> topics
>> > > >>>>>> so
>> > > >>>>>>>>>  > that we
>> > > >>>>>>>>>  > >> can (a) create a new ES index for these errors and (b)
>> > > >>>>> create a
>> > > >>>>>>> new
>> > > >>>>>>>>>  > Kibana
>> > > >>>>>>>>>  > >> dashboard to visualize them. The benefit would be that
>> > > >>> errors
>> > > >>>>>> and
>> > > >>>>>>>>>  > >> validation failures would be easier to see and analyze.
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> I am seeking feedback on the following:
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> - How granular would we want this feature to be? Think
>> > > > we
>> > > >>>>> would
>> > > >>>>>>> want
>> > > >>>>>>>>>  > one
>> > > >>>>>>>>>  > >> index/dashboard per source? Or would it be better to
>> > > >>> collapse
>> > > >>>>>>>>>  > everything
>> > > >>>>>>>>>  > >> into the same index?
>> > > >>>>>>>>>  > >> - Do we care about storing these errors in HDFS as
>> well?
>> > > >> Or
>> > > >>>>> is
>> > > >>>>>>>>>  indexing
>> > > >>>>>>>>>  > >> them enough?
>> > > >>>>>>>>>  > >> - What types of errors should we record? I am
>> proposing:
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> For error reporting:
>> > > >>>>>>>>>  > >> --Message failed to parse
>> > > >>>>>>>>>  > >> --Enrichment failed to enrich
>> > > >>>>>>>>>  > >> --Threat intel feed failures
>> > > >>>>>>>>>  > >> --Generic catch-all for all other errors
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> For validation reporting:
>> > > >>>>>>>>>  > >> --What part of message failed validation
>> > > >>>>>>>>>  > >> --What stellar validator caused the failure
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> -------------------
>> > > >>>>>>>>>  > >> Thank you,
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> James Sirota
>> > > >>>>>>>>>  > >> PPMC- Apache Metron (Incubating)
>> > > >>>>>>>>>  > >> jsirota AT apache DOT org
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> --
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> Jon
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> Sent from my mobile device
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > -------------------
>> > > >>>>>>>>>  > > Thank you,
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > James Sirota
>> > > >>>>>>>>>  > > PPMC- Apache Metron (Incubating)
>> > > >>>>>>>>>  > > jsirota AT apache DOT org
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > --
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > Jon
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > Sent from my mobile device
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  > -------------------
>> > > >>>>>>>>>  > Thank you,
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  > James Sirota
>> > > >>>>>>>>>  > PPMC- Apache Metron (Incubating)
>> > > >>>>>>>>>  > jsirota AT apache DOT org
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  --
>> > > >>>>>>>>>
>> > > >>>>>>>>>  Jon
>> > > >>>>>>>>>
>> > > >>>>>>>>>  Sent from my mobile device
>> > > >>>>>>> -------------------
>> > > >>>>>>> Thank you,
>> > > >>>>>>>
>> > > >>>>>>> James Sirota
>> > > >>>>>>> PPMC- Apache Metron (Incubating)
>> > > >>>>>>> jsirota AT apache DOT org
>> > > >>>>>>>
>> > > >>>>> --
>> > > >>>>>
>> > > >>>>> Jon
>> > > >>>>>
>> > > >>>>> Sent from my mobile device
>> > > >>>>>
>> > > >>>>
>> > > >> --
>> > > >>
>> > > >> Jon
>> > > >>
>> > > >> Sent from my mobile device
>> > > >>
>> > >
>> > > --
>> >
>> > Jon
>> >
>> > Sent from my mobile device
>> >
>> > --
>>
>> Jon
>>

Re: [DISCUSS] Unique id for messages

Reply via email to