PutMetronEnrichementRecords*  ;)

On June 5, 2018 at 10:32:43, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Do we, the community, think it would be a good idea to create a
PutMetronEnrichment NiFi processor for this use case? It seems a number of
people want to use NiFi to manage and schedule loading of enrichments for
example.

Simon

On 5 June 2018 at 06:56, Casey Stella <ceste...@gmail.com> wrote:

> The problem, as you correctly diagnosed, is the key in HBase. We
construct
> the key very specifically in Metron, so it's unlikely to work out of the
> box with the NiFi processor unfortunately. The key that we use is formed
> here in the codebase:
> https://github.com/cestella/incubator-metron/blob/master/
> metron-platform/metron-enrichment/src/main/java/org/
> apache/metron/enrichment/converter/EnrichmentKey.java#L51
>
> To put that in english, consider the following:
>
> - type - The enrichment type
> - indicator - the indicator to use
> - hash(*) - A murmur 3 128bit hash function
>
> the key is hash(indicator) + type + indicator
>
> This hash prefixing is a standard practice in hbase key design that
allows
> the keys to be uniformly distributed among the regions and prevents
> hotspotting. Depending on how the PutHBaseJSON processor works, if you
can
> construct the key and pass it in, then you might be able to either
> construct the key in NiFi or write a processor to construct the key.
> Ultimately though, what Carolyn said is true..the easiest approach is
> probably using the flatfile loader.
> If you do get this working in NiFi, however, do please let us know and/or
> consider contributing it back to the project as a PR :)
>
>
>
> On Fri, Jun 1, 2018 at 6:26 AM Charles Joynt <
> charles.jo...@gresearch.co.uk>
> wrote:
>
> > Hello,
> >
> > I work as a Dev/Ops Data Engineer within the security team at a company
> in
> > London where we are in the process of implementing Metron. I have been
> > tasked with implementing feeds of network environment data into HBase
so
> > that this data can be used as enrichment sources for our security
events.
> > First-off I wanted to pull in DNS data for an internal domain.
> >
> > I am assuming that I need to write data into HBase in such a way that
it
> > exactly matches what I would get from the flatfile_loader.sh script. A
> > colleague of mine has already loaded some DNS data using that script,
so
> I
> > am using that as a reference.
> >
> > I have implemented a flow in NiFi which takes JSON data from a HTTP
> > listener and routes it to a PutHBaseJSON processor. The flow is
working,
> in
> > the sense that data is successfully written to HBase, but despite
> (naively)
> > specifying "Row Identifier Encoding Strategy = Binary", the results in
> > HBase don't look correct. Comparing the output from HBase scan commands
I
> > see:
> >
> > flatfile_loader.sh produced:
> >
> > ROW:
> > \xFF\xFE\xCB\xB8\xEF\x92\xA3\xD9#xC\xF9\xAC\x0Ap\x1E\x00\
> x05whois\x00\x0E192.168.0.198
> > CELL: column=data:v, timestamp=1516896203840,
> > value={"clientname":"server.domain.local","clientip":"192.168.0.198"}
> >
> > PutHBaseJSON produced:
> >
> > ROW: server.domain.local
> > CELL: column=dns:v, timestamp=1527778603783,
> > value={"name":"server.domain.local","type":"A","data":"192.168.0.198"}
> >
> > From source JSON:
> >
> >
> > {"k":"server.domain.local","v":{"name":"server.domain.local"
> ,"type":"A","data":"192.168.0.198"}}
> >
> > I know that there are some differences in column family / field names,
> but
> > my worry is the ROW id. Presumably I need to encode my row key, "k" in
> the
> > JSON data, in a way that matches how the flatfile_loader.sh script did
> it.
> >
> > Can anyone explain how I might convert my Id to the correct format?
> > -or-
> > Does this matter-can Metron use the human-readable ROW ids?
> >
> > Charlie Joynt
> >
> > --------------
> > G-RESEARCH believes the information provided herein is reliable. While
> > every care has been taken to ensure accuracy, the information is
> furnished
> > to the recipients with no warranty as to the completeness and accuracy
of
> > its contents and on condition that any errors or omissions shall not be
> > made the basis of any claim, demand or cause of action.
> > The information in this email is intended only for the named recipient.
> > If you are not the intended recipient please notify us immediately and
do
> > not copy, distribute or take action based on this e-mail.
> > All messages sent to and from this e-mail address will be logged by
> > G-RESEARCH and are subject to archival storage, monitoring, review and
> > disclosure.
> > G-RESEARCH is the trading name of Trenchant Limited, 5th Floor,
> > Whittington House, 19-30 Alfred Place, London WC1E 7EA.
> > Trenchant Limited is a company registered in England with company
number
> > 08127121.
> > --------------
> >
>



-- 
-- 
simon elliston ball
@sireb

Reply via email to