On the sharing and securing points, it seems like having the result of a
query run hang around in HDFS on a given URL would work, we can then use
Ranger policies (RBAC, or even ABAC later) to control access, this also
solves the page storage problem, and gives us a kind of two step approach,
both of which could (maybe, possibly, but probably not) be large enough to
need distribution, i.e. the initial search everything and the subsequent
sort / page through the results. Does anyone imaging sorting? Maybe
sub-filtering, but PCAP is heavily time based, so timestamp sort ok?

I also suspect it's worth considering the lifecycle of our stored result
sets as being meta-data driven. If I'm doing a speculative search I don't
really care if an admin cleans that up after at the end of day / week /
disk space nervousness limit. However, if I find something good, I might
want to mark the result set as immune from automatic deletion.

The other issue I would raise, which has implications for our PCAP capture,
and also impacts Otto's suggestion of 'self-uploaded' PCAPs is how we
namespace PCAP collection and retrieval. The problem here is that I might
have PCAPs from multiple locations which have conflicting private IP
ranges, so I can't logically dump them all in the same repository. Solving
the collection end of that is probably a separate unit of effort, but this
retrieval architecture should support multiple file system locations.

If we wanted to get fancy about it, we should look at using the stored
result sets as a kind of cache, for other queries, as people refine and
narrow down queries, it may make sense to be more sophisticated about where
our query jobs pull from (i.e. filter the subset from a previous resultset,
rather than scanning petabytes of source data). This may imply some kind of
TOC for the cache. The underlying immutability of the PCAP store should
make this fairly tractable.

FYI, I've been doing a lot of thinking around data security, API and
configuration security and auditing recently, but I suspect that is a
different discuss thread. I'll kick something off shortly with a few
thoughts.

I see a lot of this as long term goals to be honest, so as Jon says, we can
definitely take a few baby steps to start.

Simon

On 11 May 2018 at 15:40, Otto Fowler <ottobackwa...@gmail.com> wrote:

> Don’t lose the use case for manually uploading PCAPS for analysis Jon.
>
>
> On May 11, 2018 at 10:14:02, zeo...@gmail.com (zeo...@gmail.com) wrote:
>
> I think baby steps are fine - admin gets access to all, otherwise you only
> see your own pcaps, but we file a jira for a future add of API security,
> which more mature SOCs that align with the Metron personas will need.
>
> Jon
>
> On Fri, May 11, 2018, 09:27 Ryan Merriman <merrim...@gmail.com> wrote:
>
> > That's a good point Jon. There are different levels of effort associated
> > with different options. If we want to allow pcaps to be shared with
> > specific users, we will need to introduce ACL security in our REST
> > application using something like the ACL capability that comes with
> Spring
> > Security or Ranger. This would be more complex to design and implement.
> > If we want something more broad like admin roles that can see all or
> > allowing pcap files to become public, this would be less work. Do you
> > think ACL security is required or would the other options be acceptable?
> >
> > On Thu, May 10, 2018 at 2:47 PM, zeo...@gmail.com <zeo...@gmail.com>
> > wrote:
> >
> > > At the very least there needs to be the ability to share downloaded
> PCAPs
> > > with other users and/or have roles that can see all pcaps. A platform
> > > engineer may want to clean up old pcaps after x time, or a manger may
> ask
> > > an analyst to find all of the traffic that exhibits xyz behavior, dump
> a
> > > pcap, and then point him to it so the manager can review. Since the
> > > pcap may be huge, we wouldn't want to try to push people to sending it
> > via
> > > email, uploading to a file server, finding an external hard drive, etc.
> > >
> > > Jon
> > >
> > > On Thu, May 10, 2018 at 10:16 AM Ryan Merriman <merrim...@gmail.com>
> > > wrote:
> > >
> > > > Mike, I believe the /pcapGetter/getPcapsByIdentifiers endpoint
> exposes
> > > the
> > > > fixed query option which we have covered. I agree with you that
> > > > deprecating the metron-api module should be a goal of this feature.
> > > >
> > > > On Wed, May 9, 2018 at 1:36 PM, Michael Miklavcic <
> > > > michael.miklav...@gmail.com> wrote:
> > > >
> > > > > This looks like a pretty good start Ryan. Does the metadata
> endpoint
> > > > cover
> > > > > this https://github.com/apache/metron/tree/master/
> > > > > metron-platform/metron-api#the-pcapgettergetpcapsbyidentifier
> > > s-endpoint
> > > > > from the original metron-api? If so, then we would be able to
> > deprecate
> > > > the
> > > > > existing metron-api project. If we later go to micro-services, a
> pcap
> > > > > module would spin back into the fold, but it would probably look
> > > > different
> > > > > from metron-api.
> > > > >
> > > > > I commented on the UI thread, but to reiterate for the purpose of
> > > backend
> > > > > functionality here I don't believe there is a way to "PAUSE" or
> > > "SUSPEND"
> > > > > jobs. That said, I think GET /api/v1/pcap/stop/<jobId> is
> sufficient
> > > for
> > > > > the job management operations.
> > > > >
> > > > > On Wed, May 9, 2018 at 11:00 AM, Ryan Merriman <
> merrim...@gmail.com>
>
> > > > > wrote:
> > > > >
> > > > > > Now that we are confident we can run submit a MR job from our
> > current
> > > > > REST
> > > > > > application, is this the desired approach? Just want to confirm.
> > > > > >
> > > > > > Next I think we should map out what the REST interface will look
> > > like.
> > > > > > Here are the endpoints I'm thinking about:
> > > > > >
> > > > > > GET /api/v1/pcap/metadata?basePath
> > > > > >
> > > > > > This endpoint will return metadata of pcap data stored in HDFS.
> > This
> > > > > would
> > > > > > include pcap size, date ranges (how far back can I go), etc. It
> > > would
> > > > > > accept an optional HDFS basePath parameter for cases where pcap
> > data
> > > is
> > > > > > stored in multiple places and/or different from the default
> > location.
> > > > > >
> > > > > > POST /api/v1/pcap/query
> > > > > >
> > > > > > This endpoint would accept a pcap request, submit a pcap query
> job,
> > > and
> > > > > > return a job id. The request would be an object containing the
> > > > > parameters
> > > > > > documented here: https://github.com/apache/metron/tree/master/
> > > > > > metron-platform/metron-pcap-backend#query-filter-utility. A
> > > query/job
> > > > > > would be associated with a user that submits it. An exception
> will
> > > be
> > > > > > returned for violating constraints like too many queries
> submitted,
> > > > query
> > > > > > parameters out of limits, etc.
> > > > > >
> > > > > > GET /api/v1/pcap/status/<jobId>
> > > > > >
> > > > > > This endpoint will return the status of a running job. I imagine
> > > this
> > > > is
> > > > > > just a proxy to the YARN REST api. We can discuss the
> > implementation
> > > > > > behind these endpoints later.
> > > > > >
> > > > > > GET /api/v1/pcap/stop/<jobId>
> > > > > >
> > > > > > This endpoint would kill a running pcap job. If the job has
> > already
> > > > > > completed this is a noop.
> > > > > >
> > > > > > GET /api/v1/pcap/list
> > > > > >
> > > > > > This endpoint will list a user's submitted pcap queries. Items in
> > > the
> > > > > list
> > > > > > would contain job id, status (is it finished?), start/end time,
> and
> > > > > number
> > > > > > of pages. Maybe there is some overlap with the status endpoint
> > above
> > > > and
> > > > > > the status endpoint is not needed?
> > > > > >
> > > > > > GET /api/v1/pcap/pdml/<jobId>/<pageNumber>
> > > > > >
> > > > > > This endpoint will return pcap results for the given page in pdml
> > > > format
> > > > > (
> > > > > > https://wiki.wireshark.org/PDML). Are there other formats we
> want
> > > to
> > > > > > support?
> > > > > >
> > > > > > GET /api/v1/pcap/raw/<jobId>/<pageNumber>
> > > > > >
> > > > > > This endpoint will allow a user to download raw pcap results for
> > the
> > > > > given
> > > > > > page.
> > > > > >
> > > > > > DELETE /api/v1/pcap/<jobId>
> > > > > >
> > > > > > This endpoint will delete pcap query results. Not sure yet how
> > this
> > > > fits
> > > > > > in with our broader cleanup strategy.
> > > > > >
> > > > > > This should get us started. What did I miss and what would you
> > > change
> > > > > > about these? I did not include much detail related to security,
> > > > cleanup
> > > > > > strategy, or underlying implementation details but these are
> items
> > we
> > > > > > should discuss at some point.
> > > > > >
> > > > > > On Tue, May 8, 2018 at 5:38 PM, Michael Miklavcic <
> > > > > > michael.miklav...@gmail.com> wrote:
> > > > > >
> > > > > > > Sweet! That's great news. The pom changes are a lot simpler
> than
> > I
> > > > > > > expected. Very nice.
> > > > > > >
> > > > > > > On Tue, May 8, 2018 at 4:35 PM, Ryan Merriman <
> > merrim...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Finally figured it out. Commit is here:
> > > > > > > > https://github.com/merrimanr/incubator-metron/commit/
> > > > > > > > 22fe5e9ff3c167b42ebeb7a9f1000753a409aff1
> > > > > > > >
> > > > > > > > It came down to figuring out the right combination of maven
> > > > > > dependencies
> > > > > > > > and passing in the HDP version to REST as a Java system
> > property.
> > > > I
> > > > > > also
> > > > > > > > included some HDFS setup tasks. I tested this in full dev and
> > > can
> > > > > now
> > > > > > > > successfully run a pcap query and get results. All you should
> > > have
> > > > > to
> > > > > > do
> > > > > > > > is generate some pcap data first.
> > > > > > > >
> > > > > > > > On Tue, May 8, 2018 at 4:17 PM, Michael Miklavcic <
> > > > > > > > michael.miklav...@gmail.com> wrote:
> > > > > > > >
> > > > > > > > > @Ryan - pulled your branch and experimented with a few
> > things.
> > > In
> > > > > > doing
> > > > > > > > so,
> > > > > > > > > it dawned on me that by adding the yarn and hadoop
> classpath,
> > > you
> > > > > > > > probably
> > > > > > > > > didn't introduce a new classpath issue, rather you probably
> > > just
> > > > > > moved
> > > > > > > > onto
> > > > > > > > > the next classpath issue, ie hbase per your exception about
> > > hbase
> > > > > > jaxb.
> > > > > > > > > Anyhow, I put up a branch with some pom changes worth
> trying
> > in
> > > > > > > > conjunction
> > > > > > > > > with invoking the rest app startup via "/usr/bin/yarn jar"
> > > > > > > > >
> > > > > > > > > https://github.com/mmiklavc/metron/tree/ryan-rest-test
> > > > > > > > >
> > > > > > > > > https://github.com/mmiklavc/metron/commit/
> > > > > > > 5ca23580fc6e043fafae2327c80b65
> > > > > > > > > b20ca1c0c9
> > > > > > > > >
> > > > > > > > > Mike
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, May 8, 2018 at 7:44 AM, Simon Elliston Ball <
> > > > > > > > > si...@simonellistonball.com> wrote:
> > > > > > > > >
> > > > > > > > > > That would be a step closer to something more like a
> > > > > micro-service
> > > > > > > > > > architecture. However, I would want to make sure we think
> > > about
> > > > > the
> > > > > > > > > > operational complexity, and mpack implications of having
> > > > another
> > > > > > > server
> > > > > > > > > > installed and running somewhere on the cluster (also,
> ssl,
> > > > > > kerberos,
> > > > > > > > etc
> > > > > > > > > > etc requirements for that service).
> > > > > > > > > >
> > > > > > > > > > On 8 May 2018 at 14:27, Ryan Merriman <
> merrim...@gmail.com
> > >
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > +1 to having metron-api as it's own service and using a
> > > > gateway
> > > > > > > type
> > > > > > > > > > > pattern.
> > > > > > > > > > >
> > > > > > > > > > > On Tue, May 8, 2018 at 8:13 AM, Otto Fowler <
> > > > > > > ottobackwa...@gmail.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Why not have metron-api as it’s own service and use a
> > > > > ‘gateway’
> > > > > > > > type
> > > > > > > > > > > > pattern in rest?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On May 8, 2018 at 08:45:33, Ryan Merriman (
> > > > > merrim...@gmail.com
> > > > > > )
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Moving the yarn classpath command earlier in the
> > > classpath
> > > > > now
> > > > > > > > gives
> > > > > > > > > > this
> > > > > > > > > > > > error:
> > > > > > > > > > > >
> > > > > > > > > > > > Caused by: java.lang.NoSuchMethodError:
> > > > > > > > > > > > javax.servlet.ServletContext.
> > > getVirtualServerName()Ljava/
> > > > > > > > > lang/String;
> > > > > > > > > > > >
> > > > > > > > > > > > I will experiment with other combinations, I suspect
> we
> > > > will
> > > > > > need
> > > > > > > > > > > > finer-grain control over the order.
> > > > > > > > > > > >
> > > > > > > > > > > > The grep matches class names inside jar files. I use
> > this
> > > > all
> > > > > > the
> > > > > > > > > time
> > > > > > > > > > > and
> > > > > > > > > > > > it's really useful.
> > > > > > > > > > > >
> > > > > > > > > > > > The metron-rest jar is already shaded.
> > > > > > > > > > > >
> > > > > > > > > > > > Reverse engineering the yarn jar command was the next
> > > > thing I
> > > > > > was
> > > > > > > > > going
> > > > > > > > > > > to
> > > > > > > > > > > > try. Will let you know how it goes.
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, May 8, 2018 at 12:36 AM, Michael Miklavcic <
> > > > > > > > > > > > michael.miklav...@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > What order did you add the hadoop or yarn
> classpath?
> > > The
> > > > > > > "shaded"
> > > > > > > > > > > > package
> > > > > > > > > > > > > stands out to me in this name
> > > "org.apache.hadoop.hbase.*
> > > > > > > shaded*
> > > > > > > > > > > > >
> .org.codehaus.jackson.jaxrs.JacksonJaxbJsonProvider."
> > > > > Maybe
> > > > > > > try
> > > > > > > > > > adding
> > > > > > > > > > > > > those packages earlier on the classpath.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think that find command needs a "jar tvf",
> > otherwise
> > > > > you're
> > > > > > > > > looking
> > > > > > > > > > > > for a
> > > > > > > > > > > > > class name in jar file names.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Have you tried shading the rest jar?
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'd also look at the classpath you get when running
> > > "yarn
> > > > > > jar"
> > > > > > > to
> > > > > > > > > > start
> > > > > > > > > > > > the
> > > > > > > > > > > > > existing pcap service, per the instructions in
> > > > > > > > > metron-api/README.md.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, May 7, 2018 at 3:28 PM, Ryan Merriman <
> > > > > > > > merrim...@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > To explore the idea of merging metron-api into
> > > > > metron-rest
> > > > > > > and
> > > > > > > > > > > running
> > > > > > > > > > > > > pcap
> > > > > > > > > > > > > > queries inside our REST application, I created a
> > > simple
> > > > > > test
> > > > > > > > > here:
> > > > > > > > > > > > > >
> > > > https://github.com/merrimanr/incubator-metron/tree/pcap-
> > > > > > > > > rest-test.
> > > > > > > > > > A
> > > > > > > > > > > > > > summary of what's included:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Added pcap as a dependency in the metron-rest
> > > pom.xml
> > > > > > > > > > > > > > - Added a pcap query controller endpoint at
> > > > > > > > > > > > > > http://node1:8082/swagger-ui.
> > > > > html#!/pcap-query-controller/
> > > > > > > > > > > > > queryUsingGET
> > > > > > > > > > > > > > - Added a pcap query service that runs a simple,
> > > > > hardcoded
> > > > > > > > query
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Generate some pcap data using pycapa (
> > > > > > > > > > > > > >
> > https://github.com/apache/metron/tree/master/metron-
> > > > > > > > > sensors/pycapa
> > > > > > > > > > )
> > > > > > > > > > > > and
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > pcap topology (
> > > > > > > > > > > > > >
> > https://github.com/apache/metron/tree/master/metron-
> > > > > > > > > > > > > >
> > platform/metron-pcap-backend#starting-the-topology).
> > > > > > > > > > > > > > After this initial setup there should be data in
> > HDFS
> > > > at
> > > > > > > > > > > > > > "/apps/metron/pcap". I believe this should be
> > enough
> > > to
> > > > > > > > exercise
> > > > > > > > > > the
> > > > > > > > > > > > > > issue. Just hit the endpoint referenced above. I
> > > tested
> > > > > > this
> > > > > > > in
> > > > > > > > > an
> > > > > > > > > > > > > > already running full dev by building and
> deploying
> > > the
> > > > > > > > > metron-rest
> > > > > > > > > > > > jar.
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > did not rebuild full dev with this change but I
> > would
> > > > > still
> > > > > > > > > expect
> > > > > > > > > > it
> > > > > > > > > > > > to
> > > > > > > > > > > > > > work. Let me know if it doesn't.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The first error I see when I hit this endpoint
> is:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > java.lang.NoClassDefFoundError:
> > > > > > > > > > > > > > org/apache/hadoop/yarn/webapp/
> > > > > YarnJacksonJaxbJsonProvider.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Here are the things I've tried so far:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Run the REST application with the YARN jar
> > command
> > > > > since
> > > > > > > this
> > > > > > > > > is
> > > > > > > > > > > how
> > > > > > > > > > > > > > all our other YARN/MR-related applications are
> > > started
> > > > > > > > > (metron-api,
> > > > > > > > > > > > > > MAAS,
> > > > > > > > > > > > > > pcap query, etc). I wouldn't expect this to work
> > > since
> > > > we
> > > > > > > have
> > > > > > > > > > > > > runtime
> > > > > > > > > > > > > > dependencies on our shaded elasticsearch and
> parser
> > > > jars
> > > > > > and
> > > > > > > > I'm
> > > > > > > > > > not
> > > > > > > > > > > > > > aware
> > > > > > > > > > > > > > of a way to add additional jars to the classpath
> > with
> > > > the
> > > > > > > YARN
> > > > > > > > > jar
> > > > > > > > > > > > > > command
> > > > > > > > > > > > > > (is there a way?). Either way I get this error:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 18/05/04 19:49:56 WARN reflections.Reflections:
> > could
> > > > not
> > > > > > > > create
> > > > > > > > > > Dir
> > > > > > > > > > > > > using
> > > > > > > > > > > > > > jarFile from url file:/usr/hdp/2.6.4.0-91/
> > > > > > > > hadoop/lib/ojdbc6.jar.
> > > > > > > > > > > > > skipping.
> > > > > > > > > > > > > > java.lang.NullPointerException
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - I tried adding `yarn classpath` and `hadoop
> > > > classpath`
> > > > > to
> > > > > > > the
> > > > > > > > > > > > > > classpath in /usr/metron/0.4.3/bin/metron-
> rest.sh
> > > (REST
> > > > > > > start
> > > > > > > > > > > > > > script). I
> > > > > > > > > > > > > > get this error:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > java.lang.ClassNotFoundException:
> > > > > > > > > > > > > >
> > org.apache.hadoop.hbase.shaded.org.codehaus.jackson.
> > > > > > > > > > > > > > jaxrs.JacksonJaxbJsonProvider
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - I searched for the class in the previous
> attempt
> > > but
> > > > > > could
> > > > > > > > not
> > > > > > > > > > find
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > in full dev:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > find / -name "*.jar" 2>/dev/null | xargs grep
> > > > > > > > > > > > > >
> > org/apache/hadoop/hbase/shaded/org/codehaus/jackson/
> > > > > > > > > > > > > > jaxrs/JacksonJaxbJsonProvider
> > > > > > > > > > > > > > 2>/dev/null
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Further up in the stack trace I see the error
> > > happens
> > > > > > when
> > > > > > > > > > > > > initiating
> > > > > > > > > > > > > > the org.apache.hadoop.yarn.util.
> > > timeline.TimelineUtils
> > > > > > > class.
> > > > > > > > I
> > > > > > > > > > > > > tried
> > > > > > > > > > > > > > setting "yarn.timeline-service.enabled" in
> Ambari
> > to
> > > > > false
> > > > > > > and
> > > > > > > > > > then
> > > > > > > > > > > I
> > > > > > > > > > > > > > get
> > > > > > > > > > > > > > this error:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Unable to parse
> > > > > > > > > > > > > >
> > > > '/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-
> > > > > > > > > framework'
> > > > > > > > > > > as
> > > > > > > > > > > > a
> > > > > > > > > > > > > > URI, check the setting for mapreduce.application.
> > > > > > > > framework.path
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - I've tried adding different hadoop, hbase, yarn
> > and
> > > > > > > mapreduce
> > > > > > > > > > Maven
> > > > > > > > > > > > > > dependencies without any success
> > > > > > > > > > > > > > - hadoop-yarn-client
> > > > > > > > > > > > > > - hadoop-yarn-common
> > > > > > > > > > > > > > - hadoop-mapreduce-client-core
> > > > > > > > > > > > > > - hadoop-yarn-server-common
> > > > > > > > > > > > > > - hadoop-yarn-api
> > > > > > > > > > > > > > - hbase-server
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I will keep exploring other possible solutions.
> Let
> > > me
> > > > > know
> > > > > > > if
> > > > > > > > > > anyone
> > > > > > > > > > > > > has
> > > > > > > > > > > > > > any ideas.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, May 7, 2018 at 9:02 AM, Otto Fowler <
> > > > > > > > > > ottobackwa...@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I can imagine a new generic service(s)
> capability
> > > > whose
> > > > > > > job (
> > > > > > > > > pun
> > > > > > > > > > > > > > intended
> > > > > > > > > > > > > > > ) is to
> > > > > > > > > > > > > > > abstract the submittal, tracking, and storage
> of
> > > > > results
> > > > > > to
> > > > > > > > > yarn.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > It would be extended with storage providers,
> > queue
> > > > > > > provider,
> > > > > > > > > > > > possibly
> > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > set of policies or rather strategies.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The pcap ‘report’ would be a client to that
> > > service,
> > > > > the
> > > > > > > > > > > specializes
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > service operation for the way we want pcap to
> > work.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > We can then re-use the generic service for
> other
> > > long
> > > > > > > running
> > > > > > > > > > yarn
> > > > > > > > > > > > > > > things…..
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On May 7, 2018 at 09:56:51, Otto Fowler (
> > > > > > > > > ottobackwa...@gmail.com
> > > > > > > > > > )
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > RE: Tracking v. users
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The submittal and tracking can associate the
> > > > submitter
> > > > > > with
> > > > > > > > the
> > > > > > > > > > > yarn
> > > > > > > > > > > > > job
> > > > > > > > > > > > > > > and track that,
> > > > > > > > > > > > > > > regardless of the yarn credentials.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IE> if all submittals and monitoring are by the
> > > same
> > > > > yarn
> > > > > > > > user
> > > > > > > > > (
> > > > > > > > > > > > > Metron )
> > > > > > > > > > > > > > > from a single or
> > > > > > > > > > > > > > > co-operative set of services, that service can
> > > > maintain
> > > > > > the
> > > > > > > > > > > mapping.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On May 7, 2018 at 09:39:52, Ryan Merriman (
> > > > > > > > merrim...@gmail.com
> > > > > > > > > )
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Otto, your use case makes sense to me. We'll
> have
> > > to
> > > > > > think
> > > > > > > > > about
> > > > > > > > > > > how
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > manage the user to job relationships. I'm
> > assuming
> > > > YARN
> > > > > > > jobs
> > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > > > > submitted as the metron service user so YARN
> > won't
> > > > keep
> > > > > > > track
> > > > > > > > > of
> > > > > > > > > > > > this
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > us. Is that assumption correct? Do you have any
> > > ideas
> > > > > for
> > > > > > > > doing
> > > > > > > > > > > > that?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Mike, I can start a feature branch and
> experiment
> > > > with
> > > > > > > > merging
> > > > > > > > > > > > > metron-api
> > > > > > > > > > > > > > > into metron-rest. That should allow us to
> > > collaborate
> > > > > on
> > > > > > > any
> > > > > > > > > > issues
> > > > > > > > > > > > or
> > > > > > > > > > > > > > > challenges. Also, can you expand on your idea
> to
> > > > manage
> > > > > > > > > external
> > > > > > > > > > > > > > > dependencies as a special module? That seems
> > like a
> > > > > very
> > > > > > > > > > attractive
> > > > > > > > > > > > > > option
> > > > > > > > > > > > > > > to me.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, May 4, 2018 at 8:39 AM, Otto Fowler <
> > > > > > > > > > > ottobackwa...@gmail.com>
> > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > From my response on the other thread, but
> > > > applicable
> > > > > to
> > > > > > > the
> > > > > > > > > > > > backend
> > > > > > > > > > > > > > > stuff:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > "The PCAP Query seems more like PCAP Report
> to
> > > me.
> > > > > You
> > > > > > > are
> > > > > > > > > > > > > generating a
> > > > > > > > > > > > > > > > report based on parameters.
> > > > > > > > > > > > > > > > That report is something that takes some time
> > and
> > > > > > > external
> > > > > > > > > > > process
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > generate… ie you have to wait for it.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I can almost imagine a flow where you:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > * Are in the AlertUI
> > > > > > > > > > > > > > > > * Ask to generate a PCAP report based on some
> > > > > selected
> > > > > > > > > > > > > > alerts/meta-alert,
> > > > > > > > > > > > > > > > possibly picking from on or more report
> > > ‘templates’
> > > > > > > > > > > > > > > > that have query options etc
> > > > > > > > > > > > > > > > * The report request is ‘queued’, that is
> > > > dispatched
> > > > > to
> > > > > > > be
> > > > > > > > be
> > > > > > > > > > > > > > > > executed/generated
> > > > > > > > > > > > > > > > * You as a user have a ‘queue’ of your report
> > > > > results,
> > > > > > > and
> > > > > > > > > when
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > report
> > > > > > > > > > > > > > > > is done it is queued there
> > > > > > > > > > > > > > > > * We ‘monitor’ the report/queue press through
> > the
> > > > > yarn
> > > > > > > > rest (
> > > > > > > > > > > > report
> > > > > > > > > > > > > > > > info/meta has the yarn details )
> > > > > > > > > > > > > > > > * You can select the report from your queue
> and
> > > > view
> > > > > it
> > > > > > > > > either
> > > > > > > > > > in
> > > > > > > > > > > > a
> > > > > > > > > > > > > new
> > > > > > > > > > > > > > > UI
> > > > > > > > > > > > > > > > or custom component
> > > > > > > > > > > > > > > > * You can then apply a different ‘view’ to
> the
> > > > report
> > > > > > or
> > > > > > > > work
> > > > > > > > > > > with
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > report data
> > > > > > > > > > > > > > > > * You can print / save etc
> > > > > > > > > > > > > > > > * You can associate the report with the
> alerts
> > (
> > > > > again
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > > report
> > > > > > > > > > > > > > info
> > > > > > > > > > > > > > > > ) with…. a ‘case’ or ‘ticket’ or
> investigation
> > > > > > something
> > > > > > > or
> > > > > > > > > > other
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > We can introduce extensibility into the
> report
> > > > > > templates,
> > > > > > > > > > report
> > > > > > > > > > > > > views
> > > > > > > > > > > > > > (
> > > > > > > > > > > > > > > > thinks that work with the json data of the
> > > report )
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Something like that.”
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Maybe we can do :
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > template -> query parameters -> script =>
> yarn
> > > info
> > > > > > > > > > > > > > > > yarn info + query info + alert context + yarn
> > > > status
> > > > > =>
> > > > > > > > > report
> > > > > > > > > > > > info
> > > > > > > > > > > > > ->
> > > > > > > > > > > > > > > > stored in a user’s ‘report queue’
> > > > > > > > > > > > > > > > report persistence added to report info
> > > > > > > > > > > > > > > > metron-rest -> api to monitor the queue, read
> > > > > results (
> > > > > > > > page
> > > > > > > > > ),
> > > > > > > > > > > > etc
> > > > > > > > > > > > > etc
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On May 4, 2018 at 09:23:39, Ryan Merriman (
> > > > > > > > > merrim...@gmail.com
> > > > > > > > > > )
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I started a separate thread on Pcap UI
> > > > considerations
> > > > > > and
> > > > > > > > > user
> > > > > > > > > > > > > > > > requirements
> > > > > > > > > > > > > > > > at Otto's request. This should help us keep
> > these
> > > > two
> > > > > > > > related
> > > > > > > > > > but
> > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > discussions focused.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, May 4, 2018 at 7:19 AM, Michel Sumbul
> <
> > > > > > > > > > > > > michelsum...@gmail.com>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hello,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > (Youhouuu my first reply on this kind of
> mail
> > > > > > chain^^)
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > If I may, I would like to share my view on
> > the
> > > > > > > following
> > > > > > > > 3
> > > > > > > > > > > > points.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > - Backend:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The current metron-api is totally seperate,
> > it
> > > > will
> > > > > > be
> > > > > > > > > logic
> > > > > > > > > > > for
> > > > > > > > > > > > me
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > it at the same place as the others rest
> api.
> > > > > > Especially
> > > > > > > > > when
> > > > > > > > > > > > more
> > > > > > > > > > > > > > > > security
> > > > > > > > > > > > > > > > > will be added, it will not be needed to do
> > the
> > > > job
> > > > > > > twice.
> > > > > > > > > > > > > > > > > The current implementation send back a pcap
> > > > object
> > > > > > > which
> > > > > > > > > > still
> > > > > > > > > > > > need
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > decoded. In the opensoc, the decoding was
> > done
> > > > with
> > > > > > > > tshard
> > > > > > > > > on
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > frontend.
> > > > > > > > > > > > > > > > > It will be good to have this decoding
> > happening
> > > > > > > directly
> > > > > > > > on
> > > > > > > > > > the
> > > > > > > > > > > > > > backend
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > not create a load on frontend. An option
> will
> > > be
> > > > to
> > > > > > > > install
> > > > > > > > > > > > tshark
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > rest server and to use to convert the pcap
> to
> > > xml
> > > > > and
> > > > > > > > then
> > > > > > > > > > to a
> > > > > > > > > > > > > json
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > will be send to the frontend.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I tried to start directly the map/reduce
> job
> > to
> > > > > > search
> > > > > > > > over
> > > > > > > > > > all
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > pcap
> > > > > > > > > > > > > > > > > data from the rest server and as Ryan
> mention
> > > it,
> > > > > we
> > > > > > > had
> > > > > > > > > > > > trouble. I
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > try to find back the error.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Then in the POC, what we tried is to use
> the
> > > > > > pcap_query
> > > > > > > > > > script
> > > > > > > > > > > > and
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > work fine. I just modified it that he sends
> > > back
> > > > > > > directly
> > > > > > > > > the
> > > > > > > > > > > > > job_id
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > yarn and not waiting that the job is
> > finished.
> > > > Then
> > > > > > it
> > > > > > > > will
> > > > > > > > > > > > allow
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > UI
> > > > > > > > > > > > > > > > > and the rest server to know what the status
> > of
> > > > the
> > > > > > > > research
> > > > > > > > > > by
> > > > > > > > > > > > > > querying
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > yarn rest api. This will allow the UI and
> the
> > > > rest
> > > > > > > server
> > > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > > > async
> > > > > > > > > > > > > > > > > without any blocking phase. What do you
> think
> > > > about
> > > > > > > that?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Having the job submitted directly from the
> > code
> > > > of
> > > > > > the
> > > > > > > > rest
> > > > > > > > > > > > server
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > perfect, but it will need a lot of
> > > investigation
> > > > I
> > > > > > > think
> > > > > > > > > (but
> > > > > > > > > > > > I'm
> > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > expert so I might be completely wrong ^^).
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > We know that the pcap_query scritp work
> fine
> > so
> > > > why
> > > > > > not
> > > > > > > > > > calling
> > > > > > > > > > > > it?
> > > > > > > > > > > > > > Is
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > that bad? (maybe stupid question, but I
> > really
> > > > > don’t
> > > > > > > see
> > > > > > > > a
> > > > > > > > > > lot
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > drawback)
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > - Front end:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Adding the the pcap search to the alert UI
> > is,
> > > I
> > > > > > think,
> > > > > > > > the
> > > > > > > > > > > > easiest
> > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > move forward. But indeed, it will then be
> the
> > > > > “Alert
> > > > > > UI
> > > > > > > > and
> > > > > > > > > > > > > > pcapquery”.
> > > > > > > > > > > > > > > > > Maybe the name of the UI should just change
> > to
> > > > > > > something
> > > > > > > > > like
> > > > > > > > > > > > > > > > “Monitoring &
> > > > > > > > > > > > > > > > > Investigation UI” ?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Is there any roadmap or plan for the
> > different
> > > > UI?
> > > > > I
> > > > > > > mean
> > > > > > > > > did
> > > > > > > > > > > > you
> > > > > > > > > > > > > > > > already
> > > > > > > > > > > > > > > > > had discussion on how you see the ui
> evolving
> > > > with
> > > > > > the
> > > > > > > > new
> > > > > > > > > > > > feature
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > will come in the future?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > - Microservices:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > What do you mean exactly by microservices?
> Is
> > > it
> > > > to
> > > > > > > > > separate
> > > > > > > > > > > all
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > features in different projects? Or
> something
> > > like
> > > > > > > having
> > > > > > > > > the
> > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > components in container like kubernet?
> (again
> > > > maybe
> > > > > > > > stupid
> > > > > > > > > > > > > question,
> > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > don’t clearly understand what you mean J )
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Michel
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > --
> > > > > > > > > > simon elliston ball
> > > > > > > > > > @sireb
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > --
> > >
> > > Jon
> > >
> >
> --
>
> Jon
>



-- 
--
simon elliston ball
@sireb

Reply via email to