Re: [DISCUSS] Opinionated Data Flows

2016-10-06 Thread zeo...@gmail.com
In this case I would initially think implicit to simplify the configs.
Doesn't seem overly complicated to implement in my mind, but that doesn't
mean I'm not missing something regarding the current state or future
roadmap.

Jon

On Thu, Oct 6, 2016, 18:25 Matt Foley  wrote:

> Would splitting and joining be implicit or explicit, for multi-path
> topologies?
> 
> From: zeo...@gmail.com 
> Sent: Thursday, October 06, 2016 11:03 AM
> To: dev@metron.incubator.apache.org
> Subject: Re: [DISCUSS] Opinionated Data Flows
>
> It should also be smart enough to handle an order like:
>
> source("bro")
>   -> parser("BasicBroParser")
>   -> exists("ip_src_addr")
>   -> geo_ip_src = geo["ip_src_addr"]
>   -> application = assets["ip_src_addr"].application
>   -> owner = assets["ip_src_addr"].owner
>   -> exists("ip_dst_addr")
>   -> geo_ip_dst = geo["ip_dst_addr"]
>   -> elasticsearch("bro-index")
>
> Without duplicate hits of the topologies.
>
> Jon
>
> On Thu, Oct 6, 2016 at 1:55 PM Nick Allen  wrote:
>
> > Here is quick example with some hypothetical syntax.  Whatever that
> syntax
> > might be, it would be very simple, easy to understand, and leverage
> > high-level concepts specific to Metron.
> >
> > This flow consumes Bro data, ensures there are valid source/destination
> > IPs, performs geo-enrichment, asset enrichment and finally persists the
> > data in Elasticsearch.
> >
> >
> > source("bro")
> >   -> parser("BasicBroParser")
> >   -> exists("ip_src_addr")
> >   -> exists("ip_dst_addr")
> >   -> geo_ip_src = geo["ip_src_addr"]
> >   -> geo_ip_dst = geo["ip_dst_addr"]
> >   -> application = assets["ip_src_addr"].application
> >   -> owner = assets["ip_src_addr"].owner
> >   -> elasticsearch("bro-index")
> >
> >
> >
> >
> > On Thu, Oct 6, 2016 at 12:58 PM, Nick Allen  wrote:
> >
> > > Chasing this bad idea down even further leads me to something even
> > > crazier.
> > >
> > > Stellar 1.0 can only operate within a single topology and in most cases
> > > only on a single message.  Stellar 2.0 could be the mechanism that
> allows
> > > users to define their own data flows and what "useful bits of Metron
> > > functionality" get plugged-in.
> > >
> > > Once, you have a DSL that allows users to define what they want Metron
> to
> > > do, then the underlying implementation mechanism (which is currently
> > Storm)
> > > can also be swapped-out.  If we have an even faster Storm
> implementation,
> > > then we swap in the Storm NG engine.  Maybe we want Metron to also run
> in
> > > Flink, then we just swap-in a Flink engine.
> > >
> > >
> > >
> > >
> > > On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen 
> wrote:
> > >
> > >> I totally "bird dogged the previous thread" as Casey likes to call it.
> > :)
> > >>  I am extracting this thought into a separate thread before I start
> > >> throwing out even more, crazier ideas.
> > >>
> > >> In general, Metron is very opinionated about data flows right now.  We
> > >>> have Parser topologies that feed an Enrichment topology, which then
> > feeds
> > >>> an Indexing topology.  We have useful bits of functionality (think
> > Stellar
> > >>> transforms, Geo enrichment, etc) that are closely coupled with these
> > >>> topologies (aka data flows).
> > >>>
> > >>
> > >>
> > >>> When a user wants to parse heterogenous data from a single topic,
> > that's
> > >>> not easy.  When a user wants enriched output to land in unique topics
> > by
> > >>> sensor type, well, that's also not easy.When a user wanted to
> skip
> > >>> enrichment of data sources, we actually re-architected the data flow
> > to add
> > >>> the Indexing topology.
> > >>>
> > >>
> > >>
> > >>> In an ideal world, a user should be responsible for defining the data
> > >>> flow, not Metron.  Metron should provide the "useful bits of
> > functionality"
> > >>> that a user can "plugin" wherever they like.  Metron itself should
> not
> > care
> > >>> how the data is moving or what step in the process it is at.
> > >>
> > >>
> > >>
> > >>
> > >> --
> > >> Nick Allen 
> > >>
> > >
> > >
> > >
> > > --
> > > Nick Allen 
> > >
> >
> >
> >
> > --
> > Nick Allen 
> >
> --
>
> Jon
>
-- 

Jon


Re: community demo tomorrow

2016-10-06 Thread James Sirota
Hi Taylor,

We setup a recurring demo meeting to run twice a month last month per this 
thread:
http://mail-archives.apache.org/mod_mbox/incubator-metron-dev/201609.mbox/raw/%3C1139681474481923%40web32j.yandex.ru%3E/

and then announced the first one in the series 2 weeks ago here:
http://mail-archives.apache.org/mod_mbox/incubator-metron-dev/201609.mbox/raw/%3C508251474574231%40web9m.yandex.ru%3E/

We then had the meeting and posted a summary and a video here:
http://mail-archives.apache.org/mod_mbox/incubator-metron-dev/201609.mbox/raw/%3C182111474664208%40web22m.yandex.ru%3E

That meeting was held over zoom at 11am PST on Sept.23.  Per that thread the 
next demo meeting should be held tomorrow at 11AM PST.  However, we still have 
a lot of pull requests outstanding so I was asking if it makes sense to hold 
the meeting next week instead so we can get all the pull requests in.  

Per your point I will send reminders about the recurring meeting 72 hours prior 
to the meeting with the agenda and will solicit feedback if anyone would like 
to change the time or the tool for the virtual meeting.  If there are no 
objections we will skip the meeting on Friday and will do it mid week next week 
once we get most of the pull requests merged.  

Thanks,
James 

06.10.2016, 17:09, "P. Taylor Goetz" :
> I'd love to see a demo, but with my mentor hat on I would say wait. Try not 
> to schedule things too soon so people in other time zones have a chance to 
> participate. This is what's behind the 72 hr. wait period for votes at the 
> ASF.
>
> I haven't checked, but I think the Metron community is largely based in North 
> America. As the community grows, this will undoubtedly change to include all 
> parts (and time zones) of the world.
>
> Solicit times and dates from the community, then collectively choose one. 
> Also decide on a medium (ghangouts, zoom, webex, etc.). Afterwords, summarize 
> what was discussed and send it to the mailing list so those who couldn't 
> attend know what happened.
>
> -Taylor
>
>>  On Oct 6, 2016, at 7:41 PM, James Sirota  wrote:
>>
>>  Does anyone want to do a demo tomorrow? Or should we push it off till next 
>> week so that we have a chance to review and commit a few more pull requests? 
>> Looks like we have 19 open right now. I think that's the most we've ever 
>> gotten in a 2-week time frame. Great job, community!
>>
>>  If we do want to demo tomorrow, can you respond to this thread with what 
>> you want to demo?
>>
>>  ---
>>  Thank you,
>>
>>  James Sirota
>>  PPMC- Apache Metron (Incubating)
>>  jsirota AT apache DOT org

--- 
Thank you,

James Sirota
PPMC- Apache Metron (Incubating)
jsirota AT apache DOT org


[GitHub] incubator-metron issue #299: METRON-425 Stellar transformation fails to hand...

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/299
  
So - based on 
`identifier_operand : (TRUE | FALSE) # LogicalConst
   | arithmetic_expr #ArithmeticOperands
   | STRING_LITERAL # StringLiteral
   | list_entity #List
   | map_entity #MapConst
   | NULL #NullConst
   | EXISTS LPAREN IDENTIFIER RPAREN #ExistsFunc
   | LPAREN conditional_expr RPAREN#condExpr_paren
   ;`

Anything that can be completely expressed from this list will work.  TRUE 
and FALSE can, but NOT um, can not.

It is not random.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Podling Report Reminder - October 2016

2016-10-06 Thread johndament
Dear podling,

This email was sent by an automated system on behalf of the Apache
Incubator PMC. It is an initial reminder to give you plenty of time to
prepare your quarterly board report.

The board meeting is scheduled for Wed, 19 October 2016, 10:30 am PDT.
The report for your podling will form a part of the Incubator PMC
report. The Incubator PMC requires your report to be submitted 2 weeks
before the board meeting, to allow sufficient time for review and
submission (Wed, October 05).

Please submit your report with sufficient time to allow the Incubator
PMC, and subsequently board members to review and digest. Again, the
very latest you should submit your report is 2 weeks prior to the board
meeting.

Thanks,

The Apache Incubator PMC

Submitting your Report

--

Your report should contain the following:

*   Your project name
*   A brief description of your project, which assumes no knowledge of
the project or necessarily of its field
*   A list of the three most important issues to address in the move
towards graduation.
*   Any issues that the Incubator PMC or ASF Board might wish/need to be
aware of
*   How has the community developed since the last report
*   How has the project developed since the last report.

This should be appended to the Incubator Wiki page at:

http://wiki.apache.org/incubator/October2016

Note: This is manually populated. You may need to wait a little before
this page is created from a template.

Mentors
---

Mentors should review reports for their project(s) and sign them off on
the Incubator wiki page. Signing off reports shows that you are
following the project - projects that are not signed may raise alarms
for the Incubator PMC.

Incubator PMC


Re: community demo tomorrow

2016-10-06 Thread P. Taylor Goetz
I'd love to see a demo, but with my mentor hat on I would say wait.  Try not to 
schedule things too soon so people in other time zones have a chance to 
participate. This is what's behind the 72 hr. wait period for votes at the ASF.

I haven't checked, but I think the Metron community is largely based in North 
America. As the community grows, this will undoubtedly change to include all 
parts (and time zones) of the world.

Solicit times and dates from the community, then collectively choose one. Also 
decide on a medium (ghangouts, zoom, webex, etc.). Afterwords, summarize what 
was discussed and send it to the mailing list so those who couldn't attend know 
what happened.

-Taylor

> On Oct 6, 2016, at 7:41 PM, James Sirota  wrote:
> 
> Does anyone want to do a demo tomorrow?  Or should we push it off till next 
> week so that we have a chance to review and commit a few more pull requests?  
> Looks like we have 19 open right now.  I think that's the most we've ever 
> gotten in a 2-week time frame.  Great job, community!
> 
> If we do want to demo tomorrow, can you respond to this thread with what you 
> want to demo?
> 
> --- 
> Thank you,
> 
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org


[GitHub] incubator-metron issue #299: METRON-425 Stellar transformation fails to hand...

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/299
  
I think that we need to expand the readme, properly.  Or add a separate 
language guide?  The not, false thing is troubling.  I'd have to go through the 
grammar more closely.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


community demo tomorrow

2016-10-06 Thread James Sirota
Does anyone want to do a demo tomorrow?  Or should we push it off till next 
week so that we have a chance to review and commit a few more pull requests?  
Looks like we have 19 open right now.  I think that's the most we've ever 
gotten in a 2-week time frame.  Great job, community!

If we do want to demo tomorrow, can you respond to this thread with what you 
want to demo?

--- 
Thank you,

James Sirota
PPMC- Apache Metron (Incubating)
jsirota AT apache DOT org


Re: [GitHub] incubator-metron issue #299: METRON-425 Stellar transformation fails to hand...

2016-10-06 Thread Otto Fowler
I think the readme needs to have a new section with the keywords and the
need for escaping them explicitly called out?

-- 

Sent with Airmail

On October 6, 2016 at 18:56:08, justinleet (g...@git.apache.org) wrote:

> Github user justinleet commented on the issue:
>
> https://github.com/apache/incubator-metron/pull/299
>
> Good catch. I've been looking at this, and ended up looking into red
> herring. A further note on top of this is that (unsurprisingly in light of
> the cause) other reserved keywords tend to cause problems. Somewhat
> inconsistently, too. E.g.
> `"newStellarField" : "not"` fails, but
> `"newStellarField" : "false"` does not.
>
> I'd have to dig into it more, but I assume it's because not and the other
> comparison operators in the grammar expect to be followed by more things,
> and don't end up being string here.
>
> I'm open to suggestions on what the appropriate behavior is, but I really
> don't like that it's inconsistent on what gets rejected.
>
>
> ---
> If your project is set up for it, you can reply to this email and have
> your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working,
> please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


[GitHub] incubator-metron issue #299: METRON-425 Stellar transformation fails to hand...

2016-10-06 Thread justinleet
Github user justinleet commented on the issue:

https://github.com/apache/incubator-metron/pull/299
  
Good catch.  I've been looking at this, and ended up looking into red 
herring.  A further note on top of this is that (unsurprisingly in light of the 
cause) other reserved keywords tend to cause problems.  Somewhat 
inconsistently, too. E.g. 
`"newStellarField" : "not"` fails, but
`"newStellarField" : "false"` does not.

I'd have to dig into it more, but I assume it's because not and the other 
comparison operators in the grammar expect to be followed by more things, and 
don't end up being string here.

I'm open to suggestions on what the appropriate behavior is, but I really 
don't like that it's inconsistent on what gets rejected.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #299: METRON-425 Stellar transformation fails ...

2016-10-06 Thread ottobackwards
GitHub user ottobackwards opened a pull request:

https://github.com/apache/incubator-metron/pull/299

METRON-425 Stellar transformation fails to handle special characters

The issue is that per our grammar the arithmetic and comparison operators 
need to be escaped to be used as literals.

I have added a unit test to the system for this case ( for field 
transformations ).
I have also taken a stab at updating the readme.md.  I do not thing the 
readme is correct though.  It may need to be written out more better by someone 
who writes good.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ottobackwards/incubator-metron METRON-425

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-metron/pull/299.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #299


commit ac2ffc4d4c5c9f233e3629b5884b0e0073646823
Author: Otto Fowler 
Date:   2016-10-06T22:31:04Z

METRON-425 add unit test and readme change about escaping literals in 
stellar statements




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Opinionated Data Flows

2016-10-06 Thread Matt Foley
Would splitting and joining be implicit or explicit, for multi-path topologies?

From: zeo...@gmail.com 
Sent: Thursday, October 06, 2016 11:03 AM
To: dev@metron.incubator.apache.org
Subject: Re: [DISCUSS] Opinionated Data Flows

It should also be smart enough to handle an order like:

source("bro")
  -> parser("BasicBroParser")
  -> exists("ip_src_addr")
  -> geo_ip_src = geo["ip_src_addr"]
  -> application = assets["ip_src_addr"].application
  -> owner = assets["ip_src_addr"].owner
  -> exists("ip_dst_addr")
  -> geo_ip_dst = geo["ip_dst_addr"]
  -> elasticsearch("bro-index")

Without duplicate hits of the topologies.

Jon

On Thu, Oct 6, 2016 at 1:55 PM Nick Allen  wrote:

> Here is quick example with some hypothetical syntax.  Whatever that syntax
> might be, it would be very simple, easy to understand, and leverage
> high-level concepts specific to Metron.
>
> This flow consumes Bro data, ensures there are valid source/destination
> IPs, performs geo-enrichment, asset enrichment and finally persists the
> data in Elasticsearch.
>
>
> source("bro")
>   -> parser("BasicBroParser")
>   -> exists("ip_src_addr")
>   -> exists("ip_dst_addr")
>   -> geo_ip_src = geo["ip_src_addr"]
>   -> geo_ip_dst = geo["ip_dst_addr"]
>   -> application = assets["ip_src_addr"].application
>   -> owner = assets["ip_src_addr"].owner
>   -> elasticsearch("bro-index")
>
>
>
>
> On Thu, Oct 6, 2016 at 12:58 PM, Nick Allen  wrote:
>
> > Chasing this bad idea down even further leads me to something even
> > crazier.
> >
> > Stellar 1.0 can only operate within a single topology and in most cases
> > only on a single message.  Stellar 2.0 could be the mechanism that allows
> > users to define their own data flows and what "useful bits of Metron
> > functionality" get plugged-in.
> >
> > Once, you have a DSL that allows users to define what they want Metron to
> > do, then the underlying implementation mechanism (which is currently
> Storm)
> > can also be swapped-out.  If we have an even faster Storm implementation,
> > then we swap in the Storm NG engine.  Maybe we want Metron to also run in
> > Flink, then we just swap-in a Flink engine.
> >
> >
> >
> >
> > On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen  wrote:
> >
> >> I totally "bird dogged the previous thread" as Casey likes to call it.
> :)
> >>  I am extracting this thought into a separate thread before I start
> >> throwing out even more, crazier ideas.
> >>
> >> In general, Metron is very opinionated about data flows right now.  We
> >>> have Parser topologies that feed an Enrichment topology, which then
> feeds
> >>> an Indexing topology.  We have useful bits of functionality (think
> Stellar
> >>> transforms, Geo enrichment, etc) that are closely coupled with these
> >>> topologies (aka data flows).
> >>>
> >>
> >>
> >>> When a user wants to parse heterogenous data from a single topic,
> that's
> >>> not easy.  When a user wants enriched output to land in unique topics
> by
> >>> sensor type, well, that's also not easy.When a user wanted to skip
> >>> enrichment of data sources, we actually re-architected the data flow
> to add
> >>> the Indexing topology.
> >>>
> >>
> >>
> >>> In an ideal world, a user should be responsible for defining the data
> >>> flow, not Metron.  Metron should provide the "useful bits of
> functionality"
> >>> that a user can "plugin" wherever they like.  Metron itself should not
> care
> >>> how the data is moving or what step in the process it is at.
> >>
> >>
> >>
> >>
> >> --
> >> Nick Allen 
> >>
> >
> >
> >
> > --
> > Nick Allen 
> >
>
>
>
> --
> Nick Allen 
>
--

Jon


[GitHub] incubator-metron issue #288: METRON-480: Kibana 4.5+ Requires http.cors.enab...

2016-10-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/incubator-metron/pull/288
  
+1 This PR did fix the issue that I was seeing.  Tested on a live cluster.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #288: METRON-480: Kibana 4.5+ Requires http.cors.enab...

2016-10-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/incubator-metron/pull/288
  
 If I manually login to each ES master and remove the `http.cors.enabled = 
true` from `/etc/elasticsearch/elasticsearch.yml` then restart ES, I am able to 
connect to ES with Kibana.

Of course, once Ambari gets involved it undoes my manual edit.  I have not 
yet figured out how to 'unset' this through Ambari yet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [GitHub] incubator-metron issue #288: METRON-480: Kibana 4.5+ Requires http.cors.enab...

2016-10-06 Thread David Lyle
Got it. This will keep that from happening. If you don't want to
reprovision- add a property http.cors.enabled. The display name and the
actual property name are different.

-D...


On Thu, Oct 6, 2016 at 4:07 PM, nickwallen  wrote:

> Github user nickwallen commented on the issue:
>
> https://github.com/apache/incubator-metron/pull/288
>
> I am seeing this on a cluster that I deployed with the mpack from
> 'master'.  ES is happy and excited to play.  When I hit up Kibana I get
> this error.
>
> ```
> Courier Fetch Error: unhandled courier request error: Authorization
> Exception
> Version: 4.5.1
> Build: 9892
> ```
>
> In researching the error, I ran across this on [Github/Kibana](
> https://github.com/elastic/kibana/issues/6719) which refers to the
> `http.cors.enabled` setting.  I remembered you opening this PR and thought
> it might connect.
>
> Now I'm confused though because in this PR you set a
> `http_cors_enabled` variable which uses underlines versus
> `http.cors.enabled` which is mentioned in the link above.
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


[GitHub] incubator-metron issue #288: METRON-480: Kibana 4.5+ Requires http.cors.enab...

2016-10-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/incubator-metron/pull/288
  
I am seeing this on a cluster that I deployed with the mpack from 'master'. 
 ES is happy and excited to play.  When I hit up Kibana I get this error.

```
Courier Fetch Error: unhandled courier request error: Authorization 
Exception
Version: 4.5.1
Build: 9892
```

In researching the error, I ran across this on 
[Github/Kibana](https://github.com/elastic/kibana/issues/6719) which refers to 
the `http.cors.enabled` setting.  I remembered you opening this PR and thought 
it might connect.  

Now I'm confused though because in this PR you set a `http_cors_enabled` 
variable which uses underlines versus `http.cors.enabled` which is mentioned in 
the link above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #288: METRON-480: Kibana 4.5+ Requires http.cors.enab...

2016-10-06 Thread dlyle65535
Github user dlyle65535 commented on the issue:

https://github.com/apache/incubator-metron/pull/288
  
Darn. Can't edit on the phone.  Are you seeing this on an existing cluster? 
A default ES instance should default to false. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
It would make a nice little POC -


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #288: METRON-480: Kibana 4.5+ Requires http.cors.enab...

2016-10-06 Thread dlyle65535
Github user dlyle65535 commented on the issue:

https://github.com/apache/incubator-metron/pull/288
  
No, it defaults to false, so no user action is required. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
The annotation would have to be made more complex.  We can accept varargs 
as part of the function, for instance.  I think it's possible, but perhaps we 
would need to iterate a bit on the annotation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
Well  If you have the annotation, the method object and the expression 
- you *could* do it all from that.  You may need to have the annotation have 
more formal input definitions though


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
Yeah agreed.  I don't think we should be executing the function as part of 
validate.  Perhaps just ensuring the function resolves.  Maybe expanding the 
interface for stellar function to provide a validate method so it's pluggable 
by function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
ok, as above, I'm not sure about x->null to do that, but I am the noob here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread Casey Stella
The purpose of validate is to ensure the statement makes syntactic sense
from a stellar perspective and it's only called when statements change in
the config.
On Thu, Oct 6, 2016 at 21:15 Casey Stella  wrote:

> Oh no validate is not called at runtime for every packet! That's called at
> stellar statement input time (e.g. Config pushes to zookeeper)
> On Thu, Oct 6, 2016 at 21:12 ottobackwards  wrote:
>
> Github user ottobackwards commented on the issue:
>
> https://github.com/apache/incubator-metron/pull/293
>
> I took this jira to get some start of an idea about Stellar, and after
> debugging through it to track down to exitVariable to find the argument
> resolution and then back up to the validate x->null I would say I got what
> I bargained for.
>
> I am not sure how I would answer your question on validation, I don't
> know Stellar well enough.  My experience and intuition tells me that
> running the 'execute' and using an exception or error case as validation is
> not very efficient at runtime, if in fact this is what happens at runtime
> as well.  Better to have real metadata on the function and just validating
> the metadata and they syntax of the query I would think.   *A
> logical/runtime error does not mean an invalid expression*.  This method is
> in effect equating them which I think is incorrect ( unless I am mistaking
> the intent of validate()).
>
>
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>
>


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
Sorry was responding on the dev list, not this PR.  The purpose of validate 
is to ensure syntactic correctness from a stellar point of view.  This is not 
called per message, but only when stellar statements change (e.g. At config 
push to zookeeper time)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread Casey Stella
Oh no validate is not called at runtime for every packet! That's called at
stellar statement input time (e.g. Config pushes to zookeeper)
On Thu, Oct 6, 2016 at 21:12 ottobackwards  wrote:

> Github user ottobackwards commented on the issue:
>
> https://github.com/apache/incubator-metron/pull/293
>
> I took this jira to get some start of an idea about Stellar, and after
> debugging through it to track down to exitVariable to find the argument
> resolution and then back up to the validate x->null I would say I got what
> I bargained for.
>
> I am not sure how I would answer your question on validation, I don't
> know Stellar well enough.  My experience and intuition tells me that
> running the 'execute' and using an exception or error case as validation is
> not very efficient at runtime, if in fact this is what happens at runtime
> as well.  Better to have real metadata on the function and just validating
> the metadata and they syntax of the query I would think.   *A
> logical/runtime error does not mean an invalid expression*.  This method is
> in effect equating them which I think is incorrect ( unless I am mistaking
> the intent of validate()).
>
>
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
@mmiklavc I had seen that.  The tests in StellarTests test the compiler and 
the variable resolution etc as well is my point.  It is worth having both.  If 
both were done from the start, maybe  METRON-439 doesn't get through


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
I took this jira to get some start of an idea about Stellar, and after 
debugging through it to track down to exitVariable to find the argument 
resolution and then back up to the validate x->null I would say I got what I 
bargained for.

I am not sure how I would answer your question on validation, I don't know 
Stellar well enough.  My experience and intuition tells me that running the 
'execute' and using an exception or error case as validation is not very 
efficient at runtime, if in fact this is what happens at runtime as well.  
Better to have real metadata on the function and just validating the metadata 
and they syntax of the query I would think.   *A logical/runtime error does not 
mean an invalid expression*.  This method is in effect equating them which I 
think is incorrect ( unless I am mistaking the intent of validate()).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
@ottobackwards A unit test has been added as part of PR#296 -> 
https://github.com/mmiklavc/incubator-metron/blob/bd6e1a5cc48962af36b131e189ffb461e836cd20/metron-platform/metron-common/src/test/java/org/apache/metron/common/dsl/functions/DataStructureFunctionsTest.java


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
So, yes, a couple of things need to be ensured for stellar functions:
* The output of the call is serializable via kryo (the profiler stores the 
output of stellar statements in HBase)
* It passes the `validate` call

Validate is intended to check syntactic errors and it does so by passing a 
VariableResolver that returns null for every variable.  I will admit that I've 
been thinking this should be rethought and we should do this another way:
* Have a special implementation of StellarBaseListener that barfs if 
function resolution happens, but doesn't actually apply functions
* Uses the normal Stellar Lexer to ensure syntactic errors
This should ensure that the stellar statement is valid syntactically IMO.

Thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/293#discussion_r82259618
  
--- Diff: 
metron-platform/metron-common/src/test/java/org/apache/metron/common/stellar/StellarTest.java
 ---
@@ -30,9 +30,7 @@
 import org.reflections.Reflections;
 import org.reflections.util.ConfigurationBuilder;
 
-import java.util.HashMap;
-import java.util.HashSet;
-import java.util.Map;
+import java.util.*;
--- End diff --

I did not actually change this on purpose, I'll fix


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #296: METRON-439: Stellar : IS_EMPTY(host) throws exc...

2016-10-06 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/incubator-metron/pull/296
  
@ottobackwards The Stellar validation currently runs by passing an empty 
member to the apply function. The original reason for the validation was to 
help keep bad configs from ending up in zookeeper. Now with the Stellar REPL, 
users should be able to more easily do their own client-side validations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
OK I have it in.  Here is the thing that I found,  I know there is a pull 
about fixing IS_EMPTY from throwing exceptions - and I had to take that same 
return 0 approach instead of throwing.  IS_EMPTY did NOT have a test in 
StellarTests, so it was never being tested through the stellar variable 
resolver, validation, parser etc.  If it had had a test there ( using the run() 
as implemented in the test ) it never would have been committed, because it 
NEVER works, no matter what you pass it as originally implemented with the 
IllegalStateExceptions.  This is because the validate calls down to the apply 
with a specifically null argument which triggers the exception.
So, a couple of things:

- every stellar function needs to be tested through the whole stellar 
framework and not just on top of StellarFunction.apply()
- why would validate call parse which calls apply?  Is that the way it is 
at runtime?  Does apply always get called twice?  Is the test harness wrong?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #288: METRON-480: Kibana 4.5+ Requires http.cors.enab...

2016-10-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/incubator-metron/pull/288
  
To fix this using Ambari on an existing ES cluster would I just go to...
  Elasticsearch -> Configs -> Custom elastic_site -> Add Property -> 
key=http_cors_enabled, value=false?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
@ottobackwards I'd vote for it to be in DataStructureFunctions and it's an 
interesting question about that.  Should the length of `null` be an exception, 
null or `0`?

I, personally, vote for `0`, but I might be persuaded otherwise.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #296: METRON-439: Stellar : IS_EMPTY(host) throws exc...

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/296
  
OK, if I change my functions to not throw then it works.  This doesn't seem 
right to me, but it may be something to do with why the tests are run that way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #298: METRON-432: Fix pcap field resolver to r...

2016-10-06 Thread mmiklavc
GitHub user mmiklavc opened a pull request:

https://github.com/apache/incubator-metron/pull/298

METRON-432: Fix pcap field resolver to return object instead of string value

Addresses https://issues.apache.org/jira/browse/METRON-432

Passes PcapTopologyIntegrationTest. Running manual tests now in quick-dev, 
but wanted to get this up for review asap.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mmiklavc/incubator-metron METRON-432

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-metron/pull/298.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #298


commit 078cd25f2f2119f388aa4c4c83c0f60503362bab
Author: Michael Miklavcic 
Date:   2016-09-20T15:20:27Z

METRON-432 Partial commit - fix field resolution to use Objects instead of 
Strings

commit 877e4602e1d0c2a44364b1f46df87c8c282be515
Author: Michael Miklavcic 
Date:   2016-10-06T18:38:43Z

METRON-432: Fix pcap field resolver to return object instead of string value




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #288: METRON-480: Kibana 4.5+ Requires http.cors.enab...

2016-10-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/incubator-metron/pull/288
  
Ah, chased down the JIRA.  Looks like it is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #288: METRON-480: Kibana 4.5+ Requires http.cors.enab...

2016-10-06 Thread nickwallen
Github user nickwallen commented on the issue:

https://github.com/apache/incubator-metron/pull/288
  
Was this intended to fix the following error thrown by Kibana when it tries 
to talk to Elasticsearch?

```
Courier Fetch Error: unhandled courier request error: Authorization 
Exception
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #297: METRON-488: Snort should use a proper CS...

2016-10-06 Thread cestella
GitHub user cestella opened a pull request:

https://github.com/apache/incubator-metron/pull/297

METRON-488: Snort should use a proper CSV implementation

Right now if you have a custom snort rule (e.g. alert tcp any any -> any 
any (msg:'snort alert message having a ,(comma) to check csv parsing'; 
sid:999158; ) ) the snort parser will fail to parse because it's splitting on 
the comma naively.
It should use the existing CSV parsing infrastructure that we have and that 
is used in the CSVParser.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cestella/incubator-metron snort_delim_bug

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-metron/pull/297.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #297


commit f0a57334d0d80e298e5ea25f1b114ae0d6db4b11
Author: cstella 
Date:   2016-10-06T18:14:46Z

Updating the snort parser to use the CSVExtractor infrastructure, which is 
a thin layer on top of OpenCSV




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #296: METRON-439: Stellar : IS_EMPTY(host) throws exc...

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/296
  
I'm working on making LENGTH work like IS_EMPTY.  I can't get either one to 
work as both are failing validation.  There is no StellarTest for IS_EMPTY ( I 
have added one to check if it is just my code that is failing ).  Can you one 
and try it there an make sure it works all the way through the Stellar parse 
and variable resolve?
ParseException: Unable to parse IS_EMPTY(foo): Unable to execute: IS_EMPTY 
expects a collection or a string

This is when calling with
`String query = "IS_EMPTY(foo);
 Asser.assertEquals(true,run(query,ImmutableMap.of("foo",""));
`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Opinionated Data Flows

2016-10-06 Thread zeo...@gmail.com
It should also be smart enough to handle an order like:

source("bro")
  -> parser("BasicBroParser")
  -> exists("ip_src_addr")
  -> geo_ip_src = geo["ip_src_addr"]
  -> application = assets["ip_src_addr"].application
  -> owner = assets["ip_src_addr"].owner
  -> exists("ip_dst_addr")
  -> geo_ip_dst = geo["ip_dst_addr"]
  -> elasticsearch("bro-index")

Without duplicate hits of the topologies.

Jon

On Thu, Oct 6, 2016 at 1:55 PM Nick Allen  wrote:

> Here is quick example with some hypothetical syntax.  Whatever that syntax
> might be, it would be very simple, easy to understand, and leverage
> high-level concepts specific to Metron.
>
> This flow consumes Bro data, ensures there are valid source/destination
> IPs, performs geo-enrichment, asset enrichment and finally persists the
> data in Elasticsearch.
>
>
> source("bro")
>   -> parser("BasicBroParser")
>   -> exists("ip_src_addr")
>   -> exists("ip_dst_addr")
>   -> geo_ip_src = geo["ip_src_addr"]
>   -> geo_ip_dst = geo["ip_dst_addr"]
>   -> application = assets["ip_src_addr"].application
>   -> owner = assets["ip_src_addr"].owner
>   -> elasticsearch("bro-index")
>
>
>
>
> On Thu, Oct 6, 2016 at 12:58 PM, Nick Allen  wrote:
>
> > Chasing this bad idea down even further leads me to something even
> > crazier.
> >
> > Stellar 1.0 can only operate within a single topology and in most cases
> > only on a single message.  Stellar 2.0 could be the mechanism that allows
> > users to define their own data flows and what "useful bits of Metron
> > functionality" get plugged-in.
> >
> > Once, you have a DSL that allows users to define what they want Metron to
> > do, then the underlying implementation mechanism (which is currently
> Storm)
> > can also be swapped-out.  If we have an even faster Storm implementation,
> > then we swap in the Storm NG engine.  Maybe we want Metron to also run in
> > Flink, then we just swap-in a Flink engine.
> >
> >
> >
> >
> > On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen  wrote:
> >
> >> I totally "bird dogged the previous thread" as Casey likes to call it.
> :)
> >>  I am extracting this thought into a separate thread before I start
> >> throwing out even more, crazier ideas.
> >>
> >> In general, Metron is very opinionated about data flows right now.  We
> >>> have Parser topologies that feed an Enrichment topology, which then
> feeds
> >>> an Indexing topology.  We have useful bits of functionality (think
> Stellar
> >>> transforms, Geo enrichment, etc) that are closely coupled with these
> >>> topologies (aka data flows).
> >>>
> >>
> >>
> >>> When a user wants to parse heterogenous data from a single topic,
> that's
> >>> not easy.  When a user wants enriched output to land in unique topics
> by
> >>> sensor type, well, that's also not easy.When a user wanted to skip
> >>> enrichment of data sources, we actually re-architected the data flow
> to add
> >>> the Indexing topology.
> >>>
> >>
> >>
> >>> In an ideal world, a user should be responsible for defining the data
> >>> flow, not Metron.  Metron should provide the "useful bits of
> functionality"
> >>> that a user can "plugin" wherever they like.  Metron itself should not
> care
> >>> how the data is moving or what step in the process it is at.
> >>
> >>
> >>
> >>
> >> --
> >> Nick Allen 
> >>
> >
> >
> >
> > --
> > Nick Allen 
> >
>
>
>
> --
> Nick Allen 
>
-- 

Jon


Re: [DISCUSS] Opinionated Data Flows

2016-10-06 Thread Nick Allen
Here is quick example with some hypothetical syntax.  Whatever that syntax
might be, it would be very simple, easy to understand, and leverage
high-level concepts specific to Metron.

This flow consumes Bro data, ensures there are valid source/destination
IPs, performs geo-enrichment, asset enrichment and finally persists the
data in Elasticsearch.


source("bro")
  -> parser("BasicBroParser")
  -> exists("ip_src_addr")
  -> exists("ip_dst_addr")
  -> geo_ip_src = geo["ip_src_addr"]
  -> geo_ip_dst = geo["ip_dst_addr"]
  -> application = assets["ip_src_addr"].application
  -> owner = assets["ip_src_addr"].owner
  -> elasticsearch("bro-index")




On Thu, Oct 6, 2016 at 12:58 PM, Nick Allen  wrote:

> Chasing this bad idea down even further leads me to something even
> crazier.
>
> Stellar 1.0 can only operate within a single topology and in most cases
> only on a single message.  Stellar 2.0 could be the mechanism that allows
> users to define their own data flows and what "useful bits of Metron
> functionality" get plugged-in.
>
> Once, you have a DSL that allows users to define what they want Metron to
> do, then the underlying implementation mechanism (which is currently Storm)
> can also be swapped-out.  If we have an even faster Storm implementation,
> then we swap in the Storm NG engine.  Maybe we want Metron to also run in
> Flink, then we just swap-in a Flink engine.
>
>
>
>
> On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen  wrote:
>
>> I totally "bird dogged the previous thread" as Casey likes to call it. :)
>>  I am extracting this thought into a separate thread before I start
>> throwing out even more, crazier ideas.
>>
>> In general, Metron is very opinionated about data flows right now.  We
>>> have Parser topologies that feed an Enrichment topology, which then feeds
>>> an Indexing topology.  We have useful bits of functionality (think Stellar
>>> transforms, Geo enrichment, etc) that are closely coupled with these
>>> topologies (aka data flows).
>>>
>>
>>
>>> When a user wants to parse heterogenous data from a single topic, that's
>>> not easy.  When a user wants enriched output to land in unique topics by
>>> sensor type, well, that's also not easy.When a user wanted to skip
>>> enrichment of data sources, we actually re-architected the data flow to add
>>> the Indexing topology.
>>>
>>
>>
>>> In an ideal world, a user should be responsible for defining the data
>>> flow, not Metron.  Metron should provide the "useful bits of functionality"
>>> that a user can "plugin" wherever they like.  Metron itself should not care
>>> how the data is moving or what step in the process it is at.
>>
>>
>>
>>
>> --
>> Nick Allen 
>>
>
>
>
> --
> Nick Allen 
>



-- 
Nick Allen 


Re: [DISCUSS] Opinionated Data Flows

2016-10-06 Thread Nick Allen
Personally, I was seeing METRON-477 as one of those "useful bits of
functionality" that would be orchestrated by Stellar 2.0.  But I can also
see your viewpoint on how it could also be part of the orchestration.  Very
interesting.



On Thu, Oct 6, 2016 at 1:09 PM, zeo...@gmail.com  wrote:

> One of those users gives this a +1.  This also appears related to
> METRON-477
> , except that 477 is
> more
> focused on data flow once it hits disk and this is during ingest/stream
> processing.  At the end of the day, not that different IMO.  Would love to
> see it all managed via Stellar/zookeeper.
>
> Jon
>
> On Thu, Oct 6, 2016 at 1:00 PM Nick Allen  wrote:
>
> In reality, the current "engine" is Storm + Kafka + HBase.  Each of these
> could be independently swapped out once Metron is just a DSL with multiple
> underlying engines.
>
> Ok, I'll stop.
>
> On Thu, Oct 6, 2016 at 12:58 PM, Nick Allen  wrote:
>
> > Chasing this bad idea down even further leads me to something even
> > crazier.
> >
> > Stellar 1.0 can only operate within a single topology and in most cases
> > only on a single message.  Stellar 2.0 could be the mechanism that allows
> > users to define their own data flows and what "useful bits of Metron
> > functionality" get plugged-in.
> >
> > Once, you have a DSL that allows users to define what they want Metron to
> > do, then the underlying implementation mechanism (which is currently
> Storm)
> > can also be swapped-out.  If we have an even faster Storm implementation,
> > then we swap in the Storm NG engine.  Maybe we want Metron to also run in
> > Flink, then we just swap-in a Flink engine.
> >
> >
> >
> >
> > On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen  wrote:
> >
> >> I totally "bird dogged the previous thread" as Casey likes to call it.
> :)
> >>  I am extracting this thought into a separate thread before I start
> >> throwing out even more, crazier ideas.
> >>
> >> In general, Metron is very opinionated about data flows right now.  We
> >>> have Parser topologies that feed an Enrichment topology, which then
> feeds
> >>> an Indexing topology.  We have useful bits of functionality (think
> Stellar
> >>> transforms, Geo enrichment, etc) that are closely coupled with these
> >>> topologies (aka data flows).
> >>>
> >>
> >>
> >>> When a user wants to parse heterogenous data from a single topic,
> that's
> >>> not easy.  When a user wants enriched output to land in unique topics
> by
> >>> sensor type, well, that's also not easy.When a user wanted to skip
> >>> enrichment of data sources, we actually re-architected the data flow to
> add
> >>> the Indexing topology.
> >>>
> >>
> >>
> >>> In an ideal world, a user should be responsible for defining the data
> >>> flow, not Metron.  Metron should provide the "useful bits of
> functionality"
> >>> that a user can "plugin" wherever they like.  Metron itself should not
> care
> >>> how the data is moving or what step in the process it is at.
> >>
> >>
> >>
> >>
> >> --
> >> Nick Allen 
> >>
> >
> >
> >
> > --
> > Nick Allen 
> >
>
>
>
> --
> Nick Allen 
>
> --
>
> Jon
>



-- 
Nick Allen 


[GitHub] incubator-metron pull request #296: METRON-439: Stellar : IS_EMPTY(host) thr...

2016-10-06 Thread mmiklavc
GitHub user mmiklavc opened a pull request:

https://github.com/apache/incubator-metron/pull/296

METRON-439: Stellar : IS_EMPTY(host) throws exception

Fix Stellar IS_EMPTY validation to handle empty and null

Addresses https://issues.apache.org/jira/browse/METRON-439

Things have changed a bit since the original Jira was filed. Most notably, 
this error appears while dumping config from Zookeeper after it has already 
been loaded. The current functionality/validation would not have allowed the 
load to succeed in the first place. Even so, empty and null checks could be 
handled more gracefully. This PR changes IS_EMPTY to return true on null or 
empty string rather than throw an exception.

Added new unit tests and validated on quick-dev with multiple bro messages 
- normal host, empty host string, and null/non-existent host string.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mmiklavc/incubator-metron METRON-439

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-metron/pull/296.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #296


commit bd6e1a5cc48962af36b131e189ffb461e836cd20
Author: Michael Miklavcic 
Date:   2016-10-06T17:09:14Z

METRON-439: Fix Stellar IS_EMPTY validation to handle empty and null




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Opinionated Data Flows

2016-10-06 Thread zeo...@gmail.com
One of those users gives this a +1.  This also appears related to METRON-477
, except that 477 is more
focused on data flow once it hits disk and this is during ingest/stream
processing.  At the end of the day, not that different IMO.  Would love to
see it all managed via Stellar/zookeeper.

Jon

On Thu, Oct 6, 2016 at 1:00 PM Nick Allen  wrote:

In reality, the current "engine" is Storm + Kafka + HBase.  Each of these
could be independently swapped out once Metron is just a DSL with multiple
underlying engines.

Ok, I'll stop.

On Thu, Oct 6, 2016 at 12:58 PM, Nick Allen  wrote:

> Chasing this bad idea down even further leads me to something even
> crazier.
>
> Stellar 1.0 can only operate within a single topology and in most cases
> only on a single message.  Stellar 2.0 could be the mechanism that allows
> users to define their own data flows and what "useful bits of Metron
> functionality" get plugged-in.
>
> Once, you have a DSL that allows users to define what they want Metron to
> do, then the underlying implementation mechanism (which is currently
Storm)
> can also be swapped-out.  If we have an even faster Storm implementation,
> then we swap in the Storm NG engine.  Maybe we want Metron to also run in
> Flink, then we just swap-in a Flink engine.
>
>
>
>
> On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen  wrote:
>
>> I totally "bird dogged the previous thread" as Casey likes to call it. :)
>>  I am extracting this thought into a separate thread before I start
>> throwing out even more, crazier ideas.
>>
>> In general, Metron is very opinionated about data flows right now.  We
>>> have Parser topologies that feed an Enrichment topology, which then
feeds
>>> an Indexing topology.  We have useful bits of functionality (think
Stellar
>>> transforms, Geo enrichment, etc) that are closely coupled with these
>>> topologies (aka data flows).
>>>
>>
>>
>>> When a user wants to parse heterogenous data from a single topic, that's
>>> not easy.  When a user wants enriched output to land in unique topics by
>>> sensor type, well, that's also not easy.When a user wanted to skip
>>> enrichment of data sources, we actually re-architected the data flow to
add
>>> the Indexing topology.
>>>
>>
>>
>>> In an ideal world, a user should be responsible for defining the data
>>> flow, not Metron.  Metron should provide the "useful bits of
functionality"
>>> that a user can "plugin" wherever they like.  Metron itself should not
care
>>> how the data is moving or what step in the process it is at.
>>
>>
>>
>>
>> --
>> Nick Allen 
>>
>
>
>
> --
> Nick Allen 
>



--
Nick Allen 

-- 

Jon


Re: [DISCUSS] Opinionated Data Flows

2016-10-06 Thread Nick Allen
In reality, the current "engine" is Storm + Kafka + HBase.  Each of these
could be independently swapped out once Metron is just a DSL with multiple
underlying engines.

Ok, I'll stop.

On Thu, Oct 6, 2016 at 12:58 PM, Nick Allen  wrote:

> Chasing this bad idea down even further leads me to something even
> crazier.
>
> Stellar 1.0 can only operate within a single topology and in most cases
> only on a single message.  Stellar 2.0 could be the mechanism that allows
> users to define their own data flows and what "useful bits of Metron
> functionality" get plugged-in.
>
> Once, you have a DSL that allows users to define what they want Metron to
> do, then the underlying implementation mechanism (which is currently Storm)
> can also be swapped-out.  If we have an even faster Storm implementation,
> then we swap in the Storm NG engine.  Maybe we want Metron to also run in
> Flink, then we just swap-in a Flink engine.
>
>
>
>
> On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen  wrote:
>
>> I totally "bird dogged the previous thread" as Casey likes to call it. :)
>>  I am extracting this thought into a separate thread before I start
>> throwing out even more, crazier ideas.
>>
>> In general, Metron is very opinionated about data flows right now.  We
>>> have Parser topologies that feed an Enrichment topology, which then feeds
>>> an Indexing topology.  We have useful bits of functionality (think Stellar
>>> transforms, Geo enrichment, etc) that are closely coupled with these
>>> topologies (aka data flows).
>>>
>>
>>
>>> When a user wants to parse heterogenous data from a single topic, that's
>>> not easy.  When a user wants enriched output to land in unique topics by
>>> sensor type, well, that's also not easy.When a user wanted to skip
>>> enrichment of data sources, we actually re-architected the data flow to add
>>> the Indexing topology.
>>>
>>
>>
>>> In an ideal world, a user should be responsible for defining the data
>>> flow, not Metron.  Metron should provide the "useful bits of functionality"
>>> that a user can "plugin" wherever they like.  Metron itself should not care
>>> how the data is moving or what step in the process it is at.
>>
>>
>>
>>
>> --
>> Nick Allen 
>>
>
>
>
> --
> Nick Allen 
>



-- 
Nick Allen 


Re: Name conventions for parsers

2016-10-06 Thread Otto Fowler
So extend NiFi to produce Metron topologies.  Gotcha.

On October 6, 2016 at 12:48:52, Nick Allen (n...@nickallen.org) wrote:

I


Re: [DISCUSS] Opinionated Data Flows

2016-10-06 Thread Nick Allen
Chasing this bad idea down even further leads me to something even crazier.

Stellar 1.0 can only operate within a single topology and in most cases
only on a single message.  Stellar 2.0 could be the mechanism that allows
users to define their own data flows and what "useful bits of Metron
functionality" get plugged-in.

Once, you have a DSL that allows users to define what they want Metron to
do, then the underlying implementation mechanism (which is currently Storm)
can also be swapped-out.  If we have an even faster Storm implementation,
then we swap in the Storm NG engine.  Maybe we want Metron to also run in
Flink, then we just swap-in a Flink engine.




On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen  wrote:

> I totally "bird dogged the previous thread" as Casey likes to call it. :)
>  I am extracting this thought into a separate thread before I start
> throwing out even more, crazier ideas.
>
> In general, Metron is very opinionated about data flows right now.  We
>> have Parser topologies that feed an Enrichment topology, which then feeds
>> an Indexing topology.  We have useful bits of functionality (think Stellar
>> transforms, Geo enrichment, etc) that are closely coupled with these
>> topologies (aka data flows).
>>
>
>
>> When a user wants to parse heterogenous data from a single topic, that's
>> not easy.  When a user wants enriched output to land in unique topics by
>> sensor type, well, that's also not easy.When a user wanted to skip
>> enrichment of data sources, we actually re-architected the data flow to add
>> the Indexing topology.
>>
>
>
>> In an ideal world, a user should be responsible for defining the data
>> flow, not Metron.  Metron should provide the "useful bits of functionality"
>> that a user can "plugin" wherever they like.  Metron itself should not care
>> how the data is moving or what step in the process it is at.
>
>
>
>
> --
> Nick Allen 
>



-- 
Nick Allen 


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
@cestella I can do that.  Should I move LENGTH to DataStructureFunctions 
then?
Also - I notice that IS_EMPTY throws IllegalStateExceptions instead of 
returning null, is that how no-var should be handled?  In the String functions 
they return null


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[DISCUSS] Opinionated Data Flows

2016-10-06 Thread Nick Allen
I totally "bird dogged the previous thread" as Casey likes to call it. :)
 I am extracting this thought into a separate thread before I start
throwing out even more, crazier ideas.

In general, Metron is very opinionated about data flows right now.  We have
> Parser topologies that feed an Enrichment topology, which then feeds an
> Indexing topology.  We have useful bits of functionality (think Stellar
> transforms, Geo enrichment, etc) that are closely coupled with these
> topologies (aka data flows).
>


> When a user wants to parse heterogenous data from a single topic, that's
> not easy.  When a user wants enriched output to land in unique topics by
> sensor type, well, that's also not easy.When a user wanted to skip
> enrichment of data sources, we actually re-architected the data flow to add
> the Indexing topology.
>


> In an ideal world, a user should be responsible for defining the data
> flow, not Metron.  Metron should provide the "useful bits of functionality"
> that a user can "plugin" wherever they like.  Metron itself should not care
> how the data is moving or what step in the process it is at.




-- 
Nick Allen 


Re: Name conventions for parsers

2016-10-06 Thread Nick Allen
In general, Metron is very opinionated about data flows right now.  We have
Parser topologies that feed an Enrichment topology, which then feeds an
Indexing topology.  We have useful bits of functionality (think Stellar
transforms, Geo enrichment, etc) that are closely coupled with these
topologies (aka data flows).

When a user wants to parse heterogenous data from a single topic, that's
not easy.  When a user wants enriched output to land in unique topics by
sensor type, well, that's also not easy.When a user wanted to skip
enrichment of data sources, we actually re-architected the data flow to add
the Indexing topology.

In an ideal world, a user should be responsible for defining the data flow,
not Metron.  Metron should provide the "useful bits of functionality" that
a user can "plugin" wherever they like.  Metron itself should not care how
the data is moving or what step in the process it is at.






On Thu, Oct 6, 2016 at 12:27 PM, Nick Allen  wrote:

> I think that's a good problem to solve, Jon.  Having some way to handle
> different types of data hitting the same Kafka topic, would be a very
> common problem.  We should make this easy to handle.  And as Simon
> mentioned, it solves the problem of ingesting low-volume data streams where
> the cost of a dedicated topology is overkill.
>
> Syslog is a good example use case.  Another example use case might be
> extracting data out of Splunk.  I worked at an organization that was using
> Splunk as the centralized log store to meet regulatory requirements.  Of
> course, Splunk is expensive so overlaying additional functionality on the
> existing installation was cost prohibitive.  The only efficient way we
> could get data out of Splunk was one big pipe containing heterogenous data.
> Perhaps there are other ways around it now.  I am no Splunk expert, but
> this seems like a common problem.
>
>
>
> On Thu, Oct 6, 2016 at 11:51 AM, zeo...@gmail.com 
> wrote:
>
>> A storm splitter gateway topology was another path that I considered,
>> especially because it would allow configs like what Yohann mentioned
>> earlier with:
>>
>> > So, it would be really useful that Metron could handle a syslog flow
>> > and automatically apply the right parser for each log. In order to
>> > help Metron, a config could be provide by the "Security Platform
>> > Engineer" to preselect a list of parser per device (as you know what
>> > type of logs a device  should send).  This feature exists in
>> > commercial SIEM.
>>
>> It's just not as easy to get going as an upstream splitter and/or parser
>> in
>> my scenario.
>>
>> Perhaps that should be an enhancement JIRA though?  I really think we need
>> to lower the barrier to getting logs to Metron in the first place, even
>> going as far as having a syslog listener (I looked at embedding rsyslog
>> and
>> syslog-ng and they both unfortunately are GPL licensed, so that's out...).
>>
>> Jon
>>
>> On Thu, Oct 6, 2016 at 9:58 AM Otto Fowler 
>> wrote:
>>
>> Each of these split things would need to end up in their own topology,
>> since they would each have different STELLAR and Enrichment
>> configurations.
>>
>> It would be simpler I think to split them than to have a topology chain
>> that ‘switches’ over a type of field and muddy stellar configs etc.
>>
>> If that is true, then the question is to split as part of the external
>> delivery ( not metron’s problem ) in NiFi or , or to have a ‘gateway -
>> splitter’ topology with only split rules to feed the other typed
>> topologies.
>>
>> Or I’m totally wrong and you can forgive me ;)
>>
>> O
>>
>>
>> On October 6, 2016 at 08:32:51, zeo...@gmail.com (zeo...@gmail.com)
>> wrote:
>>
>> If we don't do it by device I would be concerned that some more
>> appliance-based systems wouldn't allow the flexibility to split things up
>> to different destinations, nor would they allow external additions (NiFi,
>> etc.). This where I am right now, where I can send from certain appliances
>> into my syslog infrastructure, then either force my syslog architecture to
>> selectively send onto Metron, or parse and then send into a generic JSON
>> parser (I will probably go the latter route). In order to standardize and
>> simplify, I would suggest continuing down the device-based route.
>>
>> Generally, I expect the community to grow and for parsers to just exist,
>> and some users to only do minor updates to them or throw together grok
>> parsers using GROK_PREDICT() where necessary. In fact I would hope that is
>> the case, as it would indicate a broader user base.
>>
>> Jon
>>
>> On Thu, Oct 6, 2016 at 8:02 AM Simon Elliston Ball <
>> si...@simonellistonball.com> wrote:
>>
>> > > On 6 Oct 2016, at 12:22, Yohann Lepage  wrote:
>> > >
>> > > 2016-10-06 12:21 GMT+02:00 zeo...@gmail.com :
>> > >> I would think that instead we work to make each parser able to handle
>> > all
>> > >> the known outputs (and document explicitly what outputs per parser
>> are
>> > >> supported) from a product and go 

Re: Name conventions for parsers

2016-10-06 Thread Casey Stella
Not to birddog the thread, but as an aside, I'd like to see an annotations
based approach for the parser with a namespace, similar to how we do for
stellar functions.  This, I think, would make it easier for specifying them
and we could associate descriptions, possible params for configuring them,
etc.

Thoughts on this?

On Wed, Oct 5, 2016 at 1:32 PM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> At present we do not have a formal convention. Many organizations will no
> doubt want to create their own conventions to match existing naming
> methodologies.
>
> However, it seems like an excellent idea to at least produce some
> community driven recommendations for a standard baseline those without
> strong existing methods could adopt.
>
> I like your vendor-product approach, but would consider adding something
> around model / series / version to that. Does anyone have any thoughts on
> how such a taxonomy would work best?
>
> Simon
>
> > On 5 Oct 2016, at 18:22, Vladimir Shlyakhtin <
> vladimir.shlyakh...@sstech.us> wrote:
> >
> > Hi
> >
> > Does Metron have any recommendation for name convention for parsers?
> Like vendor-product.
> >
> > Thanks
> >
> > - Vladimir
>


Re: Name conventions for parsers

2016-10-06 Thread Nick Allen
I think that's a good problem to solve, Jon.  Having some way to handle
different types of data hitting the same Kafka topic, would be a very
common problem.  We should make this easy to handle.  And as Simon
mentioned, it solves the problem of ingesting low-volume data streams where
the cost of a dedicated topology is overkill.

Syslog is a good example use case.  Another example use case might be
extracting data out of Splunk.  I worked at an organization that was using
Splunk as the centralized log store to meet regulatory requirements.  Of
course, Splunk is expensive so overlaying additional functionality on the
existing installation was cost prohibitive.  The only efficient way we
could get data out of Splunk was one big pipe containing heterogenous data.
Perhaps there are other ways around it now.  I am no Splunk expert, but
this seems like a common problem.



On Thu, Oct 6, 2016 at 11:51 AM, zeo...@gmail.com  wrote:

> A storm splitter gateway topology was another path that I considered,
> especially because it would allow configs like what Yohann mentioned
> earlier with:
>
> > So, it would be really useful that Metron could handle a syslog flow
> > and automatically apply the right parser for each log. In order to
> > help Metron, a config could be provide by the "Security Platform
> > Engineer" to preselect a list of parser per device (as you know what
> > type of logs a device  should send).  This feature exists in
> > commercial SIEM.
>
> It's just not as easy to get going as an upstream splitter and/or parser in
> my scenario.
>
> Perhaps that should be an enhancement JIRA though?  I really think we need
> to lower the barrier to getting logs to Metron in the first place, even
> going as far as having a syslog listener (I looked at embedding rsyslog and
> syslog-ng and they both unfortunately are GPL licensed, so that's out...).
>
> Jon
>
> On Thu, Oct 6, 2016 at 9:58 AM Otto Fowler 
> wrote:
>
> Each of these split things would need to end up in their own topology,
> since they would each have different STELLAR and Enrichment configurations.
>
> It would be simpler I think to split them than to have a topology chain
> that ‘switches’ over a type of field and muddy stellar configs etc.
>
> If that is true, then the question is to split as part of the external
> delivery ( not metron’s problem ) in NiFi or , or to have a ‘gateway -
> splitter’ topology with only split rules to feed the other typed
> topologies.
>
> Or I’m totally wrong and you can forgive me ;)
>
> O
>
>
> On October 6, 2016 at 08:32:51, zeo...@gmail.com (zeo...@gmail.com) wrote:
>
> If we don't do it by device I would be concerned that some more
> appliance-based systems wouldn't allow the flexibility to split things up
> to different destinations, nor would they allow external additions (NiFi,
> etc.). This where I am right now, where I can send from certain appliances
> into my syslog infrastructure, then either force my syslog architecture to
> selectively send onto Metron, or parse and then send into a generic JSON
> parser (I will probably go the latter route). In order to standardize and
> simplify, I would suggest continuing down the device-based route.
>
> Generally, I expect the community to grow and for parsers to just exist,
> and some users to only do minor updates to them or throw together grok
> parsers using GROK_PREDICT() where necessary. In fact I would hope that is
> the case, as it would indicate a broader user base.
>
> Jon
>
> On Thu, Oct 6, 2016 at 8:02 AM Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
> > > On 6 Oct 2016, at 12:22, Yohann Lepage  wrote:
> > >
> > > 2016-10-06 12:21 GMT+02:00 zeo...@gmail.com :
> > >> I would think that instead we work to make each parser able to handle
> > all
> > >> the known outputs (and document explicitly what outputs per parser are
> > >> supported) from a product and go back to vendor_product, with versions
> > of
> > >> the product supported/tested and version of the parser being stored in
> > code
> > >> and documentation only.
> > > +1
> > >
> >
> > +1 - this is similar to the evolving schema problem, and probably belongs
> > in code.
> >
> > >> I'm currently working on mechanisms to get logs into Metron most
> > >> efficiently because all of my syslog comes in one big pipe.
> > > I have a similar use case. Most of the time, admins are ok to forward
> > > logs from rsyslog/syslog-ng to the SIEM as they don't want to install
> > > an agent ( *.* @@siem.intra:514;).
> > >
> > > The result is that you receive a mix of log
> > > (sudo/apache/mysql/audit/etc) from the same device and the SIEM have
> > > to deals with it.
> > >
> > > So, it would be really useful that Metron could handle a syslog flow
> > > and automatically apply the right parser for each log. In order to
> > > help Metron, a config could be provide by the "Security Platform
> > > Engineer" to preselect a list of parser per device (as you know what
> > > type of logs a device shou

[GitHub] incubator-metron issue #295: METRON-371: Changing logging level to INFO when...

2016-10-06 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/295
  
Testing Steps
1. Ensure squid topology is up.
2.  Inject the following message to the kafka-producer to ingest
```
1461576382.642161 127.0.0.1 TCP_MISS/200 103701 GET http://www.abc.com/ 
- DIRECT/199.27.79.73 text/html
```
3. Wait for the enrichment and index to be generated. 
4. Review the enrichment kafkaspout log file and ensure that the error 
cannot be seen.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #295: METRON-371: Changing logging level to IN...

2016-10-06 Thread cestella
GitHub user cestella opened a pull request:

https://github.com/apache/incubator-metron/pull/295

METRON-371: Changing logging level to INFO when there's not a config.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cestella/incubator-metron METRON-371

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-metron/pull/295.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #295


commit c5e4703290c44750f5edee95ed0ef81fb3634403
Author: cstella 
Date:   2016-10-06T16:19:44Z

Changing logging level to INFO when there's not a config.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/293
  
This is cool, but what do you think about making `LENGTH` handle different 
types other than just string?  For instance, if you pass a `Collection`, it'll 
return the `size` and if a string, the length?

This is currently how `IS_EMPTY` works


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/293#discussion_r82224827
  
--- Diff: 
metron-platform/metron-common/src/test/java/org/apache/metron/common/stellar/StellarTest.java
 ---
@@ -285,6 +285,18 @@ public void testHappyPath() {
   }
 
   @Test
--- End diff --

Yes, empty map.  If a variable does not exist, Stellar will pass in a null 
as the variable.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Name conventions for parsers

2016-10-06 Thread Simon Elliston Ball
> On 6 Oct 2016, at 16:51, zeo...@gmail.com  wrote:
> 
> A storm splitter gateway topology was another path that I considered,
> especially because it would allow configs like what Yohann mentioned
> earlier with:
> 
>> So, it would be really useful that Metron could handle a syslog flow
>> and automatically apply the right parser for each log. In order to
>> help Metron, a config could be provide by the "Security Platform
>> Engineer" to preselect a list of parser per device (as you know what
>> type of logs a device  should send).  This feature exists in
>> commercial SIEM.
> 
> It's just not as easy to get going as an upstream splitter and/or parser in
> my scenario.

That makes sense, and could reduce the overhead of excessive parser topologies 
running in Storm, though we could think of that as separate from the desire to 
split kafka topics for the incoming to maintain a sensible naming convention 
and nice fine-grained parallelism.

> Perhaps that should be an enhancement JIRA though?  I really think we need
> to lower the barrier to getting logs to Metron in the first place, even
> going as far as having a syslog listener (I looked at embedding rsyslog and
> syslog-ng and they both unfortunately are GPL licensed, so that's out...).
> 
> Jon

Something I’ve done a couple of times is to use Apache Nifi as a syslog 
listener pushing to Kafka. If you use Nifi’s syslog parsing functionality 
there’s some ability to split topics beforehand. I’ve also use that to parse to 
attributes then convert to JSON and use the new Metron JSON parser, though 
that’s certainly not 100% necessary. As long as you can keep up with the 
throughput that might be an easy use answer to upstream parsing and routing as 
Otto suggests.

That said, such use would definitely call for more efficient sharing of parse 
topologies between log types if you went crazy with the number of target topics 
for example.

Simon

> 
> On Thu, Oct 6, 2016 at 9:58 AM Otto Fowler  wrote:
> 
> Each of these split things would need to end up in their own topology,
> since they would each have different STELLAR and Enrichment configurations.
> 
> It would be simpler I think to split them than to have a topology chain
> that ‘switches’ over a type of field and muddy stellar configs etc.
> 
> If that is true, then the question is to split as part of the external
> delivery ( not metron’s problem ) in NiFi or , or to have a ‘gateway -
> splitter’ topology with only split rules to feed the other typed topologies.
> 
> Or I’m totally wrong and you can forgive me ;)
> 
> O
> 
> 
> On October 6, 2016 at 08:32:51, zeo...@gmail.com (zeo...@gmail.com) wrote:
> 
> If we don't do it by device I would be concerned that some more
> appliance-based systems wouldn't allow the flexibility to split things up
> to different destinations, nor would they allow external additions (NiFi,
> etc.). This where I am right now, where I can send from certain appliances
> into my syslog infrastructure, then either force my syslog architecture to
> selectively send onto Metron, or parse and then send into a generic JSON
> parser (I will probably go the latter route). In order to standardize and
> simplify, I would suggest continuing down the device-based route.
> 
> Generally, I expect the community to grow and for parsers to just exist,
> and some users to only do minor updates to them or throw together grok
> parsers using GROK_PREDICT() where necessary. In fact I would hope that is
> the case, as it would indicate a broader user base.
> 
> Jon
> 
> On Thu, Oct 6, 2016 at 8:02 AM Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
> 
>>> On 6 Oct 2016, at 12:22, Yohann Lepage  wrote:
>>> 
>>> 2016-10-06 12:21 GMT+02:00 zeo...@gmail.com :
 I would think that instead we work to make each parser able to handle
>> all
 the known outputs (and document explicitly what outputs per parser are
 supported) from a product and go back to vendor_product, with versions
>> of
 the product supported/tested and version of the parser being stored in
>> code
 and documentation only.
>>> +1
>>> 
>> 
>> +1 - this is similar to the evolving schema problem, and probably belongs
>> in code.
>> 
 I'm currently working on mechanisms to get logs into Metron most
 efficiently because all of my syslog comes in one big pipe.
>>> I have a similar use case. Most of the time, admins are ok to forward
>>> logs from rsyslog/syslog-ng to the SIEM as they don't want to install
>>> an agent ( *.* @@siem.intra:514;).
>>> 
>>> The result is that you receive a mix of log
>>> (sudo/apache/mysql/audit/etc) from the same device and the SIEM have
>>> to deals with it.
>>> 
>>> So, it would be really useful that Metron could handle a syslog flow
>>> and automatically apply the right parser for each log. In order to
>>> help Metron, a config could be provide by the "Security Platform
>>> Engineer" to preselect a list of parser per device (as you know what
>>> type of logs a device sho

Re: Name conventions for parsers

2016-10-06 Thread zeo...@gmail.com
A storm splitter gateway topology was another path that I considered,
especially because it would allow configs like what Yohann mentioned
earlier with:

> So, it would be really useful that Metron could handle a syslog flow
> and automatically apply the right parser for each log. In order to
> help Metron, a config could be provide by the "Security Platform
> Engineer" to preselect a list of parser per device (as you know what
> type of logs a device  should send).  This feature exists in
> commercial SIEM.

It's just not as easy to get going as an upstream splitter and/or parser in
my scenario.

Perhaps that should be an enhancement JIRA though?  I really think we need
to lower the barrier to getting logs to Metron in the first place, even
going as far as having a syslog listener (I looked at embedding rsyslog and
syslog-ng and they both unfortunately are GPL licensed, so that's out...).

Jon

On Thu, Oct 6, 2016 at 9:58 AM Otto Fowler  wrote:

Each of these split things would need to end up in their own topology,
since they would each have different STELLAR and Enrichment configurations.

It would be simpler I think to split them than to have a topology chain
that ‘switches’ over a type of field and muddy stellar configs etc.

If that is true, then the question is to split as part of the external
delivery ( not metron’s problem ) in NiFi or , or to have a ‘gateway -
splitter’ topology with only split rules to feed the other typed topologies.

Or I’m totally wrong and you can forgive me ;)

O


On October 6, 2016 at 08:32:51, zeo...@gmail.com (zeo...@gmail.com) wrote:

If we don't do it by device I would be concerned that some more
appliance-based systems wouldn't allow the flexibility to split things up
to different destinations, nor would they allow external additions (NiFi,
etc.). This where I am right now, where I can send from certain appliances
into my syslog infrastructure, then either force my syslog architecture to
selectively send onto Metron, or parse and then send into a generic JSON
parser (I will probably go the latter route). In order to standardize and
simplify, I would suggest continuing down the device-based route.

Generally, I expect the community to grow and for parsers to just exist,
and some users to only do minor updates to them or throw together grok
parsers using GROK_PREDICT() where necessary. In fact I would hope that is
the case, as it would indicate a broader user base.

Jon

On Thu, Oct 6, 2016 at 8:02 AM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> > On 6 Oct 2016, at 12:22, Yohann Lepage  wrote:
> >
> > 2016-10-06 12:21 GMT+02:00 zeo...@gmail.com :
> >> I would think that instead we work to make each parser able to handle
> all
> >> the known outputs (and document explicitly what outputs per parser are
> >> supported) from a product and go back to vendor_product, with versions
> of
> >> the product supported/tested and version of the parser being stored in
> code
> >> and documentation only.
> > +1
> >
>
> +1 - this is similar to the evolving schema problem, and probably belongs
> in code.
>
> >> I'm currently working on mechanisms to get logs into Metron most
> >> efficiently because all of my syslog comes in one big pipe.
> > I have a similar use case. Most of the time, admins are ok to forward
> > logs from rsyslog/syslog-ng to the SIEM as they don't want to install
> > an agent ( *.* @@siem.intra:514;).
> >
> > The result is that you receive a mix of log
> > (sudo/apache/mysql/audit/etc) from the same device and the SIEM have
> > to deals with it.
> >
> > So, it would be really useful that Metron could handle a syslog flow
> > and automatically apply the right parser for each log. In order to
> > help Metron, a config could be provide by the "Security Platform
> > Engineer" to preselect a list of parser per device (as you know what
> > type of logs a device should send). This feature exists in
> > commercial SIEM.
> >
>
> +1 for this too. One question though, do you think it’s viable to do this
> by device. I would expect multiple types of syslog coming from the same
> physical device, especially when dealing with things like server logs.
>
> This could be handled with minimal parse and routing in NiFi potentially,
> but that may make setup more complex than the sort of mapping you’re
> talking about here. Thoughts?
>
> Simon

-- 

Jon

-- 

Jon


[GitHub] incubator-metron pull request #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/293#discussion_r82218813
  
--- Diff: 
metron-platform/metron-common/src/test/java/org/apache/metron/common/stellar/StellarTest.java
 ---
@@ -285,6 +285,18 @@ public void testHappyPath() {
   }
 
   @Test
+  public void testLength(){
+String query = "LENGTH(foo)";
+Assert.assertEquals(5, run(query,ImmutableMap.of("foo","abcde")));
+  }
+
+  @Test
--- End diff --

I will fix - see my comment above about how to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/293#discussion_r82218691
  
--- Diff: 
metron-platform/metron-common/src/test/java/org/apache/metron/common/stellar/StellarTest.java
 ---
@@ -285,6 +285,18 @@ public void testHappyPath() {
   }
 
   @Test
--- End diff --

So, I was going to make a test passing in NULL, but the Immutable map will 
not take nulls, can you clarify how you would test this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread ottobackwards
Github user ottobackwards commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/293#discussion_r82218396
  
--- Diff: 
metron-platform/metron-common/src/main/java/org/apache/metron/common/dsl/functions/StringFunctions.java
 ---
@@ -222,4 +222,16 @@ public Object apply(List args) {
   return null;
 }
   }
+
+  @Stellar( name="LENGTH"
+  , description = "Returns the length of a string"
+  , params = { "input - String" }
+  , returns = "String"
--- End diff --

No - I need to fix that.  I started with the TO_UPPER def. to make sure I 
got it right and need to change that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/293#discussion_r82217916
  
--- Diff: 
metron-platform/metron-common/src/test/java/org/apache/metron/common/stellar/StellarTest.java
 ---
@@ -285,6 +285,18 @@ public void testHappyPath() {
   }
 
   @Test
+  public void testLength(){
+String query = "LENGTH(foo)";
+Assert.assertEquals(5, run(query,ImmutableMap.of("foo","abcde")));
+  }
+
+  @Test
--- End diff --

If I mistakenly call `LENGTH()` with no argument, then I get an 
`IndexOutOfBoundsException`.  And this is not unique to the `LENGTH` function.  
There are a number of other Stellar functions that do the same.

This may be confusing to a user of Stellar who is not familiar with the 
underlying implementation.  "IndexOutOfBounds? What the heck does that mean?"  

I wonder if it makes sense to handle this in a standard way within Stellar. 
 Ideally we would get something along the lines of "LENGTH: Missing first 
argument 'input'.  Expected 'String'".  

I bring this up only as a discussion point for future work.  This PR does 
not necessarily have to address this situation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/293#discussion_r82215514
  
--- Diff: 
metron-platform/metron-common/src/test/java/org/apache/metron/common/stellar/StellarTest.java
 ---
@@ -285,6 +285,18 @@ public void testHappyPath() {
   }
 
   @Test
--- End diff --

I would add one more test that validates the behavior when you call LENGTH 
on an undefined variable; `LENGTH(foo)` when there is no variable `foo`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #293: METRON-473 Add LENGTH() To Stellar

2016-10-06 Thread nickwallen
Github user nickwallen commented on a diff in the pull request:

https://github.com/apache/incubator-metron/pull/293#discussion_r82214084
  
--- Diff: 
metron-platform/metron-common/src/main/java/org/apache/metron/common/dsl/functions/StringFunctions.java
 ---
@@ -222,4 +222,16 @@ public Object apply(List args) {
   return null;
 }
   }
+
+  @Stellar( name="LENGTH"
+  , description = "Returns the length of a string"
+  , params = { "input - String" }
+  , returns = "String"
--- End diff --

Probably need to change`returns = "String"` to indicate that it returns an 
Integer which is the length of the String.  However you want to say that is 
fine, but it does not actually return a String.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #294: METRON-487: Correct the license in the StixExtr...

2016-10-06 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/294
  
Note, this also adds a license to the unit test REST service


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #294: METRON-487: Correct the license in the S...

2016-10-06 Thread cestella
GitHub user cestella opened a pull request:

https://github.com/apache/incubator-metron/pull/294

METRON-487: Correct the license in the StixExtractorTest

The StixExtractorTest has an example taken from the Stix project which 
blanket licenses its examples as BSD. We need to pull it out of the test and 
note it in the LICENSE file

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cestella/incubator-metron METRON-487

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-metron/pull/294.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #294


commit e116bd21c07890496f791cf1ed92116caa5d2370
Author: cstella 
Date:   2016-10-06T15:35:01Z

Updating licensing.

commit 7e433872df9b45a5b6d8a7718994b6af53b3857b
Author: cstella 
Date:   2016-10-06T15:35:58Z

updating license




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron pull request #284: METRON-474 - fix vagrant ansible default...

2016-10-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-metron/pull/284


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Name conventions for parsers

2016-10-06 Thread Otto Fowler
Each of these split things would need to end up in their own topology,
since they would each have different STELLAR and Enrichment configurations.

It would be simpler I think to split them than to have a topology chain
that ‘switches’ over a type of field and muddy stellar configs etc.

If that is true, then the question is to split as part of the external
delivery ( not metron’s problem ) in NiFi or , or to have a ‘gateway -
splitter’ topology with only split rules to feed the other typed topologies.

Or I’m totally wrong and you can forgive me ;)

O

On October 6, 2016 at 08:32:51, zeo...@gmail.com (zeo...@gmail.com) wrote:

If we don't do it by device I would be concerned that some more
appliance-based systems wouldn't allow the flexibility to split things up
to different destinations, nor would they allow external additions (NiFi,
etc.). This where I am right now, where I can send from certain appliances
into my syslog infrastructure, then either force my syslog architecture to
selectively send onto Metron, or parse and then send into a generic JSON
parser (I will probably go the latter route). In order to standardize and
simplify, I would suggest continuing down the device-based route.

Generally, I expect the community to grow and for parsers to just exist,
and some users to only do minor updates to them or throw together grok
parsers using GROK_PREDICT() where necessary. In fact I would hope that is
the case, as it would indicate a broader user base.

Jon

On Thu, Oct 6, 2016 at 8:02 AM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> > On 6 Oct 2016, at 12:22, Yohann Lepage  wrote:
> >
> > 2016-10-06 12:21 GMT+02:00 zeo...@gmail.com :
> >> I would think that instead we work to make each parser able to handle
> all
> >> the known outputs (and document explicitly what outputs per parser are
> >> supported) from a product and go back to vendor_product, with versions
> of
> >> the product supported/tested and version of the parser being stored in
> code
> >> and documentation only.
> > +1
> >
>
> +1 - this is similar to the evolving schema problem, and probably belongs
> in code.
>
> >> I'm currently working on mechanisms to get logs into Metron most
> >> efficiently because all of my syslog comes in one big pipe.
> > I have a similar use case. Most of the time, admins are ok to forward
> > logs from rsyslog/syslog-ng to the SIEM as they don't want to install
> > an agent ( *.* @@siem.intra:514;).
> >
> > The result is that you receive a mix of log
> > (sudo/apache/mysql/audit/etc) from the same device and the SIEM have
> > to deals with it.
> >
> > So, it would be really useful that Metron could handle a syslog flow
> > and automatically apply the right parser for each log. In order to
> > help Metron, a config could be provide by the "Security Platform
> > Engineer" to preselect a list of parser per device (as you know what
> > type of logs a device should send). This feature exists in
> > commercial SIEM.
> >
>
> +1 for this too. One question though, do you think it’s viable to do this
> by device. I would expect multiple types of syslog coming from the same
> physical device, especially when dealing with things like server logs.
>
> This could be handled with minimal parse and routing in NiFi potentially,
> but that may make setup more complex than the sort of mapping you’re
> talking about here. Thoughts?
>
> Simon

-- 

Jon


[GitHub] incubator-metron issue #286: METRON-326 Error Handling in ElasticsearchWrite...

2016-10-06 Thread justinleet
Github user justinleet commented on the issue:

https://github.com/apache/incubator-metron/pull/286
  
Moving things using option 2 (writer package in common) seems stable, after 
running locally a few times.  Travis also passed.  Not sure why moving them to 
different module causes such issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Metron correlation capabilities

2016-10-06 Thread Carolyn Duby
Thanks James.




On 10/5/16, 6:13 PM, "James Sirota"  wrote:

>Hi Carolyn,
>
>The correlation capabilities are done via ES queries and are visualized in 
>Kibana.  Metron's Stallar tranformation, enrichment, and threat intel 
>correlation capabilities allow you to pull up all relevant data and context 
>for all telemetries ingested with a single query.  Metron's PCAP services then 
>allow you to tie it in with the underlying packet capture.  
>
>With respect to ML analytics, Metron has Model as a Service that allows the 
>creation of stand alone models, ensembles of models, or chaining of multiple 
>models and provides model provisioning, discovery, and scoring.  If your 
>customer has pre-existing analytics packs they wish to run on top of Metron 
>please refer them to the boards and we will help them get the models to run on 
>MaaS.  
>
>Thanks,
>James
>
>05.10.2016, 14:41, "Carolyn Duby" :
>> Does Metron have any correlation capabilities that we can demonstrate now?
>>
>> Are any analytics packs ready to show?
>>
>> We have a customer asking about these capabilities.
>>
>> Thanks
>> Carolyn
>
>--- 
>Thank you,
>
>James Sirota
>PPMC- Apache Metron (Incubating)
>jsirota AT apache DOT org
>


Re: Name conventions for parsers

2016-10-06 Thread zeo...@gmail.com
If we don't do it by device I would be concerned that some more
appliance-based systems wouldn't allow the flexibility to split things up
to different destinations, nor would they allow external additions (NiFi,
etc.).  This where I am right now, where I can send from certain appliances
into my syslog infrastructure, then either force my syslog architecture to
selectively send onto Metron, or parse and then send into a generic JSON
parser (I will probably go the latter route).  In order to standardize and
simplify, I would suggest continuing down the device-based route.

Generally, I expect the community to grow and for parsers to just exist,
and some users to only do minor updates to them or throw together grok
parsers using GROK_PREDICT() where necessary.  In fact I would hope that is
the case, as it would indicate a broader user base.

Jon

On Thu, Oct 6, 2016 at 8:02 AM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> > On 6 Oct 2016, at 12:22, Yohann Lepage  wrote:
> >
> > 2016-10-06 12:21 GMT+02:00 zeo...@gmail.com :
> >> I would think that instead we work to make each parser able to handle
> all
> >> the known outputs (and document explicitly what outputs per parser are
> >> supported) from a product and go back to vendor_product, with versions
> of
> >> the product supported/tested and version of the parser being stored in
> code
> >> and documentation only.
> > +1
> >
>
> +1 - this is similar to the evolving schema problem, and probably belongs
> in code.
>
> >> I'm currently working on mechanisms to get logs into Metron most
> >> efficiently because all of my syslog comes in one big pipe.
> > I have a similar use case. Most of the time, admins are ok to forward
> > logs from rsyslog/syslog-ng to the SIEM as they don't want to install
> > an agent  ( *.* @@siem.intra:514;).
> >
> > The result is that you receive a mix of log
> > (sudo/apache/mysql/audit/etc) from the same device and the SIEM have
> > to deals with it.
> >
> > So, it would be really useful that Metron could handle a syslog flow
> > and automatically apply the right parser for each log. In order to
> > help Metron, a config could be provide by the "Security Platform
> > Engineer" to preselect a list of parser per device (as you know what
> > type of logs a device  should send).  This feature exists in
> > commercial SIEM.
> >
>
> +1 for this too. One question though, do you think it’s viable to do this
> by device. I would expect multiple types of syslog coming from the same
> physical device, especially when dealing with things like server logs.
>
> This could be handled with minimal parse and routing in NiFi potentially,
> but that may make setup more complex than the sort of mapping you’re
> talking about here. Thoughts?
>
> Simon

-- 

Jon


Re: Name conventions for parsers

2016-10-06 Thread Simon Elliston Ball
> On 6 Oct 2016, at 12:22, Yohann Lepage  wrote:
> 
> 2016-10-06 12:21 GMT+02:00 zeo...@gmail.com :
>> I would think that instead we work to make each parser able to handle all
>> the known outputs (and document explicitly what outputs per parser are
>> supported) from a product and go back to vendor_product, with versions of
>> the product supported/tested and version of the parser being stored in code
>> and documentation only.
> +1
> 

+1 - this is similar to the evolving schema problem, and probably belongs in 
code.

>> I'm currently working on mechanisms to get logs into Metron most
>> efficiently because all of my syslog comes in one big pipe.
> I have a similar use case. Most of the time, admins are ok to forward
> logs from rsyslog/syslog-ng to the SIEM as they don't want to install
> an agent  ( *.* @@siem.intra:514;).
> 
> The result is that you receive a mix of log
> (sudo/apache/mysql/audit/etc) from the same device and the SIEM have
> to deals with it.
> 
> So, it would be really useful that Metron could handle a syslog flow
> and automatically apply the right parser for each log. In order to
> help Metron, a config could be provide by the "Security Platform
> Engineer" to preselect a list of parser per device (as you know what
> type of logs a device  should send).  This feature exists in
> commercial SIEM.
> 

+1 for this too. One question though, do you think it’s viable to do this by 
device. I would expect multiple types of syslog coming from the same physical 
device, especially when dealing with things like server logs. 

This could be handled with minimal parse and routing in NiFi potentially, but 
that may make setup more complex than the sort of mapping you’re talking about 
here. Thoughts? 

Simon

Re: Name conventions for parsers

2016-10-06 Thread Yohann Lepage
2016-10-06 12:21 GMT+02:00 zeo...@gmail.com :
> I would think that instead we work to make each parser able to handle all
> the known outputs (and document explicitly what outputs per parser are
> supported) from a product and go back to vendor_product, with versions of
> the product supported/tested and version of the parser being stored in code
> and documentation only.
+1

> I'm currently working on mechanisms to get logs into Metron most
> efficiently because all of my syslog comes in one big pipe.
I have a similar use case. Most of the time, admins are ok to forward
logs from rsyslog/syslog-ng to the SIEM as they don't want to install
an agent  ( *.* @@siem.intra:514;).

The result is that you receive a mix of log
(sudo/apache/mysql/audit/etc) from the same device and the SIEM have
to deals with it.

So, it would be really useful that Metron could handle a syslog flow
and automatically apply the right parser for each log. In order to
help Metron, a config could be provide by the "Security Platform
Engineer" to preselect a list of parser per device (as you know what
type of logs a device  should send).  This feature exists in
commercial SIEM.

My 2 cents
-- 
Yohann L.


Re: Name conventions for parsers

2016-10-06 Thread zeo...@gmail.com
I would think that instead we work to make each parser able to handle all
the known outputs (and document explicitly what outputs per parser are
supported) from a product and go back to vendor_product, with versions of
the product supported/tested and version of the parser being stored in code
and documentation only.  I know this adds a little bit of overhead but from
the usability perspective, it is often hard enough to get the logs split to
a product-specific Kafka topic in the first place, let alone one per log
type and needing to repoint for each version.

There is also the question about what we do for open source logs - do we
substitute vendor for license?  I don't see much benefit to bro_bro and
storm_storm but bsd_bro and apache_storm make much more sense to me.
Obviously this doesn't provide the version of license, and there is still
potential for overlap still, but I didn't have any better thoughts offhand.

I'm currently working on mechanisms to get logs into Metron most
efficiently because all of my syslog comes in one big pipe.  I need to
apply upstream filters that are somewhat fragile in order to forward it
properly.  I may even still need to parse the logs outside of metron and
then send it in using the generic JSON parser for sanity, depending on how
things go in application.  I really don't want to make ingest more
complicated than needed and I have the feeling there will be many others in
the same situation as I am - all of my prior companies would be.

Jon

On Wed, Oct 5, 2016, 14:33 James Sirota  wrote:

> Well...a product can produce multiple types of telemetry.  There are also
> multiple versions of products that produce the same telemetry, but in
> different formats (string, json, xml, etc).  so maybe
> vendor_telemetry_version_format?
>
> 05.10.2016, 11:14, "Carolyn Duby" :
> > It would be helpful to have the function as well as we get more storm
> components. For example vendor_product_parse
> >
> > On 10/5/16, 1:43 PM, "James Sirota"  wrote:
> >
> >> I agree. Would you like to put forth a recommendation for a naming
> convention?
> >>
> >> 05.10.2016, 10:33, "Simon Elliston Ball" :
> >>>  At present we do not have a formal convention. Many organizations
> will no doubt want to create their own conventions to match existing naming
> methodologies.
> >>>
> >>>  However, it seems like an excellent idea to at least produce some
> community driven recommendations for a standard baseline those without
> strong existing methods could adopt.
> >>>
> >>>  I like your vendor-product approach, but would consider adding
> something around model / series / version to that. Does anyone have any
> thoughts on how such a taxonomy would work best?
> >>>
> >>>  Simon
> >>>
>    On 5 Oct 2016, at 18:22, Vladimir Shlyakhtin <
> vladimir.shlyakh...@sstech.us> wrote:
> 
>    Hi
> 
>    Does Metron have any recommendation for name convention for
> parsers? Like vendor-product.
> 
>    Thanks
> 
>    - Vladimir
> >>
> >> ---
> >> Thank you,
> >>
> >> James Sirota
> >> PPMC- Apache Metron (Incubating)
> >> jsirota AT apache DOT org
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>
-- 

Jon