Re: [CALL FOR TEST DATA] Request help identifying public domain or opensource test data sets for Metron testing

Dima Kovalyov Thu, 04 May 2017 02:44:24 -0700

Hello Matt,

It's been long-time for us to continue working in this direction further. Thank 
you for the response.


I wanted to ask if anything changed since our last discussion regarding 
parsers, enrichments and generators contribution. Is there anything else we 
should be doing other then:
1. Sign Corporate CLA with Apache (link).<https://www.apache.org/licenses/#clas>
2. Sign an Individual CLA for the submitter 
(instructions<https://www.apache.org/licenses/#clas>), I need to do that 
despite #1?
3. Register on Apache GitHub and JIRA.
4. Open JIRA master ticket for submissions from SSTECH.
5. Create sub-task for each piece of code we are going to submit.
6. Send email to dev@metron.apache.org<mailto:dev@metron.apache.org> describing 
proposed changes including JIRA case. What to expect from email? Approval or 
suggestions?
7. Fork Apache Metron master branch internally, merge our changes and test them 
using single-node vagrant.
8. Create Pull Request (PR), how?
9. Wait for the dev team to review, accept changes and answer any questions or 
suggestions.

This above applies to the code that was:
1. Written and tested.
2. Covered with unit tests.
3. Can be built using maven
4. Has place in the Apache Metron folder tree.

- Dima


On 10/08/2016 06:43 AM, Matt Foley wrote:
Hi Dima,
Sorry this is getting a little long, but TL;DR on 
Metron+Development+Environment+Setup+Instructions<https://cwiki.apache.org/confluence/display/METRON/Metron+Development+Environment+Setup+Instructions>
 is:

A. Open a Jira for the work you want to do, or the contribution you want to 
make.  Since you have several parsers, you might open an umbrella Jira, with 
four subtask jiras, each of which includes the parser and test data generator 
for one of the four technologies you mentioned.
B. Send an email to the dev list proposing what you want to submit, and 
referencing the Jira.
C. Fork the Apache Metron code base in your personal github area.
D. Make sure your contribution works correctly with the latest master branch 
code.
E. Decide where in the code tree your contribution would fit best.  The parsers 
themselves would of course go under metron-platform/metron-parsers/.  The data 
generators could reasonably be put in the test/ subdirectory, perhaps under 
metron-platform/metron-parsers/src/test/java/org/apache/metron/writers 
(although we would defer to the reviewers).
F. Add the necessary maven glue so the new pieces build along with the core.
G. Metron requires all submissions to have unit tests with thorough coverage, 
so add those if they aren’t there yet.
H. When things are ready to submit, commit everything to your github, and 
create a Pull Request (PR)
I. Watch the PR and Jira for responses.  Respond to questions, accept feedback 
or suggest alternative solutions, and work through the process with the 
community.  If things need lengthy discussion, you may be asked to do so in the 
dev list.
J. With patience, all issues will be agreed on, and the contribution will be 
accepted into Metron, for the benefit of the whole community.

Hope this helps.  Feel free to contact me directly, or just ask questions on 
the dev list.
Best regards,
—Matt


On Oct 7, 2016, at 6:05 PM, Matt Foley 
<ma...@apache.org<mailto:ma...@apache.org>> wrote:

Dima, that’s great!

Since you’re talking about a code contribution (or several :-), let’s move the 
discussion over to the 
d...@metron.incubator.apache.org<mailto:d...@metron.incubator.apache.org> list, 
after this response.  Briefly, here’s how you submit a contribution.

First the housekeeping:
1. If Sstech has not yet signed a Corporate CLA with Apache, please ask them to 
do so (instructions<https://www.apache.org/licenses/#clas>)
2. If you, or a colleague who will submit the contributions, has not yet signed 
an Individual CLA, please do so 
(instructions<https://www.apache.org/licenses/#clas>)

Since you’ve been successfully writing Metron parsers, you almost certainly 
have already done the following, but I’ll mention them here for the sake of 
other readers:
3. If you’re not on the dev mailing list, please join it 
(instructions<https://cwiki.apache.org/confluence/display/METRON/Community+Resources>)
4. If you weren’t a registered user of Apache’s Jira, you would request to be 
added, but I see you already are, so that’s good.
5. If you don’t yet have an account on Github.com<http://github.com/>, sign up 
for one (the free level is fine).
6. Set up a Metron Development Environment, and establish the ability to spin 
up a single-node test environment 
(instructions<https://cwiki.apache.org/confluence/display/METRON/Metron+Development+Environment+Setup+Instructions>)

To actually make the contribution, you follow the process shown in:
https://cwiki.apache.org/confluence/display/METRON/Metron+Development+Environment+Setup+Instructions

I’ll go into more detail in a direct email.
Thanks a lot for being interested in submitting these!

Cheers,
—Matt

________________________________
From: Dima Kovalyov <dima.koval...@sstech.us<mailto:dima.koval...@sstech.us>>
Sent: Friday, October 07, 2016 4:44 PM
To: u...@metron.incubator.apache.org<mailto:u...@metron.incubator.apache.org>; 
Satish Abburi
Subject: Re: [CALL FOR TEST DATA] Request help identifying public domain or 
opensource test data sets for Metron testing

Hello Matt,

We (Sstech team) currently have parsers and data generators for BlueCoat, Unix, 
MS Exchange, MS Windows and we would gladly contribute them.

Can you please share the procedure for submitting these peaces?
Thank you.

- Dima

On 10/08/2016 01:49 AM, Matt Foley wrote:
Hi all,
Enhanced testing of Metron, especially performance testing, would be aided by 
having data sets of realistic size, that exercise one or more of the various 
parts of Metron:

  *   each Parser (bro, yaf, snort, squid, ...)
  *   each Enhancer (geo, user, assets, ...)
  *   each Threat Intel module (Soltra, HailATaxi, ...)

Data sets must meet the following criteria:

  *   opensource or public domain
  *   suitably scrubbed, containing no Personally Identifiable Information
  *   unencumbered by company sensitivity, security, or IP concerns.

They may take the form of raw PCAP streams, or they may be already parsed or 
otherwise pre-processed.

If you know of opensource or public domain data sets of this kind, please 
respond with the URL, in this email thread or to the Jira ticket 
METRON-491<https://issues.apache.org/jira/browse/METRON-491>.

If you have an appropriate data set that your company would be willing to 
contribute, please also respond and we will help in any way we can.


Thanks,
--Matt

Re: [CALL FOR TEST DATA] Request help identifying public domain or opensource test data sets for Metron testing

Reply via email to