Thank you for all the feedback everyone.  I will attempt to summarize all
the input we¹ve received and update my initial proposal.  We can discuss
further if anyone is still unclear and I will volunteer to capture all the
details in a document of some kind once we all come to a consensus.

Looks like everyone is in agreement for the top level projects.  Nick is
working on a task that will require an addition top level project so I am
going to add that in as well:

metron-deployment
metron-platform
metron-ui
metron-sensors

All of these except metron-platform are well understood and don¹t warrant
any more discussion.  For metron-platform there seem to be 2 areas that
are not as clear: 

- whether we need a common project
- how do we organize test related code

I agree with David and others that a common project will likely get
misused and could become unnecessary bloated.  But I suspect there will be
cases where we have common code being used across multiple projects (is
already happening).  In this case we will either need this common project
or we will have to keep common code in one of the other projects and have
all other projects extend that. For the latter, an example would be
keeping common code in enrichment and having parsers declare enrichment as
a dependency.  There are a couple downsides I see with this approach:

- parser topology jars now bring along all the enrichment dependencies
- since more code from various projects are being packaged together,
version conflicts are more likely and poms become more complicated due to
all the necessary exclusions

My thinking is that any jar file being deployed should only contain what
it needs.  Curious what others think here.  My vote would be to maintain a
common project (or whatever we want to call it) and be diligent about not
letting project-specific code slip in there.

I believe Nick was the first person to ask the question about projects
related to test code and why we would need separate test and integration
test.  The reason for this is that our integration-test classes currently
depend on other projects (not surprising since they are integration
tests).  If there are utilities we want make available to all projects
(mock classes, utilities for reading sample data, etc) then it can¹t live
in integration-test because that will introduce circular dependencies.  If
it is possible to refactor our current Metron-Testing project so that it
doesn¹t depend on any other projects, then we can keep utilities here.
Otherwise we need a separate project for testing utilities.  I suspect
removing other project dependencies from Metron-Testing will prove more
difficult than it¹s worth so my vote would be to have 2 test related
projects.

So here is where our metron-platform organization stands:

metron-common *
metron-integration-test *
metron-test-utilities *
metron-data-management
metron-pcap
metron-parsers
metron-enrichment
        metron-solr
        metron-elasticsearch
metron-api

* may or may not change depending on the outcome of this discussion

Thoughts?

Ryan Merriman


On 4/11/16, 4:15 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote:

>If you load up your Irc client just type
>/join #apache-metron-dev
>
>Sent from my iPhone
>
>> On Apr 11, 2016, at 12:06 PM, James Sirota <jsir...@hortonworks.com>
>>wrote:
>> 
>> Great, thanks, Debo.  Where can I find instructions on how to get to it?
>> 
>> Thanks,
>> James 
>> 
>> 
>> 
>> 
>>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <dedu...@cisco.com> wrote:
>>> 
>>> Hi James 
>>> 
>>> Ok set it up and ack Š..
>>> 
>>> Thx
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> On 4/10/16, 6:31 PM, "James Sirota" <jsir...@hortonworks.com> wrote:
>>>> 
>>>> Hi Debo,
>>>> 
>>>> I think it would be great if you set it up
>>>> 
>>>> Thanks,
>>>> James 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote:
>>>>> 
>>>>> I have set it up for another open source effort in the past and it
>>>>>was not very hard. Am happy to volunteer if needed.
>>>>> 
>>>>> Thx 
>>>>> Debo
>>>>> 
>>>>> Sent from my iPhone
>>>>> 
>>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota <jsir...@hortonworks.com>
>>>>>>wrote:
>>>>>> 
>>>>>> I¹d be open to an IRC channel.  Does anyone know if Apache allows
>>>>>>this?  If yes, does anyone know how to set one up?
>>>>>> 
>>>>>> Thanks,
>>>>>> James 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote:
>>>>>>> 
>>>>>>> Hi Nick 
>>>>>>> 
>>>>>>> I like your suggestions. For the enrichment layer do you think it
>>>>>>>would also include any advanced analytics. Else we might want to
>>>>>>>have an analytics layer.
>>>>>>> 
>>>>>>> It would be good to have an arch which could be extended for new
>>>>>>>functionality.
>>>>>>> 
>>>>>>> However Ryan's suggestion of the ui API and deployer also makes
>>>>>>>sense. 
>>>>>>> 
>>>>>>> Should we have an IRC channel to discuss this or maybe etherpad?
>>>>>>> 
>>>>>>> Debo
>>>>>>> 
>>>>>>> Sent from my iPhone
>>>>>>> 
>>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <n...@nickallen.org>
>>>>>>>>wrote:
>>>>>>>> 
>>>>>>>> It might help to think of our code base as four separate types of
>>>>>>>> functionality.  This is primarily meant to give us a framework to
>>>>>>>>think
>>>>>>>> about the organization of Metron (and drive more discussion),
>>>>>>>>rather than
>>>>>>>> my proposal for a specific structure.
>>>>>>>> 
>>>>>>>> - Sensor - Anything that captures external, non-streaming data and
>>>>>>>> presents it in a form ready for stream processing.
>>>>>>>> - Input - Responsible for preparing streaming data for
>>>>>>>>enrichment.  The
>>>>>>>> existing "parsers" fit neatly into this space.
>>>>>>>> - Enrichment - Responsible for enriching an incoming data feed
>>>>>>>>like
>>>>>>>> geoip, asset enrichment, threat intel lookups, etc.
>>>>>>>> - Output - Responsible for persisting data that has been
>>>>>>>>processed by
>>>>>>>> Metron which obviously means search indexers or data stores.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman
>>>>>>>><rmerri...@hortonworks.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> All,
>>>>>>>>> 
>>>>>>>>> I would like to propose a review and refactor of the current
>>>>>>>>>project
>>>>>>>>> organization within Metron.  Much of the way the legacy code was
>>>>>>>>>organized
>>>>>>>>> does not make sense anymore and could be designed so that it is
>>>>>>>>>easier to
>>>>>>>>> navigate and understand.  Our test coverage has increased
>>>>>>>>>substantially so
>>>>>>>>> I believe we can do this with confidence.
>>>>>>>>> 
>>>>>>>>> First off, I think we should agree on a naming convention.  I
>>>>>>>>>see some
>>>>>>>>> projects (YARN and Storm for example) that prepend the
>>>>>>>>>sub-project with the
>>>>>>>>> name of the top-level project (storm-core for example).  Metron
>>>>>>>>>also
>>>>>>>>> currently does this (Metron-Common).  I think that's fine,
>>>>>>>>>although in the
>>>>>>>>> case of Metron, I feel like having "Metron" prepended is
>>>>>>>>>redundant.
>>>>>>>>> Regardless of whether we decide to stick with that approach, I
>>>>>>>>>propose that
>>>>>>>>> project names be uniform and lowercase.  For example, under these
>>>>>>>>> assumptions "Metron-Common" would change to "common".
>>>>>>>>> 
>>>>>>>>> The first level of organization makes sense to me.  Only change
>>>>>>>>>I would
>>>>>>>>> make would be to project names:
>>>>>>>>> 
>>>>>>>>> *   deployment
>>>>>>>>> *   streaming
>>>>>>>>> *   ui
>>>>>>>>> 
>>>>>>>>> Or if we want to keep metron in project names:
>>>>>>>>> 
>>>>>>>>> *   metron-deployment
>>>>>>>>> *   metron-streaming
>>>>>>>>> *   metron-ui
>>>>>>>>> 
>>>>>>>>> For now I don't see any changes necessary in deployment or ui
>>>>>>>>> organization.  I see the streaming project structure primarily
>>>>>>>>>driven by 2
>>>>>>>>> things:  the Maven dependency tree and deployment targets.  For
>>>>>>>>>example,
>>>>>>>>> solr and elasticsearch code should be separated (because their
>>>>>>>>>dependency
>>>>>>>>> on lucene conflicts) but both will depend on common enrichment
>>>>>>>>>code.  Also,
>>>>>>>>> now that parser, enrichment and pcap topologies are separate,
>>>>>>>>>code for
>>>>>>>>> those topologies will be deployed as separate jars.  No reason
>>>>>>>>>to include
>>>>>>>>> parser code in enrichment topologies and vice-versa.  Any other
>>>>>>>>> considerations I'm missing?
>>>>>>>>> 
>>>>>>>>> With that being said, here is my initial proposal:
>>>>>>>>> 
>>>>>>>>> *   common -  Any common code that all topologies depend on
>>>>>>>>> (configuration classes, generic writers for example).  No
>>>>>>>>>dependencies on
>>>>>>>>> other Metron projects.
>>>>>>>>> *   test - Contains utilities for writing unit tests, sample
>>>>>>>>>configs and
>>>>>>>>> sample data.  Will depend on common.
>>>>>>>>> *   integration-test - Contains utilities and classes needed to
>>>>>>>>>run our
>>>>>>>>> integration tests (in memory components for example).  Will
>>>>>>>>>depend on
>>>>>>>>> common and test.
>>>>>>>>> *   dataload - Contains all code related to data loading.  Will
>>>>>>>>>also
>>>>>>>>> include any property files needed and integration tests.  Will
>>>>>>>>>depend on
>>>>>>>>> common, test (test scope), and integration-test (test scope).
>>>>>>>>> *   parser - All code specific to the parser topologies.  Would
>>>>>>>>>also
>>>>>>>>> include scripts, property files, flux files and parser topology
>>>>>>>>>integration
>>>>>>>>> tests.  This project will depend on common, test (test scope),
>>>>>>>>>and
>>>>>>>>> integration-testing (test scope).
>>>>>>>>> *   enrichment - All code specific to the enrichment topologies
>>>>>>>>>(except
>>>>>>>>> solr and elasticsearch).  Would also include scripts, property
>>>>>>>>>files, flux
>>>>>>>>> files and enrichment topology integration tests.  This project
>>>>>>>>>will depend
>>>>>>>>> on common, test (test scope), and integration-test (test scope).
>>>>>>>>> *   elasticsearch - All Elasticsearch related code.  Will depend
>>>>>>>>>on
>>>>>>>>> enrichment.
>>>>>>>>> *   solr - All Solr related code.  Will depend on enrichment.
>>>>>>>>> *   pcap - All code specific to the topology dedicated to pcap.
>>>>>>>>>Would
>>>>>>>>> also include scripts, property files, flux files and pcap
>>>>>>>>>integration
>>>>>>>>> test.  This project will depend on common, test (test scope) and
>>>>>>>>> integration-test (test scope).
>>>>>>>>> *   api - This will serve as a generic replacement for
>>>>>>>>> Metron-Pcap_Service.  Will contain all code to build a Metron
>>>>>>>>>web service
>>>>>>>>> middle layer that can expose APIs through REST or other client
>>>>>>>>>protocols.
>>>>>>>>> Could possibly depend on all other projects or separated further
>>>>>>>>>if version
>>>>>>>>> conflicts arise (separate api projects for solr and
>>>>>>>>>elasticsearch for
>>>>>>>>> example).
>>>>>>>>> 
>>>>>>>>> Looking forward to hearing everyone's feedback and great ideas.
>>>>>>>>> 
>>>>>>>>> Ryan Merriman
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> Nick Allen <n...@nickallen.org>
>>>>> 
>

Reply via email to