+1 from me.

I would also like to address the configs and make sure the configs are in the 
same place.  Do you have ideas on where we would put those?

Thanks,
James 



On 4/13/16, 6:50 AM, "Ryan Merriman" <rmerri...@hortonworks.com> wrote:

>Thank you for all the feedback everyone.  I will attempt to summarize all
>the input we¹ve received and update my initial proposal.  We can discuss
>further if anyone is still unclear and I will volunteer to capture all the
>details in a document of some kind once we all come to a consensus.
>
>Looks like everyone is in agreement for the top level projects.  Nick is
>working on a task that will require an addition top level project so I am
>going to add that in as well:
>
>metron-deployment
>metron-platform
>metron-ui
>metron-sensors
>
>All of these except metron-platform are well understood and don¹t warrant
>any more discussion.  For metron-platform there seem to be 2 areas that
>are not as clear: 
>
>- whether we need a common project
>- how do we organize test related code
>
>I agree with David and others that a common project will likely get
>misused and could become unnecessary bloated.  But I suspect there will be
>cases where we have common code being used across multiple projects (is
>already happening).  In this case we will either need this common project
>or we will have to keep common code in one of the other projects and have
>all other projects extend that. For the latter, an example would be
>keeping common code in enrichment and having parsers declare enrichment as
>a dependency.  There are a couple downsides I see with this approach:
>
>- parser topology jars now bring along all the enrichment dependencies
>- since more code from various projects are being packaged together,
>version conflicts are more likely and poms become more complicated due to
>all the necessary exclusions
>
>My thinking is that any jar file being deployed should only contain what
>it needs.  Curious what others think here.  My vote would be to maintain a
>common project (or whatever we want to call it) and be diligent about not
>letting project-specific code slip in there.
>
>I believe Nick was the first person to ask the question about projects
>related to test code and why we would need separate test and integration
>test.  The reason for this is that our integration-test classes currently
>depend on other projects (not surprising since they are integration
>tests).  If there are utilities we want make available to all projects
>(mock classes, utilities for reading sample data, etc) then it can¹t live
>in integration-test because that will introduce circular dependencies.  If
>it is possible to refactor our current Metron-Testing project so that it
>doesn¹t depend on any other projects, then we can keep utilities here.
>Otherwise we need a separate project for testing utilities.  I suspect
>removing other project dependencies from Metron-Testing will prove more
>difficult than it¹s worth so my vote would be to have 2 test related
>projects.
>
>So here is where our metron-platform organization stands:
>
>metron-common *
>metron-integration-test *
>metron-test-utilities *
>metron-data-management
>metron-pcap
>metron-parsers
>metron-enrichment
>       metron-solr
>       metron-elasticsearch
>metron-api
>
>* may or may not change depending on the outcome of this discussion
>
>Thoughts?
>
>Ryan Merriman
>
>
>On 4/11/16, 4:15 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote:
>
>>If you load up your Irc client just type
>>/join #apache-metron-dev
>>
>>Sent from my iPhone
>>
>>> On Apr 11, 2016, at 12:06 PM, James Sirota <jsir...@hortonworks.com>
>>>wrote:
>>> 
>>> Great, thanks, Debo.  Where can I find instructions on how to get to it?
>>> 
>>> Thanks,
>>> James 
>>> 
>>> 
>>> 
>>> 
>>>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <dedu...@cisco.com> wrote:
>>>> 
>>>> Hi James 
>>>> 
>>>> Ok set it up and ack Š..
>>>> 
>>>> Thx
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 4/10/16, 6:31 PM, "James Sirota" <jsir...@hortonworks.com> wrote:
>>>>> 
>>>>> Hi Debo,
>>>>> 
>>>>> I think it would be great if you set it up
>>>>> 
>>>>> Thanks,
>>>>> James 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote:
>>>>>> 
>>>>>> I have set it up for another open source effort in the past and it
>>>>>>was not very hard. Am happy to volunteer if needed.
>>>>>> 
>>>>>> Thx 
>>>>>> Debo
>>>>>> 
>>>>>> Sent from my iPhone
>>>>>> 
>>>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota <jsir...@hortonworks.com>
>>>>>>>wrote:
>>>>>>> 
>>>>>>> I¹d be open to an IRC channel.  Does anyone know if Apache allows
>>>>>>>this?  If yes, does anyone know how to set one up?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> James 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Hi Nick 
>>>>>>>> 
>>>>>>>> I like your suggestions. For the enrichment layer do you think it
>>>>>>>>would also include any advanced analytics. Else we might want to
>>>>>>>>have an analytics layer.
>>>>>>>> 
>>>>>>>> It would be good to have an arch which could be extended for new
>>>>>>>>functionality.
>>>>>>>> 
>>>>>>>> However Ryan's suggestion of the ui API and deployer also makes
>>>>>>>>sense. 
>>>>>>>> 
>>>>>>>> Should we have an IRC channel to discuss this or maybe etherpad?
>>>>>>>> 
>>>>>>>> Debo
>>>>>>>> 
>>>>>>>> Sent from my iPhone
>>>>>>>> 
>>>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <n...@nickallen.org>
>>>>>>>>>wrote:
>>>>>>>>> 
>>>>>>>>> It might help to think of our code base as four separate types of
>>>>>>>>> functionality.  This is primarily meant to give us a framework to
>>>>>>>>>think
>>>>>>>>> about the organization of Metron (and drive more discussion),
>>>>>>>>>rather than
>>>>>>>>> my proposal for a specific structure.
>>>>>>>>> 
>>>>>>>>> - Sensor - Anything that captures external, non-streaming data and
>>>>>>>>> presents it in a form ready for stream processing.
>>>>>>>>> - Input - Responsible for preparing streaming data for
>>>>>>>>>enrichment.  The
>>>>>>>>> existing "parsers" fit neatly into this space.
>>>>>>>>> - Enrichment - Responsible for enriching an incoming data feed
>>>>>>>>>like
>>>>>>>>> geoip, asset enrichment, threat intel lookups, etc.
>>>>>>>>> - Output - Responsible for persisting data that has been
>>>>>>>>>processed by
>>>>>>>>> Metron which obviously means search indexers or data stores.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman
>>>>>>>>><rmerri...@hortonworks.com>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> All,
>>>>>>>>>> 
>>>>>>>>>> I would like to propose a review and refactor of the current
>>>>>>>>>>project
>>>>>>>>>> organization within Metron.  Much of the way the legacy code was
>>>>>>>>>>organized
>>>>>>>>>> does not make sense anymore and could be designed so that it is
>>>>>>>>>>easier to
>>>>>>>>>> navigate and understand.  Our test coverage has increased
>>>>>>>>>>substantially so
>>>>>>>>>> I believe we can do this with confidence.
>>>>>>>>>> 
>>>>>>>>>> First off, I think we should agree on a naming convention.  I
>>>>>>>>>>see some
>>>>>>>>>> projects (YARN and Storm for example) that prepend the
>>>>>>>>>>sub-project with the
>>>>>>>>>> name of the top-level project (storm-core for example).  Metron
>>>>>>>>>>also
>>>>>>>>>> currently does this (Metron-Common).  I think that's fine,
>>>>>>>>>>although in the
>>>>>>>>>> case of Metron, I feel like having "Metron" prepended is
>>>>>>>>>>redundant.
>>>>>>>>>> Regardless of whether we decide to stick with that approach, I
>>>>>>>>>>propose that
>>>>>>>>>> project names be uniform and lowercase.  For example, under these
>>>>>>>>>> assumptions "Metron-Common" would change to "common".
>>>>>>>>>> 
>>>>>>>>>> The first level of organization makes sense to me.  Only change
>>>>>>>>>>I would
>>>>>>>>>> make would be to project names:
>>>>>>>>>> 
>>>>>>>>>> *   deployment
>>>>>>>>>> *   streaming
>>>>>>>>>> *   ui
>>>>>>>>>> 
>>>>>>>>>> Or if we want to keep metron in project names:
>>>>>>>>>> 
>>>>>>>>>> *   metron-deployment
>>>>>>>>>> *   metron-streaming
>>>>>>>>>> *   metron-ui
>>>>>>>>>> 
>>>>>>>>>> For now I don't see any changes necessary in deployment or ui
>>>>>>>>>> organization.  I see the streaming project structure primarily
>>>>>>>>>>driven by 2
>>>>>>>>>> things:  the Maven dependency tree and deployment targets.  For
>>>>>>>>>>example,
>>>>>>>>>> solr and elasticsearch code should be separated (because their
>>>>>>>>>>dependency
>>>>>>>>>> on lucene conflicts) but both will depend on common enrichment
>>>>>>>>>>code.  Also,
>>>>>>>>>> now that parser, enrichment and pcap topologies are separate,
>>>>>>>>>>code for
>>>>>>>>>> those topologies will be deployed as separate jars.  No reason
>>>>>>>>>>to include
>>>>>>>>>> parser code in enrichment topologies and vice-versa.  Any other
>>>>>>>>>> considerations I'm missing?
>>>>>>>>>> 
>>>>>>>>>> With that being said, here is my initial proposal:
>>>>>>>>>> 
>>>>>>>>>> *   common -  Any common code that all topologies depend on
>>>>>>>>>> (configuration classes, generic writers for example).  No
>>>>>>>>>>dependencies on
>>>>>>>>>> other Metron projects.
>>>>>>>>>> *   test - Contains utilities for writing unit tests, sample
>>>>>>>>>>configs and
>>>>>>>>>> sample data.  Will depend on common.
>>>>>>>>>> *   integration-test - Contains utilities and classes needed to
>>>>>>>>>>run our
>>>>>>>>>> integration tests (in memory components for example).  Will
>>>>>>>>>>depend on
>>>>>>>>>> common and test.
>>>>>>>>>> *   dataload - Contains all code related to data loading.  Will
>>>>>>>>>>also
>>>>>>>>>> include any property files needed and integration tests.  Will
>>>>>>>>>>depend on
>>>>>>>>>> common, test (test scope), and integration-test (test scope).
>>>>>>>>>> *   parser - All code specific to the parser topologies.  Would
>>>>>>>>>>also
>>>>>>>>>> include scripts, property files, flux files and parser topology
>>>>>>>>>>integration
>>>>>>>>>> tests.  This project will depend on common, test (test scope),
>>>>>>>>>>and
>>>>>>>>>> integration-testing (test scope).
>>>>>>>>>> *   enrichment - All code specific to the enrichment topologies
>>>>>>>>>>(except
>>>>>>>>>> solr and elasticsearch).  Would also include scripts, property
>>>>>>>>>>files, flux
>>>>>>>>>> files and enrichment topology integration tests.  This project
>>>>>>>>>>will depend
>>>>>>>>>> on common, test (test scope), and integration-test (test scope).
>>>>>>>>>> *   elasticsearch - All Elasticsearch related code.  Will depend
>>>>>>>>>>on
>>>>>>>>>> enrichment.
>>>>>>>>>> *   solr - All Solr related code.  Will depend on enrichment.
>>>>>>>>>> *   pcap - All code specific to the topology dedicated to pcap.
>>>>>>>>>>Would
>>>>>>>>>> also include scripts, property files, flux files and pcap
>>>>>>>>>>integration
>>>>>>>>>> test.  This project will depend on common, test (test scope) and
>>>>>>>>>> integration-test (test scope).
>>>>>>>>>> *   api - This will serve as a generic replacement for
>>>>>>>>>> Metron-Pcap_Service.  Will contain all code to build a Metron
>>>>>>>>>>web service
>>>>>>>>>> middle layer that can expose APIs through REST or other client
>>>>>>>>>>protocols.
>>>>>>>>>> Could possibly depend on all other projects or separated further
>>>>>>>>>>if version
>>>>>>>>>> conflicts arise (separate api projects for solr and
>>>>>>>>>>elasticsearch for
>>>>>>>>>> example).
>>>>>>>>>> 
>>>>>>>>>> Looking forward to hearing everyone's feedback and great ideas.
>>>>>>>>>> 
>>>>>>>>>> Ryan Merriman
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -- 
>>>>>>>>> Nick Allen <n...@nickallen.org>
>>>>>> 
>>
>
>

Reply via email to