+1 from me. I would also like to address the configs and make sure the configs are in the same place. Do you have ideas on where we would put those?
Thanks, James On 4/13/16, 6:50 AM, "Ryan Merriman" <rmerri...@hortonworks.com> wrote: >Thank you for all the feedback everyone. I will attempt to summarize all >the input we¹ve received and update my initial proposal. We can discuss >further if anyone is still unclear and I will volunteer to capture all the >details in a document of some kind once we all come to a consensus. > >Looks like everyone is in agreement for the top level projects. Nick is >working on a task that will require an addition top level project so I am >going to add that in as well: > >metron-deployment >metron-platform >metron-ui >metron-sensors > >All of these except metron-platform are well understood and don¹t warrant >any more discussion. For metron-platform there seem to be 2 areas that >are not as clear: > >- whether we need a common project >- how do we organize test related code > >I agree with David and others that a common project will likely get >misused and could become unnecessary bloated. But I suspect there will be >cases where we have common code being used across multiple projects (is >already happening). In this case we will either need this common project >or we will have to keep common code in one of the other projects and have >all other projects extend that. For the latter, an example would be >keeping common code in enrichment and having parsers declare enrichment as >a dependency. There are a couple downsides I see with this approach: > >- parser topology jars now bring along all the enrichment dependencies >- since more code from various projects are being packaged together, >version conflicts are more likely and poms become more complicated due to >all the necessary exclusions > >My thinking is that any jar file being deployed should only contain what >it needs. Curious what others think here. My vote would be to maintain a >common project (or whatever we want to call it) and be diligent about not >letting project-specific code slip in there. > >I believe Nick was the first person to ask the question about projects >related to test code and why we would need separate test and integration >test. The reason for this is that our integration-test classes currently >depend on other projects (not surprising since they are integration >tests). If there are utilities we want make available to all projects >(mock classes, utilities for reading sample data, etc) then it can¹t live >in integration-test because that will introduce circular dependencies. If >it is possible to refactor our current Metron-Testing project so that it >doesn¹t depend on any other projects, then we can keep utilities here. >Otherwise we need a separate project for testing utilities. I suspect >removing other project dependencies from Metron-Testing will prove more >difficult than it¹s worth so my vote would be to have 2 test related >projects. > >So here is where our metron-platform organization stands: > >metron-common * >metron-integration-test * >metron-test-utilities * >metron-data-management >metron-pcap >metron-parsers >metron-enrichment > metron-solr > metron-elasticsearch >metron-api > >* may or may not change depending on the outcome of this discussion > >Thoughts? > >Ryan Merriman > > >On 4/11/16, 4:15 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote: > >>If you load up your Irc client just type >>/join #apache-metron-dev >> >>Sent from my iPhone >> >>> On Apr 11, 2016, at 12:06 PM, James Sirota <jsir...@hortonworks.com> >>>wrote: >>> >>> Great, thanks, Debo. Where can I find instructions on how to get to it? >>> >>> Thanks, >>> James >>> >>> >>> >>> >>>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <dedu...@cisco.com> wrote: >>>> >>>> Hi James >>>> >>>> Ok set it up and ack Š.. >>>> >>>> Thx >>>> >>>> >>>> >>>> >>>> >>>>> On 4/10/16, 6:31 PM, "James Sirota" <jsir...@hortonworks.com> wrote: >>>>> >>>>> Hi Debo, >>>>> >>>>> I think it would be great if you set it up >>>>> >>>>> Thanks, >>>>> James >>>>> >>>>> >>>>> >>>>> >>>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote: >>>>>> >>>>>> I have set it up for another open source effort in the past and it >>>>>>was not very hard. Am happy to volunteer if needed. >>>>>> >>>>>> Thx >>>>>> Debo >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota <jsir...@hortonworks.com> >>>>>>>wrote: >>>>>>> >>>>>>> I¹d be open to an IRC channel. Does anyone know if Apache allows >>>>>>>this? If yes, does anyone know how to set one up? >>>>>>> >>>>>>> Thanks, >>>>>>> James >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote: >>>>>>>> >>>>>>>> Hi Nick >>>>>>>> >>>>>>>> I like your suggestions. For the enrichment layer do you think it >>>>>>>>would also include any advanced analytics. Else we might want to >>>>>>>>have an analytics layer. >>>>>>>> >>>>>>>> It would be good to have an arch which could be extended for new >>>>>>>>functionality. >>>>>>>> >>>>>>>> However Ryan's suggestion of the ui API and deployer also makes >>>>>>>>sense. >>>>>>>> >>>>>>>> Should we have an IRC channel to discuss this or maybe etherpad? >>>>>>>> >>>>>>>> Debo >>>>>>>> >>>>>>>> Sent from my iPhone >>>>>>>> >>>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <n...@nickallen.org> >>>>>>>>>wrote: >>>>>>>>> >>>>>>>>> It might help to think of our code base as four separate types of >>>>>>>>> functionality. This is primarily meant to give us a framework to >>>>>>>>>think >>>>>>>>> about the organization of Metron (and drive more discussion), >>>>>>>>>rather than >>>>>>>>> my proposal for a specific structure. >>>>>>>>> >>>>>>>>> - Sensor - Anything that captures external, non-streaming data and >>>>>>>>> presents it in a form ready for stream processing. >>>>>>>>> - Input - Responsible for preparing streaming data for >>>>>>>>>enrichment. The >>>>>>>>> existing "parsers" fit neatly into this space. >>>>>>>>> - Enrichment - Responsible for enriching an incoming data feed >>>>>>>>>like >>>>>>>>> geoip, asset enrichment, threat intel lookups, etc. >>>>>>>>> - Output - Responsible for persisting data that has been >>>>>>>>>processed by >>>>>>>>> Metron which obviously means search indexers or data stores. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman >>>>>>>>><rmerri...@hortonworks.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> All, >>>>>>>>>> >>>>>>>>>> I would like to propose a review and refactor of the current >>>>>>>>>>project >>>>>>>>>> organization within Metron. Much of the way the legacy code was >>>>>>>>>>organized >>>>>>>>>> does not make sense anymore and could be designed so that it is >>>>>>>>>>easier to >>>>>>>>>> navigate and understand. Our test coverage has increased >>>>>>>>>>substantially so >>>>>>>>>> I believe we can do this with confidence. >>>>>>>>>> >>>>>>>>>> First off, I think we should agree on a naming convention. I >>>>>>>>>>see some >>>>>>>>>> projects (YARN and Storm for example) that prepend the >>>>>>>>>>sub-project with the >>>>>>>>>> name of the top-level project (storm-core for example). Metron >>>>>>>>>>also >>>>>>>>>> currently does this (Metron-Common). I think that's fine, >>>>>>>>>>although in the >>>>>>>>>> case of Metron, I feel like having "Metron" prepended is >>>>>>>>>>redundant. >>>>>>>>>> Regardless of whether we decide to stick with that approach, I >>>>>>>>>>propose that >>>>>>>>>> project names be uniform and lowercase. For example, under these >>>>>>>>>> assumptions "Metron-Common" would change to "common". >>>>>>>>>> >>>>>>>>>> The first level of organization makes sense to me. Only change >>>>>>>>>>I would >>>>>>>>>> make would be to project names: >>>>>>>>>> >>>>>>>>>> * deployment >>>>>>>>>> * streaming >>>>>>>>>> * ui >>>>>>>>>> >>>>>>>>>> Or if we want to keep metron in project names: >>>>>>>>>> >>>>>>>>>> * metron-deployment >>>>>>>>>> * metron-streaming >>>>>>>>>> * metron-ui >>>>>>>>>> >>>>>>>>>> For now I don't see any changes necessary in deployment or ui >>>>>>>>>> organization. I see the streaming project structure primarily >>>>>>>>>>driven by 2 >>>>>>>>>> things: the Maven dependency tree and deployment targets. For >>>>>>>>>>example, >>>>>>>>>> solr and elasticsearch code should be separated (because their >>>>>>>>>>dependency >>>>>>>>>> on lucene conflicts) but both will depend on common enrichment >>>>>>>>>>code. Also, >>>>>>>>>> now that parser, enrichment and pcap topologies are separate, >>>>>>>>>>code for >>>>>>>>>> those topologies will be deployed as separate jars. No reason >>>>>>>>>>to include >>>>>>>>>> parser code in enrichment topologies and vice-versa. Any other >>>>>>>>>> considerations I'm missing? >>>>>>>>>> >>>>>>>>>> With that being said, here is my initial proposal: >>>>>>>>>> >>>>>>>>>> * common - Any common code that all topologies depend on >>>>>>>>>> (configuration classes, generic writers for example). No >>>>>>>>>>dependencies on >>>>>>>>>> other Metron projects. >>>>>>>>>> * test - Contains utilities for writing unit tests, sample >>>>>>>>>>configs and >>>>>>>>>> sample data. Will depend on common. >>>>>>>>>> * integration-test - Contains utilities and classes needed to >>>>>>>>>>run our >>>>>>>>>> integration tests (in memory components for example). Will >>>>>>>>>>depend on >>>>>>>>>> common and test. >>>>>>>>>> * dataload - Contains all code related to data loading. Will >>>>>>>>>>also >>>>>>>>>> include any property files needed and integration tests. Will >>>>>>>>>>depend on >>>>>>>>>> common, test (test scope), and integration-test (test scope). >>>>>>>>>> * parser - All code specific to the parser topologies. Would >>>>>>>>>>also >>>>>>>>>> include scripts, property files, flux files and parser topology >>>>>>>>>>integration >>>>>>>>>> tests. This project will depend on common, test (test scope), >>>>>>>>>>and >>>>>>>>>> integration-testing (test scope). >>>>>>>>>> * enrichment - All code specific to the enrichment topologies >>>>>>>>>>(except >>>>>>>>>> solr and elasticsearch). Would also include scripts, property >>>>>>>>>>files, flux >>>>>>>>>> files and enrichment topology integration tests. This project >>>>>>>>>>will depend >>>>>>>>>> on common, test (test scope), and integration-test (test scope). >>>>>>>>>> * elasticsearch - All Elasticsearch related code. Will depend >>>>>>>>>>on >>>>>>>>>> enrichment. >>>>>>>>>> * solr - All Solr related code. Will depend on enrichment. >>>>>>>>>> * pcap - All code specific to the topology dedicated to pcap. >>>>>>>>>>Would >>>>>>>>>> also include scripts, property files, flux files and pcap >>>>>>>>>>integration >>>>>>>>>> test. This project will depend on common, test (test scope) and >>>>>>>>>> integration-test (test scope). >>>>>>>>>> * api - This will serve as a generic replacement for >>>>>>>>>> Metron-Pcap_Service. Will contain all code to build a Metron >>>>>>>>>>web service >>>>>>>>>> middle layer that can expose APIs through REST or other client >>>>>>>>>>protocols. >>>>>>>>>> Could possibly depend on all other projects or separated further >>>>>>>>>>if version >>>>>>>>>> conflicts arise (separate api projects for solr and >>>>>>>>>>elasticsearch for >>>>>>>>>> example). >>>>>>>>>> >>>>>>>>>> Looking forward to hearing everyone's feedback and great ideas. >>>>>>>>>> >>>>>>>>>> Ryan Merriman >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Nick Allen <n...@nickallen.org> >>>>>> >> > >