As of now, I think the following classes are not used:
Metron-EnrichmentAdapters org.apache.metron.enrichment.adapters.cif.AbstractCIFAdapter.java org.apache.metron.enrichment.adapters.cif.CIFHbaseAdapter.java org.apache.metron.enrichment.adapters.whois.WhoisHBaseAdapter.java Metron-DataLoads org.apache.metron.dataloads.cif.HBaseTableLoad.java Thanks, Frank Lu On 4/18/16, 3:05 PM, "Ryan Merriman" <rmerri...@hortonworks.com> wrote: >All, > >I put together a list of all the project java assets that details where >they will be moved (or potentially deleted) as part of the project >reorganization. Feedback welcome. > >Ryan Merriman > >On 4/13/16, 9:42 AM, "James Sirota" <jsir...@hortonworks.com> wrote: > >>I would have configs as a project but rather as a folder structure that >>other modules can point to >> >>Thanks, >>James >> >> >> >> >>On 4/13/16, 7:32 AM, "Ryan Merriman" <rmerri...@hortonworks.com> wrote: >> >>>James brings up a good point. I propose adding another project under >>>metron-platform called metron-configuration. This would be a fairly >>>lightweight project that would contain anything related to configuration >>>(property files, json files, flux files, etc). >>> >>>On 4/13/16, 8:56 AM, "James Sirota" <jsir...@hortonworks.com> wrote: >>> >>>>+1 from me. >>>> >>>>I would also like to address the configs and make sure the configs are >>>>in >>>>the same place. Do you have ideas on where we would put those? >>>> >>>>Thanks, >>>>James >>>> >>>> >>>> >>>>On 4/13/16, 6:50 AM, "Ryan Merriman" <rmerri...@hortonworks.com> wrote: >>>> >>>>>Thank you for all the feedback everyone. I will attempt to summarize >>>>>all >>>>>the input we¹ve received and update my initial proposal. We can >>>>>discuss >>>>>further if anyone is still unclear and I will volunteer to capture all >>>>>the >>>>>details in a document of some kind once we all come to a consensus. >>>>> >>>>>Looks like everyone is in agreement for the top level projects. Nick >>>>>is >>>>>working on a task that will require an addition top level project so I >>>>>am >>>>>going to add that in as well: >>>>> >>>>>metron-deployment >>>>>metron-platform >>>>>metron-ui >>>>>metron-sensors >>>>> >>>>>All of these except metron-platform are well understood and don¹t >>>>>warrant >>>>>any more discussion. For metron-platform there seem to be 2 areas that >>>>>are not as clear: >>>>> >>>>>- whether we need a common project >>>>>- how do we organize test related code >>>>> >>>>>I agree with David and others that a common project will likely get >>>>>misused and could become unnecessary bloated. But I suspect there will >>>>>be >>>>>cases where we have common code being used across multiple projects (is >>>>>already happening). In this case we will either need this common >>>>>project >>>>>or we will have to keep common code in one of the other projects and >>>>>have >>>>>all other projects extend that. For the latter, an example would be >>>>>keeping common code in enrichment and having parsers declare enrichment >>>>>as >>>>>a dependency. There are a couple downsides I see with this approach: >>>>> >>>>>- parser topology jars now bring along all the enrichment dependencies >>>>>- since more code from various projects are being packaged together, >>>>>version conflicts are more likely and poms become more complicated due >>>>>to >>>>>all the necessary exclusions >>>>> >>>>>My thinking is that any jar file being deployed should only contain >>>>>what >>>>>it needs. Curious what others think here. My vote would be to >>>>>maintain >>>>>a >>>>>common project (or whatever we want to call it) and be diligent about >>>>>not >>>>>letting project-specific code slip in there. >>>>> >>>>>I believe Nick was the first person to ask the question about projects >>>>>related to test code and why we would need separate test and >>>>>integration >>>>>test. The reason for this is that our integration-test classes >>>>>currently >>>>>depend on other projects (not surprising since they are integration >>>>>tests). If there are utilities we want make available to all projects >>>>>(mock classes, utilities for reading sample data, etc) then it can¹t >>>>>live >>>>>in integration-test because that will introduce circular dependencies. >>>>>If >>>>>it is possible to refactor our current Metron-Testing project so that >>>>>it >>>>>doesn¹t depend on any other projects, then we can keep utilities here. >>>>>Otherwise we need a separate project for testing utilities. I suspect >>>>>removing other project dependencies from Metron-Testing will prove more >>>>>difficult than it¹s worth so my vote would be to have 2 test related >>>>>projects. >>>>> >>>>>So here is where our metron-platform organization stands: >>>>> >>>>>metron-common * >>>>>metron-integration-test * >>>>>metron-test-utilities * >>>>>metron-data-management >>>>>metron-pcap >>>>>metron-parsers >>>>>metron-enrichment >>>>> metron-solr >>>>> metron-elasticsearch >>>>>metron-api >>>>> >>>>>* may or may not change depending on the outcome of this discussion >>>>> >>>>>Thoughts? >>>>> >>>>>Ryan Merriman >>>>> >>>>> >>>>>On 4/11/16, 4:15 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote: >>>>> >>>>>>If you load up your Irc client just type >>>>>>/join #apache-metron-dev >>>>>> >>>>>>Sent from my iPhone >>>>>> >>>>>>> On Apr 11, 2016, at 12:06 PM, James Sirota <jsir...@hortonworks.com> >>>>>>>wrote: >>>>>>> >>>>>>> Great, thanks, Debo. Where can I find instructions on how to get to >>>>>>>it? >>>>>>> >>>>>>> Thanks, >>>>>>> James >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 4/11/16, 9:41 AM, "Debo Dutta (dedutta)" <dedu...@cisco.com> >>>>>>>>wrote: >>>>>>>> >>>>>>>> Hi James >>>>>>>> >>>>>>>> Ok set it up and ack Š.. >>>>>>>> >>>>>>>> Thx >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> On 4/10/16, 6:31 PM, "James Sirota" <jsir...@hortonworks.com> >>>>>>>>>wrote: >>>>>>>>> >>>>>>>>> Hi Debo, >>>>>>>>> >>>>>>>>> I think it would be great if you set it up >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> James >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 4/10/16, 6:25 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>> I have set it up for another open source effort in the past and >>>>>>>>>>it >>>>>>>>>>was not very hard. Am happy to volunteer if needed. >>>>>>>>>> >>>>>>>>>> Thx >>>>>>>>>> Debo >>>>>>>>>> >>>>>>>>>> Sent from my iPhone >>>>>>>>>> >>>>>>>>>>> On Apr 10, 2016, at 5:53 PM, James Sirota >>>>>>>>>>><jsir...@hortonworks.com> >>>>>>>>>>>wrote: >>>>>>>>>>> >>>>>>>>>>> I¹d be open to an IRC channel. Does anyone know if Apache >>>>>>>>>>>allows >>>>>>>>>>>this? If yes, does anyone know how to set one up? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> James >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <ddu...@gmail.com> >>>>>>>>>>>>wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Nick >>>>>>>>>>>> >>>>>>>>>>>> I like your suggestions. For the enrichment layer do you think >>>>>>>>>>>>it >>>>>>>>>>>>would also include any advanced analytics. Else we might want to >>>>>>>>>>>>have an analytics layer. >>>>>>>>>>>> >>>>>>>>>>>> It would be good to have an arch which could be extended for >>>>>>>>>>>>new >>>>>>>>>>>>functionality. >>>>>>>>>>>> >>>>>>>>>>>> However Ryan's suggestion of the ui API and deployer also makes >>>>>>>>>>>>sense. >>>>>>>>>>>> >>>>>>>>>>>> Should we have an IRC channel to discuss this or maybe >>>>>>>>>>>>etherpad? >>>>>>>>>>>> >>>>>>>>>>>> Debo >>>>>>>>>>>> >>>>>>>>>>>> Sent from my iPhone >>>>>>>>>>>> >>>>>>>>>>>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <n...@nickallen.org> >>>>>>>>>>>>>wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> It might help to think of our code base as four separate types >>>>>>>>>>>>>of >>>>>>>>>>>>> functionality. This is primarily meant to give us a framework >>>>>>>>>>>>>to >>>>>>>>>>>>>think >>>>>>>>>>>>> about the organization of Metron (and drive more discussion), >>>>>>>>>>>>>rather than >>>>>>>>>>>>> my proposal for a specific structure. >>>>>>>>>>>>> >>>>>>>>>>>>> - Sensor - Anything that captures external, non-streaming data >>>>>>>>>>>>>and >>>>>>>>>>>>> presents it in a form ready for stream processing. >>>>>>>>>>>>> - Input - Responsible for preparing streaming data for >>>>>>>>>>>>>enrichment. The >>>>>>>>>>>>> existing "parsers" fit neatly into this space. >>>>>>>>>>>>> - Enrichment - Responsible for enriching an incoming data feed >>>>>>>>>>>>>like >>>>>>>>>>>>> geoip, asset enrichment, threat intel lookups, etc. >>>>>>>>>>>>> - Output - Responsible for persisting data that has been >>>>>>>>>>>>>processed by >>>>>>>>>>>>> Metron which obviously means search indexers or data stores. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman >>>>>>>>>>>>><rmerri...@hortonworks.com> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> All, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I would like to propose a review and refactor of the current >>>>>>>>>>>>>>project >>>>>>>>>>>>>> organization within Metron. Much of the way the legacy code >>>>>>>>>>>>>>was >>>>>>>>>>>>>>organized >>>>>>>>>>>>>> does not make sense anymore and could be designed so that it >>>>>>>>>>>>>>is >>>>>>>>>>>>>>easier to >>>>>>>>>>>>>> navigate and understand. Our test coverage has increased >>>>>>>>>>>>>>substantially so >>>>>>>>>>>>>> I believe we can do this with confidence. >>>>>>>>>>>>>> >>>>>>>>>>>>>> First off, I think we should agree on a naming convention. I >>>>>>>>>>>>>>see some >>>>>>>>>>>>>> projects (YARN and Storm for example) that prepend the >>>>>>>>>>>>>>sub-project with the >>>>>>>>>>>>>> name of the top-level project (storm-core for example). >>>>>>>>>>>>>>Metron >>>>>>>>>>>>>>also >>>>>>>>>>>>>> currently does this (Metron-Common). I think that's fine, >>>>>>>>>>>>>>although in the >>>>>>>>>>>>>> case of Metron, I feel like having "Metron" prepended is >>>>>>>>>>>>>>redundant. >>>>>>>>>>>>>> Regardless of whether we decide to stick with that approach, >>>>>>>>>>>>>>I >>>>>>>>>>>>>>propose that >>>>>>>>>>>>>> project names be uniform and lowercase. For example, under >>>>>>>>>>>>>>these >>>>>>>>>>>>>> assumptions "Metron-Common" would change to "common". >>>>>>>>>>>>>> >>>>>>>>>>>>>> The first level of organization makes sense to me. Only >>>>>>>>>>>>>>change >>>>>>>>>>>>>>I would >>>>>>>>>>>>>> make would be to project names: >>>>>>>>>>>>>> >>>>>>>>>>>>>> * deployment >>>>>>>>>>>>>> * streaming >>>>>>>>>>>>>> * ui >>>>>>>>>>>>>> >>>>>>>>>>>>>> Or if we want to keep metron in project names: >>>>>>>>>>>>>> >>>>>>>>>>>>>> * metron-deployment >>>>>>>>>>>>>> * metron-streaming >>>>>>>>>>>>>> * metron-ui >>>>>>>>>>>>>> >>>>>>>>>>>>>> For now I don't see any changes necessary in deployment or ui >>>>>>>>>>>>>> organization. I see the streaming project structure >>>>>>>>>>>>>>primarily >>>>>>>>>>>>>>driven by 2 >>>>>>>>>>>>>> things: the Maven dependency tree and deployment targets. >>>>>>>>>>>>>>For >>>>>>>>>>>>>>example, >>>>>>>>>>>>>> solr and elasticsearch code should be separated (because >>>>>>>>>>>>>>their >>>>>>>>>>>>>>dependency >>>>>>>>>>>>>> on lucene conflicts) but both will depend on common >>>>>>>>>>>>>>enrichment >>>>>>>>>>>>>>code. Also, >>>>>>>>>>>>>> now that parser, enrichment and pcap topologies are separate, >>>>>>>>>>>>>>code for >>>>>>>>>>>>>> those topologies will be deployed as separate jars. No >>>>>>>>>>>>>>reason >>>>>>>>>>>>>>to include >>>>>>>>>>>>>> parser code in enrichment topologies and vice-versa. Any >>>>>>>>>>>>>>other >>>>>>>>>>>>>> considerations I'm missing? >>>>>>>>>>>>>> >>>>>>>>>>>>>> With that being said, here is my initial proposal: >>>>>>>>>>>>>> >>>>>>>>>>>>>> * common - Any common code that all topologies depend on >>>>>>>>>>>>>> (configuration classes, generic writers for example). No >>>>>>>>>>>>>>dependencies on >>>>>>>>>>>>>> other Metron projects. >>>>>>>>>>>>>> * test - Contains utilities for writing unit tests, sample >>>>>>>>>>>>>>configs and >>>>>>>>>>>>>> sample data. Will depend on common. >>>>>>>>>>>>>> * integration-test - Contains utilities and classes needed >>>>>>>>>>>>>>to >>>>>>>>>>>>>>run our >>>>>>>>>>>>>> integration tests (in memory components for example). Will >>>>>>>>>>>>>>depend on >>>>>>>>>>>>>> common and test. >>>>>>>>>>>>>> * dataload - Contains all code related to data loading. >>>>>>>>>>>>>>Will >>>>>>>>>>>>>>also >>>>>>>>>>>>>> include any property files needed and integration tests. >>>>>>>>>>>>>>Will >>>>>>>>>>>>>>depend on >>>>>>>>>>>>>> common, test (test scope), and integration-test (test scope). >>>>>>>>>>>>>> * parser - All code specific to the parser topologies. >>>>>>>>>>>>>>Would >>>>>>>>>>>>>>also >>>>>>>>>>>>>> include scripts, property files, flux files and parser >>>>>>>>>>>>>>topology >>>>>>>>>>>>>>integration >>>>>>>>>>>>>> tests. This project will depend on common, test (test >>>>>>>>>>>>>>scope), >>>>>>>>>>>>>>and >>>>>>>>>>>>>> integration-testing (test scope). >>>>>>>>>>>>>> * enrichment - All code specific to the enrichment >>>>>>>>>>>>>>topologies >>>>>>>>>>>>>>(except >>>>>>>>>>>>>> solr and elasticsearch). Would also include scripts, >>>>>>>>>>>>>>property >>>>>>>>>>>>>>files, flux >>>>>>>>>>>>>> files and enrichment topology integration tests. This >>>>>>>>>>>>>>project >>>>>>>>>>>>>>will depend >>>>>>>>>>>>>> on common, test (test scope), and integration-test (test >>>>>>>>>>>>>>scope). >>>>>>>>>>>>>> * elasticsearch - All Elasticsearch related code. Will >>>>>>>>>>>>>>depend >>>>>>>>>>>>>>on >>>>>>>>>>>>>> enrichment. >>>>>>>>>>>>>> * solr - All Solr related code. Will depend on enrichment. >>>>>>>>>>>>>> * pcap - All code specific to the topology dedicated to >>>>>>>>>>>>>>pcap. >>>>>>>>>>>>>>Would >>>>>>>>>>>>>> also include scripts, property files, flux files and pcap >>>>>>>>>>>>>>integration >>>>>>>>>>>>>> test. This project will depend on common, test (test scope) >>>>>>>>>>>>>>and >>>>>>>>>>>>>> integration-test (test scope). >>>>>>>>>>>>>> * api - This will serve as a generic replacement for >>>>>>>>>>>>>> Metron-Pcap_Service. Will contain all code to build a Metron >>>>>>>>>>>>>>web service >>>>>>>>>>>>>> middle layer that can expose APIs through REST or other >>>>>>>>>>>>>>client >>>>>>>>>>>>>>protocols. >>>>>>>>>>>>>> Could possibly depend on all other projects or separated >>>>>>>>>>>>>>further >>>>>>>>>>>>>>if version >>>>>>>>>>>>>> conflicts arise (separate api projects for solr and >>>>>>>>>>>>>>elasticsearch for >>>>>>>>>>>>>> example). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Looking forward to hearing everyone's feedback and great >>>>>>>>>>>>>>ideas. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ryan Merriman >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Nick Allen <n...@nickallen.org> >>>>>>>>>> >>>>>> >>>>> >>>>> >>> >