Re: [DISCUSS] Project reorganization

Casey Stella Mon, 11 Apr 2016 08:58:31 -0700

I'm in general in favor of keeping an integration test project only for
integration test infrastructure (i.e. The inmemory components) and having
the integration tests live in the projects that have the components that
are being tested.


On Mon, Apr 11, 2016 at 11:36 David Lyle <dlyle65...@gmail.com> wrote:

> I think I was thinking along the same lines as James, let me read it back
> to make sure:
>
> Metron
>   Platform
>      Common (*)
>      Integration-Test (*)
>      DataManagement
>      PCAP
>      Parsers
>      Enrichment
>        Solr
>        Elasticsearch
>   Deployment
>   Streaming
>   UI
>
> For Common and Integration-Test, I'd be interested in a little more
> discussion around keeping them. I lean toward not having them. I understand
> and support the goal of reuse, but I've found these catch-all projects
> don't always facilitate that aim. We may be better served in the long run
> by aligning these classes with their initial users. For example, wouldn't
> all the bolt interfaces and abstract classes be better homed in Enrichment?
> Configuration classes may be best as a separate project under Platform? The
> classes in Metron-Testing may have to stick around as a separate project-
> but perhaps not, they seem to be tightly aligned with enrichment type
> integration testing.
>
> Also- since we're going to have to refactor the poms as part of this
> effort, there are some first order principles that'd I'd be interested in
> hearing other's thoughts about:
>
> 1) mvn (whatever) should run from the top level and each sub-module.
> 2) The top level pom should use a dependencyManagement section to avoid
> global_version type variables.
> 3) All plugins and dependencies should have a specified version (fwiw, I
> think we're pretty good here, but it's worth a look)
> 4) Versioning- master/trunk should be version-SNAPSHOT.
> 5) Other thoughts?
>
>
> -D...
>
>
> On Sun, Apr 10, 2016 at 8:31 PM, James Sirota <jsir...@hortonworks.com>
> wrote:
>
> > Hi Debo,
> >
> > I think it would be great if you set it up
> >
> > Thanks,
> > James
> >
> >
> >
> >
> > On 4/10/16, 6:25 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote:
> >
> > >I have set it up for another open source effort in the past and it was
> > not very hard. Am happy to volunteer if needed.
> > >
> > >Thx
> > >Debo
> > >
> > >Sent from my iPhone
> > >
> > >> On Apr 10, 2016, at 5:53 PM, James Sirota <jsir...@hortonworks.com>
> > wrote:
> > >>
> > >> I’d be open to an IRC channel.  Does anyone know if Apache allows
> > this?  If yes, does anyone know how to set one up?
> > >>
> > >> Thanks,
> > >> James
> > >>
> > >>
> > >>
> > >>
> > >>> On 4/10/16, 4:52 PM, "Debojyoti Dutta" <ddu...@gmail.com> wrote:
> > >>>
> > >>> Hi Nick
> > >>>
> > >>> I like your suggestions. For the enrichment layer do you think it
> > would also include any advanced analytics. Else we might want to have an
> > analytics layer.
> > >>>
> > >>> It would be good to have an arch which could be extended for new
> > functionality.
> > >>>
> > >>> However Ryan's suggestion of the ui API and deployer also makes
> sense.
> > >>>
> > >>> Should we have an IRC channel to discuss this or maybe etherpad?
> > >>>
> > >>> Debo
> > >>>
> > >>> Sent from my iPhone
> > >>>
> > >>>> On Apr 10, 2016, at 4:36 PM, Nick Allen <n...@nickallen.org> wrote:
> > >>>>
> > >>>> It might help to think of our code base as four separate types of
> > >>>> functionality.  This is primarily meant to give us a framework to
> > think
> > >>>> about the organization of Metron (and drive more discussion), rather
> > than
> > >>>> my proposal for a specific structure.
> > >>>>
> > >>>>  - Sensor - Anything that captures external, non-streaming data and
> > >>>>  presents it in a form ready for stream processing.
> > >>>>  - Input - Responsible for preparing streaming data for enrichment.
> > The
> > >>>>  existing "parsers" fit neatly into this space.
> > >>>>  - Enrichment - Responsible for enriching an incoming data feed like
> > >>>>  geoip, asset enrichment, threat intel lookups, etc.
> > >>>>  - Output - Responsible for persisting data that has been processed
> by
> > >>>>  Metron which obviously means search indexers or data stores.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Fri, Apr 8, 2016 at 4:46 PM, Ryan Merriman <
> > rmerri...@hortonworks.com>
> > >>>> wrote:
> > >>>>
> > >>>>> All,
> > >>>>>
> > >>>>> I would like to propose a review and refactor of the current
> project
> > >>>>> organization within Metron.  Much of the way the legacy code was
> > organized
> > >>>>> does not make sense anymore and could be designed so that it is
> > easier to
> > >>>>> navigate and understand.  Our test coverage has increased
> > substantially so
> > >>>>> I believe we can do this with confidence.
> > >>>>>
> > >>>>> First off, I think we should agree on a naming convention.  I see
> > some
> > >>>>> projects (YARN and Storm for example) that prepend the sub-project
> > with the
> > >>>>> name of the top-level project (storm-core for example).  Metron
> also
> > >>>>> currently does this (Metron-Common).  I think that's fine, although
> > in the
> > >>>>> case of Metron, I feel like having "Metron" prepended is redundant.
> > >>>>> Regardless of whether we decide to stick with that approach, I
> > propose that
> > >>>>> project names be uniform and lowercase.  For example, under these
> > >>>>> assumptions "Metron-Common" would change to "common".
> > >>>>>
> > >>>>> The first level of organization makes sense to me.  Only change I
> > would
> > >>>>> make would be to project names:
> > >>>>>
> > >>>>> *   deployment
> > >>>>> *   streaming
> > >>>>> *   ui
> > >>>>>
> > >>>>> Or if we want to keep metron in project names:
> > >>>>>
> > >>>>> *   metron-deployment
> > >>>>> *   metron-streaming
> > >>>>> *   metron-ui
> > >>>>>
> > >>>>> For now I don't see any changes necessary in deployment or ui
> > >>>>> organization.  I see the streaming project structure primarily
> > driven by 2
> > >>>>> things:  the Maven dependency tree and deployment targets.  For
> > example,
> > >>>>> solr and elasticsearch code should be separated (because their
> > dependency
> > >>>>> on lucene conflicts) but both will depend on common enrichment
> > code.  Also,
> > >>>>> now that parser, enrichment and pcap topologies are separate, code
> > for
> > >>>>> those topologies will be deployed as separate jars.  No reason to
> > include
> > >>>>> parser code in enrichment topologies and vice-versa.  Any other
> > >>>>> considerations I'm missing?
> > >>>>>
> > >>>>> With that being said, here is my initial proposal:
> > >>>>>
> > >>>>> *   common -  Any common code that all topologies depend on
> > >>>>> (configuration classes, generic writers for example).  No
> > dependencies on
> > >>>>> other Metron projects.
> > >>>>> *   test - Contains utilities for writing unit tests, sample
> configs
> > and
> > >>>>> sample data.  Will depend on common.
> > >>>>> *   integration-test - Contains utilities and classes needed to run
> > our
> > >>>>> integration tests (in memory components for example).  Will depend
> on
> > >>>>> common and test.
> > >>>>> *   dataload - Contains all code related to data loading.  Will
> also
> > >>>>> include any property files needed and integration tests.  Will
> > depend on
> > >>>>> common, test (test scope), and integration-test (test scope).
> > >>>>> *   parser - All code specific to the parser topologies.  Would
> also
> > >>>>> include scripts, property files, flux files and parser topology
> > integration
> > >>>>> tests.  This project will depend on common, test (test scope), and
> > >>>>> integration-testing (test scope).
> > >>>>> *   enrichment - All code specific to the enrichment topologies
> > (except
> > >>>>> solr and elasticsearch).  Would also include scripts, property
> > files, flux
> > >>>>> files and enrichment topology integration tests.  This project will
> > depend
> > >>>>> on common, test (test scope), and integration-test (test scope).
> > >>>>> *   elasticsearch - All Elasticsearch related code.  Will depend on
> > >>>>> enrichment.
> > >>>>> *   solr - All Solr related code.  Will depend on enrichment.
> > >>>>> *   pcap - All code specific to the topology dedicated to pcap.
> > Would
> > >>>>> also include scripts, property files, flux files and pcap
> integration
> > >>>>> test.  This project will depend on common, test (test scope) and
> > >>>>> integration-test (test scope).
> > >>>>> *   api - This will serve as a generic replacement for
> > >>>>> Metron-Pcap_Service.  Will contain all code to build a Metron web
> > service
> > >>>>> middle layer that can expose APIs through REST or other client
> > protocols.
> > >>>>> Could possibly depend on all other projects or separated further if
> > version
> > >>>>> conflicts arise (separate api projects for solr and elasticsearch
> for
> > >>>>> example).
> > >>>>>
> > >>>>> Looking forward to hearing everyone's feedback and great ideas.
> > >>>>>
> > >>>>> Ryan Merriman
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Nick Allen <n...@nickallen.org>
> > >>>
> > >
> >
>

Re: [DISCUSS] Project reorganization

Reply via email to