Thanks for the feedback so far everyone.  All good points.

Otto, if we did decide to go down the Docker route, we could
use /master/metron-contrib/metron-docker as a starting point.  The reason I
initially create that module was to support Management UI testing because
full dev was unusable for that purpose at that time.  This is the same use
case.  A lot of the work has already been done but we would need to review
it and bring it up to date with the current state of master.  Once we get
it to a point where we can manually spin up the Docker environment and get
the e2e tests to pass, we would then need to add it into our Travis
workflow.

Mike, yes this is one of the options I listed at the start of the discuss
thread although I'm not sure I agree with the Docker disadvantages you
list.  We could use a similar approach for HDFS in Docker by setting it to
local FS and creating a shared volume that all the containers have access
to.  I've also found that Docker Compose makes the networking part much
easier.  What other advantages would in-memory components in separate
process offer us that you can think of?  Are there other disadvantages with
using Docker?

Justin, I think that's a really good point and I would be on board with
it.  I see this use case (e2e testing infrastructure) as a good way to
evaluate our options without making major changes across our codebase.  I
would agree that standardizing on an approach would be ideal and something
we should work towards.  The debugging request is also something that would
be extremely helpful.  The only issue I see is debugging a Storm topology,
this would still need to be run locally using LocalCluster because remote
debugging does not work well in Storm (per previous comments from Storm
committers).  At one point I was able to get this to work with Docker
containers but we would definitely need to revisit it and create tooling
around it.

So in summary, I think we agree on these points so far:

   - no one seems to be in favor of mocking our backend so I'll take that
   option off the table
   - everyone seems to be in favor of moving to a strategy where we spin up
   backend services at the beginning of all tests and spin down at the end,
   rather than spinning up/down for each class or suite of tests
   - the ability to debug our code locally is important and something to
   keep in mind as we evaluate our options

I think the next step is to decide whether we pursue in-memory/separate
process vs Docker.  Having used both, there are a couple disadvantages I
see with the in-memory approach:

   - The in-memory components are different than real installations and
   come with separate issues.  There have been cases where an in-memory
   component had a bug (looking at you Kafka) that a normal installation
   wouldn't have and required effort to put workarounds in place.
   - Spinning up the in-memory components in separate processes and
   managing their life cycles is not a trivial task.  In Otto's words, I
   believe this will inevitably become a "large chuck of custom development
   that has to be maintained".  Docker Compose exposes a declarative interface
   that is much simpler in my opinion (check out
   
https://github.com/apache/metron/blob/master/metron-contrib/metron-docker/compose/docker-compose.yml
   as an example).  I also think our testing infrastructure will be more
   accessible to outside contributors because Docker is a common skill in the
   industry.  Otherwise a contributor would have to come up to speed with our
   custom in-memory process module before being able to make any meaningful
   contributions.

I can live with the first one but the second one is a big issue IMO.  Even
if we do decide to use the in-memory components I think we need to delegate
the process management stuff to another framework not maintained by us.

How do others feel?  What other considerations are there?

On Wed, Nov 29, 2017 at 6:59 AM, Justin Leet <justinjl...@gmail.com> wrote:

> As an additional consideration, it would be really nice to get our current
> set of integration tests to be able to be run on this infrastructure as
> well. Or at least able to be converted in a known manner. Eventually, we
> could probably split out the integration tests from the unit tests
> entirely. It would likely improve the build times if we we're reusing the
> components between test classes (keep in mind right now, we only reuse
> between test cases in a given class).
>
> In my mind, ideally we have a single infra for integration and e2e tests.
> I'd like to be able to run them from IntelliJ and debug them directly (or
> at least be able to easily, and in a well documented manner, be able to do
> remote debugging of them). Obviously, that's easier said than done, but
> what I'd like to avoid us having essentially two different ways to do the
> same thing (spin up some our of dependency components and run code against
> them). I'm worried that's quick vs full dev all over again.  But without us
> being able to easily kill one because half of tests depend on one and half
> on the other.
>
> On Wed, Nov 29, 2017 at 1:22 AM, Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > What about just spinning up each of the components in their own process?
> > It's even lighter weight, doesn't have the complications for HDFS (you
> can
> > use the local FS easily, for example), and doesn't have any issues around
> > ports and port mapping with the containers.
> >
> > On Tue, Nov 28, 2017 at 3:48 PM, Otto Fowler <ottobackwa...@gmail.com>
> > wrote:
> >
> > > As long as there is not a large chuck of custom deployment that has to
> be
> > > maintained docker sounds ideal.
> > > I would like to understand what it would take to create the docker e2e
> > env.
> > >
> > >
> > >
> > > On November 28, 2017 at 17:27:13, Ryan Merriman (merrim...@gmail.com)
> > > wrote:
> > >
> > > Currently the e2e tests for our Alerts UI depends on full dev being up
> > and
> > > running. This is not a good long term solution because it forces a
> > > contributor/reviewer to run the tests manually with full dev running.
> It
> > > would be better if the backend services could be made available to the
> > e2e
> > > tests while running in Travis. This would allow us to add the e2e tests
> > to
> > > our automated build process.
> > >
> > > What is the right approach? Here are some options I can think of:
> > >
> > > - Use the in-memory components we use for the backend integration tests
> > > - Use a Docker approach
> > > - Use mock components designed for the e2e tests
> > >
> > > Mocking the backend would be my least favorite option because it would
> > > introduce a complex module of code that we have to maintain.
> > >
> > > The in-memory approach has some shortcomings but we may be able to
> solve
> > > some of those by moving components to their own process and spinning
> them
> > > up/down at the beginning/end of tests. Plus we are already using them.
> > >
> > > My preference would be Docker because it most closely mimics a real
> > > installation and gives you isolation, networking and dependency
> > management
> > > features OOTB. In many cases Dockerfiles are maintained and published
> by
> > a
> > > third party and require no work other than some setup like loading data
> > or
> > > templates/schemas. Elasticsearch is a good example.
> > >
> > > I believe we could make any of these approaches work in Travis. What
> does
> > > everyone think?
> > >
> > > Ryan
> > >
> >
>

Reply via email to