So the issue with metron-docker is that it is all custom setup for metron
components, and understanding how to maintain it when you make changes to
the system is difficult for the developers.
This is a particular issue to me, because I would have to re-write a big
chunk of it to accommodate 777.

I would feel better using docker if each docker container only had the base
services, and did not require a separate but parallel deployment path to
ambari.  That is to say if the docker components
were functional equivalent and limited to the in memory components
functionality and usage.  I apologize if that is in fact what you are
getting at.

Then we could move the integrations and e2e to them.



On November 29, 2017 at 10:00:20, Ryan Merriman (merrim...@gmail.com) wrote:

Thanks for the feedback so far everyone. All good points.

Otto, if we did decide to go down the Docker route, we could
use /master/metron-contrib/metron-docker as a starting point. The reason I
initially create that module was to support Management UI testing because
full dev was unusable for that purpose at that time. This is the same use
case. A lot of the work has already been done but we would need to review
it and bring it up to date with the current state of master. Once we get
it to a point where we can manually spin up the Docker environment and get
the e2e tests to pass, we would then need to add it into our Travis
workflow.

Mike, yes this is one of the options I listed at the start of the discuss
thread although I'm not sure I agree with the Docker disadvantages you
list. We could use a similar approach for HDFS in Docker by setting it to
local FS and creating a shared volume that all the containers have access
to. I've also found that Docker Compose makes the networking part much
easier. What other advantages would in-memory components in separate
process offer us that you can think of? Are there other disadvantages with
using Docker?

Justin, I think that's a really good point and I would be on board with
it. I see this use case (e2e testing infrastructure) as a good way to
evaluate our options without making major changes across our codebase. I
would agree that standardizing on an approach would be ideal and something
we should work towards. The debugging request is also something that would
be extremely helpful. The only issue I see is debugging a Storm topology,
this would still need to be run locally using LocalCluster because remote
debugging does not work well in Storm (per previous comments from Storm
committers). At one point I was able to get this to work with Docker
containers but we would definitely need to revisit it and create tooling
around it.

So in summary, I think we agree on these points so far:

- no one seems to be in favor of mocking our backend so I'll take that
option off the table
- everyone seems to be in favor of moving to a strategy where we spin up
backend services at the beginning of all tests and spin down at the end,
rather than spinning up/down for each class or suite of tests
- the ability to debug our code locally is important and something to
keep in mind as we evaluate our options

I think the next step is to decide whether we pursue in-memory/separate
process vs Docker. Having used both, there are a couple disadvantages I
see with the in-memory approach:

- The in-memory components are different than real installations and
come with separate issues. There have been cases where an in-memory
component had a bug (looking at you Kafka) that a normal installation
wouldn't have and required effort to put workarounds in place.
- Spinning up the in-memory components in separate processes and
managing their life cycles is not a trivial task. In Otto's words, I
believe this will inevitably become a "large chuck of custom development
that has to be maintained". Docker Compose exposes a declarative interface
that is much simpler in my opinion (check out
https://github.com/apache/metron/blob/master/metron-contrib/metron-docker/compose/docker-compose.yml
as an example). I also think our testing infrastructure will be more
accessible to outside contributors because Docker is a common skill in the
industry. Otherwise a contributor would have to come up to speed with our
custom in-memory process module before being able to make any meaningful
contributions.

I can live with the first one but the second one is a big issue IMO. Even
if we do decide to use the in-memory components I think we need to delegate
the process management stuff to another framework not maintained by us.

How do others feel? What other considerations are there?

On Wed, Nov 29, 2017 at 6:59 AM, Justin Leet <justinjl...@gmail.com> wrote:

> As an additional consideration, it would be really nice to get our
current
> set of integration tests to be able to be run on this infrastructure as
> well. Or at least able to be converted in a known manner. Eventually, we
> could probably split out the integration tests from the unit tests
> entirely. It would likely improve the build times if we we're reusing the
> components between test classes (keep in mind right now, we only reuse
> between test cases in a given class).
>
> In my mind, ideally we have a single infra for integration and e2e tests.
> I'd like to be able to run them from IntelliJ and debug them directly (or
> at least be able to easily, and in a well documented manner, be able to
do
> remote debugging of them). Obviously, that's easier said than done, but
> what I'd like to avoid us having essentially two different ways to do the
> same thing (spin up some our of dependency components and run code
against
> them). I'm worried that's quick vs full dev all over again. But without
us
> being able to easily kill one because half of tests depend on one and
half
> on the other.
>
> On Wed, Nov 29, 2017 at 1:22 AM, Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > What about just spinning up each of the components in their own
process?
> > It's even lighter weight, doesn't have the complications for HDFS (you
> can
> > use the local FS easily, for example), and doesn't have any issues
around
> > ports and port mapping with the containers.
> >
> > On Tue, Nov 28, 2017 at 3:48 PM, Otto Fowler <ottobackwa...@gmail.com>
> > wrote:
> >
> > > As long as there is not a large chuck of custom deployment that has
to
> be
> > > maintained docker sounds ideal.
> > > I would like to understand what it would take to create the docker
e2e
> > env.
> > >
> > >
> > >
> > > On November 28, 2017 at 17:27:13, Ryan Merriman (merrim...@gmail.com)
> > > wrote:
> > >
> > > Currently the e2e tests for our Alerts UI depends on full dev being
up
> > and
> > > running. This is not a good long term solution because it forces a
> > > contributor/reviewer to run the tests manually with full dev running.
> It
> > > would be better if the backend services could be made available to
the
> > e2e
> > > tests while running in Travis. This would allow us to add the e2e
tests
> > to
> > > our automated build process.
> > >
> > > What is the right approach? Here are some options I can think of:
> > >
> > > - Use the in-memory components we use for the backend integration
tests
> > > - Use a Docker approach
> > > - Use mock components designed for the e2e tests
> > >
> > > Mocking the backend would be my least favorite option because it
would
> > > introduce a complex module of code that we have to maintain.
> > >
> > > The in-memory approach has some shortcomings but we may be able to
> solve
> > > some of those by moving components to their own process and spinning
> them
> > > up/down at the beginning/end of tests. Plus we are already using
them.
> > >
> > > My preference would be Docker because it most closely mimics a real
> > > installation and gives you isolation, networking and dependency
> > management
> > > features OOTB. In many cases Dockerfiles are maintained and published
> by
> > a
> > > third party and require no work other than some setup like loading
data
> > or
> > > templates/schemas. Elasticsearch is a good example.
> > >
> > > I believe we could make any of these approaches work in Travis. What
> does
> > > everyone think?
> > >
> > > Ryan
> > >
> >
>

Reply via email to