Thanks for the feedback so far everyone. All good points. Otto, if we did decide to go down the Docker route, we could use /master/metron-contrib/metron-docker as a starting point. The reason I initially create that module was to support Management UI testing because full dev was unusable for that purpose at that time. This is the same use case. A lot of the work has already been done but we would need to review it and bring it up to date with the current state of master. Once we get it to a point where we can manually spin up the Docker environment and get the e2e tests to pass, we would then need to add it into our Travis workflow.
Mike, yes this is one of the options I listed at the start of the discuss thread although I'm not sure I agree with the Docker disadvantages you list. We could use a similar approach for HDFS in Docker by setting it to local FS and creating a shared volume that all the containers have access to. I've also found that Docker Compose makes the networking part much easier. What other advantages would in-memory components in separate process offer us that you can think of? Are there other disadvantages with using Docker? Justin, I think that's a really good point and I would be on board with it. I see this use case (e2e testing infrastructure) as a good way to evaluate our options without making major changes across our codebase. I would agree that standardizing on an approach would be ideal and something we should work towards. The debugging request is also something that would be extremely helpful. The only issue I see is debugging a Storm topology, this would still need to be run locally using LocalCluster because remote debugging does not work well in Storm (per previous comments from Storm committers). At one point I was able to get this to work with Docker containers but we would definitely need to revisit it and create tooling around it. So in summary, I think we agree on these points so far: - no one seems to be in favor of mocking our backend so I'll take that option off the table - everyone seems to be in favor of moving to a strategy where we spin up backend services at the beginning of all tests and spin down at the end, rather than spinning up/down for each class or suite of tests - the ability to debug our code locally is important and something to keep in mind as we evaluate our options I think the next step is to decide whether we pursue in-memory/separate process vs Docker. Having used both, there are a couple disadvantages I see with the in-memory approach: - The in-memory components are different than real installations and come with separate issues. There have been cases where an in-memory component had a bug (looking at you Kafka) that a normal installation wouldn't have and required effort to put workarounds in place. - Spinning up the in-memory components in separate processes and managing their life cycles is not a trivial task. In Otto's words, I believe this will inevitably become a "large chuck of custom development that has to be maintained". Docker Compose exposes a declarative interface that is much simpler in my opinion (check out https://github.com/apache/metron/blob/master/metron-contrib/metron-docker/compose/docker-compose.yml as an example). I also think our testing infrastructure will be more accessible to outside contributors because Docker is a common skill in the industry. Otherwise a contributor would have to come up to speed with our custom in-memory process module before being able to make any meaningful contributions. I can live with the first one but the second one is a big issue IMO. Even if we do decide to use the in-memory components I think we need to delegate the process management stuff to another framework not maintained by us. How do others feel? What other considerations are there? On Wed, Nov 29, 2017 at 6:59 AM, Justin Leet <justinjl...@gmail.com> wrote: > As an additional consideration, it would be really nice to get our current > set of integration tests to be able to be run on this infrastructure as > well. Or at least able to be converted in a known manner. Eventually, we > could probably split out the integration tests from the unit tests > entirely. It would likely improve the build times if we we're reusing the > components between test classes (keep in mind right now, we only reuse > between test cases in a given class). > > In my mind, ideally we have a single infra for integration and e2e tests. > I'd like to be able to run them from IntelliJ and debug them directly (or > at least be able to easily, and in a well documented manner, be able to do > remote debugging of them). Obviously, that's easier said than done, but > what I'd like to avoid us having essentially two different ways to do the > same thing (spin up some our of dependency components and run code against > them). I'm worried that's quick vs full dev all over again. But without us > being able to easily kill one because half of tests depend on one and half > on the other. > > On Wed, Nov 29, 2017 at 1:22 AM, Michael Miklavcic < > michael.miklav...@gmail.com> wrote: > > > What about just spinning up each of the components in their own process? > > It's even lighter weight, doesn't have the complications for HDFS (you > can > > use the local FS easily, for example), and doesn't have any issues around > > ports and port mapping with the containers. > > > > On Tue, Nov 28, 2017 at 3:48 PM, Otto Fowler <ottobackwa...@gmail.com> > > wrote: > > > > > As long as there is not a large chuck of custom deployment that has to > be > > > maintained docker sounds ideal. > > > I would like to understand what it would take to create the docker e2e > > env. > > > > > > > > > > > > On November 28, 2017 at 17:27:13, Ryan Merriman (merrim...@gmail.com) > > > wrote: > > > > > > Currently the e2e tests for our Alerts UI depends on full dev being up > > and > > > running. This is not a good long term solution because it forces a > > > contributor/reviewer to run the tests manually with full dev running. > It > > > would be better if the backend services could be made available to the > > e2e > > > tests while running in Travis. This would allow us to add the e2e tests > > to > > > our automated build process. > > > > > > What is the right approach? Here are some options I can think of: > > > > > > - Use the in-memory components we use for the backend integration tests > > > - Use a Docker approach > > > - Use mock components designed for the e2e tests > > > > > > Mocking the backend would be my least favorite option because it would > > > introduce a complex module of code that we have to maintain. > > > > > > The in-memory approach has some shortcomings but we may be able to > solve > > > some of those by moving components to their own process and spinning > them > > > up/down at the beginning/end of tests. Plus we are already using them. > > > > > > My preference would be Docker because it most closely mimics a real > > > installation and gives you isolation, networking and dependency > > management > > > features OOTB. In many cases Dockerfiles are maintained and published > by > > a > > > third party and require no work other than some setup like loading data > > or > > > templates/schemas. Elasticsearch is a good example. > > > > > > I believe we could make any of these approaches work in Travis. What > does > > > everyone think? > > > > > > Ryan > > > > > >