We will also need an ES and Metron REST container for the e2e tests, but yeah you get the idea. I think having the tests be responsible for setup will work.
Maybe the next step is to build a simple example and let everyone try it out. If we don't like it, we can throw it away. On Wed, Nov 29, 2017 at 1:28 PM, Otto Fowler <ottobackwa...@gmail.com> wrote: > So we will just have a : > > ZK container > Kafka Container > HDFS Container > > and not deploy any metron stuff to them in the docker setup, the test > itself will deploy what it needs and cleanup? > > > On November 29, 2017 at 11:53:46, Ryan Merriman (merrim...@gmail.com) > wrote: > > “I would feel better using docker if each docker container only had the > base services, and did not require a separate but parallel deployment path > to ambari” > > This exactly how it works. There is a container for each base service, > just like we now have an in-memory component for each base service. There > is also no deployment path to Ambari. Ambari is not involved at all. > > From a client perspective (our e2e/integration tests in this case) there > really is not much of a difference. At the end of the day services are up > and running and available on various ports. > > Also there is going to be maintenance required no matter what approach we > decide on. If we add another ES template that needs to be loaded by the > MPack, our e2e/integration test infrastructure will also have to load that > template. I have had to do this with our current integration tests. > > > On Nov 29, 2017, at 9:38 AM, Otto Fowler <ottobackwa...@gmail.com> > wrote: > > > > So the issue with metron-docker is that it is all custom setup for > metron components, and understanding how to maintain it when you make > changes to the system is difficult for the developers. > > This is a particular issue to me, because I would have to re-write a big > chunk of it to accommodate 777. > > > > I would feel better using docker if each docker container only had the > base services, and did not require a separate but parallel deployment path > to ambari. That is to say if the docker components > > were functional equivalent and limited to the in memory components > functionality and usage. I apologize if that is in fact what you are > getting at. > > > > Then we could move the integrations and e2e to them. > > > > > > > >> On November 29, 2017 at 10:00:20, Ryan Merriman (merrim...@gmail.com) > wrote: > >> > >> Thanks for the feedback so far everyone. All good points. > >> > >> Otto, if we did decide to go down the Docker route, we could > >> use /master/metron-contrib/metron-docker as a starting point. The > reason I > >> initially create that module was to support Management UI testing > because > >> full dev was unusable for that purpose at that time. This is the same > use > >> case. A lot of the work has already been done but we would need to > review > >> it and bring it up to date with the current state of master. Once we > get > >> it to a point where we can manually spin up the Docker environment and > get > >> the e2e tests to pass, we would then need to add it into our Travis > >> workflow. > >> > >> Mike, yes this is one of the options I listed at the start of the > discuss > >> thread although I'm not sure I agree with the Docker disadvantages you > >> list. We could use a similar approach for HDFS in Docker by setting it > to > >> local FS and creating a shared volume that all the containers have > access > >> to. I've also found that Docker Compose makes the networking part much > >> easier. What other advantages would in-memory components in separate > >> process offer us that you can think of? Are there other disadvantages > with > >> using Docker? > >> > >> Justin, I think that's a really good point and I would be on board with > >> it. I see this use case (e2e testing infrastructure) as a good way to > >> evaluate our options without making major changes across our codebase. > I > >> would agree that standardizing on an approach would be ideal and > something > >> we should work towards. The debugging request is also something that > would > >> be extremely helpful. The only issue I see is debugging a Storm > topology, > >> this would still need to be run locally using LocalCluster because > remote > >> debugging does not work well in Storm (per previous comments from Storm > >> committers). At one point I was able to get this to work with Docker > >> containers but we would definitely need to revisit it and create > tooling > >> around it. > >> > >> So in summary, I think we agree on these points so far: > >> > >> - no one seems to be in favor of mocking our backend so I'll take that > >> option off the table > >> - everyone seems to be in favor of moving to a strategy where we spin > up > >> backend services at the beginning of all tests and spin down at the > end, > >> rather than spinning up/down for each class or suite of tests > >> - the ability to debug our code locally is important and something to > >> keep in mind as we evaluate our options > >> > >> I think the next step is to decide whether we pursue in-memory/separate > >> process vs Docker. Having used both, there are a couple disadvantages I > >> see with the in-memory approach: > >> > >> - The in-memory components are different than real installations and > >> come with separate issues. There have been cases where an in-memory > >> component had a bug (looking at you Kafka) that a normal installation > >> wouldn't have and required effort to put workarounds in place. > >> - Spinning up the in-memory components in separate processes and > >> managing their life cycles is not a trivial task. In Otto's words, I > >> believe this will inevitably become a "large chuck of custom > development > >> that has to be maintained". Docker Compose exposes a declarative > interface > >> that is much simpler in my opinion (check out > >> https://github.com/apache/metron/blob/master/metron- > contrib/metron-docker/compose/docker-compose.yml > >> as an example). I also think our testing infrastructure will be more > >> accessible to outside contributors because Docker is a common skill in > the > >> industry. Otherwise a contributor would have to come up to speed with > our > >> custom in-memory process module before being able to make any > meaningful > >> contributions. > >> > >> I can live with the first one but the second one is a big issue IMO. > Even > >> if we do decide to use the in-memory components I think we need to > delegate > >> the process management stuff to another framework not maintained by us. > >> > >> How do others feel? What other considerations are there? > >> > >> On Wed, Nov 29, 2017 at 6:59 AM, Justin Leet <justinjl...@gmail.com> > wrote: > >> > >> > As an additional consideration, it would be really nice to get our > current > >> > set of integration tests to be able to be run on this infrastructure > as > >> > well. Or at least able to be converted in a known manner. Eventually, > we > >> > could probably split out the integration tests from the unit tests > >> > entirely. It would likely improve the build times if we we're reusing > the > >> > components between test classes (keep in mind right now, we only > reuse > >> > between test cases in a given class). > >> > > >> > In my mind, ideally we have a single infra for integration and e2e > tests. > >> > I'd like to be able to run them from IntelliJ and debug them directly > (or > >> > at least be able to easily, and in a well documented manner, be able > to do > >> > remote debugging of them). Obviously, that's easier said than done, > but > >> > what I'd like to avoid us having essentially two different ways to do > the > >> > same thing (spin up some our of dependency components and run code > against > >> > them). I'm worried that's quick vs full dev all over again. But > without us > >> > being able to easily kill one because half of tests depend on one and > half > >> > on the other. > >> > > >> > On Wed, Nov 29, 2017 at 1:22 AM, Michael Miklavcic < > >> > michael.miklav...@gmail.com> wrote: > >> > > >> > > What about just spinning up each of the components in their own > process? > >> > > It's even lighter weight, doesn't have the complications for HDFS > (you > >> > can > >> > > use the local FS easily, for example), and doesn't have any issues > around > >> > > ports and port mapping with the containers. > >> > > > >> > > On Tue, Nov 28, 2017 at 3:48 PM, Otto Fowler < > ottobackwa...@gmail.com> > >> > > wrote: > >> > > > >> > > > As long as there is not a large chuck of custom deployment that > has to > >> > be > >> > > > maintained docker sounds ideal. > >> > > > I would like to understand what it would take to create the > docker e2e > >> > > env. > >> > > > > >> > > > > >> > > > > >> > > > On November 28, 2017 at 17:27:13, Ryan Merriman ( > merrim...@gmail.com) > >> > > > wrote: > >> > > > > >> > > > Currently the e2e tests for our Alerts UI depends on full dev > being up > >> > > and > >> > > > running. This is not a good long term solution because it forces > a > >> > > > contributor/reviewer to run the tests manually with full dev > running. > >> > It > >> > > > would be better if the backend services could be made available > to the > >> > > e2e > >> > > > tests while running in Travis. This would allow us to add the e2e > tests > >> > > to > >> > > > our automated build process. > >> > > > > >> > > > What is the right approach? Here are some options I can think of: > >> > > > > >> > > > - Use the in-memory components we use for the backend integration > tests > >> > > > - Use a Docker approach > >> > > > - Use mock components designed for the e2e tests > >> > > > > >> > > > Mocking the backend would be my least favorite option because it > would > >> > > > introduce a complex module of code that we have to maintain. > >> > > > > >> > > > The in-memory approach has some shortcomings but we may be able > to > >> > solve > >> > > > some of those by moving components to their own process and > spinning > >> > them > >> > > > up/down at the beginning/end of tests. Plus we are already using > them. > >> > > > > >> > > > My preference would be Docker because it most closely mimics a > real > >> > > > installation and gives you isolation, networking and dependency > >> > > management > >> > > > features OOTB. In many cases Dockerfiles are maintained and > published > >> > by > >> > > a > >> > > > third party and require no work other than some setup like > loading data > >> > > or > >> > > > templates/schemas. Elasticsearch is a good example. > >> > > > > >> > > > I believe we could make any of these approaches work in Travis. > What > >> > does > >> > > > everyone think? > >> > > > > >> > > > Ryan > >> > > > > >> > > > >> > > >