Mike, I think we are in agreement: any solution involving in-memory components would have them running in separate processes vs. a single process like they do now.
> On Nov 29, 2017, at 9:14 AM, Michael Miklavcic <michael.miklav...@gmail.com> > wrote: > > I understood the item on "in-memory components" to be similar to what we're > currently doing in the integration tests, which we cannot and should not > do. They are spun up in a single jvm process which causes major problems > with classpath isolation. My main point here is to be sure each component > is separate from one another, and that they can be utilized for both the > e2e and integration tests. > >> On Nov 29, 2017 8:00 AM, "Ryan Merriman" <merrim...@gmail.com> wrote: >> >> Thanks for the feedback so far everyone. All good points. >> >> Otto, if we did decide to go down the Docker route, we could >> use /master/metron-contrib/metron-docker as a starting point. The reason >> I >> initially create that module was to support Management UI testing because >> full dev was unusable for that purpose at that time. This is the same use >> case. A lot of the work has already been done but we would need to review >> it and bring it up to date with the current state of master. Once we get >> it to a point where we can manually spin up the Docker environment and get >> the e2e tests to pass, we would then need to add it into our Travis >> workflow. >> >> Mike, yes this is one of the options I listed at the start of the discuss >> thread although I'm not sure I agree with the Docker disadvantages you >> list. We could use a similar approach for HDFS in Docker by setting it to >> local FS and creating a shared volume that all the containers have access >> to. I've also found that Docker Compose makes the networking part much >> easier. What other advantages would in-memory components in separate >> process offer us that you can think of? Are there other disadvantages with >> using Docker? >> >> Justin, I think that's a really good point and I would be on board with >> it. I see this use case (e2e testing infrastructure) as a good way to >> evaluate our options without making major changes across our codebase. I >> would agree that standardizing on an approach would be ideal and something >> we should work towards. The debugging request is also something that would >> be extremely helpful. The only issue I see is debugging a Storm topology, >> this would still need to be run locally using LocalCluster because remote >> debugging does not work well in Storm (per previous comments from Storm >> committers). At one point I was able to get this to work with Docker >> containers but we would definitely need to revisit it and create tooling >> around it. >> >> So in summary, I think we agree on these points so far: >> >> - no one seems to be in favor of mocking our backend so I'll take that >> option off the table >> - everyone seems to be in favor of moving to a strategy where we spin up >> backend services at the beginning of all tests and spin down at the end, >> rather than spinning up/down for each class or suite of tests >> - the ability to debug our code locally is important and something to >> keep in mind as we evaluate our options >> >> I think the next step is to decide whether we pursue in-memory/separate >> process vs Docker. Having used both, there are a couple disadvantages I >> see with the in-memory approach: >> >> - The in-memory components are different than real installations and >> come with separate issues. There have been cases where an in-memory >> component had a bug (looking at you Kafka) that a normal installation >> wouldn't have and required effort to put workarounds in place. >> - Spinning up the in-memory components in separate processes and >> managing their life cycles is not a trivial task. In Otto's words, I >> believe this will inevitably become a "large chuck of custom development >> that has to be maintained". Docker Compose exposes a declarative >> interface >> that is much simpler in my opinion (check out >> https://github.com/apache/metron/blob/master/metron- >> contrib/metron-docker/compose/docker-compose.yml >> as an example). I also think our testing infrastructure will be more >> accessible to outside contributors because Docker is a common skill in >> the >> industry. Otherwise a contributor would have to come up to speed with >> our >> custom in-memory process module before being able to make any meaningful >> contributions. >> >> I can live with the first one but the second one is a big issue IMO. Even >> if we do decide to use the in-memory components I think we need to delegate >> the process management stuff to another framework not maintained by us. >> >> How do others feel? What other considerations are there? >> >> On Wed, Nov 29, 2017 at 6:59 AM, Justin Leet <justinjl...@gmail.com> >> wrote: >> >>> As an additional consideration, it would be really nice to get our >> current >>> set of integration tests to be able to be run on this infrastructure as >>> well. Or at least able to be converted in a known manner. Eventually, we >>> could probably split out the integration tests from the unit tests >>> entirely. It would likely improve the build times if we we're reusing the >>> components between test classes (keep in mind right now, we only reuse >>> between test cases in a given class). >>> >>> In my mind, ideally we have a single infra for integration and e2e tests. >>> I'd like to be able to run them from IntelliJ and debug them directly (or >>> at least be able to easily, and in a well documented manner, be able to >> do >>> remote debugging of them). Obviously, that's easier said than done, but >>> what I'd like to avoid us having essentially two different ways to do the >>> same thing (spin up some our of dependency components and run code >> against >>> them). I'm worried that's quick vs full dev all over again. But without >> us >>> being able to easily kill one because half of tests depend on one and >> half >>> on the other. >>> >>> On Wed, Nov 29, 2017 at 1:22 AM, Michael Miklavcic < >>> michael.miklav...@gmail.com> wrote: >>> >>>> What about just spinning up each of the components in their own >> process? >>>> It's even lighter weight, doesn't have the complications for HDFS (you >>> can >>>> use the local FS easily, for example), and doesn't have any issues >> around >>>> ports and port mapping with the containers. >>>> >>>> On Tue, Nov 28, 2017 at 3:48 PM, Otto Fowler <ottobackwa...@gmail.com> >>>> wrote: >>>> >>>>> As long as there is not a large chuck of custom deployment that has >> to >>> be >>>>> maintained docker sounds ideal. >>>>> I would like to understand what it would take to create the docker >> e2e >>>> env. >>>>> >>>>> >>>>> >>>>> On November 28, 2017 at 17:27:13, Ryan Merriman (merrim...@gmail.com >> ) >>>>> wrote: >>>>> >>>>> Currently the e2e tests for our Alerts UI depends on full dev being >> up >>>> and >>>>> running. This is not a good long term solution because it forces a >>>>> contributor/reviewer to run the tests manually with full dev running. >>> It >>>>> would be better if the backend services could be made available to >> the >>>> e2e >>>>> tests while running in Travis. This would allow us to add the e2e >> tests >>>> to >>>>> our automated build process. >>>>> >>>>> What is the right approach? Here are some options I can think of: >>>>> >>>>> - Use the in-memory components we use for the backend integration >> tests >>>>> - Use a Docker approach >>>>> - Use mock components designed for the e2e tests >>>>> >>>>> Mocking the backend would be my least favorite option because it >> would >>>>> introduce a complex module of code that we have to maintain. >>>>> >>>>> The in-memory approach has some shortcomings but we may be able to >>> solve >>>>> some of those by moving components to their own process and spinning >>> them >>>>> up/down at the beginning/end of tests. Plus we are already using >> them. >>>>> >>>>> My preference would be Docker because it most closely mimics a real >>>>> installation and gives you isolation, networking and dependency >>>> management >>>>> features OOTB. In many cases Dockerfiles are maintained and published >>> by >>>> a >>>>> third party and require no work other than some setup like loading >> data >>>> or >>>>> templates/schemas. Elasticsearch is a good example. >>>>> >>>>> I believe we could make any of these approaches work in Travis. What >>> does >>>>> everyone think? >>>>> >>>>> Ryan >>>>> >>>> >>> >>