Re: [DISCUSS] e2e test infrastructure

Ryan Merriman Wed, 29 Nov 2017 08:15:03 -0800

Mike, I think we are in agreement:  any solution involving in-memory components 
would have them running in separate processes vs. a single process like they do 
now.


> On Nov 29, 2017, at 9:14 AM, Michael Miklavcic <michael.miklav...@gmail.com> 
> wrote:
> 
> I understood the item on "in-memory components" to be similar to what we're
> currently doing in the integration tests, which we cannot and should not
> do. They are spun up in a single jvm process which causes major problems
> with classpath isolation. My main point here is to be sure each component
> is separate from one another, and that they can be utilized for both the
> e2e and integration tests.
> 
>> On Nov 29, 2017 8:00 AM, "Ryan Merriman" <merrim...@gmail.com> wrote:
>> 
>> Thanks for the feedback so far everyone.  All good points.
>> 
>> Otto, if we did decide to go down the Docker route, we could
>> use /master/metron-contrib/metron-docker as a starting point.  The reason
>> I
>> initially create that module was to support Management UI testing because
>> full dev was unusable for that purpose at that time.  This is the same use
>> case.  A lot of the work has already been done but we would need to review
>> it and bring it up to date with the current state of master.  Once we get
>> it to a point where we can manually spin up the Docker environment and get
>> the e2e tests to pass, we would then need to add it into our Travis
>> workflow.
>> 
>> Mike, yes this is one of the options I listed at the start of the discuss
>> thread although I'm not sure I agree with the Docker disadvantages you
>> list.  We could use a similar approach for HDFS in Docker by setting it to
>> local FS and creating a shared volume that all the containers have access
>> to.  I've also found that Docker Compose makes the networking part much
>> easier.  What other advantages would in-memory components in separate
>> process offer us that you can think of?  Are there other disadvantages with
>> using Docker?
>> 
>> Justin, I think that's a really good point and I would be on board with
>> it.  I see this use case (e2e testing infrastructure) as a good way to
>> evaluate our options without making major changes across our codebase.  I
>> would agree that standardizing on an approach would be ideal and something
>> we should work towards.  The debugging request is also something that would
>> be extremely helpful.  The only issue I see is debugging a Storm topology,
>> this would still need to be run locally using LocalCluster because remote
>> debugging does not work well in Storm (per previous comments from Storm
>> committers).  At one point I was able to get this to work with Docker
>> containers but we would definitely need to revisit it and create tooling
>> around it.
>> 
>> So in summary, I think we agree on these points so far:
>> 
>>   - no one seems to be in favor of mocking our backend so I'll take that
>>   option off the table
>>   - everyone seems to be in favor of moving to a strategy where we spin up
>>   backend services at the beginning of all tests and spin down at the end,
>>   rather than spinning up/down for each class or suite of tests
>>   - the ability to debug our code locally is important and something to
>>   keep in mind as we evaluate our options
>> 
>> I think the next step is to decide whether we pursue in-memory/separate
>> process vs Docker.  Having used both, there are a couple disadvantages I
>> see with the in-memory approach:
>> 
>>   - The in-memory components are different than real installations and
>>   come with separate issues.  There have been cases where an in-memory
>>   component had a bug (looking at you Kafka) that a normal installation
>>   wouldn't have and required effort to put workarounds in place.
>>   - Spinning up the in-memory components in separate processes and
>>   managing their life cycles is not a trivial task.  In Otto's words, I
>>   believe this will inevitably become a "large chuck of custom development
>>   that has to be maintained".  Docker Compose exposes a declarative
>> interface
>>   that is much simpler in my opinion (check out
>>   https://github.com/apache/metron/blob/master/metron-
>> contrib/metron-docker/compose/docker-compose.yml
>>   as an example).  I also think our testing infrastructure will be more
>>   accessible to outside contributors because Docker is a common skill in
>> the
>>   industry.  Otherwise a contributor would have to come up to speed with
>> our
>>   custom in-memory process module before being able to make any meaningful
>>   contributions.
>> 
>> I can live with the first one but the second one is a big issue IMO.  Even
>> if we do decide to use the in-memory components I think we need to delegate
>> the process management stuff to another framework not maintained by us.
>> 
>> How do others feel?  What other considerations are there?
>> 
>> On Wed, Nov 29, 2017 at 6:59 AM, Justin Leet <justinjl...@gmail.com>
>> wrote:
>> 
>>> As an additional consideration, it would be really nice to get our
>> current
>>> set of integration tests to be able to be run on this infrastructure as
>>> well. Or at least able to be converted in a known manner. Eventually, we
>>> could probably split out the integration tests from the unit tests
>>> entirely. It would likely improve the build times if we we're reusing the
>>> components between test classes (keep in mind right now, we only reuse
>>> between test cases in a given class).
>>> 
>>> In my mind, ideally we have a single infra for integration and e2e tests.
>>> I'd like to be able to run them from IntelliJ and debug them directly (or
>>> at least be able to easily, and in a well documented manner, be able to
>> do
>>> remote debugging of them). Obviously, that's easier said than done, but
>>> what I'd like to avoid us having essentially two different ways to do the
>>> same thing (spin up some our of dependency components and run code
>> against
>>> them). I'm worried that's quick vs full dev all over again.  But without
>> us
>>> being able to easily kill one because half of tests depend on one and
>> half
>>> on the other.
>>> 
>>> On Wed, Nov 29, 2017 at 1:22 AM, Michael Miklavcic <
>>> michael.miklav...@gmail.com> wrote:
>>> 
>>>> What about just spinning up each of the components in their own
>> process?
>>>> It's even lighter weight, doesn't have the complications for HDFS (you
>>> can
>>>> use the local FS easily, for example), and doesn't have any issues
>> around
>>>> ports and port mapping with the containers.
>>>> 
>>>> On Tue, Nov 28, 2017 at 3:48 PM, Otto Fowler <ottobackwa...@gmail.com>
>>>> wrote:
>>>> 
>>>>> As long as there is not a large chuck of custom deployment that has
>> to
>>> be
>>>>> maintained docker sounds ideal.
>>>>> I would like to understand what it would take to create the docker
>> e2e
>>>> env.
>>>>> 
>>>>> 
>>>>> 
>>>>> On November 28, 2017 at 17:27:13, Ryan Merriman (merrim...@gmail.com
>> )
>>>>> wrote:
>>>>> 
>>>>> Currently the e2e tests for our Alerts UI depends on full dev being
>> up
>>>> and
>>>>> running. This is not a good long term solution because it forces a
>>>>> contributor/reviewer to run the tests manually with full dev running.
>>> It
>>>>> would be better if the backend services could be made available to
>> the
>>>> e2e
>>>>> tests while running in Travis. This would allow us to add the e2e
>> tests
>>>> to
>>>>> our automated build process.
>>>>> 
>>>>> What is the right approach? Here are some options I can think of:
>>>>> 
>>>>> - Use the in-memory components we use for the backend integration
>> tests
>>>>> - Use a Docker approach
>>>>> - Use mock components designed for the e2e tests
>>>>> 
>>>>> Mocking the backend would be my least favorite option because it
>> would
>>>>> introduce a complex module of code that we have to maintain.
>>>>> 
>>>>> The in-memory approach has some shortcomings but we may be able to
>>> solve
>>>>> some of those by moving components to their own process and spinning
>>> them
>>>>> up/down at the beginning/end of tests. Plus we are already using
>> them.
>>>>> 
>>>>> My preference would be Docker because it most closely mimics a real
>>>>> installation and gives you isolation, networking and dependency
>>>> management
>>>>> features OOTB. In many cases Dockerfiles are maintained and published
>>> by
>>>> a
>>>>> third party and require no work other than some setup like loading
>> data
>>>> or
>>>>> templates/schemas. Elasticsearch is a good example.
>>>>> 
>>>>> I believe we could make any of these approaches work in Travis. What
>>> does
>>>>> everyone think?
>>>>> 
>>>>> Ryan
>>>>> 
>>>> 
>>> 
>>

Re: [DISCUSS] e2e test infrastructure

Reply via email to