[PROPOSAL] Improve testing infrastructure of couchdb

Ilya Khlopotov Thu, 01 Oct 2015 13:33:07 -0700

Hi,


I would like to submit a proposal to improve the way we test couchdb. Here
is the link to rendered version:
https://rawgit.com/iilyak/e65b4ddf8a46416f84be/raw/4c8dff91d51fcfd6e5a7f65b7e9b0f8c96ee3a61/proposal.html

Bellow is the source of the proposal
------------------------------

# Testing couchdb

It is quite hard to test couchdb currently. The main difficulties are
related
to complex setup functions and weak isolation of test cases. There are some
functions in `test_util` module which simplify setup a little bit but not
by much.
The purpose of this proposal is to define some requirements for testing
infrastructure and proposing a solution which satisfy most of these
requirements.

## Why is it so hard to write certain kinds of tests?

- Setup / teardown overhead.
- Difficult-to-reproduce failure modes.
- Tests which require clusters rather than single nodes.
- Tests for issues that only manifest at scale (e.g. >100 nodes).
- Verifying/manipulating aspects of internal state during / after a test.
- PRs for functional changes are not isolated to single repositories which
makes integration with CI (required to enforce any must-have-passing-tests
rules we might want) or manual testing of multi-repo changes difficult.
- Code which is difficult to test due to coupling of IO with pure logic.

## What might make things easier?

- A high level API for creating test fixtures (e.g. nodes/clusters in
specific failure states but also other things like configuration files,
logs, etc).
- A high level API for manipulating node/cluster state (e.g. stop/start
node, simulate packet loss, partition one or more nodes).
- Lower-level constructs for directly modifying code under-test e.g.:
  - forcing a particular error to be thrown at a particular time
  - tapping into the output of a logger
  - tapping into metrics collector
  - being able to tell that specified function has been called and possibly
get arguments passed to that function
  - being able to store terms into temporary storage during the execution
of a test case
  - facilities to group tests
  - ability to suppress logging or individual log messages
  - be able to run same behaviour tests against different implementations
  - be able to run same test suites against different configurations
- Tooling for testing specific branches of the sub-repositories (e.g.
https://cloudup.com/cOgxRPbt9aP).
- A manifest repo which links to all proposed changes that span multiple
repos (e.g. https://github.com/iilyak/couchdb-manifest).
- Refactoring to separate IO from pure logic.
- Adding function specifications and increased use of dialyzer as part of
the CI chain.
- Track test suite execution time to detect performance degradation.
- Refactoring to be able to specify names for named processes

# Proposal

- Create `couch_test` app which would contain helper functions to make
eunit
easier to use for couchdb testing. It would include solutions for some of
the identified earlier problems.
- Write new tests using `cdt:setup` `cdt:make_casses`
- Use [riak_test's intercepts](
https://github.com/basho/riak_test/tree/master/intercepts)
- Update Makefile to use `src/couch_test/bin/cdt ci`
- Exact implementation might be different from the design bellow
- Improve as we go


# couch_test (cdt) design

    couch_test/
      +-- include/
        +-- intercept.hrl
        +-- cdt.hrl
      +-- intercepts/
      +-- setups/ - Would contain setup/teardown code for reuse
      +-- src
        +-- intercept.erl
        +-- cdt.erl
        +-- combinatorics.erl
      +-- bin/
        +-- cdt
      +-- rebar.config
      +-- etc/
        +-- cdt.conf.example
        +-- local.conf.example
        +-- test_cluster.conf.example

Just to illustrate the idea here is the list of commands for `cdt`
(eventually).

`cdt --help`

  - cdt ci
  - cdt all
  - cdt unit
  - cdt integration
  - cdt system
  - cdt perf
  - cdt props
  - cdt scale -nodes 200
  - cdt vmargs -period 3600000 -nodes 3
    - run VM with different vmargs (generated using powerset)
      to find out best combination of options and flags.
      We restart VM with next set of options after 1 hour
  - cdt -file file.erl
  - cdt -module module
  - cdt -test module:fun
  - cdt mixed -nodes 10 -old versionA -new versionB


## setup modules

    -module(cdt_chttpd_setup).

    -export([chttpd/4]).

    chttpd(A, B, C, D) ->
        {setup(A, B, C, D), teardown()}.

    setup(A, B, C, D) ->
        fun(Ctx) ->
            %% use A and B to setup chttpd
            %% store C and D in context for latter use in tests
            NewCtx = update_ctx(Ctx, C, D),
            {?MODULE, NewCtx}
        end.

    teardown() ->
        fun(Ctx) ->
           %% Clean up the things using Ctx to find out what to do
           {?MODULE, Ctx}
        end.

Setups could be composed into a chain

    -module(my_unit_test).

    -import_lib("cdt.hrl").

    %% Imports are optional
    -import([cdt_chttpd_setup, [chttpd/4]]).
    -import([cdt_couch_setup, [couch/3]]).
    -import([cdt_cluster_setup, [cluster/1]]).
    -import([cdt_fault_setup, [disconnect/2, drop_packets/3]]).

    setup(Type) ->
        Chain = [
            couch(Type, 1, 2),
            chttpd(backdoor, foo, bar, baz),
            cluster(3),
            disconnect(1, 2),
            drop_packets(2, 3, 30) %% 30% of packets to drop between db2
and db3 nodes
        ],
        Args = [],
        Opts = [],
        cdt:setup(Chain, Args, Opts).

    teardown(_Type, Ctx) ->
        cdt:teardown(Ctx).

## Injecting networking problems

Since a crossplatform solution is required it is better to use something
erlang
based for fault injection. We could extend [epmdpxy](
https://github.com/dergraf/epmdpxy)
to simulate latency or connectivity problems. It should be extended in such
a
way to be able to selectively induce problems between specified nodes
(without affecting the communication between test master and test slaves).
In this case nodes should be started using:

    ERL_EPMD_PORT=43690 erl

## Tapping to logger

Don't log or produce noisy output by default.
However should be able to control verbosity. It is also possible to split
the output.

  - stdout: Errors and output from eunit
  - file per test suite run: For detailed logging

In some cases should be able to:

 - suppress log message
 - check error message produced by logger

The easiest way to achieve both goals is using permanent intercept for
`couch_log.erl`.
Another approach could be a special `couch_log` backend.

## Fixtures

Store fixtures in `tests/fixtures` of the applications we are testing.
We also might have some common fixtures in `couch_test/fixtures`.
All fixtures should be templates.
`couch_test` app would have some helpers to find and include the fixtures.

It would be helpful to support following types of fixtures:

 - module
 - file
 - data structure
    - `<name>.script` - similar to `file:script` but with template
rendering
    - `<name>` - similar to `file:consult` but with template rendering


## Grouping test cases

Use list of lists as a test name to determine if grouping is needed.
For example:

    apply_options_test_() ->
        Funs = [fun ensure_apply_is_called/2],
        Cases = combinatorics:powerset([pipe, concurrent]),
        cdt:make_cases(
            ["apply options tests", "Apply with options: ~p"],
            fun setup/1, fun teardown/2,
            Cases, Funs).

This would generate following test cases

    {
        "apply options tests",
        [
            {
                "Apply with options: []",
                [
                    {
                        foreachx, fun setup/1, fun teardown/2,
                        [
                            {[], fun ensure_apply_is_called/2}
                        ]
                    }
                ]
            },
            {
                "Apply with options: [pipe]",
                [
                    {
                        foreachx, fun setup/1, fun teardown/2,
                        [
                            {[pipe], fun ensure_apply_is_called/2}
                        ]
                    }
                ]
            },
            {
                "Apply with options: [concurent]",
                [
                    {
                        foreachx, fun setup/1, fun teardown/2,
                        [
                            {[concurent], fun ensure_apply_is_called/2}
                        ]
                    }
                ]
            },
            {
                "Apply with options: [pipe, concurrent]",
                [
                    {
                        foreachx, fun setup/1, fun teardown/2,
                        [
                            {[pipe, concurrent], fun
ensure_apply_is_called/2}
                        ]
                    }
                ]
            },
        ]


## Tests annotations

In order to distinguish kinds of tests we would need to annotate test
cases.
We could use one of the following in order of my personal preference
(any other ideas?):

1. Implement parse transform using `merl` to support annotations:

        -scope([integration, unit, cluster]).
        my_tests_() - >
            ok.

2. Split different kinds of tests into different modules and maybe
  keep them in different directories
3. Pass scope to `cdt:make_cases`.
4. Introduce naming convention for a test name

   - `i_my_integration_tests_() -> ok`.
   - `u_my_unit_tests_() -> ok`.

5. Have a module where we add every test case into approporiate scope.

--------------------------------------------

Best regards,
ILYA

[PROPOSAL] Improve testing infrastructure of couchdb

Reply via email to