Hi,
I would like to submit a proposal to improve the way we test couchdb. Here is the link to rendered version: https://rawgit.com/iilyak/e65b4ddf8a46416f84be/raw/4c8dff91d51fcfd6e5a7f65b7e9b0f8c96ee3a61/proposal.html Bellow is the source of the proposal ------------------------------ # Testing couchdb It is quite hard to test couchdb currently. The main difficulties are related to complex setup functions and weak isolation of test cases. There are some functions in `test_util` module which simplify setup a little bit but not by much. The purpose of this proposal is to define some requirements for testing infrastructure and proposing a solution which satisfy most of these requirements. ## Why is it so hard to write certain kinds of tests? - Setup / teardown overhead. - Difficult-to-reproduce failure modes. - Tests which require clusters rather than single nodes. - Tests for issues that only manifest at scale (e.g. >100 nodes). - Verifying/manipulating aspects of internal state during / after a test. - PRs for functional changes are not isolated to single repositories which makes integration with CI (required to enforce any must-have-passing-tests rules we might want) or manual testing of multi-repo changes difficult. - Code which is difficult to test due to coupling of IO with pure logic. ## What might make things easier? - A high level API for creating test fixtures (e.g. nodes/clusters in specific failure states but also other things like configuration files, logs, etc). - A high level API for manipulating node/cluster state (e.g. stop/start node, simulate packet loss, partition one or more nodes). - Lower-level constructs for directly modifying code under-test e.g.: - forcing a particular error to be thrown at a particular time - tapping into the output of a logger - tapping into metrics collector - being able to tell that specified function has been called and possibly get arguments passed to that function - being able to store terms into temporary storage during the execution of a test case - facilities to group tests - ability to suppress logging or individual log messages - be able to run same behaviour tests against different implementations - be able to run same test suites against different configurations - Tooling for testing specific branches of the sub-repositories (e.g. https://cloudup.com/cOgxRPbt9aP). - A manifest repo which links to all proposed changes that span multiple repos (e.g. https://github.com/iilyak/couchdb-manifest). - Refactoring to separate IO from pure logic. - Adding function specifications and increased use of dialyzer as part of the CI chain. - Track test suite execution time to detect performance degradation. - Refactoring to be able to specify names for named processes # Proposal - Create `couch_test` app which would contain helper functions to make eunit easier to use for couchdb testing. It would include solutions for some of the identified earlier problems. - Write new tests using `cdt:setup` `cdt:make_casses` - Use [riak_test's intercepts]( https://github.com/basho/riak_test/tree/master/intercepts) - Update Makefile to use `src/couch_test/bin/cdt ci` - Exact implementation might be different from the design bellow - Improve as we go # couch_test (cdt) design couch_test/ +-- include/ +-- intercept.hrl +-- cdt.hrl +-- intercepts/ +-- setups/ - Would contain setup/teardown code for reuse +-- src +-- intercept.erl +-- cdt.erl +-- combinatorics.erl +-- bin/ +-- cdt +-- rebar.config +-- etc/ +-- cdt.conf.example +-- local.conf.example +-- test_cluster.conf.example Just to illustrate the idea here is the list of commands for `cdt` (eventually). `cdt --help` - cdt ci - cdt all - cdt unit - cdt integration - cdt system - cdt perf - cdt props - cdt scale -nodes 200 - cdt vmargs -period 3600000 -nodes 3 - run VM with different vmargs (generated using powerset) to find out best combination of options and flags. We restart VM with next set of options after 1 hour - cdt -file file.erl - cdt -module module - cdt -test module:fun - cdt mixed -nodes 10 -old versionA -new versionB ## setup modules -module(cdt_chttpd_setup). -export([chttpd/4]). chttpd(A, B, C, D) -> {setup(A, B, C, D), teardown()}. setup(A, B, C, D) -> fun(Ctx) -> %% use A and B to setup chttpd %% store C and D in context for latter use in tests NewCtx = update_ctx(Ctx, C, D), {?MODULE, NewCtx} end. teardown() -> fun(Ctx) -> %% Clean up the things using Ctx to find out what to do {?MODULE, Ctx} end. Setups could be composed into a chain -module(my_unit_test). -import_lib("cdt.hrl"). %% Imports are optional -import([cdt_chttpd_setup, [chttpd/4]]). -import([cdt_couch_setup, [couch/3]]). -import([cdt_cluster_setup, [cluster/1]]). -import([cdt_fault_setup, [disconnect/2, drop_packets/3]]). setup(Type) -> Chain = [ couch(Type, 1, 2), chttpd(backdoor, foo, bar, baz), cluster(3), disconnect(1, 2), drop_packets(2, 3, 30) %% 30% of packets to drop between db2 and db3 nodes ], Args = [], Opts = [], cdt:setup(Chain, Args, Opts). teardown(_Type, Ctx) -> cdt:teardown(Ctx). ## Injecting networking problems Since a crossplatform solution is required it is better to use something erlang based for fault injection. We could extend [epmdpxy]( https://github.com/dergraf/epmdpxy) to simulate latency or connectivity problems. It should be extended in such a way to be able to selectively induce problems between specified nodes (without affecting the communication between test master and test slaves). In this case nodes should be started using: ERL_EPMD_PORT=43690 erl ## Tapping to logger Don't log or produce noisy output by default. However should be able to control verbosity. It is also possible to split the output. - stdout: Errors and output from eunit - file per test suite run: For detailed logging In some cases should be able to: - suppress log message - check error message produced by logger The easiest way to achieve both goals is using permanent intercept for `couch_log.erl`. Another approach could be a special `couch_log` backend. ## Fixtures Store fixtures in `tests/fixtures` of the applications we are testing. We also might have some common fixtures in `couch_test/fixtures`. All fixtures should be templates. `couch_test` app would have some helpers to find and include the fixtures. It would be helpful to support following types of fixtures: - module - file - data structure - `<name>.script` - similar to `file:script` but with template rendering - `<name>` - similar to `file:consult` but with template rendering ## Grouping test cases Use list of lists as a test name to determine if grouping is needed. For example: apply_options_test_() -> Funs = [fun ensure_apply_is_called/2], Cases = combinatorics:powerset([pipe, concurrent]), cdt:make_cases( ["apply options tests", "Apply with options: ~p"], fun setup/1, fun teardown/2, Cases, Funs). This would generate following test cases { "apply options tests", [ { "Apply with options: []", [ { foreachx, fun setup/1, fun teardown/2, [ {[], fun ensure_apply_is_called/2} ] } ] }, { "Apply with options: [pipe]", [ { foreachx, fun setup/1, fun teardown/2, [ {[pipe], fun ensure_apply_is_called/2} ] } ] }, { "Apply with options: [concurent]", [ { foreachx, fun setup/1, fun teardown/2, [ {[concurent], fun ensure_apply_is_called/2} ] } ] }, { "Apply with options: [pipe, concurrent]", [ { foreachx, fun setup/1, fun teardown/2, [ {[pipe, concurrent], fun ensure_apply_is_called/2} ] } ] }, ] ## Tests annotations In order to distinguish kinds of tests we would need to annotate test cases. We could use one of the following in order of my personal preference (any other ideas?): 1. Implement parse transform using `merl` to support annotations: -scope([integration, unit, cluster]). my_tests_() - > ok. 2. Split different kinds of tests into different modules and maybe keep them in different directories 3. Pass scope to `cdt:make_cases`. 4. Introduce naming convention for a test name - `i_my_integration_tests_() -> ok`. - `u_my_unit_tests_() -> ok`. 5. Have a module where we add every test case into approporiate scope. -------------------------------------------- Best regards, ILYA