[ https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ariel Weisberg updated CASSANDRA-14134: --------------------------------------- Resolution: Fixed Reviewer: Ariel Weisberg Fix Version/s: 4.0 Status: Resolved (was: Ready to Commit) +1 Committed as: Cassandra [e73bc32a940175a21700721bafaa593aa86a1a5f|https://github.com/apache/cassandra/commit/e73bc32a940175a21700721bafaa593aa86a1a5f] dtests [49b2dda4e6643d2b18376d504b5fea4c0b3354a7|https://github.com/apache/cassandra-dtest/commit/49b2dda4e6643d2b18376d504b5fea4c0b3354a7] cassandra-builds [6df93e990a93ea4e422b3b67b55cab49554909eb|https://github.com/apache/cassandra-builds/commit/6df93e990a93ea4e422b3b67b55cab49554909eb] and [63504a1ce2cec70e86bfd2f913325db8190f8f88|https://github.com/apache/cassandra-builds/commit/63504a1ce2cec70e86bfd2f913325db8190f8f88] and [d3cd2e8cec57633f8a064c3620f9f5af442640ca|https://github.com/apache/cassandra-builds/commit/d3cd2e8cec57633f8a064c3620f9f5af442640ca] Thanks! > Migrate dtests to use pytest and python3 > ---------------------------------------- > > Key: CASSANDRA-14134 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14134 > Project: Cassandra > Issue Type: Improvement > Components: Testing > Reporter: Michael Kjellman > Assignee: Michael Kjellman > Priority: Major > Fix For: 4.0 > > > h4. Get the C* dtests running on the pytest framework. > C* DTests currently run using the python test framework nosetest. This > framework has been largely abandoned with no releases since 2015 and a > general strong consensus in the python community that pytest is the future. > h4. Why should we do this. > Currently (and historically) dtests have always been difficult to run, flaky > and unpredictable in CI environments, and almost impossible to debug. > On November 28th, 2017, I proposed on the dev@ list that we move the dtests > from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, > and kurt greaves with really only "+1" like replies to the proposal. > Since then I've been working pretty much non stop to complete the large > refactor of dtests to pytests. As part of this effort (and due to the > migration tools that exist require it) I also ported the code to python3 > (from the current python 2.7 based code-base). > h4. High-level summary of key changes, improvements, and new features. > * Migrate dtests from executing using the nosetest framework to pytest > * Port the entire code base from Python 2.7 to Python 3.6 > * Update run_dtests.py to work with pytest > * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable > list of all available collected tests > * Update README.md for executing the dtests with pytest > * Add new debugging tips section to README.md to help with some basics of > debugging python3 and pytest > * Migrate all existing Enviornment Variable usage as a means to control dtest > operation modes to argparse command line options with documented help on each > toggles intended usage > * Migration of old unitTest and nose based test structure to modern pytest > fixture approach > * Automatic detection of physical system resources to automatically determine > if @pytest.mark.resource_intensive annotated tests should be collected and > run on the system where they are being executed > * new pytest fixture replacements for @since and @pytest.mark.upgrade_test > annotations > * Migration to python logging framework > * Upgrade thrift bindings to latest version with full python3 compatibility > * Remove deprecated cql and pycassa dependencies and migrate any remaining > tests to fully remove those dependencies > * Fixed dozens of tests that would hang the pytest framework forever when run > in CI enviornments > * Ran code nearly 300 times in CircleCI during the migration and to find, > identify, and fix any tests capable of hanging CI > * Upgrade Tests do not yet run in CI and still need additional migration work > (although all upgrade test classes compile successfully) > I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] > migration tool. As this required python 3 language support I found myself > down the 2to3 python migration path. While painful to do this, the benefits > of python3 over python2.7 are numerous and moving to python3 for the > additional debugging tools now available to use when fixing dtests makes the > effort worth it for that reason alone! > After the automated tools did their thing I began what was a much longer and > tedious manual process than I ever could have expected due to the custom many > ways we did things in dtests (frequently to work around nosetest limitations > of missing features that thankfully are now all included with the pytest > framework). I've done nearly 300 test runs of my migration branch with > circleci. > The latest CircleCI runs can be found at: > (dtests without vnodes) [https://circleci.com/gh/mkjellman/cassandra/277] > (dtests with vnodes) [https://circleci.com/gh/mkjellman/cassandra/278] > With vnodes, there are currently only 6 remaining dtest test failures. > Without vnodes, there are 12 remaining dtest failures. > It turns out that after the dtests were moved to ASF Jenkins from cassci, the > jobs were misconfigured and we actually haven't been running the dtests in > the non-vnodes configuration. The current most recent trunk dtest job to > complete on ASF Jenkins (with vnodes) was > [https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/387/]. > That test run had 36 test failures. > There are great performance improvements so far with pytests. With a > parallism factor of 90x in CircleCI the dtests-no-vnodes job completed all > tests in 16 minutes and 24 seconds and the dtests-with-vnodes job completed > all tests in 14 minutes and 22 seconds.. (Compare that to the 8+ hours ASF > Jenkins currently takes!). > A dtest on pytest compatible CircleCI configuration is available at > https://github.com/mkjellman/cassandra/blob/trunk_circle/.circleci/config.yml > A very small patch is required to ccm for 2 places that were not yet python3 > compatible. (I didn't do a full audit of ccm, but these appear to be the only > two issues I've found even with a ton of testing when using ccmlib with > dtests and running with python 3). > {code} > --- /home/cassandra/env/src/ccm/ccmlib/node.py 2017-12-13 03:11:13.000000000 > +0000-8").splitlines())) > +++ node.py 2017-12-19 09:51:49.083186951 +0000 > @@ -57,10 +57,16 @@ > message = "Subprocess {} exited with non-zero status; exit status: > {}".format(command, exit_status) > if stdout: > message += "; \nstdout: " > - message += stdout > + if isinstance(stdout, str): > + message += stdout > + else: > + message += stdout.decode("utf-8") > if stderr: > message += "; \nstderr: " > - message += stderr > + if isinstance(stderr, str): > + message += stderr > + else: > + message += stderr.decode("utf-8") > > Exception.__init__(self, message) > > @@ -2009,7 +2015,10 @@ > > out, _, _ = handle_external_tool_process(p, ["sstableutil", > '--type', 'final', ks, table]) > > - return sorted(filter(lambda s: s.endswith('-Data.db'), > out.splitlines())) > + if isinstance(out, str): > + return sorted(filter(lambda s: s.endswith('-Data.db'), > out.splitlines())) > + else: > + return sorted(filter(lambda s: s.endswith('-Data.db'), > out.decode("utf-8").splitlines())) > > def _get_load_from_info_output(info): > {code} > For the actual dtest work itself: I've squashed all my work down into a > single commit available at > [https://github.com/mkjellman/cassandra-dtest/tree/dtests_on_pytest_v2] > And the actual commit available at > [https://github.com/mkjellman/cassandra-dtest/commit/11201c149d1198c0d6d1a479e08bfb8695f14c9e] > I tried to put a lot of effort into making the end user experience better for > both bootstrapping and running dtests for the first time, but also day to > day. I hope people like the approach! In addition to moving to argparse with > commented command line arguments to control execution, I also updated the > README.md to reflect the new pytest world and also added a new section with a > few tips I learned while I was debugging hundreds of tests during this > migration process that hopefully are helpful to the rest of the community > going forward! > I'm sure the first commit above isn't 100% perfect (although I really have > put a lot of effort into it) but my one request is to hopefully get consensus > with this first revision of changes committed to reduce any chance of further > drift with the codebase as changes are committed into master. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org