[kudu-CR] build: adapt new Java flaky test infrastructure to existing controls

Adar Dembo (Code Review) Wed, 03 Apr 2019 13:46:05 -0700

Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12917 )


Change subject: build: adapt new Java flaky test infrastructure to existing 
controls
......................................................................

build: adapt new Java flaky test infrastructure to existing controls

Now that Java tests are reporting success/failure, we can use the existing
flaky test controls to drive it. As a refresher, the C++ tests rely on these
environment variables:
- RUN_FLAKY_ONLY: whether to run just flaky tests or all tests
- KUDU_FLAKY_TEST_ATTEMPTS: number of attempts for flaky tests
- KUDU_FLAKY_TEST_LIST: path to list of flaky tests, one on each line
- KUDU_RETRY_ALL_FAILED_TESTS: whether to retry all tests or just the ones
                               in the flaky test list

The algorithm is roughly:
  if RUN_FLAKY_ONLY or KUDU_FLAKY_TEST_ATTEMPTS > 1:
    populate KUDU_FLAKY_TEST_LIST from test result server

  if RUN_FLAKY_ONLY:
    testset = tests listed in KUDU_FLAKY_TEST_LIST
  else:
    testset = all tests

  for t in testset:
    if KUDU_RETRY_ALL_FAILED_TESTS or (KUDU_FLAKY_TEST_LIST and
                                       t in KUDU_FLAKY_TEST_LIST):
      num_attempts = KUDU_FLAKY_TEST_ATTEMPTS (or 1 if unset)
    else:
      num_attempts = 1

    run t up to num_attempts times

You can see it at work in build-and-test.sh/run-test.sh. You can also see it
in dist-test.py though notably, it doesn't care about RUN_FLAKY_ONLY because
we never used that particular combination (presumably the list of flaky
tests is short enough that it wouldn't benefit from distributed testing).

This patch attempts to mirror these exact semantics for Java tests. Here are
the interesting changes:
- In RetryRule, rerunFailingTestsCount is gone. The behavior is informed via
  the aforementioned environment variables instead.
- In build-and-test.sh, if RUN_FLAKY_ONLY is set, parse the flaky test list
  into a series of --tests gradle command line arguments.
- In dist-test.py, opt into the C++ flaky test handling (which reflects the
  above algorithm). There are also some small changes to flaky handling to
  accommodate Java's per-method flaky test tracking.

Note: all of this assumes that there's no overlap between the names of any
C++ or Java tests, which is currently true as all C++ tests have names like
"tablet-test" or "master_cert_authority-itest" while all Java tests are
prefixed with "org.apache.kudu...". If this were to change, we'd need to
properly "namespace" the test results in the reporting infrastructure and
fetch the flaky test lists separately for C++ and Java tests. For now
there's just one flaky test list, and both ctest and gradle are OK with
being asked to run irrelevant tests (they'll just be ignored).

Change-Id: Ia89598d7eeb5ab642ab4ebb7aa583adcce770eae
Reviewed-on: http://gerrit.cloudera.org:8080/12917
Reviewed-by: Grant Henke <granthe...@apache.org>
Tested-by: Adar Dembo <a...@cloudera.com>
---
M build-support/dist_test.py
M build-support/jenkins/build-and-test.sh
M java/gradle/tests.gradle
M java/kudu-test-utils/src/main/java/org/apache/kudu/test/junit/RetryRule.java
4 files changed, 145 insertions(+), 31 deletions(-)

Approvals:
  Grant Henke: Looks good to me, approved
  Adar Dembo: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/12917
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia89598d7eeb5ab642ab4ebb7aa583adcce770eae
Gerrit-Change-Number: 12917
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>

[kudu-CR] build: adapt new Java flaky test infrastructure to existing controls

Reply via email to