[ https://issues.apache.org/jira/browse/HADOOP-11045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117835#comment-14117835 ]
Yongjun Zhang commented on HADOOP-11045: ---------------------------------------- Example output for job Hadoop-Common-0.23-Build {code} [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j Hadoop-Common-0.23-Build -n 8 ****Recently FAILED builds in url: https://builds.apache.org//job/Hadoop-Common-0.23-Build THERE ARE 5 builds (out of 5) that have failed tests in the past 8 days, as listed below: ===>https://builds.apache.org/job/Hadoop-Common-0.23-Build/1059/testReport (2014-09-01 02:01:30) Failed test: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec ===>https://builds.apache.org/job/Hadoop-Common-0.23-Build/1058/testReport (2014-08-31 02:01:30) Failed test: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec ===>https://builds.apache.org/job/Hadoop-Common-0.23-Build/1057/testReport (2014-08-30 02:01:30) Failed test: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec ===>https://builds.apache.org/job/Hadoop-Common-0.23-Build/1056/testReport (2014-08-29 02:01:30) Failed test: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec ===>https://builds.apache.org/job/Hadoop-Common-0.23-Build/1055/testReport (2014-08-28 02:01:30) Failed test: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec All failed tests <#occurrences: testName>: 5: org.apache.hadoop.io.compress.TestCodec.testSnappyCodec {code} Example output for Hadoop-Hdfs-trunk: {code} [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -n 7 ****Recently FAILED builds in url: https://builds.apache.org//job/Hadoop-Hdfs-trunk THERE ARE 7 builds (out of 8) that have failed tests in the past 7 days, as listed below: ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1858/testReport (2014-09-01 04:31:30) Failed test: org.apache.hadoop.hdfs.web.TestWebHDFS.testWebHdfsDeleteSnapshot ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1857/testReport (2014-08-31 04:31:30) Failed test: org.apache.hadoop.hdfs.web.TestWebHDFSForHA.testFailoverAfterOpen Failed test: org.apache.hadoop.hdfs.web.TestWebHDFSForHA.testSecureHAToken ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1856/testReport (2014-08-30 09:46:54) Failed test: org.apache.hadoop.hdfs.TestDFSClientRetries.testIdempotentAllocateBlockAndClose Failed test: org.apache.hadoop.hdfs.TestDFSClientRetries.testFailuresArePerOperation Failed test: org.apache.hadoop.hdfs.TestDFSClientRetries.testRetryOnChecksumFailure Failed test: org.apache.hadoop.hdfs.TestDFSClientRetries.testWriteTimeoutAtDataNode Failed test: org.apache.hadoop.hdfs.TestDFSClientRetries.testDFSClientRetriesOnBusyBlocks Failed test: org.apache.hadoop.hdfs.TestDFSClientRetries.testClientDNProtocolTimeout Failed test: org.apache.hadoop.hdfs.TestDFSClientRetries.testGetFileChecksum Failed test: org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1855/testReport (2014-08-30 04:31:30) Failed test: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancer Failed test: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testUnevenDistribution ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1854/testReport (2014-08-29 04:31:30) Could not open testReport ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1853/testReport (2014-08-28 09:37:18) Could not open testReport ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1852/testReport (2014-08-28 09:28:48) Could not open testReport All failed tests <#occurrences: testName>: 1: org.apache.hadoop.hdfs.TestDFSClientRetries.testIdempotentAllocateBlockAndClose 1: org.apache.hadoop.hdfs.web.TestWebHDFSForHA.testFailoverAfterOpen 1: org.apache.hadoop.hdfs.web.TestWebHDFS.testWebHdfsDeleteSnapshot 1: org.apache.hadoop.hdfs.TestDFSClientRetries.testFailuresArePerOperation 1: org.apache.hadoop.hdfs.web.TestWebHDFSForHA.testSecureHAToken 1: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testUnevenDistribution 1: org.apache.hadoop.hdfs.TestDFSClientRetries.testRetryOnChecksumFailure 1: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancer 1: org.apache.hadoop.hdfs.TestDFSClientRetries.testWriteTimeoutAtDataNode 1: org.apache.hadoop.hdfs.TestDFSClientRetries.testDFSClientRetriesOnBusyBlocks 1: org.apache.hadoop.hdfs.TestDFSClientRetries.testClientDNProtocolTimeout 1: org.apache.hadoop.hdfs.TestDFSClientRetries.testGetFileChecksum 1: org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart {code} > Introducing a tool to detect flaky tests of hadoop jenkins test job > ------------------------------------------------------------------- > > Key: HADOOP-11045 > URL: https://issues.apache.org/jira/browse/HADOOP-11045 > Project: Hadoop Common > Issue Type: Improvement > Components: build, tools > Reporter: Yongjun Zhang > Assignee: Yongjun Zhang > > File this jira to introduce a tool to detect flaky tests of hadoop jenkins > test jobs. > I developed the tool on top of some initial work [~tlipcon] did. We find it > quite useful. With Todd's agreement, I'd like to push it to upstream so all > of us can share (thanks Todd for the initial work and support). I hope you > find the tool useful. > This is a tool for hadoop contributors rather than hadoop users. Thanks > [~tedyu] for the advice to put to dev-support dir. > Description of the tool: > {code} > # > # Given a jenkins test job, this script examines all runs of the job done > # within specified period of time (number of days prior to the execution > # time of this script), and reports all failed tests. > # > # The output of this script includes a section for each run that has failed > # tests, with each failed test name listed. > # > # More importantly, at the end, it outputs a summary section to list all > failed > # tests within all examined runs, and indicate how many runs a same test > # failed, and sorted all failed tests by how many runs each test failed in. > # > # This way, when we see failed tests in PreCommit build, we can quickly tell > # whether a failed test is a new failure or it failed before, and it may just > # be a flaky test. > # > # Of course, to be 100% sure about the reason of a failed test, closer look > # at the failed test for the specific run is necessary. > # > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)