[ https://issues.apache.org/jira/browse/IMPALA-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16753260#comment-16753260 ]
ASF subversion and git services commented on IMPALA-8114: --------------------------------------------------------- Commit b3318ad434f1a82a880afeaf600233ea76e8ca0f in impala's branch refs/heads/master from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=b3318ad ] IMPALA-8114: Deflake test_breakpad.py A test failed recently in a private build and it looked like the loop in wait_for_num_processes had terminated to early. To make sure that the forked of processes that write the minidumps have actually started, we now sleep for 1 second before entering the wait loop. Change-Id: Ifcd1fbb498c475a1f186f490abaf90b47ecba05b Reviewed-on: http://gerrit.cloudera.org:8080/12273 Reviewed-by: Tim Armstrong <tarmstr...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Build test failure in test_breakpad.py > -------------------------------------- > > Key: IMPALA-8114 > URL: https://issues.apache.org/jira/browse/IMPALA-8114 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure > Affects Versions: Impala 3.1.0 > Reporter: Paul Rogers > Assignee: Lars Volker > Priority: Blocker > > Recent builds have failed due to a failure in {{test_breakpad.py}}. Assigning > to Tim as the person who most recently touched this file. > Test output: > {noformat} > 09:04:35 ==================================== ERRORS > ==================================== > 09:04:35 ___ ERROR at teardown of > TestBreakpadExhaustive.test_minidump_cleanup_thread ___ > 09:04:35 custom_cluster/test_breakpad.py:49: in teardown_method > 09:04:35 self.kill_cluster(SIGKILL) > 09:04:35 custom_cluster/test_breakpad.py:80: in kill_cluster > 09:04:35 self.kill_processes(processes, signal) > 09:04:35 custom_cluster/test_breakpad.py:85: in kill_processes > 09:04:35 process.kill(signal) > 09:04:35 common/impala_cluster.py:330: in kill > 09:04:35 assert 0, "No processes %s found" % self.cmd > 09:04:35 E AssertionError: No processes > ['/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/impalad', > '-kudu_client_rpc_timeout_ms', '0', '-kudu_master_hosts', 'localhost', > '--mem_limit=12884901888', '-logbufsecs=5', '-v=1', '-max_log_files=0', > '-log_filename=impalad', > '-log_dir=/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/custom_cluster_tests', > '-beeswax_port=21000', '-hs2_port=21050', '-be_port=22000', > '-krpc_port=27000', '-state_store_subscriber_port=23000', > '-webserver_port=25000', '-max_minidumps=2', '-logbufsecs=1', > '-minidump_path=/tmp/tmpKaSw_w', '--default_query_options='] found > {noformat} > Distilled {{TEST-impala-custom-cluster.xml}} output: > {noformat} > -- 2019-01-23 08:00:43,585 INFO MainThread: Found 3 impalad/1 > statestored/1 catalogd process(es) > … > -- 2019-01-23 08:00:43,667 INFO MainThread: Killing: > /data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/statestored > -logbufsecs=5 -v=1 -max_log_files=0 -log_filename=statestored > -log_dir=/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/custom_cluster_tests > -max_minidumps=2 -logbufsecs=1 -minidump_path=/tmp/tmpKaSw_w (PID: 16809) > with signal 10 > -- 2019-01-23 08:00:43,692 INFO MainThread: Found 6 impalad/1 > statestored/1 catalogd process(es) > ... > E AssertionError: No processes > ['/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/impalad > {noformat} > Notice that the main thread appaars to be killing statestore, but fails to > kill impalad. Notice that a message appears that says that all impalads are > running in the midst of the code that tries to shut down the cluster. Is this > test multi-threaded? Is there more than one “main thread” Are these main > threads working at cross purposes? What recent change may have caused this? > Also, looks like the script is sending signal 10 (SIGUSR1) while the > statestore (in its log) says it got a SIGTERM (15): > {noformat} > I0123 08:00:44.086009 16868 thrift-client.cc:78] Couldn't open transport for > impala-ec2-centoCaught signal: SIGTERM. Daemon will exit. > {noformat} > Not terribly familiar with this area of the product, so bumping it over to > the BE team. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org