[ 
https://issues.apache.org/jira/browse/IMPALA-12556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17785719#comment-17785719
 ] 

ASF subversion and git services commented on IMPALA-12556:
----------------------------------------------------------

Commit 97eae40d10193b5cfb10ca4d2b4034dce44274d0 in impala's branch 
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=97eae40d1 ]

IMPALA-12556: Fix flaky test test_two_statestored_with_force_active

Test test_two_statestored_with_force_active failed occasionally by
cause of both statestore instances assigned with active roles.

This patch fixes the issue to handle the case that both statestore
instances are restarted with flag "statestore_force_active" in the
same way as both statestore instances are restarted without flag
"statestore_force_active".

Testing:
 - Repeatedly ran test_two_statestored_with_force_active on Jenkins for
   hundreds of times without failure.
 - Repeatedly ran test_two_statestored_with_force_active on local
   machine for thousand times without failure.
 - Repeatedly ran all tests in test_statestored_ha.py for over 12 hours
   on Jenkins without failure.
 - Passed core tests.

Change-Id: I3e6f85233ff6fa747a6aa5ef8d093627885d20b2
Reviewed-on: http://gerrit.cloudera.org:8080/20699
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Wenzhe Zhou <wz...@cloudera.com>


> test_two_statestored_with_force_active fails or flaky
> -----------------------------------------------------
>
>                 Key: IMPALA-12556
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12556
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Distributed Exec
>    Affects Versions: Impala 4.4.0
>            Reporter: Laszlo Gaal
>            Assignee: Wenzhe Zhou
>            Priority: Blocker
>
> custom_cluster.test_statestored_ha.TestStatestoredHA.test_two_statestored_with_force_active
>  failed in a precommit run.
> Symptom:
> {code}
> common/custom_cluster_test_suite.py:208: in setup_method
>     self._start_impala_cluster(cluster_args, **kwargs)
> common/custom_cluster_test_suite.py:330: in _start_impala_cluster
>     check_call(cmd + options, close_fds=True)
> ../toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/subprocess.py:190:
>  in check_call
>     raise CalledProcessError(retcode, cmd)
> E   CalledProcessError: Command 
> '['/home/ubuntu/Impala/bin/start-impala-cluster.py', 
> '--state_store_args=--statestore_update_frequency_ms=50     
> --statestore_priority_update_frequency_ms=50     
> --statestore_heartbeat_frequency_ms=50', '--cluster_size=3', 
> '--num_coordinators=3', 
> '--log_dir=/home/ubuntu/Impala/logs/custom_cluster_tests', '--log_level=1', 
> '--state_store_args=--statestore_force_active=true ', 
> '--enable_statestored_ha', '--impalad_args=--default_query_options=']' 
> returned non-zero exit status 1
> {code}
> The test dies with a FATAL log entry in catalogd's log:
> {code}
> DCHECK found in log file: 
> /home/ubuntu/Impala/logs/custom_cluster_tests/catalogd.FATAL
> {code}
> {code}
> Log file created at: 2023/11/11 23:36:24
> Running on machine: ip-172-31-52-128
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> F1111 23:36:24.798915 2270244 statestore-subscriber.cc:336] Check failed: 
> !statestore_is_active || !statestore2_is_active 
> {code}
> Offending precommit run: 
> https://jenkins.impala.io/job/ubuntu-20.04-from-scratch/874/ (preserved).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to