Binglin Chang created KUDU-1391:
-----------------------------------
Summary: 2 of 3 replica alive but failed to elect leader
Key: KUDU-1391
URL: https://issues.apache.org/jira/browse/KUDU-1391
Project: Kudu
Issue Type: Bug
Reporter: Binglin Chang
Last weekend many TS have a lot too many open files error(haven't upgrade to ,
when using our internal deploy tool to restart cluster (stop all ts, then start
all ts), the control machine have some issue which seems to block or write to
ssh terminal(maybe usb driver issue, not related to this bug), so only half
(about 30) of the TS is shutdown, then after maybe 10 minutes, I switch to
another control host and perform the whole restart.
Then I see writes are blocked, because 1 tablet is in no leader state, from
web-ui, 2 of 3 replicas is in follower state, 1 TABLET_DATA_TOMBSTONED, but
all election failed, will attach the log of the 2 followers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)