Yong-Hao Zou created ZOOKEEPER-4407:
---------------------------------------
Summary: Zookeeper crashes after commit fail
Key: ZOOKEEPER-4407
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4407
Project: ZooKeeper
Issue Type: Improvement
Components: server
Affects Versions: 3.7.0
Reporter: Yong-Hao Zou
I got the following Severe unrecoverable error because of a transient error of
file write and the server exit.
{code:java}
2021-11-01 10:55:41,215 [myid:4] - ERROR
[SyncThread:4:ZooKeeperCriticalThread@49] - Severe unrecoverable error, from
thread : SyncThread:4 java.io.IOException: Write error at
java.base/java.io.FileOutputStream.writeBytes(Native Method) at
java.base/java.io.FileOutputStream.write(FileOutputStream.java:354) at
java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)
at
java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142)
at
org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:377)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:599)
at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:657)
at
org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:235)
at
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:169)
{code}
I think it is designed in https://issues.apache.org/jira/browse/ZOOKEEPER-2247.
But is it a good design that the server exit because of one commit fail?
I think it is better that we just let the commit fail or let the leader turn to
follower, and keep the server running.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)