Yong-Hao Zou created ZOOKEEPER-4407:
---------------------------------------

             Summary: Zookeeper crashes after commit fail
                 Key: ZOOKEEPER-4407
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4407
             Project: ZooKeeper
          Issue Type: Improvement
          Components: server
    Affects Versions: 3.7.0
            Reporter: Yong-Hao Zou


I got the following Severe unrecoverable error because of a transient error of 
file write and the server exit.
{code:java}
2021-11-01 10:55:41,215 [myid:4] - ERROR 
[SyncThread:4:ZooKeeperCriticalThread@49] - Severe unrecoverable error, from 
thread : SyncThread:4    java.io.IOException: Write error        at 
java.base/java.io.FileOutputStream.writeBytes(Native Method)        at 
java.base/java.io.FileOutputStream.write(FileOutputStream.java:354)        at 
java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)
        at 
java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142)     
   at 
org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:377)  
      at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:599)
        at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:657)   
     at 
org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:235)
        at 
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:169)
{code}
I think it is designed in https://issues.apache.org/jira/browse/ZOOKEEPER-2247.

But is it a good design that the server exit because of one commit fail?

I think it is better that we just let the commit fail or let the leader turn to 
follower, and keep the server running.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to