[ https://issues.apache.org/jira/browse/ZOOKEEPER-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16028323#comment-16028323 ]
Flavio Junqueira commented on ZOOKEEPER-1936: --------------------------------------------- I think it was simply closed, I had a few comments there that were never addressed. > Server exits when unable to create data directory due to race > -------------------------------------------------------------- > > Key: ZOOKEEPER-1936 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.4.6, 3.5.0 > Reporter: Harald Musum > Assignee: Ted Yu > Priority: Minor > Fix For: 3.5.4, 3.6.0 > > Attachments: ZOOKEEPER-1936.branch-3.4.patch, ZOOKEEPER-1936.patch, > ZOOKEEPER-1936.v2.patch, ZOOKEEPER-1936.v3.patch, ZOOKEEPER-1936.v3.patch, > ZOOKEEPER-1936.v4.patch, ZOOKEEPER-1936.v5.patch > > > We sometime see issues with ZooKeeper server not starting and seeing this > error in the log: > [2014-05-27 09:29:48.248] ERROR : - > .org.apache.zookeeper.server.ZooKeeperServerMain Unexpected exception, > exiting abnormally\nexception=\njava.io.IOException: Unable to create data > directory /home/y/var/zookeeper/version-2\n\tat > org.apache.zookeeper.server.persistence.FileTxnSnapLog.<init>(FileTxnSnapLog.java:85)\n\tat > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat > org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat > org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t > [...] > Stack trace from JVM gives this: > "PurgeTask" daemon prio=10 tid=0x000000000201d000 nid=0x1727 runnable > [0x00007f55d7dc7000] > java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.createDirectory(Native Method) > at java.io.File.mkdir(File.java:1310) > at java.io.File.mkdirs(File.java:1337) > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.<init>(FileTxnSnapLog.java:84) > at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68) > at > org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140) > at java.util.TimerThread.mainLoop(Timer.java:555) > at java.util.TimerThread.run(Timer.java:505) > "zookeeper server" prio=10 tid=0x00000000027df800 nid=0x1715 runnable > [0x00007f55d7ed8000] > java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.createDirectory(Native Method) > at java.io.File.mkdir(File.java:1310) > at java.io.File.mkdirs(File.java:1337) > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.<init>(FileTxnSnapLog.java:84) > at > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103) > at > org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86) > at > org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78) > [...] > So it seems that when autopurge is used (as it is in our case), it might > happen at the same time as starting the server itself. In FileTxnSnapLog() it > will check if the directory exists and create it if not. These two tasks do > this at the same time, and mkdir fails and server exits the JVM. -- This message was sent by Atlassian JIRA (v6.3.15#6346)