[ https://issues.apache.org/jira/browse/ZOOKEEPER-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Raul Gutierrez Segales updated ZOOKEEPER-2201: ---------------------------------------------- Release Note: (was: Merged: http://svn.apache.org/viewvc?view=revision&revision=1683878 http://svn.apache.org/viewvc?view=revision&revision=1683930 http://svn.apache.org/viewvc?view=revision&revision=1683931 (pasting SVN URLs since the github mirror seems to be lagging). Thanks [~dnadolny]! ) > Network issues can cause cluster to hang due to near-deadlock > ------------------------------------------------------------- > > Key: ZOOKEEPER-2201 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2201 > Project: ZooKeeper > Issue Type: Bug > Affects Versions: 3.4.6, 3.5.0 > Reporter: Donny Nadolny > Assignee: Donny Nadolny > Priority: Critical > Fix For: 3.4.7, 3.5.2, 3.6.0 > > Attachments: ZOOKEEPER-2201-branch-34.patch, ZOOKEEPER-2201.patch, > ZOOKEEPER-2201.patch, ZOOKEEPER-2201.patch, ZOOKEEPER-2201.patch, > ZOOKEEPER-2201.patch > > > {{DataTree.serializeNode}} synchronizes on the {{DataNode}} it is about to > serialize then writes it out via {{OutputArchive.writeRecord}}, potentially > to a network connection. Under default linux TCP settings, a network > connection where the other side completely disappears will hang (blocking on > the {{java.net.SocketOutputStream.socketWrite0}} call) for over 15 minutes. > During this time, any attempt to create/delete/modify the {{DataNode}} will > cause the leader to hang at the beginning of the request processor chain: > {noformat} > "ProcessThread(sid:5 cport:-1):" prio=10 tid=0x00000000026f1800 nid=0x379c > waiting for monitor entry [0x00007fe6c2a8c000] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.zookeeper.server.PrepRequestProcessor.getRecordForPath(PrepRequestProcessor.java:163) > - waiting to lock <0x00000000d4cd9e28> (a > org.apache.zookeeper.server.DataNode) > - locked <0x00000000d2ef81d0> (a java.util.ArrayList) > at > org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(PrepRequestProcessor.java:345) > at > org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:534) > at > org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:131) > {noformat} > Additionally, any attempt to send a snapshot to a follower or to disk will > hang. > Because the ping packets are sent by another thread which is unaffected, > followers never time out and become leader, even though the cluster will make > no progress until either the leader is killed or the TCP connection times > out. This isn't exactly a deadlock since it will resolve itself eventually, > but as mentioned above this will take > 15 minutes with the default TCP retry > settings in linux. > A simple solution to this is: in {{DataTree.serializeNode}} we can take a > copy of the contents of the {{DataNode}} (as is done with its children) in > the synchronized block, then call {{writeRecord}} with the copy of the > {{DataNode}} outside of the original {{DataNode}} synchronized block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)