Dear Zeppeliners, *1. For Zeppelin 0.6.0, I found it's very likely to hang at the synchronized (noteSocketMap), for example:* "Thread-133" #189 prio=5 os_prio=0 tid=0x00007efc14001000 nid=0x7d3f waiting for monitor entry [0x00007efc56cef000] * java.lang.Thread.State: BLOCKED (on object monitor)* * at org.apache.zeppelin.socket.NotebookServer.broadcast(NotebookServer.java:301)* * - waiting to lock <0x00007efe1d340c90> (a java.util.HashMap)* at org.apache.zeppelin.socket.NotebookServer.broadcastNote(NotebookServer.java:389) at org.apache.zeppelin.socket.NotebookServer$ParagraphListenerImpl.afterStatusChange(NotebookServer.java:1148) at org.apache.zeppelin.scheduler.Job.setStatus(Job.java:150) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.afterStatusChange(RemoteScheduler.java:408) at org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.getStatus(RemoteScheduler.java:272) - locked <0x00007f08e9f9fb50> (a org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller) at org.apache.zeppelin.scheduler.RemoteScheduler$JobStatusPoller.run(RemoteScheduler.java:211)
*2. Then I found the root cause is that Jetty org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block hangs, therefore Zeppelin hangs:* "Thread-132" #187 prio=5 os_prio=0 tid=0x00007efc7001e800 nid=0x7d3a waiting on condition [0x00007efcc0bd6000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00007f08e9ca9de8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) *at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:219)* at org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block(BlockingWriteCallback.java:83) at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:107) at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString(WebSocketRemoteEndpoint.java:387) *at org.apache.zeppelin.socket.NotebookSocket.send(NotebookSocket.java:69)* at org.apache.zeppelin.socket.NotebookServer.broadcast(NotebookServer.java:308) - locked <0x00007efe1d340c90> (a java.util.HashMap) at org.apache.zeppelin.socket.NotebookServer.access$000(NotebookServer.java:62) at org.apache.zeppelin.socket.NotebookServer$ParagraphListenerImpl.onProgressUpdate(NotebookServer.java:1121) at org.apache.zeppelin.scheduler.JobProgressPoller.run(JobProgressPoller.java:51) *3. Finally, I switch to the latest Jetty: * <jetty.version>9.3.14.v20161028</jetty.version> The hang has not occurred again so far. Please consider to upgrade Jetty in next release, thank you. Best, Zell