[jira] [Commented] (HBASE-3708) createAndFailSilent is not so silent; leaves lots of logging in ensemble logs
[ https://issues.apache.org/jira/browse/HBASE-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019464#comment-13019464 ] Dmitriy V. Ryaboy commented on HBASE-3708: -- I posted the patch a while back, but I guess something was wrong with the RB integration? Here it is: https://review.cloudera.org/r/1672/ > createAndFailSilent is not so silent; leaves lots of logging in ensemble logs > - > > Key: HBASE-3708 > URL: https://issues.apache.org/jira/browse/HBASE-3708 > Project: HBase > Issue Type: Bug > Components: zookeeper >Affects Versions: 0.90.1 >Reporter: stack >Assignee: Dmitriy V. Ryaboy > > Clients on startup create a ZKWatcher instance. Part of construction is > check that hbase dirs are all up in zk. Its done by making the following > call: > http://hbase.apache.org/xref/org/apache/hadoop/hbase/zookeeper/ZKUtil.html#898 > A user complains that its making for lots of logging every second over on the > zk ensemble: > 14:59 seeing lots of these in the ZK log though, dozens per second of > "Got user-level KeeperException when processing sessionid:0x42daa1daab0ecbe > type:create cxid:0x1 zxid:0xfffe txntype:unknown reqpath:n/a > Error Path:/hbase Error:KeeperErrorCode = NodeExists for /hbase" -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (HBASE-3642) Web UI should be available during startup
Web UI should be available during startup - Key: HBASE-3642 URL: https://issues.apache.org/jira/browse/HBASE-3642 Project: HBase Issue Type: Improvement Reporter: Dmitriy V. Ryaboy Priority: Minor Currently, HBase does not provide a web interface during its start-up period -- while it's waiting for RSes to report in, replaying logs, etc. It would be great if the Web UI was available and showed the current status. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3622) Deadlock in HBaseServer (JVM bug?)
[ https://issues.apache.org/jira/browse/HBASE-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006668#comment-13006668 ] Dmitriy V. Ryaboy commented on HBASE-3622: -- We are deploying membar change, and adding monitoring for this condition; will update if it shows up again. > Deadlock in HBaseServer (JVM bug?) > -- > > Key: HBASE-3622 > URL: https://issues.apache.org/jira/browse/HBASE-3622 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.1 >Reporter: Jean-Daniel Cryans >Priority: Critical > Fix For: 0.92.0 > > Attachments: HBASE-3622.patch > > > On Dmitriy's cluster: > {code} > "IPC Reader 0 on port 60020" prio=10 tid=0x2aacb4a82800 nid=0x3a72 > waiting on condition [0x429ba000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x2aaabf5fa6d0> (a > java.util.concurrent.locks.ReentrantLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114) > at > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186) > at > java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) > at > java.util.concurrent.LinkedBlockingQueue.signalNotEmpty(LinkedBlockingQueue.java:103) > at > java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:267) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:985) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:946) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316) > - locked <0x2aaabf580fb0> (a > org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > ... > "IPC Server handler 29 on 60020" daemon prio=10 tid=0x2aacbc163800 > nid=0x3acc waiting on condition [0x462f3000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x2aaabf5e3800> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1025) > "IPC Server handler 28 on 60020" daemon prio=10 tid=0x2aacbc161800 > nid=0x3acb waiting on condition [0x461f2000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x2aaabf5e3800> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1025 > ... > {code} > This region server stayed in this state for hours. The reader is waiting to > put and the handlers are waiting to take, and they wait on different lock > ids. It reminds me of the UseMembar thing about the JVM sometime missing to > notify waiters. In any case, that RS needed to be closed in order to get out > of that state. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira