[jira] [Commented] (HBASE-3708) createAndFailSilent is not so silent; leaves lots of logging in ensemble logs

2011-04-13 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019464#comment-13019464
 ] 

Dmitriy V. Ryaboy commented on HBASE-3708:
--

I posted the patch a while back, but I guess something was wrong with the RB 
integration? Here it is: https://review.cloudera.org/r/1672/

> createAndFailSilent is not so silent; leaves lots of logging in ensemble logs
> -
>
> Key: HBASE-3708
> URL: https://issues.apache.org/jira/browse/HBASE-3708
> Project: HBase
>  Issue Type: Bug
>  Components: zookeeper
>Affects Versions: 0.90.1
>Reporter: stack
>Assignee: Dmitriy V. Ryaboy
>
> Clients on startup create a ZKWatcher instance.  Part of construction is 
> check that hbase dirs are all up in zk.  Its done by making the following 
> call: 
> http://hbase.apache.org/xref/org/apache/hadoop/hbase/zookeeper/ZKUtil.html#898
> A user complains that its making for lots of logging every second over on the 
> zk ensemble:
> 14:59 seeing lots of these in the ZK log though, dozens per second of 
> "Got user-level KeeperException when processing sessionid:0x42daa1daab0ecbe 
> type:create cxid:0x1 zxid:0xfffe txntype:unknown reqpath:n/a 
> Error Path:/hbase Error:KeeperErrorCode = NodeExists for /hbase"

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Created: (HBASE-3642) Web UI should be available during startup

2011-03-14 Thread Dmitriy V. Ryaboy (JIRA)
Web UI should be available during startup
-

 Key: HBASE-3642
 URL: https://issues.apache.org/jira/browse/HBASE-3642
 Project: HBase
  Issue Type: Improvement
Reporter: Dmitriy V. Ryaboy
Priority: Minor


Currently, HBase does not provide a web interface during its start-up period -- 
while it's waiting for RSes to report in, replaying logs, etc. It would be 
great if the Web UI was available and showed the current status.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3622) Deadlock in HBaseServer (JVM bug?)

2011-03-14 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006668#comment-13006668
 ] 

Dmitriy V. Ryaboy commented on HBASE-3622:
--

We are deploying membar change, and adding monitoring for this condition; will 
update if it shows up again.

> Deadlock in HBaseServer (JVM bug?)
> --
>
> Key: HBASE-3622
> URL: https://issues.apache.org/jira/browse/HBASE-3622
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.1
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: HBASE-3622.patch
>
>
> On Dmitriy's cluster:
> {code}
> "IPC Reader 0 on port 60020" prio=10 tid=0x2aacb4a82800 nid=0x3a72 
> waiting on condition [0x429ba000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x2aaabf5fa6d0> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
> at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
> at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
> at 
> java.util.concurrent.LinkedBlockingQueue.signalNotEmpty(LinkedBlockingQueue.java:103)
> at 
> java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:267)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:985)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:946)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316)
> - locked <0x2aaabf580fb0> (a 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> ...
> "IPC Server handler 29 on 60020" daemon prio=10 tid=0x2aacbc163800 
> nid=0x3acc waiting on condition [0x462f3000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x2aaabf5e3800> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1025)
> "IPC Server handler 28 on 60020" daemon prio=10 tid=0x2aacbc161800 
> nid=0x3acb waiting on condition [0x461f2000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x2aaabf5e3800> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1025
> ...
> {code}
> This region server stayed in this state for hours. The reader is waiting to 
> put and the handlers are waiting to take, and they wait on different lock 
> ids. It reminds me of the UseMembar thing about the JVM sometime missing to 
> notify waiters. In any case, that RS needed to be closed in order to get out 
> of that state. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira