Hi, I'm running HBase 1.2.0 on FreeBSD via the ports system ( http://www.freshports.org/databases/hbase/), and it is generally working well. However, in an HA setup, the HBase master spins at 200% CPU usage when it is active and this follows the active master and disappears when standby. Since this cluster is fairly idle, and an older production Linux one uses much less master CPU, I am curious what is going on.
Using visualvm, I can see that the ProcedureExecutor threads are quite busy. Using dtrace, I can see that there is tons of native umtx activity. I am guessing that some procedure is failing, and continuously retrying as fast as the procedure dispatch allows. Using dtrace, I can also see that a lot of time seems to be spent in the JVM's 'libzip.so' native library. I'm wondering if it's a classloader run amok or something. I need to rebuild the JVM with debugging to get more out of dtrace, but the JVM doesn't implement a dtrace ustack helper for FreeBSD like it does on Solarish. Hopefully then I can get some idea of what is going on. Does this speak to anyone for ideas to look into? Other than noticing the CPU usage in top, the master seems to function fine and is responsive. Here's a sample thread dump, this doesn't really jump out to me, but I don't know if any of my captures are at the right moment either: "ProcedureExecutor-7" - Thread t@116 java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Native Method) - parking to wait for <4653d04c> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:145) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:139) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:803) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.run(ProcedureExecutor.java:494) Locked ownable synchronizers: - None "ProcedureExecutor-5" - Thread t@114 java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Native Method) - parking to wait for <4653d04c> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:145) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:139) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:803) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.run(ProcedureExecutor.java:494) Locked ownable synchronizers: - None "ProcedureExecutor-4" - Thread t@113 java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Native Method) - parking to wait for <4653d04c> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:145) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:139) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:803) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.run(ProcedureExecutor.java:494) Locked ownable synchronizers: - None "ProcedureExecutor-3" - Thread t@112 java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Native Method) - parking to wait for <4653d04c> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:145) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:139) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:803) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.run(ProcedureExecutor.java:494) Locked ownable synchronizers: - None "ProcedureExecutor-2" - Thread t@111 java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Native Method) - parking to wait for <4653d04c> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:145) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:139) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:803) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.run(ProcedureExecutor.java:494) Locked ownable synchronizers: - None "ProcedureExecutor-1" - Thread t@110 java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:139) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:803) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.run(ProcedureExecutor.java:494) Locked ownable synchronizers: - None "ProcedureExecutor-0" - Thread t@109 java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Native Method) - parking to wait for <4653d04c> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:145) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.poll(MasterProcedureScheduler.java:139) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:803) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.run(ProcedureExecutor.java:494) Locked ownable synchronizers: - None