[jira] [Commented] (HBASE-26480) Close NamedQueueRecorder to allow HMaster/RS to shutdown gracefully
[ https://issues.apache.org/jira/browse/HBASE-26480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17449234#comment-17449234 ] Rushabh Shah commented on HBASE-26480: -- Thank you [~Xiaolin Ha] for the review and merge ! > Close NamedQueueRecorder to allow HMaster/RS to shutdown gracefully > --- > > Key: HBASE-26480 > URL: https://issues.apache.org/jira/browse/HBASE-26480 > Project: HBase > Issue Type: Bug >Affects Versions: 1.7.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > Fix For: 1.8.0 > > > Saw one case in our production cluster where RS was not exiting. Saw this > non-daemon thread in hung RS stack trace: > {noformat} > "main.slowlog.append-pool-pool1-t1" #26 prio=5 os_prio=31 > tid=0x7faf23bf7800 nid=0x6d07 waiting on condition [0x73f4d000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0004039e3840> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > com.lmax.disruptor.BlockingWaitStrategy.waitFor(BlockingWaitStrategy.java:47) > at > com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:56) > at > com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:159) > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:125) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > > This is coming from > [NamedQueueRecorder|https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/namequeues/NamedQueueRecorder.java#L65] > implementation. > This bug doesn't exists in branch-2 and master since the Disruptor > initialization has changed and we set daemon=true also. See [this > code|https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/namequeues/NamedQueueRecorder.java#L68] > > FYI [~vjasani] [~zhangduo] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26480) Close NamedQueueRecorder to allow HMaster/RS to shutdown gracefully
[ https://issues.apache.org/jira/browse/HBASE-26480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448687#comment-17448687 ] Hudson commented on HBASE-26480: Results for branch branch-1 [build #184 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/184/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/184//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/184//JDK7_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-1/184//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Close NamedQueueRecorder to allow HMaster/RS to shutdown gracefully > --- > > Key: HBASE-26480 > URL: https://issues.apache.org/jira/browse/HBASE-26480 > Project: HBase > Issue Type: Bug >Affects Versions: 1.7.0 >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > Fix For: 1.8.0 > > > Saw one case in our production cluster where RS was not exiting. Saw this > non-daemon thread in hung RS stack trace: > {noformat} > "main.slowlog.append-pool-pool1-t1" #26 prio=5 os_prio=31 > tid=0x7faf23bf7800 nid=0x6d07 waiting on condition [0x73f4d000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0004039e3840> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > com.lmax.disruptor.BlockingWaitStrategy.waitFor(BlockingWaitStrategy.java:47) > at > com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:56) > at > com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:159) > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:125) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > > This is coming from > [NamedQueueRecorder|https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/namequeues/NamedQueueRecorder.java#L65] > implementation. > This bug doesn't exists in branch-2 and master since the Disruptor > initialization has changed and we set daemon=true also. See [this > code|https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/namequeues/NamedQueueRecorder.java#L68] > > FYI [~vjasani] [~zhangduo] -- This message was sent by Atlassian Jira (v8.20.1#820001)