[ https://issues.apache.org/jira/browse/HBASE-6018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276787#comment-13276787 ]
Jonathan Hsieh commented on HBASE-6018: --------------------------------------- This line seems related to attempt to enqueue a work item into a SynchronousQueue introduced in HBASE-4859. I don't understand why a SynchronousQueue is used (it has no capacity!) Problem goes away after this change: {code} diff --git a/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java b/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java index 83aa316..8a050fd 100644 --- a/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java +++ b/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java @@ -33,7 +33,8 @@ import java.util.SortedSet; import java.util.TreeMap; import java.util.TreeSet; import java.util.concurrent.ConcurrentSkipListMap; -import java.util.concurrent.SynchronousQueue; +//import java.util.concurrent.SynchronousQueue; +import java.util.concurrent.LinkedBlockingQueue; import java.util.concurrent.ThreadPoolExecutor; import java.util.concurrent.TimeUnit; import java.util.concurrent.atomic.AtomicInteger; @@ -217,9 +218,9 @@ public class HBaseFsck { this.conf = conf; int numThreads = conf.getInt("hbasefsck.numthreads", MAX_NUM_THREADS); - executor = new ThreadPoolExecutor(1, numThreads, + executor = new ThreadPoolExecutor(numThreads, numThreads, THREADS_KEEP_ALIVE_SECONDS, TimeUnit.SECONDS, - new SynchronousQueue<Runnable>()); + new LinkedBlockingQueue<Runnable>()); executor.allowCoreThreadTimeOut(true); } {code} > hbck fails with a RejectedExecutionException > -------------------------------------------- > > Key: HBASE-6018 > URL: https://issues.apache.org/jira/browse/HBASE-6018 > Project: HBase > Issue Type: Bug > Affects Versions: 0.92.1, 0.94.0 > Reporter: Jonathan Hsieh > Assignee: Jonathan Hsieh > > On a long running job 0.94.0rc3 cluster, we get to a point where hbck > consistently encounters this error and fails: > {code} > Exception in thread "main" java.util.concurrent.RejectedExecutionException > at > java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768) > at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) > at > org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegionInfos(HBaseFsck.java:633) > at > org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:354) > at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:382) > at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3120) > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira