Here is related code for disabling bucket cache:

    if (this.ioErrorStartTime > 0) {

      if (cacheEnabled && (now - ioErrorStartTime) > this.
ioErrorsTolerationDuration) {

        LOG.error("IO errors duration time has exceeded " +
ioErrorsTolerationDuration +

          "ms, disabling cache, please check your IOEngine");

        disableCache();

Can you search in the region server log to see if the above occurred ?

Was this server the only one with disabled cache ?

Cheers

On Sun, Feb 25, 2018 at 6:20 AM, Saad Mufti <saad.mu...@oath.com.invalid>
wrote:

> HI,
>
> I am running an HBase 1.3.1 cluster on AWS EMR. The bucket cache is
> configured to use two attached EBS disks of 50 GB each and I provisioned
> the bucket cache to be a bit less than the total, at a total of 98 GB per
> instance to be on the safe side. My tables have column families set to
> prefetch on open.
>
> On some instances during cluster startup, the bucket cache starts throwing
> errors, and eventually the bucket cache gets completely disabled on this
> instance. The instance still stays up as a valid region server and the only
> clue in the region server UI is that the bucket cache tab reports a count
> of 0, and size of 0 bytes.
>
> I have already opened a ticket with AWS to see if there are problems with
> the EBS volumes, but wanted to tap the open source community's hive-mind to
> see what kind of problem would cause the bucket cache to get disabled. If
> the application depends on the bucket cache for performance, wouldn't it be
> better to just remove that region server from the pool if its bucket cache
> cannot be recovered/enabled?
>
> The error look like the following. Would appreciate any insight, thank:
>
> 2018-02-25 01:12:47,780 ERROR [hfile-prefetch-1519513834057]
> bucket.BucketCache: Failed reading block
> 332b0634287f4c42851bc1a55ffe4042_1348128 from bucket cache
> java.nio.channels.ClosedByInterruptException
>         at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(
> AbstractInterruptibleChannel.java:202)
>         at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.
> java:746)
>         at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine$
> FileReadAccessor.access(FileIOEngine.java:219)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.
> accessFile(FileIOEngine.java:170)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.
> read(FileIOEngine.java:105)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.
> getBlock(BucketCache.java:492)
>         at
> org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.
> getBlock(CombinedBlockCache.java:84)
>         at
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.
> getCachedBlock(HFileReaderV2.java:279)
>         at
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(
> HFileReaderV2.java:420)
>         at
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$1.run(
> HFileReaderV2.java:209)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(
> ScheduledThreadPoolExecutor.java:293)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1149)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
>
> and
>
> 2018-02-25 01:12:52,432 ERROR [regionserver/
> ip-xx-xx-xx-xx.xx-xx-xx.us-east-1.ec2.xx.net/xx.xx.xx.xx:
> 16020-BucketCacheWriter-7]
> bucket.BucketCache: Failed writing to bucket cache
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
>         at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:758)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine$
> FileWriteAccessor.access(FileIOEngine.java:227)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.
> accessFile(FileIOEngine.java:170)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.
> write(FileIOEngine.java:116)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache$
> RAMQueueEntry.writeToCache(BucketCache.java:1357)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache$WriterThread.doDrain(
> BucketCache.java:883)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache$
> WriterThread.run(BucketCache.java:838)
>         at java.lang.Thread.run(Thread.java:748)
>
> and later
> 2018-02-25 01:13:47,783 INFO  [regionserver/
> ip-10-194-246-70.aolp-ds-dev.us-east-1.ec2.aolcloud.net/10.
> 194.246.70:16020-BucketCacheWriter-4]
> bucket.BucketCach
> e: regionserver/
> ip-10-194-246-70.aolp-ds-dev.us-east-1.ec2.aolcloud.net/10.
> 194.246.70:16020-BucketCacheWriter-4
> exiting, cacheEnabled=false
> 2018-02-25 01:13:47,864 WARN  [regionserver/
> ip-10-194-246-70.aolp-ds-dev.us-east-1.ec2.aolcloud.net/10.
> 194.246.70:16020-BucketCacheWriter-6]
> bucket.FileIOEngi
> ne: Failed syncing data to /mnt1/hbase/bucketcache
> 2018-02-25 01:13:47,864 ERROR [regionserver/
> ip-10-194-246-70.aolp-ds-dev.us-east-1.ec2.aolcloud.net/10.
> 194.246.70:16020-BucketCacheWriter-6]
> bucket.BucketCach
> e: Failed syncing IO engine
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
>         at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:379)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.
> sync(FileIOEngine.java:128)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache$WriterThread.doDrain(
> BucketCache.java:911)
>         at
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache$
> WriterThread.run(BucketCache.java:838)
>         at java.lang.Thread.run(Thread.java:748)
>
> ----
> Saad
>

Reply via email to