>From the logs, it seems there were some issue with the file that was used
by the bucket cache. Probably the volume where the file was mounted had
some issues.
If you can confirm that , then this issue should be pretty straightforward.
If not let us know, we can help.

Regards
Ram

On Sun, Feb 25, 2018 at 9:40 PM, Ted Yu <[email protected]> wrote:

> Here is related code for disabling bucket cache:
>
>     if (this.ioErrorStartTime > 0) {
>
>       if (cacheEnabled && (now - ioErrorStartTime) > this.
> ioErrorsTolerationDuration) {
>
>         LOG.error("IO errors duration time has exceeded " +
> ioErrorsTolerationDuration +
>
>           "ms, disabling cache, please check your IOEngine");
>
>         disableCache();
>
> Can you search in the region server log to see if the above occurred ?
>
> Was this server the only one with disabled cache ?
>
> Cheers
>
> On Sun, Feb 25, 2018 at 6:20 AM, Saad Mufti <[email protected]>
> wrote:
>
> > HI,
> >
> > I am running an HBase 1.3.1 cluster on AWS EMR. The bucket cache is
> > configured to use two attached EBS disks of 50 GB each and I provisioned
> > the bucket cache to be a bit less than the total, at a total of 98 GB per
> > instance to be on the safe side. My tables have column families set to
> > prefetch on open.
> >
> > On some instances during cluster startup, the bucket cache starts
> throwing
> > errors, and eventually the bucket cache gets completely disabled on this
> > instance. The instance still stays up as a valid region server and the
> only
> > clue in the region server UI is that the bucket cache tab reports a count
> > of 0, and size of 0 bytes.
> >
> > I have already opened a ticket with AWS to see if there are problems with
> > the EBS volumes, but wanted to tap the open source community's hive-mind
> to
> > see what kind of problem would cause the bucket cache to get disabled. If
> > the application depends on the bucket cache for performance, wouldn't it
> be
> > better to just remove that region server from the pool if its bucket
> cache
> > cannot be recovered/enabled?
> >
> > The error look like the following. Would appreciate any insight, thank:
> >
> > 2018-02-25 01:12:47,780 ERROR [hfile-prefetch-1519513834057]
> > bucket.BucketCache: Failed reading block
> > 332b0634287f4c42851bc1a55ffe4042_1348128 from bucket cache
> > java.nio.channels.ClosedByInterruptException
> >         at
> > java.nio.channels.spi.AbstractInterruptibleChannel.end(
> > AbstractInterruptibleChannel.java:202)
> >         at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.
> > java:746)
> >         at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine$
> > FileReadAccessor.access(FileIOEngine.java:219)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.
> > accessFile(FileIOEngine.java:170)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.
> > read(FileIOEngine.java:105)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.
> > getBlock(BucketCache.java:492)
> >         at
> > org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.
> > getBlock(CombinedBlockCache.java:84)
> >         at
> > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.
> > getCachedBlock(HFileReaderV2.java:279)
> >         at
> > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(
> > HFileReaderV2.java:420)
> >         at
> > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$1.run(
> > HFileReaderV2.java:209)
> >         at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> >         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >         at
> > java.util.concurrent.ScheduledThreadPoolExecutor$
> > ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> >         at
> > java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(
> > ScheduledThreadPoolExecutor.java:293)
> >         at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1149)
> >         at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:624)
> >         at java.lang.Thread.run(Thread.java:748)
> >
> > and
> >
> > 2018-02-25 01:12:52,432 ERROR [regionserver/
> > ip-xx-xx-xx-xx.xx-xx-xx.us-east-1.ec2.xx.net/xx.xx.xx.xx:
> > 16020-BucketCacheWriter-7]
> > bucket.BucketCache: Failed writing to bucket cache
> > java.nio.channels.ClosedChannelException
> >         at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.
> java:110)
> >         at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:758)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine$
> > FileWriteAccessor.access(FileIOEngine.java:227)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.
> > accessFile(FileIOEngine.java:170)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.
> > write(FileIOEngine.java:116)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.BucketCache$
> > RAMQueueEntry.writeToCache(BucketCache.java:1357)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.BucketCache$
> WriterThread.doDrain(
> > BucketCache.java:883)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.BucketCache$
> > WriterThread.run(BucketCache.java:838)
> >         at java.lang.Thread.run(Thread.java:748)
> >
> > and later
> > 2018-02-25 01:13:47,783 INFO  [regionserver/
> > ip-10-194-246-70.aolp-ds-dev.us-east-1.ec2.aolcloud.net/10.
> > 194.246.70:16020-BucketCacheWriter-4]
> > bucket.BucketCach
> > e: regionserver/
> > ip-10-194-246-70.aolp-ds-dev.us-east-1.ec2.aolcloud.net/10.
> > 194.246.70:16020-BucketCacheWriter-4
> > exiting, cacheEnabled=false
> > 2018-02-25 01:13:47,864 WARN  [regionserver/
> > ip-10-194-246-70.aolp-ds-dev.us-east-1.ec2.aolcloud.net/10.
> > 194.246.70:16020-BucketCacheWriter-6]
> > bucket.FileIOEngi
> > ne: Failed syncing data to /mnt1/hbase/bucketcache
> > 2018-02-25 01:13:47,864 ERROR [regionserver/
> > ip-10-194-246-70.aolp-ds-dev.us-east-1.ec2.aolcloud.net/10.
> > 194.246.70:16020-BucketCacheWriter-6]
> > bucket.BucketCach
> > e: Failed syncing IO engine
> > java.nio.channels.ClosedChannelException
> >         at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.
> java:110)
> >         at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:379)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.FileIOEngine.
> > sync(FileIOEngine.java:128)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.BucketCache$
> WriterThread.doDrain(
> > BucketCache.java:911)
> >         at
> > org.apache.hadoop.hbase.io.hfile.bucket.BucketCache$
> > WriterThread.run(BucketCache.java:838)
> >         at java.lang.Thread.run(Thread.java:748)
> >
> > ----
> > Saad
> >
>

Reply via email to