Mind putting up a thread dump?

How many spindles?

If you compare the i/o stats between a good RS and a stuck one, how do they
compare?

Thanks,
S


On Wed, Mar 27, 2019 at 11:57 AM Srinidhi Muppalla <srinid...@trulia.com>
wrote:

> Hello,
>
> We've noticed an issue in our HBase cluster where one of the
> region-servers has a spike in I/O wait associated with a spike in Load for
> that node. As a result, our request times to the cluster increase
> dramatically. Initially, we suspected that we were experiencing
> hotspotting, but even after temporarily blocking requests to the highest
> volume regions on that region-servers the issue persisted. Moreover, when
> looking at request counts to the regions on the region-server from the
> HBase UI, they were not particularly high and our own application level
> metrics on the requests we were making were not very high either. From
> looking at a thread dump of the region-server, it appears that our get and
> scan requests are getting stuck when trying to read from the blocks in our
> bucket cache leaving the threads in a 'runnable' state. For context, we are
> running HBase 1.30 on a cluster backed by S3 running on EMR and our bucket
> cache is running in File mode. Our region-servers all have SSDs. We have a
> combined cache with the L1 standard LRU cache and the L2 file mode bucket
> cache. Our Bucket Cache utilization is less than 50% of the allocated space.
>
> We suspect that part of the issue is our disk space utilization on the
> region-server as our max disk space utilization also increased as this
> happened. What things can we do to minimize disk space utilization? The
> actual HFiles are on S3 -- only the cache, application logs, and write
> ahead logs are on the region-servers. Other than the disk space
> utilization, what factors could cause high I/O wait in HBase and is there
> anything we can do to minimize it?
>
> Right now, the only thing that works is terminating and recreating the
> cluster (which we can do safely because it's S3 backed).
>
> Thanks!
> Srinidhi
>

Reply via email to