Mind putting up a thread dump? How many spindles?
If you compare the i/o stats between a good RS and a stuck one, how do they compare? Thanks, S On Wed, Mar 27, 2019 at 11:57 AM Srinidhi Muppalla <srinid...@trulia.com> wrote: > Hello, > > We've noticed an issue in our HBase cluster where one of the > region-servers has a spike in I/O wait associated with a spike in Load for > that node. As a result, our request times to the cluster increase > dramatically. Initially, we suspected that we were experiencing > hotspotting, but even after temporarily blocking requests to the highest > volume regions on that region-servers the issue persisted. Moreover, when > looking at request counts to the regions on the region-server from the > HBase UI, they were not particularly high and our own application level > metrics on the requests we were making were not very high either. From > looking at a thread dump of the region-server, it appears that our get and > scan requests are getting stuck when trying to read from the blocks in our > bucket cache leaving the threads in a 'runnable' state. For context, we are > running HBase 1.30 on a cluster backed by S3 running on EMR and our bucket > cache is running in File mode. Our region-servers all have SSDs. We have a > combined cache with the L1 standard LRU cache and the L2 file mode bucket > cache. Our Bucket Cache utilization is less than 50% of the allocated space. > > We suspect that part of the issue is our disk space utilization on the > region-server as our max disk space utilization also increased as this > happened. What things can we do to minimize disk space utilization? The > actual HFiles are on S3 -- only the cache, application logs, and write > ahead logs are on the region-servers. Other than the disk space > utilization, what factors could cause high I/O wait in HBase and is there > anything we can do to minimize it? > > Right now, the only thing that works is terminating and recreating the > cluster (which we can do safely because it's S3 backed). > > Thanks! > Srinidhi >