Saad, Did you validate that Meta is not on the “Hot” region server?
Regards, John Leach > On Dec 1, 2016, at 1:50 PM, Saad Mufti <[email protected]> wrote: > > Hi, > > We are using HBase 1.0 on CDH 5.5.2 . We have taken great care to avoid > hotspotting due to inadvertent data patterns by prepending an MD5 based 4 > digit hash prefix to all our data keys. This works fine most of the times, > but more and more (as much as once or twice a day) recently we have > occasions where one region server suddenly becomes "hot" (CPU above or > around 95% in various monitoring tools). When it happens it lasts for > hours, occasionally the hotspot might jump to another region server as the > master decide the region is unresponsive and gives its region to another > server. > > For the longest time, we thought this must be some single rogue key in our > input data that is being hammered. All attempts to track this down have > failed though, and the following behavior argues against this being > application based: > > 1. plotted Get and Put rate by region on the "hot" region server in > Cloudera Manager Charts, shows no single region is an outlier. > > 2. cleanly restarting just the region server process causes its regions to > randomly migrate to other region servers, then it gets new ones from the > HBase master, basically a sort of shuffling, then the hotspot goes away. If > it were application based, you'd expect the hotspot to just jump to another > region server. > > 3. have pored through region server logs and can't see anything out of the > ordinary happening > > The only other pertinent thing to mention might be that we have a special > process of our own running outside the cluster that does cluster wide major > compaction in a rolling fashion, where each batch consists of one region > from each region server, and it waits before one batch is completely done > before starting another. We have seen no real impact on the hotspot from > shutting this down and in normal times it doesn't impact our read or write > performance much. > > We are at our wit's end, anyone have experience with a scenario like this? > Any help/guidance would be most appreciated. > > ----- > Saad
