No. ---- Saad
On Fri, Dec 2, 2016 at 3:27 PM, Ted Yu <ted...@yahoo.com.invalid> wrote: > Some how I couldn't access the pastebin (I am in China now). > Did the region server showing hotspot host meta ? > Thanks > > On Friday, December 2, 2016 11:53 AM, Saad Mufti <saad.mu...@gmail.com> > wrote: > > > We're in AWS with D2.4xLarge instances. Each instance has 12 independent > spindles/disks from what I can tell. > > We have charted get_rate and mutate_rate by host and > > a) mutate_rate shows no real outliers > b) read_rate shows the overall rate on the "hotspot" region server is a bit > higher than every other server, not severely but enough that it is a bit > noticeable. But when we chart get_rate on that server by region, no one > region stands out. > > get_rate chart by host: > > https://snag.gy/hmoiDw.jpg > > mutate_rate chart by host: > > https://snag.gy/jitdMN.jpg > > ---- > Saad > > > ---- > Saad > > > On Fri, Dec 2, 2016 at 2:34 PM, John Leach <jle...@splicemachine.com> > wrote: > > > Here is what I see... > > > > > > * Short Compaction Running on Heap > > "regionserver/ip-10-99-181-146.aolp-prd.us-east-1.ec2. > > aolcloud.net/10.99.181.146:60020-shortCompactions-1480229281547" - > Thread > > t@242 > > java.lang.Thread.State: RUNNABLE > > at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder. > > compressSingleKeyValue(FastDiffDeltaEncoder.java:270) > > at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder. > > internalEncode(FastDiffDeltaEncoder.java:245) > > at org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder. > > encode(BufferedDataBlockEncoder.java:987) > > at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder. > > encode(FastDiffDeltaEncoder.java:58) > > at org.apache.hadoop.hbase.io.hfile.HFileDataBlockEncoderImpl.encode( > > HFileDataBlockEncoderImpl.java:97) > > at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.write( > > HFileBlock.java:866) > > at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append( > > HFileWriterV2.java:270) > > at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append( > > HFileWriterV3.java:87) > > at org.apache.hadoop.hbase.regionserver.StoreFile$Writer. > > append(StoreFile.java:949) > > at org.apache.hadoop.hbase.regionserver.compactions. > > Compactor.performCompaction(Compactor.java:282) > > at org.apache.hadoop.hbase.regionserver.compactions. > > DefaultCompactor.compact(DefaultCompactor.java:105) > > at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$ > > DefaultCompactionContext.compact(DefaultStoreEngine.java:124) > > at org.apache.hadoop.hbase.regionserver.HStore.compact( > > HStore.java:1233) > > at org.apache.hadoop.hbase.regionserver.HRegion.compact( > > HRegion.java:1770) > > at org.apache.hadoop.hbase.regionserver.CompactSplitThread$ > > CompactionRunner.run(CompactSplitThread.java:520) > > at java.util.concurrent.ThreadPoolExecutor.runWorker( > > ThreadPoolExecutor.java:1142) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > > ThreadPoolExecutor.java:617) > > at java.lang.Thread.run(Thread.java:745) > > > > > > * WAL Syncs waiting… ALL 5 > > "sync.0" - Thread t@202 > > java.lang.Thread.State: TIMED_WAITING > > at java.lang.Object.wait(Native Method) > > - waiting on <67ba892d> (a java.util.LinkedList) > > at org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno( > > DFSOutputStream.java:2337) > > at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync( > > DFSOutputStream.java:2224) > > at org.apache.hadoop.hdfs.DFSOutputStream.hflush( > > DFSOutputStream.java:2116) > > at org.apache.hadoop.fs.FSDataOutputStream.hflush( > > FSDataOutputStream.java:130) > > at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync( > > ProtobufLogWriter.java:173) > > at org.apache.hadoop.hbase.regionserver.wal.FSHLog$ > > SyncRunner.run(FSHLog.java:1379) > > at java.lang.Thread.run(Thread.java:745) > > > > * Mutations backing up very badly... > > > > "B.defaultRpcServer.handler=103,queue=7,port=60020" - Thread t@155 > > java.lang.Thread.State: TIMED_WAITING > > at java.lang.Object.wait(Native Method) > > - waiting on <6ab54ea3> (a org.apache.hadoop.hbase. > > regionserver.wal.SyncFuture) > > at org.apache.hadoop.hbase.regionserver.wal.SyncFuture. > > get(SyncFuture.java:167) > > at org.apache.hadoop.hbase.regionserver.wal.FSHLog. > > blockOnSync(FSHLog.java:1504) > > at org.apache.hadoop.hbase.regionserver.wal.FSHLog. > > publishSyncThenBlockOnCompletion(FSHLog.java:1498) > > at org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync( > > FSHLog.java:1632) > > at org.apache.hadoop.hbase.regionserver.HRegion. > > syncOrDefer(HRegion.java:7737) > > at org.apache.hadoop.hbase.regionserver.HRegion. > > processRowsWithLocks(HRegion.java:6504) > > at org.apache.hadoop.hbase.regionserver.HRegion. > > mutateRowsWithLocks(HRegion.java:6352) > > at org.apache.hadoop.hbase.regionserver.HRegion. > > mutateRowsWithLocks(HRegion.java:6334) > > at org.apache.hadoop.hbase.regionserver.HRegion. > > mutateRow(HRegion.java:6325) > > at org.apache.hadoop.hbase.regionserver.RSRpcServices. > > mutateRows(RSRpcServices.java:418) > > at org.apache.hadoop.hbase.regionserver.RSRpcServices. > > multi(RSRpcServices.java:1916) > > at org.apache.hadoop.hbase.protobuf.generated. > > ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32213) > > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2034) > > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) > > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop( > > RpcExecutor.java:130) > > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor. > java:107) > > at java.lang.Thread.run(Thread.java:745) > > > > > > Too many writers being blocked attempting to write to WAL. > > > > What does your disk infrastructure look like? Can you get away with > > Multi-wal? Ugh... > > > > Regards, > > John Leach > > > > > > > On Dec 2, 2016, at 1:20 PM, Saad Mufti <saad.mu...@gmail.com> wrote: > > > > > > Hi Ted, > > > > > > Finally we have another hotspot going on, same symptoms as before, here > > is > > > the pastebin for the stack trace from the region server that I obtained > > via > > > VisualVM: > > > > > > http://pastebin.com/qbXPPrXk > > > > > > Would really appreciate any insight you or anyone else can provide. > > > > > > Thanks. > > > > > > ---- > > > Saad > > > > > > > > > On Thu, Dec 1, 2016 at 6:08 PM, Saad Mufti <saad.mu...@gmail.com> > wrote: > > > > > >> Sure will, the next time it happens. > > >> > > >> Thanks!!! > > >> > > >> ---- > > >> Saad > > >> > > >> > > >> On Thu, Dec 1, 2016 at 5:01 PM, Ted Yu <ted...@yahoo.com.invalid> > > wrote: > > >> > > >>> From #2 in the initial email, the hbase:meta might not be the cause > for > > >>> the hotspot. > > >>> > > >>> Saad: > > >>> Can you pastebin stack trace of the hot region server when this > happens > > >>> again ? > > >>> > > >>> Thanks > > >>> > > >>>> On Dec 2, 2016, at 4:48 AM, Saad Mufti <saad.mu...@gmail.com> > wrote: > > >>>> > > >>>> We used a pre-split into 1024 regions at the start but we > > miscalculated > > >>> our > > >>>> data size, so there were still auto-splits storms at the beginning > as > > >>> data > > >>>> size stabilized, it has ended up at around 9500 or so regions, plus > a > > >>> few > > >>>> thousand regions for a few other tables (much smaller). But haven't > > had > > >>> any > > >>>> new auto-splits in a couple of months. And the hotspots only started > > >>>> happening recently. > > >>>> > > >>>> Our hashing scheme is very simple, we take the MD5 of the key, then > > >>> form a > > >>>> 4 digit prefix based on the first two bytes of the MD5 normalized to > > be > > >>>> within the range 0-1023 . I am fairly confident about this scheme > > >>>> especially since even during the hotspot we see no evidence so far > > that > > >>> any > > >>>> particular region is taking disproportionate traffic (based on > > Cloudera > > >>>> Manager per region charts on the hotspot server). Does that look > like > > a > > >>>> reasonable scheme to randomize which region any give key goes to? > And > > >>> the > > >>>> start of the hotspot doesn't seem to correspond to any region > > splitting > > >>> or > > >>>> moving from one server to another activity. > > >>>> > > >>>> Thanks. > > >>>> > > >>>> ---- > > >>>> Saad > > >>>> > > >>>> > > >>>>> On Thu, Dec 1, 2016 at 3:32 PM, John Leach < > jle...@splicemachine.com > > > > > >>> wrote: > > >>>>> > > >>>>> Saad, > > >>>>> > > >>>>> Region move or split causes client connections to simultaneously > > >>> refresh > > >>>>> their meta. > > >>>>> > > >>>>> Key word is supposed. We have seen meta hot spotting from time to > > time > > >>>>> and on different versions at Splice Machine. > > >>>>> > > >>>>> How confident are you in your hashing algorithm? > > >>>>> > > >>>>> Regards, > > >>>>> John Leach > > >>>>> > > >>>>> > > >>>>> > > >>>>>> On Dec 1, 2016, at 2:25 PM, Saad Mufti <saad.mu...@gmail.com> > > wrote: > > >>>>>> > > >>>>>> No never thought about that. I just figured out how to locate the > > >>> server > > >>>>>> for that table after you mentioned it. We'll have to keep an eye > on > > it > > >>>>> next > > >>>>>> time we have a hotspot to see if it coincides with the hotspot > > server. > > >>>>>> > > >>>>>> What would be the theory for how it could become a hotspot? Isn't > > the > > >>>>>> client supposed to cache it and only go back for a refresh if it > > hits > > >>> a > > >>>>>> region that is not in its expected location? > > >>>>>> > > >>>>>> ---- > > >>>>>> Saad > > >>>>>> > > >>>>>> > > >>>>>> On Thu, Dec 1, 2016 at 2:56 PM, John Leach < > > jle...@splicemachine.com> > > >>>>> wrote: > > >>>>>> > > >>>>>>> Saad, > > >>>>>>> > > >>>>>>> Did you validate that Meta is not on the “Hot” region server? > > >>>>>>> > > >>>>>>> Regards, > > >>>>>>> John Leach > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>>> On Dec 1, 2016, at 1:50 PM, Saad Mufti <saad.mu...@gmail.com> > > >>> wrote: > > >>>>>>>> > > >>>>>>>> Hi, > > >>>>>>>> > > >>>>>>>> We are using HBase 1.0 on CDH 5.5.2 . We have taken great care > to > > >>> avoid > > >>>>>>>> hotspotting due to inadvertent data patterns by prepending an > MD5 > > >>>>> based 4 > > >>>>>>>> digit hash prefix to all our data keys. This works fine most of > > the > > >>>>>>> times, > > >>>>>>>> but more and more (as much as once or twice a day) recently we > > have > > >>>>>>>> occasions where one region server suddenly becomes "hot" (CPU > > above > > >>> or > > >>>>>>>> around 95% in various monitoring tools). When it happens it > lasts > > >>> for > > >>>>>>>> hours, occasionally the hotspot might jump to another region > > server > > >>> as > > >>>>>>> the > > >>>>>>>> master decide the region is unresponsive and gives its region to > > >>>>> another > > >>>>>>>> server. > > >>>>>>>> > > >>>>>>>> For the longest time, we thought this must be some single rogue > > key > > >>> in > > >>>>>>> our > > >>>>>>>> input data that is being hammered. All attempts to track this > down > > >>> have > > >>>>>>>> failed though, and the following behavior argues against this > > being > > >>>>>>>> application based: > > >>>>>>>> > > >>>>>>>> 1. plotted Get and Put rate by region on the "hot" region server > > in > > >>>>>>>> Cloudera Manager Charts, shows no single region is an outlier. > > >>>>>>>> > > >>>>>>>> 2. cleanly restarting just the region server process causes its > > >>> regions > > >>>>>>> to > > >>>>>>>> randomly migrate to other region servers, then it gets new ones > > from > > >>>>> the > > >>>>>>>> HBase master, basically a sort of shuffling, then the hotspot > goes > > >>>>> away. > > >>>>>>> If > > >>>>>>>> it were application based, you'd expect the hotspot to just jump > > to > > >>>>>>> another > > >>>>>>>> region server. > > >>>>>>>> > > >>>>>>>> 3. have pored through region server logs and can't see anything > > out > > >>> of > > >>>>>>> the > > >>>>>>>> ordinary happening > > >>>>>>>> > > >>>>>>>> The only other pertinent thing to mention might be that we have > a > > >>>>> special > > >>>>>>>> process of our own running outside the cluster that does cluster > > >>> wide > > >>>>>>> major > > >>>>>>>> compaction in a rolling fashion, where each batch consists of > one > > >>>>> region > > >>>>>>>> from each region server, and it waits before one batch is > > completely > > >>>>> done > > >>>>>>>> before starting another. We have seen no real impact on the > > hotspot > > >>>>> from > > >>>>>>>> shutting this down and in normal times it doesn't impact our > read > > or > > >>>>>>> write > > >>>>>>>> performance much. > > >>>>>>>> > > >>>>>>>> We are at our wit's end, anyone have experience with a scenario > > like > > >>>>>>> this? > > >>>>>>>> Any help/guidance would be most appreciated. > > >>>>>>>> > > >>>>>>>> ----- > > >>>>>>>> Saad > > >>>>>>> > > >>>>>>> > > >>>>> > > >>>>> > > >>> > > >> > > >> > > > > > > >