I would check compaction, investigate throttling if it's causing high CPU.

On Sat, Dec 3, 2016 at 6:20 AM Saad Mufti <saad.mu...@gmail.com> wrote:

> No.
>
> ----
> Saad
>
>
> On Fri, Dec 2, 2016 at 3:27 PM, Ted Yu <ted...@yahoo.com.invalid> wrote:
>
> > Some how I couldn't access the pastebin (I am in China now).
> > Did the region server showing hotspot host meta ?
> > Thanks
> >
> >     On Friday, December 2, 2016 11:53 AM, Saad Mufti <
> saad.mu...@gmail.com>
> > wrote:
> >
> >
> >  We're in AWS with D2.4xLarge instances. Each instance has 12 independent
> > spindles/disks from what I can tell.
> >
> > We have charted get_rate and mutate_rate by host and
> >
> > a) mutate_rate shows no real outliers
> > b) read_rate shows the overall rate on the "hotspot" region server is a
> bit
> > higher than every other server, not severely but enough that it is a bit
> > noticeable. But when we chart get_rate on that server by region, no one
> > region stands out.
> >
> > get_rate chart by host:
> >
> > https://snag.gy/hmoiDw.jpg
> >
> > mutate_rate chart by host:
> >
> > https://snag.gy/jitdMN.jpg
> >
> > ----
> > Saad
> >
> >
> > ----
> > Saad
> >
> >
> > On Fri, Dec 2, 2016 at 2:34 PM, John Leach <jle...@splicemachine.com>
> > wrote:
> >
> > > Here is what I see...
> > >
> > >
> > > * Short Compaction Running on Heap
> > > "regionserver/ip-10-99-181-146.aolp-prd.us-east-1.ec2.
> > > aolcloud.net/10.99.181.146:60020-shortCompactions-1480229281547" -
> > Thread
> > > t@242
> > >    java.lang.Thread.State: RUNNABLE
> > >    at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > compressSingleKeyValue(FastDiffDeltaEncoder.java:270)
> > >    at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > internalEncode(FastDiffDeltaEncoder.java:245)
> > >    at org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder.
> > > encode(BufferedDataBlockEncoder.java:987)
> > >    at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > encode(FastDiffDeltaEncoder.java:58)
> > >    at org.apache.hadoop.hbase.io
> .hfile.HFileDataBlockEncoderImpl.encode(
> > > HFileDataBlockEncoderImpl.java:97)
> > >    at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.write(
> > > HFileBlock.java:866)
> > >    at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(
> > > HFileWriterV2.java:270)
> > >    at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(
> > > HFileWriterV3.java:87)
> > >    at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.
> > > append(StoreFile.java:949)
> > >    at org.apache.hadoop.hbase.regionserver.compactions.
> > > Compactor.performCompaction(Compactor.java:282)
> > >    at org.apache.hadoop.hbase.regionserver.compactions.
> > > DefaultCompactor.compact(DefaultCompactor.java:105)
> > >    at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$
> > > DefaultCompactionContext.compact(DefaultStoreEngine.java:124)
> > >    at org.apache.hadoop.hbase.regionserver.HStore.compact(
> > > HStore.java:1233)
> > >    at org.apache.hadoop.hbase.regionserver.HRegion.compact(
> > > HRegion.java:1770)
> > >    at org.apache.hadoop.hbase.regionserver.CompactSplitThread$
> > > CompactionRunner.run(CompactSplitThread.java:520)
> > >    at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > ThreadPoolExecutor.java:1142)
> > >    at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > ThreadPoolExecutor.java:617)
> > >    at java.lang.Thread.run(Thread.java:745)
> > >
> > >
> > > * WAL Syncs waiting…  ALL 5
> > > "sync.0" - Thread t@202
> > >    java.lang.Thread.State: TIMED_WAITING
> > >    at java.lang.Object.wait(Native Method)
> > >    - waiting on <67ba892d> (a java.util.LinkedList)
> > >    at org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(
> > > DFSOutputStream.java:2337)
> > >    at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(
> > > DFSOutputStream.java:2224)
> > >    at org.apache.hadoop.hdfs.DFSOutputStream.hflush(
> > > DFSOutputStream.java:2116)
> > >    at org.apache.hadoop.fs.FSDataOutputStream.hflush(
> > > FSDataOutputStream.java:130)
> > >    at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(
> > > ProtobufLogWriter.java:173)
> > >    at org.apache.hadoop.hbase.regionserver.wal.FSHLog$
> > > SyncRunner.run(FSHLog.java:1379)
> > >    at java.lang.Thread.run(Thread.java:745)
> > >
> > > * Mutations backing up very badly...
> > >
> > > "B.defaultRpcServer.handler=103,queue=7,port=60020" - Thread t@155
> > >    java.lang.Thread.State: TIMED_WAITING
> > >    at java.lang.Object.wait(Native Method)
> > >    - waiting on <6ab54ea3> (a org.apache.hadoop.hbase.
> > > regionserver.wal.SyncFuture)
> > >    at org.apache.hadoop.hbase.regionserver.wal.SyncFuture.
> > > get(SyncFuture.java:167)
> > >    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.
> > > blockOnSync(FSHLog.java:1504)
> > >    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.
> > > publishSyncThenBlockOnCompletion(FSHLog.java:1498)
> > >    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(
> > > FSHLog.java:1632)
> > >    at org.apache.hadoop.hbase.regionserver.HRegion.
> > > syncOrDefer(HRegion.java:7737)
> > >    at org.apache.hadoop.hbase.regionserver.HRegion.
> > > processRowsWithLocks(HRegion.java:6504)
> > >    at org.apache.hadoop.hbase.regionserver.HRegion.
> > > mutateRowsWithLocks(HRegion.java:6352)
> > >    at org.apache.hadoop.hbase.regionserver.HRegion.
> > > mutateRowsWithLocks(HRegion.java:6334)
> > >    at org.apache.hadoop.hbase.regionserver.HRegion.
> > > mutateRow(HRegion.java:6325)
> > >    at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> > > mutateRows(RSRpcServices.java:418)
> > >    at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> > > multi(RSRpcServices.java:1916)
> > >    at org.apache.hadoop.hbase.protobuf.generated.
> > >
> ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32213)
> > >    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2034)
> > >    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
> > >    at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> > > RpcExecutor.java:130)
> > >    at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> > java:107)
> > >    at java.lang.Thread.run(Thread.java:745)
> > >
> > >
> > > Too many writers being blocked attempting to write to WAL.
> > >
> > > What does your disk infrastructure look like?  Can you get away with
> > > Multi-wal?  Ugh...
> > >
> > > Regards,
> > > John Leach
> > >
> > >
> > > > On Dec 2, 2016, at 1:20 PM, Saad Mufti <saad.mu...@gmail.com> wrote:
> > > >
> > > > Hi Ted,
> > > >
> > > > Finally we have another hotspot going on, same symptoms as before,
> here
> > > is
> > > > the pastebin for the stack trace from the region server that I
> obtained
> > > via
> > > > VisualVM:
> > > >
> > > > http://pastebin.com/qbXPPrXk
> > > >
> > > > Would really appreciate any insight you or anyone else can provide.
> > > >
> > > > Thanks.
> > > >
> > > > ----
> > > > Saad
> > > >
> > > >
> > > > On Thu, Dec 1, 2016 at 6:08 PM, Saad Mufti <saad.mu...@gmail.com>
> > wrote:
> > > >
> > > >> Sure will, the next time it happens.
> > > >>
> > > >> Thanks!!!
> > > >>
> > > >> ----
> > > >> Saad
> > > >>
> > > >>
> > > >> On Thu, Dec 1, 2016 at 5:01 PM, Ted Yu <ted...@yahoo.com.invalid>
> > > wrote:
> > > >>
> > > >>> From #2 in the initial email, the hbase:meta might not be the cause
> > for
> > > >>> the hotspot.
> > > >>>
> > > >>> Saad:
> > > >>> Can you pastebin stack trace of the hot region server when this
> > happens
> > > >>> again ?
> > > >>>
> > > >>> Thanks
> > > >>>
> > > >>>> On Dec 2, 2016, at 4:48 AM, Saad Mufti <saad.mu...@gmail.com>
> > wrote:
> > > >>>>
> > > >>>> We used a pre-split into 1024 regions at the start but we
> > > miscalculated
> > > >>> our
> > > >>>> data size, so there were still auto-splits storms at the beginning
> > as
> > > >>> data
> > > >>>> size stabilized, it has ended up at around 9500 or so regions,
> plus
> > a
> > > >>> few
> > > >>>> thousand regions for a few other tables (much smaller). But
> haven't
> > > had
> > > >>> any
> > > >>>> new auto-splits in a couple of months. And the hotspots only
> started
> > > >>>> happening recently.
> > > >>>>
> > > >>>> Our hashing scheme is very simple, we take the MD5 of the key,
> then
> > > >>> form a
> > > >>>> 4 digit prefix based on the first two bytes of the MD5 normalized
> to
> > > be
> > > >>>> within the range 0-1023 . I am fairly confident about this scheme
> > > >>>> especially since even during the hotspot we see no evidence so far
> > > that
> > > >>> any
> > > >>>> particular region is taking disproportionate traffic (based on
> > > Cloudera
> > > >>>> Manager per region charts on the hotspot server). Does that look
> > like
> > > a
> > > >>>> reasonable scheme to randomize which region any give key goes to?
> > And
> > > >>> the
> > > >>>> start of the hotspot doesn't seem to correspond to any region
> > > splitting
> > > >>> or
> > > >>>> moving from one server to another activity.
> > > >>>>
> > > >>>> Thanks.
> > > >>>>
> > > >>>> ----
> > > >>>> Saad
> > > >>>>
> > > >>>>
> > > >>>>> On Thu, Dec 1, 2016 at 3:32 PM, John Leach <
> > jle...@splicemachine.com
> > > >
> > > >>> wrote:
> > > >>>>>
> > > >>>>> Saad,
> > > >>>>>
> > > >>>>> Region move or split causes client connections to simultaneously
> > > >>> refresh
> > > >>>>> their meta.
> > > >>>>>
> > > >>>>> Key word is supposed.  We have seen meta hot spotting from time
> to
> > > time
> > > >>>>> and on different versions at Splice Machine.
> > > >>>>>
> > > >>>>> How confident are you in your hashing algorithm?
> > > >>>>>
> > > >>>>> Regards,
> > > >>>>> John Leach
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>> On Dec 1, 2016, at 2:25 PM, Saad Mufti <saad.mu...@gmail.com>
> > > wrote:
> > > >>>>>>
> > > >>>>>> No never thought about that. I just figured out how to locate
> the
> > > >>> server
> > > >>>>>> for that table after you mentioned it. We'll have to keep an eye
> > on
> > > it
> > > >>>>> next
> > > >>>>>> time we have a hotspot to see if it coincides with the hotspot
> > > server.
> > > >>>>>>
> > > >>>>>> What would be the theory for how it could become a hotspot?
> Isn't
> > > the
> > > >>>>>> client supposed to cache it and only go back for a refresh if it
> > > hits
> > > >>> a
> > > >>>>>> region that is not in its expected location?
> > > >>>>>>
> > > >>>>>> ----
> > > >>>>>> Saad
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Thu, Dec 1, 2016 at 2:56 PM, John Leach <
> > > jle...@splicemachine.com>
> > > >>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Saad,
> > > >>>>>>>
> > > >>>>>>> Did you validate that Meta is not on the “Hot” region server?
> > > >>>>>>>
> > > >>>>>>> Regards,
> > > >>>>>>> John Leach
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>> On Dec 1, 2016, at 1:50 PM, Saad Mufti <saad.mu...@gmail.com>
> > > >>> wrote:
> > > >>>>>>>>
> > > >>>>>>>> Hi,
> > > >>>>>>>>
> > > >>>>>>>> We are using HBase 1.0 on CDH 5.5.2 . We have taken great care
> > to
> > > >>> avoid
> > > >>>>>>>> hotspotting due to inadvertent data patterns by prepending an
> > MD5
> > > >>>>> based 4
> > > >>>>>>>> digit hash prefix to all our data keys. This works fine most
> of
> > > the
> > > >>>>>>> times,
> > > >>>>>>>> but more and more (as much as once or twice a day) recently we
> > > have
> > > >>>>>>>> occasions where one region server suddenly becomes "hot" (CPU
> > > above
> > > >>> or
> > > >>>>>>>> around 95% in various monitoring tools). When it happens it
> > lasts
> > > >>> for
> > > >>>>>>>> hours, occasionally the hotspot might jump to another region
> > > server
> > > >>> as
> > > >>>>>>> the
> > > >>>>>>>> master decide the region is unresponsive and gives its region
> to
> > > >>>>> another
> > > >>>>>>>> server.
> > > >>>>>>>>
> > > >>>>>>>> For the longest time, we thought this must be some single
> rogue
> > > key
> > > >>> in
> > > >>>>>>> our
> > > >>>>>>>> input data that is being hammered. All attempts to track this
> > down
> > > >>> have
> > > >>>>>>>> failed though, and the following behavior argues against this
> > > being
> > > >>>>>>>> application based:
> > > >>>>>>>>
> > > >>>>>>>> 1. plotted Get and Put rate by region on the "hot" region
> server
> > > in
> > > >>>>>>>> Cloudera Manager Charts, shows no single region is an outlier.
> > > >>>>>>>>
> > > >>>>>>>> 2. cleanly restarting just the region server process causes
> its
> > > >>> regions
> > > >>>>>>> to
> > > >>>>>>>> randomly migrate to other region servers, then it gets new
> ones
> > > from
> > > >>>>> the
> > > >>>>>>>> HBase master, basically a sort of shuffling, then the hotspot
> > goes
> > > >>>>> away.
> > > >>>>>>> If
> > > >>>>>>>> it were application based, you'd expect the hotspot to just
> jump
> > > to
> > > >>>>>>> another
> > > >>>>>>>> region server.
> > > >>>>>>>>
> > > >>>>>>>> 3. have pored through region server logs and can't see
> anything
> > > out
> > > >>> of
> > > >>>>>>> the
> > > >>>>>>>> ordinary happening
> > > >>>>>>>>
> > > >>>>>>>> The only other pertinent thing to mention might be that we
> have
> > a
> > > >>>>> special
> > > >>>>>>>> process of our own running outside the cluster that does
> cluster
> > > >>> wide
> > > >>>>>>> major
> > > >>>>>>>> compaction in a rolling fashion, where each batch consists of
> > one
> > > >>>>> region
> > > >>>>>>>> from each region server, and it waits before one batch is
> > > completely
> > > >>>>> done
> > > >>>>>>>> before starting another. We have seen no real impact on the
> > > hotspot
> > > >>>>> from
> > > >>>>>>>> shutting this down and in normal times it doesn't impact our
> > read
> > > or
> > > >>>>>>> write
> > > >>>>>>>> performance much.
> > > >>>>>>>>
> > > >>>>>>>> We are at our wit's end, anyone have experience with a
> scenario
> > > like
> > > >>>>>>> this?
> > > >>>>>>>> Any help/guidance would be most appreciated.
> > > >>>>>>>>
> > > >>>>>>>> -----
> > > >>>>>>>> Saad
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>
> > > >>
> > > >>
> > >
> > >
> >
> >
> >
>

Reply via email to