Re: Hot Region Server With No Hot Region

Ted Yu Tue, 13 Dec 2016 12:57:53 -0800

I was looking at CellCounter but it doesn't provide what you are looking
for.


Maybe we can enhance it such that given threshold on the number
of qualifiers in a row (say 100,000), output the rows which have at least
these many qualifiers.

On Tue, Dec 13, 2016 at 12:47 PM, Saad Mufti <[email protected]> wrote:

> Thanks everyone for the feedback. We tracked this down to having a bad
> design using dynamic columns, there were a few (very few) rows that
> accumulated up to 200,000 dynamic columns. When we got any activity that
> caused us to try to read one of these rows, it resulted in a hot region
> server.
>
> Follow up question, we are now in the process of cleaning up those rows as
> identified, but but some are so big that trying to read them in the cleanup
> process kills it with out of memory exceptions. Is there any way to
> identify rows with too many columns without actually reading them all?
>
> Thanks.
>
> ----
> Saad
>
>
> On Sat, Dec 3, 2016 at 3:20 PM, Ted Yu <[email protected]> wrote:
>
> > I took a look at the stack trace.
> >
> > Region server log would give us more detail on the frequency and duration
> > of compactions.
> >
> > Cheers
> >
> > On Sat, Dec 3, 2016 at 7:39 AM, Jeremy Carroll <[email protected]>
> > wrote:
> >
> > > I would check compaction, investigate throttling if it's causing high
> > CPU.
> > >
> > > On Sat, Dec 3, 2016 at 6:20 AM Saad Mufti <[email protected]>
> wrote:
> > >
> > > > No.
> > > >
> > > > ----
> > > > Saad
> > > >
> > > >
> > > > On Fri, Dec 2, 2016 at 3:27 PM, Ted Yu <[email protected]>
> > wrote:
> > > >
> > > > > Some how I couldn't access the pastebin (I am in China now).
> > > > > Did the region server showing hotspot host meta ?
> > > > > Thanks
> > > > >
> > > > >     On Friday, December 2, 2016 11:53 AM, Saad Mufti <
> > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > >
> > > > >  We're in AWS with D2.4xLarge instances. Each instance has 12
> > > independent
> > > > > spindles/disks from what I can tell.
> > > > >
> > > > > We have charted get_rate and mutate_rate by host and
> > > > >
> > > > > a) mutate_rate shows no real outliers
> > > > > b) read_rate shows the overall rate on the "hotspot" region server
> > is a
> > > > bit
> > > > > higher than every other server, not severely but enough that it is
> a
> > > bit
> > > > > noticeable. But when we chart get_rate on that server by region, no
> > one
> > > > > region stands out.
> > > > >
> > > > > get_rate chart by host:
> > > > >
> > > > > https://snag.gy/hmoiDw.jpg
> > > > >
> > > > > mutate_rate chart by host:
> > > > >
> > > > > https://snag.gy/jitdMN.jpg
> > > > >
> > > > > ----
> > > > > Saad
> > > > >
> > > > >
> > > > > ----
> > > > > Saad
> > > > >
> > > > >
> > > > > On Fri, Dec 2, 2016 at 2:34 PM, John Leach <
> [email protected]
> > >
> > > > > wrote:
> > > > >
> > > > > > Here is what I see...
> > > > > >
> > > > > >
> > > > > > * Short Compaction Running on Heap
> > > > > > "regionserver/ip-10-99-181-146.aolp-prd.us-east-1.ec2.
> > > > > > aolcloud.net/10.99.181.146:60020-shortCompactions-1480229281547"
> -
> > > > > Thread
> > > > > > t@242
> > > > > >    java.lang.Thread.State: RUNNABLE
> > > > > >    at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > > > > compressSingleKeyValue(FastDiffDeltaEncoder.java:270)
> > > > > >    at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > > > > internalEncode(FastDiffDeltaEncoder.java:245)
> > > > > >    at org.apache.hadoop.hbase.io.encoding.
> > BufferedDataBlockEncoder.
> > > > > > encode(BufferedDataBlockEncoder.java:987)
> > > > > >    at org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder.
> > > > > > encode(FastDiffDeltaEncoder.java:58)
> > > > > >    at org.apache.hadoop.hbase.io
> > > > .hfile.HFileDataBlockEncoderImpl.encode(
> > > > > > HFileDataBlockEncoderImpl.java:97)
> > > > > >    at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.write(
> > > > > > HFileBlock.java:866)
> > > > > >    at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(
> > > > > > HFileWriterV2.java:270)
> > > > > >    at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(
> > > > > > HFileWriterV3.java:87)
> > > > > >    at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.
> > > > > > append(StoreFile.java:949)
> > > > > >    at org.apache.hadoop.hbase.regionserver.compactions.
> > > > > > Compactor.performCompaction(Compactor.java:282)
> > > > > >    at org.apache.hadoop.hbase.regionserver.compactions.
> > > > > > DefaultCompactor.compact(DefaultCompactor.java:105)
> > > > > >    at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$
> > > > > > DefaultCompactionContext.compact(DefaultStoreEngine.java:124)
> > > > > >    at org.apache.hadoop.hbase.regionserver.HStore.compact(
> > > > > > HStore.java:1233)
> > > > > >    at org.apache.hadoop.hbase.regionserver.HRegion.compact(
> > > > > > HRegion.java:1770)
> > > > > >    at org.apache.hadoop.hbase.regionserver.CompactSplitThread$
> > > > > > CompactionRunner.run(CompactSplitThread.java:520)
> > > > > >    at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > > > ThreadPoolExecutor.java:1142)
> > > > > >    at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > > > ThreadPoolExecutor.java:617)
> > > > > >    at java.lang.Thread.run(Thread.java:745)
> > > > > >
> > > > > >
> > > > > > * WAL Syncs waiting…  ALL 5
> > > > > > "sync.0" - Thread t@202
> > > > > >    java.lang.Thread.State: TIMED_WAITING
> > > > > >    at java.lang.Object.wait(Native Method)
> > > > > >    - waiting on <67ba892d> (a java.util.LinkedList)
> > > > > >    at org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(
> > > > > > DFSOutputStream.java:2337)
> > > > > >    at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(
> > > > > > DFSOutputStream.java:2224)
> > > > > >    at org.apache.hadoop.hdfs.DFSOutputStream.hflush(
> > > > > > DFSOutputStream.java:2116)
> > > > > >    at org.apache.hadoop.fs.FSDataOutputStream.hflush(
> > > > > > FSDataOutputStream.java:130)
> > > > > >    at org.apache.hadoop.hbase.regionserver.wal.
> > > ProtobufLogWriter.sync(
> > > > > > ProtobufLogWriter.java:173)
> > > > > >    at org.apache.hadoop.hbase.regionserver.wal.FSHLog$
> > > > > > SyncRunner.run(FSHLog.java:1379)
> > > > > >    at java.lang.Thread.run(Thread.java:745)
> > > > > >
> > > > > > * Mutations backing up very badly...
> > > > > >
> > > > > > "B.defaultRpcServer.handler=103,queue=7,port=60020" - Thread
> t@155
> > > > > >    java.lang.Thread.State: TIMED_WAITING
> > > > > >    at java.lang.Object.wait(Native Method)
> > > > > >    - waiting on <6ab54ea3> (a org.apache.hadoop.hbase.
> > > > > > regionserver.wal.SyncFuture)
> > > > > >    at org.apache.hadoop.hbase.regionserver.wal.SyncFuture.
> > > > > > get(SyncFuture.java:167)
> > > > > >    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.
> > > > > > blockOnSync(FSHLog.java:1504)
> > > > > >    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.
> > > > > > publishSyncThenBlockOnCompletion(FSHLog.java:1498)
> > > > > >    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(
> > > > > > FSHLog.java:1632)
> > > > > >    at org.apache.hadoop.hbase.regionserver.HRegion.
> > > > > > syncOrDefer(HRegion.java:7737)
> > > > > >    at org.apache.hadoop.hbase.regionserver.HRegion.
> > > > > > processRowsWithLocks(HRegion.java:6504)
> > > > > >    at org.apache.hadoop.hbase.regionserver.HRegion.
> > > > > > mutateRowsWithLocks(HRegion.java:6352)
> > > > > >    at org.apache.hadoop.hbase.regionserver.HRegion.
> > > > > > mutateRowsWithLocks(HRegion.java:6334)
> > > > > >    at org.apache.hadoop.hbase.regionserver.HRegion.
> > > > > > mutateRow(HRegion.java:6325)
> > > > > >    at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> > > > > > mutateRows(RSRpcServices.java:418)
> > > > > >    at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> > > > > > multi(RSRpcServices.java:1916)
> > > > > >    at org.apache.hadoop.hbase.protobuf.generated.
> > > > > >
> > > > ClientProtos$ClientService$2.callBlockingMethod(
> > ClientProtos.java:32213)
> > > > > >    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:
> > > 2034)
> > > > > >    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.
> > > java:107)
> > > > > >    at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> > > > > > RpcExecutor.java:130)
> > > > > >    at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> > > > > java:107)
> > > > > >    at java.lang.Thread.run(Thread.java:745)
> > > > > >
> > > > > >
> > > > > > Too many writers being blocked attempting to write to WAL.
> > > > > >
> > > > > > What does your disk infrastructure look like?  Can you get away
> > with
> > > > > > Multi-wal?  Ugh...
> > > > > >
> > > > > > Regards,
> > > > > > John Leach
> > > > > >
> > > > > >
> > > > > > > On Dec 2, 2016, at 1:20 PM, Saad Mufti <[email protected]>
> > > wrote:
> > > > > > >
> > > > > > > Hi Ted,
> > > > > > >
> > > > > > > Finally we have another hotspot going on, same symptoms as
> > before,
> > > > here
> > > > > > is
> > > > > > > the pastebin for the stack trace from the region server that I
> > > > obtained
> > > > > > via
> > > > > > > VisualVM:
> > > > > > >
> > > > > > > http://pastebin.com/qbXPPrXk
> > > > > > >
> > > > > > > Would really appreciate any insight you or anyone else can
> > provide.
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > > ----
> > > > > > > Saad
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Dec 1, 2016 at 6:08 PM, Saad Mufti <
> [email protected]
> > >
> > > > > wrote:
> > > > > > >
> > > > > > >> Sure will, the next time it happens.
> > > > > > >>
> > > > > > >> Thanks!!!
> > > > > > >>
> > > > > > >> ----
> > > > > > >> Saad
> > > > > > >>
> > > > > > >>
> > > > > > >> On Thu, Dec 1, 2016 at 5:01 PM, Ted Yu
> <[email protected]
> > >
> > > > > > wrote:
> > > > > > >>
> > > > > > >>> From #2 in the initial email, the hbase:meta might not be the
> > > cause
> > > > > for
> > > > > > >>> the hotspot.
> > > > > > >>>
> > > > > > >>> Saad:
> > > > > > >>> Can you pastebin stack trace of the hot region server when
> this
> > > > > happens
> > > > > > >>> again ?
> > > > > > >>>
> > > > > > >>> Thanks
> > > > > > >>>
> > > > > > >>>> On Dec 2, 2016, at 4:48 AM, Saad Mufti <
> [email protected]>
> > > > > wrote:
> > > > > > >>>>
> > > > > > >>>> We used a pre-split into 1024 regions at the start but we
> > > > > > miscalculated
> > > > > > >>> our
> > > > > > >>>> data size, so there were still auto-splits storms at the
> > > beginning
> > > > > as
> > > > > > >>> data
> > > > > > >>>> size stabilized, it has ended up at around 9500 or so
> regions,
> > > > plus
> > > > > a
> > > > > > >>> few
> > > > > > >>>> thousand regions for a few other tables (much smaller). But
> > > > haven't
> > > > > > had
> > > > > > >>> any
> > > > > > >>>> new auto-splits in a couple of months. And the hotspots only
> > > > started
> > > > > > >>>> happening recently.
> > > > > > >>>>
> > > > > > >>>> Our hashing scheme is very simple, we take the MD5 of the
> key,
> > > > then
> > > > > > >>> form a
> > > > > > >>>> 4 digit prefix based on the first two bytes of the MD5
> > > normalized
> > > > to
> > > > > > be
> > > > > > >>>> within the range 0-1023 . I am fairly confident about this
> > > scheme
> > > > > > >>>> especially since even during the hotspot we see no evidence
> so
> > > far
> > > > > > that
> > > > > > >>> any
> > > > > > >>>> particular region is taking disproportionate traffic (based
> on
> > > > > > Cloudera
> > > > > > >>>> Manager per region charts on the hotspot server). Does that
> > look
> > > > > like
> > > > > > a
> > > > > > >>>> reasonable scheme to randomize which region any give key
> goes
> > > to?
> > > > > And
> > > > > > >>> the
> > > > > > >>>> start of the hotspot doesn't seem to correspond to any
> region
> > > > > > splitting
> > > > > > >>> or
> > > > > > >>>> moving from one server to another activity.
> > > > > > >>>>
> > > > > > >>>> Thanks.
> > > > > > >>>>
> > > > > > >>>> ----
> > > > > > >>>> Saad
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>> On Thu, Dec 1, 2016 at 3:32 PM, John Leach <
> > > > > [email protected]
> > > > > > >
> > > > > > >>> wrote:
> > > > > > >>>>>
> > > > > > >>>>> Saad,
> > > > > > >>>>>
> > > > > > >>>>> Region move or split causes client connections to
> > > simultaneously
> > > > > > >>> refresh
> > > > > > >>>>> their meta.
> > > > > > >>>>>
> > > > > > >>>>> Key word is supposed.  We have seen meta hot spotting from
> > time
> > > > to
> > > > > > time
> > > > > > >>>>> and on different versions at Splice Machine.
> > > > > > >>>>>
> > > > > > >>>>> How confident are you in your hashing algorithm?
> > > > > > >>>>>
> > > > > > >>>>> Regards,
> > > > > > >>>>> John Leach
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>>> On Dec 1, 2016, at 2:25 PM, Saad Mufti <
> > [email protected]>
> > > > > > wrote:
> > > > > > >>>>>>
> > > > > > >>>>>> No never thought about that. I just figured out how to
> > locate
> > > > the
> > > > > > >>> server
> > > > > > >>>>>> for that table after you mentioned it. We'll have to keep
> an
> > > eye
> > > > > on
> > > > > > it
> > > > > > >>>>> next
> > > > > > >>>>>> time we have a hotspot to see if it coincides with the
> > hotspot
> > > > > > server.
> > > > > > >>>>>>
> > > > > > >>>>>> What would be the theory for how it could become a
> hotspot?
> > > > Isn't
> > > > > > the
> > > > > > >>>>>> client supposed to cache it and only go back for a refresh
> > if
> > > it
> > > > > > hits
> > > > > > >>> a
> > > > > > >>>>>> region that is not in its expected location?
> > > > > > >>>>>>
> > > > > > >>>>>> ----
> > > > > > >>>>>> Saad
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>> On Thu, Dec 1, 2016 at 2:56 PM, John Leach <
> > > > > > [email protected]>
> > > > > > >>>>> wrote:
> > > > > > >>>>>>
> > > > > > >>>>>>> Saad,
> > > > > > >>>>>>>
> > > > > > >>>>>>> Did you validate that Meta is not on the “Hot” region
> > server?
> > > > > > >>>>>>>
> > > > > > >>>>>>> Regards,
> > > > > > >>>>>>> John Leach
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>>> On Dec 1, 2016, at 1:50 PM, Saad Mufti <
> > > [email protected]>
> > > > > > >>> wrote:
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Hi,
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> We are using HBase 1.0 on CDH 5.5.2 . We have taken
> great
> > > care
> > > > > to
> > > > > > >>> avoid
> > > > > > >>>>>>>> hotspotting due to inadvertent data patterns by
> prepending
> > > an
> > > > > MD5
> > > > > > >>>>> based 4
> > > > > > >>>>>>>> digit hash prefix to all our data keys. This works fine
> > most
> > > > of
> > > > > > the
> > > > > > >>>>>>> times,
> > > > > > >>>>>>>> but more and more (as much as once or twice a day)
> > recently
> > > we
> > > > > > have
> > > > > > >>>>>>>> occasions where one region server suddenly becomes "hot"
> > > (CPU
> > > > > > above
> > > > > > >>> or
> > > > > > >>>>>>>> around 95% in various monitoring tools). When it happens
> > it
> > > > > lasts
> > > > > > >>> for
> > > > > > >>>>>>>> hours, occasionally the hotspot might jump to another
> > region
> > > > > > server
> > > > > > >>> as
> > > > > > >>>>>>> the
> > > > > > >>>>>>>> master decide the region is unresponsive and gives its
> > > region
> > > > to
> > > > > > >>>>> another
> > > > > > >>>>>>>> server.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> For the longest time, we thought this must be some
> single
> > > > rogue
> > > > > > key
> > > > > > >>> in
> > > > > > >>>>>>> our
> > > > > > >>>>>>>> input data that is being hammered. All attempts to track
> > > this
> > > > > down
> > > > > > >>> have
> > > > > > >>>>>>>> failed though, and the following behavior argues against
> > > this
> > > > > > being
> > > > > > >>>>>>>> application based:
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 1. plotted Get and Put rate by region on the "hot"
> region
> > > > server
> > > > > > in
> > > > > > >>>>>>>> Cloudera Manager Charts, shows no single region is an
> > > outlier.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 2. cleanly restarting just the region server process
> > causes
> > > > its
> > > > > > >>> regions
> > > > > > >>>>>>> to
> > > > > > >>>>>>>> randomly migrate to other region servers, then it gets
> new
> > > > ones
> > > > > > from
> > > > > > >>>>> the
> > > > > > >>>>>>>> HBase master, basically a sort of shuffling, then the
> > > hotspot
> > > > > goes
> > > > > > >>>>> away.
> > > > > > >>>>>>> If
> > > > > > >>>>>>>> it were application based, you'd expect the hotspot to
> > just
> > > > jump
> > > > > > to
> > > > > > >>>>>>> another
> > > > > > >>>>>>>> region server.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> 3. have pored through region server logs and can't see
> > > > anything
> > > > > > out
> > > > > > >>> of
> > > > > > >>>>>>> the
> > > > > > >>>>>>>> ordinary happening
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> The only other pertinent thing to mention might be that
> we
> > > > have
> > > > > a
> > > > > > >>>>> special
> > > > > > >>>>>>>> process of our own running outside the cluster that does
> > > > cluster
> > > > > > >>> wide
> > > > > > >>>>>>> major
> > > > > > >>>>>>>> compaction in a rolling fashion, where each batch
> consists
> > > of
> > > > > one
> > > > > > >>>>> region
> > > > > > >>>>>>>> from each region server, and it waits before one batch
> is
> > > > > > completely
> > > > > > >>>>> done
> > > > > > >>>>>>>> before starting another. We have seen no real impact on
> > the
> > > > > > hotspot
> > > > > > >>>>> from
> > > > > > >>>>>>>> shutting this down and in normal times it doesn't impact
> > our
> > > > > read
> > > > > > or
> > > > > > >>>>>>> write
> > > > > > >>>>>>>> performance much.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> We are at our wit's end, anyone have experience with a
> > > > scenario
> > > > > > like
> > > > > > >>>>>>> this?
> > > > > > >>>>>>>> Any help/guidance would be most appreciated.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> -----
> > > > > > >>>>>>>> Saad
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: Hot Region Server With No Hot Region

Reply via email to