Re: Use experience and performance data of offheap from Alibaba online cluster

Yu Li Mon, 21 Nov 2016 20:26:08 -0800

Ok, let me elaborate our settings on GC/heapsize per Nick required in
former email (smile):


For on heap:
export HBASE_REGIONSERVER_OPTS="-Xmx43008m -Xms43008m -Xmn8192m
-XX:SurvivorRatio=3 -XX:MaxTenuringThreshold=15 -Xss256k -XX:+UseParNewGC
-XX:+PrintPromotionFailure -XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=63  -XX:ParallelGCThreads=16"
hbase.regionserver.global.memstore.size => 0.35
hbase.regionserver.global.memstore.size.lower.limit => 0.83
hbase.hregion.memstore.mslab.enabled => true
MSLAB pool disabled

For off heap:
hbase.bucketcache.size => 12288

So 42GB onheap and 12GB offheap, still using CMS not G1 GC, around 12GB
memstore globally, with MSLAB but no pool.


Best Regards,
Yu

On 21 November 2016 at 13:34, Anoop John <[email protected]> wrote:

> Yu Li,
>         What is the global memstore percent that u have?  40 or 50%?
> G1GC or CMS?   MSLAB on or off?  MSLAB pool?  What percentage?  All
> these will help us in write path off heaping also.
>
> -Anoop-
>
> On Mon, Nov 21, 2016 at 10:58 AM, Yu Li <[email protected]> wrote:
> > Thanks @Nick for the feedback, and sorry for the lagging response.
> >
> > Regarding the backport, I've opened HBASE-17138 for further discussion,
> FYI.
> >
> > Regarding the fullGC part, Anoop is right, we still don't have write-path
> > offheap so there'll still be fullGC happening. And since the block cache
> > normally occupies 30% of the overall (hbase.block.cache.size is 0.3 by
> > default), we reduce the heap size by 30% after making blockcache offheap,
> > and the fullGC time is reduced at a relative ratio
> >
> > Best Regards,
> > Yu
> >
> > On 21 November 2016 at 00:29, Anoop John <[email protected]> wrote:
> >
> >> Regarding ur Q on max GC pause Nick, the use case is R+ W workload.
> This is
> >> what I know.. thr write path is still having heavy on heap usage and
> >> garbage. Might be a reason I guess.. once we can have write path off
> heap
> >> also, the huge % of our memory need can then be handled by off heap and
> I
> >> believe that can make this still down. The point is then we can work
> with
> >> much reduced xmx. The off heap memory what we use is fixed buffers which
> >> live for the life of the RS.  ie. DBBs in Bucket cache/ DBBs in
> >> MemstoreChunkPool.
> >>
> >> -Anoop-
> >>
> >> On Sunday, November 20, 2016, Nick Dimiduk <[email protected]> wrote:
> >> > Very encouraging indeed! Thank you so much for sharing your results
> with
> >> > the community!!! This is excellent to see and we really appreciate
> your
> >> > openness. I have a couple comments/questions.
> >> >
> >> > (1) from the DISCUSS thread re: EOL of 1.1, it seems we'll continue to
> >> > support 1.x releases for some time, including with overlap of the 2.0
> >> line.
> >> > Based on the energy of the community, I would guess these will be
> >> > maintained until 2.2 at least. Therefore, offheap patches that have
> seen
> >> > production exposure seem like a reasonable candidate for backport,
> >> perhaps
> >> > in a 1.4 or 1.5 release timeframe.
> >> >
> >> > (2) I'm surprised to see your max GC pause go from 11s -> 7s. Do I
> >> > understand you correctly? This is an improvement to be sure, but I
> would
> >> > have expected more gain. Can you elaborate on your GC and heapsize
> >> > settings? Have you profiled the heaps at all to see where the pressure
> >> lies
> >> > once the bulk of the data pathway is moved to direct memory?
> >> >
> >> > Thanks a lot!
> >> > Nick
> >> >
> >> > On Fri, Nov 18, 2016 at 12:11 AM Yu Li <[email protected]> wrote:
> >> >
> >> >> Dear all,
> >> >>
> >> >> We have backported read path offheap (HBASE-11425) to our customized
> >> >> hbase-1.1.2 (thanks @Anoop for the help/support) and run it online
> for
> >> more
> >> >> than a month, and would like to share our experience, for what it's
> >> worth
> >> >> (smile).
> >> >>
> >> >> Generally speaking, we gained a better and more stable
> >> >> throughput/performance with offheap, and below are some details:
> >> >>
> >> >> 1. QPS become more stable with offheap
> >> >>
> >> >> Performance w/o offheap:
> >> >>
> >> >> Performance w/ offheap:
> >> >>
> >> >> These data come from our online A/B test cluster (with 450 physical
> >> >> machines, and each with 256G memory + 64 core) with real world
> >> workloads,
> >> >> it shows using offheap we could gain a more stable throughput as
> well as
> >> >> better performance
> >> >>
> >> >> Not showing fully online data here because for online we published
> the
> >> >> version with both offheap and NettyRpcServer together, so no
> standalone
> >> >> comparison data for offheap
> >> >>
> >> >> 2. Full GC frequency and cost
> >> >>
> >> >> Average Full GC STW time reduce from 11s to 7s with offheap.
> >> >>
> >> >> 3. Young GC frequency and cost
> >> >>
> >> >> No performance degradation observed with offheap.
> >> >>
> >> >> 4. Peak throughput of one single RS
> >> >>
> >> >> On Singles Day (11/11), peak throughput of one single RS reached
> 100K,
> >> >> among which 90K from Get. Plus internet in/out data we could know the
> >> >> average result size of get request is ~1KB
> >> >>
> >> >> Offheap are used on all online machines (more than 1600 nodes)
> instead
> >> of
> >> >> LruCache, so the above QPS is gained from offheap bucketcache, along
> >> with
> >> >> NettyRpcServer(HBASE-15756).
> >> >> Just let us know if any comments. Thanks.
> >> >>
> >> >> Best Regards,
> >> >> Yu
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >>
>

Re: Use experience and performance data of offheap from Alibaba online cluster

Reply via email to