Re: Use experience and performance data of offheap from Alibaba online cluster

Yu Li Mon, 21 Nov 2016 22:44:26 -0800

Good point Jingcheng, something I had a feeling some of you might ask
(smile). I'd say it's because of some historical reason and we had some
concerns for issues like HBASE-16195. But yes will consider enabling the
pool since already passed 11/11 and not so many eyes on our cluster, before
write-path offheap work done.


Best Regards,
Yu

On 22 November 2016 at 14:30, Du, Jingcheng <[email protected]> wrote:

> Thanks Yu.
> You disabled the MSLAB pool in the running, why did you choose to disable
> it?
> I compared the performance with and without MSLAB pool once before,  the
> GC pause can be improved if the MSLAB pool was enabled (I used G1GC at that
> time). Maybe the pool can help in your case?
>
> Regards,
> Jingcheng
>
> -----Original Message-----
> From: Yu Li [mailto:[email protected]]
> Sent: Tuesday, November 22, 2016 12:25 PM
> To: [email protected]
> Subject: Re: Use experience and performance data of offheap from Alibaba
> online cluster
>
> Ok, let me elaborate our settings on GC/heapsize per Nick required in
> former email (smile):
>
> For on heap:
> export HBASE_REGIONSERVER_OPTS="-Xmx43008m -Xms43008m -Xmn8192m
> -XX:SurvivorRatio=3 -XX:MaxTenuringThreshold=15 -Xss256k -XX:+UseParNewGC
> -XX:+PrintPromotionFailure -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSInitiatingOccupancyFraction=63  -XX:ParallelGCThreads=16"
> hbase.regionserver.global.memstore.size => 0.35 
> hbase.regionserver.global.memstore.size.lower.limit
> => 0.83 hbase.hregion.memstore.mslab.enabled => true MSLAB pool disabled
>
> For off heap:
> hbase.bucketcache.size => 12288
>
> So 42GB onheap and 12GB offheap, still using CMS not G1 GC, around 12GB
> memstore globally, with MSLAB but no pool.
>
>
> Best Regards,
> Yu
>
> On 21 November 2016 at 13:34, Anoop John <[email protected]> wrote:
>
> > Yu Li,
> >         What is the global memstore percent that u have?  40 or 50%?
> > G1GC or CMS?   MSLAB on or off?  MSLAB pool?  What percentage?  All
> > these will help us in write path off heaping also.
> >
> > -Anoop-
> >
> > On Mon, Nov 21, 2016 at 10:58 AM, Yu Li <[email protected]> wrote:
> > > Thanks @Nick for the feedback, and sorry for the lagging response.
> > >
> > > Regarding the backport, I've opened HBASE-17138 for further
> > > discussion,
> > FYI.
> > >
> > > Regarding the fullGC part, Anoop is right, we still don't have
> > > write-path offheap so there'll still be fullGC happening. And since
> > > the block cache normally occupies 30% of the overall
> > > (hbase.block.cache.size is 0.3 by default), we reduce the heap size
> > > by 30% after making blockcache offheap, and the fullGC time is
> > > reduced at a relative ratio
> > >
> > > Best Regards,
> > > Yu
> > >
> > > On 21 November 2016 at 00:29, Anoop John <[email protected]>
> wrote:
> > >
> > >> Regarding ur Q on max GC pause Nick, the use case is R+ W workload.
> > This is
> > >> what I know.. thr write path is still having heavy on heap usage
> > >> and garbage. Might be a reason I guess.. once we can have write
> > >> path off
> > heap
> > >> also, the huge % of our memory need can then be handled by off heap
> > >> and
> > I
> > >> believe that can make this still down. The point is then we can
> > >> work
> > with
> > >> much reduced xmx. The off heap memory what we use is fixed buffers
> > >> which live for the life of the RS.  ie. DBBs in Bucket cache/ DBBs
> > >> in MemstoreChunkPool.
> > >>
> > >> -Anoop-
> > >>
> > >> On Sunday, November 20, 2016, Nick Dimiduk <[email protected]>
> wrote:
> > >> > Very encouraging indeed! Thank you so much for sharing your
> > >> > results
> > with
> > >> > the community!!! This is excellent to see and we really
> > >> > appreciate
> > your
> > >> > openness. I have a couple comments/questions.
> > >> >
> > >> > (1) from the DISCUSS thread re: EOL of 1.1, it seems we'll
> > >> > continue to support 1.x releases for some time, including with
> > >> > overlap of the 2.0
> > >> line.
> > >> > Based on the energy of the community, I would guess these will be
> > >> > maintained until 2.2 at least. Therefore, offheap patches that
> > >> > have
> > seen
> > >> > production exposure seem like a reasonable candidate for
> > >> > backport,
> > >> perhaps
> > >> > in a 1.4 or 1.5 release timeframe.
> > >> >
> > >> > (2) I'm surprised to see your max GC pause go from 11s -> 7s. Do
> > >> > I understand you correctly? This is an improvement to be sure,
> > >> > but I
> > would
> > >> > have expected more gain. Can you elaborate on your GC and
> > >> > heapsize settings? Have you profiled the heaps at all to see
> > >> > where the pressure
> > >> lies
> > >> > once the bulk of the data pathway is moved to direct memory?
> > >> >
> > >> > Thanks a lot!
> > >> > Nick
> > >> >
> > >> > On Fri, Nov 18, 2016 at 12:11 AM Yu Li <[email protected]> wrote:
> > >> >
> > >> >> Dear all,
> > >> >>
> > >> >> We have backported read path offheap (HBASE-11425) to our
> > >> >> customized
> > >> >> hbase-1.1.2 (thanks @Anoop for the help/support) and run it
> > >> >> online
> > for
> > >> more
> > >> >> than a month, and would like to share our experience, for what
> > >> >> it's
> > >> worth
> > >> >> (smile).
> > >> >>
> > >> >> Generally speaking, we gained a better and more stable
> > >> >> throughput/performance with offheap, and below are some details:
> > >> >>
> > >> >> 1. QPS become more stable with offheap
> > >> >>
> > >> >> Performance w/o offheap:
> > >> >>
> > >> >> Performance w/ offheap:
> > >> >>
> > >> >> These data come from our online A/B test cluster (with 450
> > >> >> physical machines, and each with 256G memory + 64 core) with
> > >> >> real world
> > >> workloads,
> > >> >> it shows using offheap we could gain a more stable throughput as
> > well as
> > >> >> better performance
> > >> >>
> > >> >> Not showing fully online data here because for online we
> > >> >> published
> > the
> > >> >> version with both offheap and NettyRpcServer together, so no
> > standalone
> > >> >> comparison data for offheap
> > >> >>
> > >> >> 2. Full GC frequency and cost
> > >> >>
> > >> >> Average Full GC STW time reduce from 11s to 7s with offheap.
> > >> >>
> > >> >> 3. Young GC frequency and cost
> > >> >>
> > >> >> No performance degradation observed with offheap.
> > >> >>
> > >> >> 4. Peak throughput of one single RS
> > >> >>
> > >> >> On Singles Day (11/11), peak throughput of one single RS reached
> > 100K,
> > >> >> among which 90K from Get. Plus internet in/out data we could
> > >> >> know the average result size of get request is ~1KB
> > >> >>
> > >> >> Offheap are used on all online machines (more than 1600 nodes)
> > instead
> > >> of
> > >> >> LruCache, so the above QPS is gained from offheap bucketcache,
> > >> >> along
> > >> with
> > >> >> NettyRpcServer(HBASE-15756).
> > >> >> Just let us know if any comments. Thanks.
> > >> >>
> > >> >> Best Regards,
> > >> >> Yu
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >
> > >>
> >
>

Re: Use experience and performance data of offheap from Alibaba online cluster

Reply via email to