I have been playing with out-of-process compaction currently. I will post
my results (and patch) when it is ready. The compaction in HBase is the
area of a multitude of possible improvements.

-Vladimir Rodionov

On Thu, Oct 9, 2014 at 2:30 PM, Andrew Purtell <[email protected]> wrote:

> On Thu, Oct 9, 2014 at 6:31 AM, Jean-Marc Spaggiari <
> [email protected]
> > wrote:
>
> > For #4, one more thing me might want to add is a safety valve to increase
> > throttle in case compaction queue become bigger than a certain value?
> >
> > JM
> >
> >
> ​Would that make resolution of the problem leading to a large queue in the
> first place more difficult do you think?​
>
>
>
>
> > 2014-10-09 1:20 GMT-04:00 lars hofhansl <[email protected]>:
> >
> > > Hi Michael,
> > >
> > > your math is right.
> > >
> > >
> > > I think the issue is that it actually is easy to max out the ToR switch
> > > (and hence starve out other traffic), so we might want to protect the
> ToR
> > > switch from prolonged heavy compaction traffic in order to keep some of
> > the
> > > bandwidth free for other traffic.
> > > Vladimir issues were around slowing other traffic while compactions are
> > > running.
> > >
> > >
> > > -- Lars
> > >
> > >
> > >
> > > ----- Original Message -----
> > > From: Michael Segel <[email protected]>
> > > To: [email protected]; lars hofhansl <[email protected]>
> > > Cc: Vladimir Rodionov <[email protected]>
> > > Sent: Wednesday, October 8, 2014 12:30 PM
> > > Subject: Re: Compactions nice to have features
> > >
> > >
> > >
> > > On Oct 5, 2014, at 11:01 PM, lars hofhansl <[email protected]> wrote:
> > >
> > > >>> - rack IO throttle. We should add that to accommodate for over
> > > subscription at the ToR level.
> > > >> Can you decipher that, Lars?
> > > >
> > > > ToR is "Top of Rack" switch. Over subscription means that a ToR
> switch
> > > usually does not have enough bandwidth to serve traffic in and out of
> > rack
> > > at full speed.
> > > > For example if you had 40 machines in a rack with 1ge links each, and
> > > the ToR switch has a 10ge uplink, you'd say the ToR switch is 4 to 1
> over
> > > subsctribed.
> > > >
> > > >
> > > > Was just trying to say: "Yeah, we need that" :)
> > > >
> > >
> > >
> > > Hmmm.
> > >
> > > Rough math…  using 3.5” SATA II (7200 RPM) drives … 4 drives would max
> > out
> > > 1GbE.  So then  a server with 12 drives would max out 3Gb/S. Assuming
> > 3.5”
> > > drives. 2.5” drives and SATAIII would push this up.
> > > So in theory you could get 5Gb/S or more from a node.
> > >
> > > 16 serves per rack… (again YMMV based on power, heat, etc … ) thats
> > 48Gb/S
> > > and up.
> > >
> > > If you had 20 servers and they had smaller (2.5” drives) 5Gb/S x 20 =
> > > 100Gb/S.
> > >
> > > So what’s the width of the fabric?  (YMMV based on ToR)
> > >
> > > I don’t know why you’d want to ‘throttle’ because the limits of the ToR
> > > would throttle you already.
> > >
> > > Of course I’m assuming that you’re running a M/R job that’s going full
> > > bore.
> > >
> > >
> > > Are you seeing this?
> > > I would imagine that you’d have a long running job maxing out the I/O
> and
> > > seeing a jump in wait CPU over time.
> > >
> > > And what’s the core to spindle ratio?
> > >
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Reply via email to