Re: Jeff Dean on fast response in an unreliable world

Ted Dunning Wed, 12 Sep 2012 15:41:41 -0700

It isn't a doubling.  It is a power.

If probability of exceeding the SLA is p, then the probability that two
independent resources will exceed the SLA is p^2.  For three, the
probability is p^3.

To be concrete, I just did a simulation with a mixture of two log-normal
distributions.  Using a mixture distribution here is important to emulate
the long-tailed nature of response time distributions ... it doesn't
suffice to use normal distributions.

With a long tailed distribution that has a median of 20 ms response, the
raw distribution has about a 2% chance of having a response > 50ms.  Using
the lesser of two responses gives a probability of > 50 ms response if
0.04%.  Three responses gives a probability of 0.0008%.  For most
applications, the difference between 2 and 3 replicated queries is nil.

Moreover, if the second query has an artificial delay of a few ms, you get
nearly the same improvements in probability of meeting the SLA, but you pay
much lower average cost because you rarely invoke the redundant queries.

So the reason that 2 are used instead of 3 is that 2 helps a lot while 3
only improves things slightly more.

On Wed, Sep 12, 2012 at 1:01 PM, Constantine Peresypkin <
[email protected]> wrote:

> If you do a double query you're increasing your chances to success by
> factor of 2 only.
> Why not triple or quadruple?
>
> On Wed, Sep 12, 2012 at 10:14 PM, Ted Dunning <[email protected]>
> wrote:
>
> > Heavens.... we can easily satisfy both needs.
> >
> > Just have a parameter that can be set to 0 (= universal double query) or
> > Integer.MAX_INTEGER to get no backups at all.
> >
> > On Wed, Sep 12, 2012 at 11:47 AM, Constantine Peresypkin <
> > [email protected]> wrote:
> >
> > > > The PowerDrill paper also mentions a variant of this where each query
> > > fragment is sent to two machines, and the results for that fragment are
> > > used from whatever machine responds first.
> > >
> > >
> > > To send each query or request twice cluster load will be increased by
> > 100%.
> > >
> >
>

Re: Jeff Dean on fast response in an unreliable world

Reply via email to