It isn't a doubling. It is a power. If probability of exceeding the SLA is p, then the probability that two independent resources will exceed the SLA is p^2. For three, the probability is p^3.
To be concrete, I just did a simulation with a mixture of two log-normal distributions. Using a mixture distribution here is important to emulate the long-tailed nature of response time distributions ... it doesn't suffice to use normal distributions. With a long tailed distribution that has a median of 20 ms response, the raw distribution has about a 2% chance of having a response > 50ms. Using the lesser of two responses gives a probability of > 50 ms response if 0.04%. Three responses gives a probability of 0.0008%. For most applications, the difference between 2 and 3 replicated queries is nil. Moreover, if the second query has an artificial delay of a few ms, you get nearly the same improvements in probability of meeting the SLA, but you pay much lower average cost because you rarely invoke the redundant queries. So the reason that 2 are used instead of 3 is that 2 helps a lot while 3 only improves things slightly more. On Wed, Sep 12, 2012 at 1:01 PM, Constantine Peresypkin < [email protected]> wrote: > If you do a double query you're increasing your chances to success by > factor of 2 only. > Why not triple or quadruple? > > On Wed, Sep 12, 2012 at 10:14 PM, Ted Dunning <[email protected]> > wrote: > > > Heavens.... we can easily satisfy both needs. > > > > Just have a parameter that can be set to 0 (= universal double query) or > > Integer.MAX_INTEGER to get no backups at all. > > > > On Wed, Sep 12, 2012 at 11:47 AM, Constantine Peresypkin < > > [email protected]> wrote: > > > > > > The PowerDrill paper also mentions a variant of this where each query > > > fragment is sent to two machines, and the results for that fragment are > > > used from whatever machine responds first. > > > > > > > > > To send each query or request twice cluster load will be increased by > > 100%. > > > > > >
