Re: Priority based queuing

Willy Tarreau Wed, 02 May 2018 10:23:32 -0700

On Wed, May 02, 2018 at 12:44:06PM -0400, Patrick Hemmer wrote:
> Can you elaborate on what you're thinking of for a time-based queue?
> 
> What I'm imagining you mean is that you would write a rule to set the
> max queue time, and haproxy would insert it into the queue sorting on
> TIME_NOW() + MAX_QUEUE_TIME. The main difference I see to this approach
> vs scoring, is that you ensure that an item doesn't sit on the queue
> forever (or whatever `timeout queue` is set to) if higher priority stuff
> keeps getting inserted before it.


In part, but mostly that under contention you can still maintain a
good quality of service. I remember having had a discussion a very
long time ago with a gaming site operator explaining that he wanted
to send profile management requests to a dedicated slow server so
that people filling in their profile do not disturb the ones playing.
For me it's a good example : "how long do you think these people are
willing to wait for the server to respond to each click to update
their profile?". Let's say 2 seconds is OK, you just add a 2s offset
to the position in the queue for such requests. If in the mean time
other higher priority requests come in, the delayed one may face the
2 seconds delay. But it will not face more. And obviously if no other
request is pending before it in the queue it will be served earlier.

A similar example is for certain shopping sites which consider that
once you have something in your shopping cart, you're much more likely
to finish the process and to pay because you've found the item you
were looking for, so you're more willing to tolerate a slightly higher
response time, and as such provide a better response time to newcomers.
But while this small delay can easily be expressed in milliseconds
(probably 50/100), it's almost impossible to define it using positions.
What if 50 new visitors suddenly come in ? You don't want the guy with
his item to suddenly experience a timeout. The time-based queue however
grants you a control over the service degradation you're accepting to
inflict to your users.

Another interesting example is when you want ot provide strong service
guarantees to some top-level visitors while running under intense load.
By using only a position, it's easy to say "I want the Gold class to
advance by 1000 positions". But if the site is highly loaded and you
have 10k pending requests, these 1000 positions remain irrelevant,
because the requests sits there, waiting for 90% of the pollution to
be flushed. If you say "I want these requests to advance by 10 seconds"
then no matter how many other requests you have in the queue, it will
really advance by 10 seconds and effectively shrink the response time
by 10 seconds.

Overall for user experience it's important to think in time and not
positions. The rationale behind this is simple : the user will never
accept as a justification for varying quality of service the number of
other visitors. And positions just depend on this, it's an average number
of people you're allowing to pass over. If I'm granted a pass ticket
allowing me to advance 10 places in the cinema queue and I find a
crowded street, I will go back home. If I'm told I won't wait more
than 5 minutes, I'll come regardless of the number of waiters.

> I don't think this is necessarily a good thing.

For having considered the two options several times in field over the
last decade when sites started to crawl, I became very well convinced ;-)

> If you're under a DOS
> attack, the goal is to get the good requests processed before any
> possible malicious requests. With a time based queue, those malicious
> requests will still get processed and starve out the good requests.

Precisely not that much because the good ones will pass before, and
the malicious ones will then be subject to the queue timeout if there
are too many.

> For
> example lets say you're under attack, a bad request comes in with
> max_queue_time=1000ms, and then after 999ms elapse, a good request comes
> in with max_queue_time=10ms. You have a good request, and a bad request
> on the queue, but HAProxy is going to process the bad request first
> because its timer is expiring first.

Absolutely but you fell into the trap of not considering that the queue
is supposed to be full, so you're in fact comparing only two requests,
while in practice you have many of them and there the effect is much
more visible.

> Essentially if haproxy is receiving
> X good requests per second, and Y bad requests per second, it's still
> going to forward X good per second, and Y bad per second, to the backend
> server. The only difference is that they're time shifted.

Except once subject to the queue timeout. It's a very important aspect we
must not dismiss.

> The other thing I could think you mean by time-based is to insert into
> the queue sorting on MAX_QUEUE_TIME, just like a score-based queue, but
> you would still record TIME_NOW() + MAX_QUEUE_TIME, and would reject
> requests that don't get processed by their deadline. Essentially a
> per-request version of the `timeout queue` setting. But I don't see any
> real value in this.
> 
> Or do you mean something else?

What I mean is that the queue is indexed on the computed service time
which is (current_time + offset), where offset can be either positive or
negative, null by default. The queue is sorted by service time, just like
any timer in fact. Then the queue collection simply picks the next event
in the queue.

Please also note that right now the queue already considers the time to
maintain fairness between requests targetting a specific server and those
who just want any server. This is what ensures no request starves in either
the server's queue or the backend's queue. With a time-sorted queue, instead
of picking the tasks according to their time of arrival, we'd simply pick
them based on the service time, and we could maintain an excellent fairness
between servers and backends.

Regarding the DOS situations, there are certain aspects which slightly
differ from the points around the quality of service. DOS fighting involves
sacrifice, and basic quality of service hardly provides this by default. In
our case, sacrifice happens on a queue size or time. And the decision should
depend on the request's importance, which is often very different from the
desired response time. The well known example is the "Pay" button on many
sites : you click on it, and if it doesn't respond fast, you will not press
Escape. You're keeping fingers crossed hoping it will not return an error,
even if it takes 30 seconds. And moreover, you're happy once it responds!
Here that's what makes the difference with the response time : in fact you'd
like to say that certain requests are highly sensitive but not urgent, and
that you'd like to be able to increase their timeout and ultimately get
served.

To fight DOS it's the same. Commonly, many sites consider that requests
holding a valid cookie correspond to authenticated users and score much
better in terms of trustability. Thus by having adjustable timeouts, it
would make it very easy to adjust the queue timeout for a request based
on the presence of a cookie or not.

Now when you think about it, a reasonable but simple strategy for some of
the examples above could be summarized to this :

   timeout queue 10s
   # pay button not urgent but needs to work
   http-request set-timeout queue 60s if PAY_BUTTON
   http-request set-priority -10s if PAY_BUTTON

   # unauthenticated less urgent, can be a DOS
   http-request set-priority -1s  if NO_COOKIE || FAVICON

   # some elements that need to be deliveered quickly
   http-request set-priority  1s  if INDEX_HTML || CSS || JS || ICONS

   # auto-completion needs to be fast but no problem if it doesn't work
   http-request set-timeout queue 1s if AUTO_COMPLETION_REQUEST
   http-request set-priority 10s  if AUTO_COMPLETION_REQUEST

   # inconsistent user-agents are suspicious, delay them just in case
   http-request set-priority -20s if SUSPICIOUS_USER_AGENT

   # too fast users need to slow down a little bit
   http-request set-priority -20s if { sc0_rate gt 10 } || { sc0_err_cnt gt 10 }

With a proper API we can even imagine having access to these request
properties from Lua to implement far more advanced policies.

Regards,
Willy

Re: Priority based queuing

Reply via email to