Hi Willy,

Thank your for your detailed answer.

On Mon, Mar 19, 2018 at 07:28:16PM +0100, Willy Tarreau wrote:
> Threading was clearly released with an experimental status, just like
> H2, because we knew we'd be facing some post-release issues in these
> two areas that are hard to get 100% right at once. However I consider
> that the situation has got much better, and to confirm this, both of
> these are now enabled by default in HapTech's products. With this said,
> I expect that over time we'll continue to see a few bugs, but not more
> than what we're seeing in various areas. For example, we didn't get a
> single issue on haproxy.org since it was updated to the 1.8.1 or so,
> 3 months ago. So this is getting quite good.

ok it was not clear to me as being experimental since it was quite
widely advertised in several blog posts, but I probably missed
something. Thanks for the clarification though.
Since 1.8.1 the only issues we had with nbthread was indeed related
to using it along with seamless reload but I will give a second try with
the latest patches you released in 1.8 tree today.

> I ran a stress test on this patch, with a single server running with
> "maxconn 1", with a frontend bound to two threads. I measure exactly
> 30000 conn/s with a single thread (keep in mind that there's a single
> connection at once), and 28500 with two threads. Thus the sync point
> takes on average an extra 1.75 microsecond, compared to the 35
> microseconds it takes on average to finish processing the request
> (connect, server processing, response, close).
> Also if you're running with nbproc > 1 instead, the maxconn setting is
> not really respected since it becomes per-process. When you run with
> 8 processes it doesn't mean much anymore, or you need to have small
> maxconn settings, implying that sometimes a process might queue some
> requests while there are available slots in other processes. Thus I'd
> argue that the threads here significantly improve the situation by
> allowing all connection slots to be used by all CPUs, which is a real
> improvement which should theorically show you lower latencies.

thanks for these details. We will run some tests on our side as well;
the commit message made me worried about the last percentile of
requests which might have crazy numbers sometimes.
I now better understand we are speaking about 1.75 extra microseconds.

> Note that if this is of interest to you, it's trivial to make haproxy
> run in busy polling mode, and in this case the performance increases to
> 30900 conn/s, at the expense of eating all your CPU (which possibly you
> don't care about if the latency is your worst ennemy). We can possibly
> even improve this to ensure that it's done only when there are existing
> sessions on a given thread. Let me know if this is something that could
> be of interest to you, as I think we could make this configurable and
> bypass the sync point in this case.

It is definitely something interesting for us to make it configurable.
I will try to have a look as well.

> No they're definitely not for 1.8 and still really touchy. We're
> progressively attacking locks wherever we can. Some further patches
> will refine the scheduler to make it more parallel, and even the code
> above will continue to change, see it as a first step in the right
> direction.

understood.

> We noticed a nice performance boost on the last one with many cores
> (24 threads, something like +40% on connection rate), but we'll probably
> see even better once the rest is addressed.

indeed, I remember we spoke about those improvments at the last meetup.
nice work, 1.9 looks already interesting from this point of view!


Cheers,

-- 
William

Reply via email to