Hi Brian,

On Tue, Oct 22, 2019 at 04:19:58PM +0200, Willy Tarreau wrote:
> At the moment I don't know what it requires to break it down per thread,
> so I'll add a github issue referencing your report so that we don't forget.
> Depending on the complexity, it may make sense to backport it once done,
> because it's still a regression.

So this morning I spent a little time on it and it was really trivial to
fix. I've now splitted the LRU cache per thread (just as it is when using
nbproc in fact) so that we can get rid of the lock. Result, from 67kH/s
I jump to 369kH/s on a test with 7 threads on 4 cores :-)

I could backport it as far as 1.8 so I did it, as I really consider this
as a regression and it can be a showstopper for some users to migrate to
threads. The gains are less important in 1.8 since the scalability is much
less linear there. I'm going to work on issuing some releases today or over
this week, I'd encourage you to migrate to 2.0.8 once it's out.

Thanks again for your detailed report!
Willy

Reply via email to