o SO_REUSEPORT on trunk

Lu, Yingqi Thu, 06 Nov 2014 12:14:22 -0800

Hi Yann,

I do not see any documents regarding to this new configurable flag 
ListenCoresBucketsRatio (maybe I missed it) and also users may not be familiar 
with it, I still think maybe it is better to keep the default to 8 at least in 
the trunk.

Regarding to how to make small systems take advantage of this patch, I actually 
did some testing on system with less cores. The data show that when system has 
less than 16 cores, more than 1 bucket does not bring any throughput and 
response time benefits. The patch is used mainly for big systems to resolve the 
scalability issue. That is the reason why we previously hard coded the ratio to 
8 (impact only on system has 16 cores or more). 

The accept_mutex is not much a bottleneck anymore with the current patch 
implantation. Current implementation already cut 1 big mutex into multiple 
smaller mutexes in the multiple listen statements case (each bucket has its 
dedicated accept_mutex). To prove this, our data show performance parity 
between 1 listen statement (listen 80, no accept_mutex) and 2 listen statements 
(listen 192.168.1.1 80, listen 192.168.1.2 80, with accept_mutex) with current 
trunk version. Comparing against without SO_REUSEPORT patch, we see 28% 
performance gain with 1 listen statement case and 69% gain with 2 listen 
statements case. 

Regarding to the approach that enables each child has its own listen socket, I 
did some testing with current trunk version to increase the number of buckets 
to be equal to a reasonable serverlimit (this avoids number of child processes 
changes). I also verified that MaxClient and ThreadPerChild were set properly. 
I used single listen statement so that accept_mutex was disabled. Comparing 
against the current approach, this has ~25% less throughput with significantly 
higher response time.

In addition to this, implementing the listen socket for each child separately 
has less performance as well as connection loss/timeout issues with current 
Linux kernel. Below are more information/data we collected with "each child 
process has its own listen socket" approach:
1. During the run, we noticed that there are tons of “read timed out” errors. 
These errors not only happen when the system is highly utilized, it even 
happens when system is only 10% utilized. The response time was high.
2. Compared to current trunk implementation, we found "each child has its own 
listen socket approach" results in significantly higher (up to 10X) response 
time at different CPU utilization levels. At peak performance level, it has 
20+% less throughput with tons of “connection reset” errors in additional to 
“read timed out” errors. Current trunk implementation does not have errors.
3. During the graceful restart, there are tons of connection losses. 

Based on the above findings, I think we may want to keep the current approach. 
It is a clean, working and better performing one :-)

Thanks,
Yingqi

-----Original Message-----
From: Yann Ylavic [mailto:ylavic....@gmail.com]
Sent: Thursday, November 06, 2014 4:59 AM
To: httpd
Subject: Re: Listeners buckets and duplication w/ and w/o SO_REUSEPORT on trunk

Rebasing discussion here since this thread seems to be referenced in PR55897, 
and the discussion has somehow been forked and continued in [1].

[1]. 
http://mail-archives.apache.org/mod_mbox/httpd-dev/201410.mbox/%3c9acd5b67aac5594cb6268234cf29cf9aa37e9...@orsmsx113.amr.corp.intel.com%3E

On Sat, Oct 11, 2014 at 1:55 AM, Lu, Yingqi <yingqi...@intel.com> wrote:
> Attached patch is generated based on current trunk. It covers for 
> prefork/worker/event/eventopt MPM.

The patch (modified) has now been applied to trunk with r1635521.

On Thu, Oct 30, 2014 at 5:10 PM, Lu, Yingqi <yingqi...@intel.com> wrote:
> As this is getting better, I am wondering if you guys have plan to put this 
> SO_REUSEPORT patch into the stable version.
> If yes, do you have a rough timeline?

The whole feature could certainly be proposed for 2.4.x since there is no 
(MAJOR) API change.

On Thu, Nov 6, 2014 at 6:52 AM, Lu, Yingqi <yingqi...@intel.com> wrote:
> I just took some testing on the most recent trunk version.
> I found out that num_buckets is default to 1 (ListenCoresBucketsRatio is 
> default to 0).
> Adding ListenCoresBucketsRatio is great since user can have control over this.
> However, I am thinking it may be better to make this default at 8. 
> This will make the SO_REUSEPORT support to be default enabled (8 buckets).
(8 buckets with 64 CPU cores, lucky you...).

Yes this change wrt to your original patch is documented in the commit message, 
including how change it to an opt-out.
I chose the opt-in way because I almost always find it safer, especially for 
backports to stable.
I have no strong opinion on this regarding trunk, though, could be an opt-out 
(easily) there.

Let's see what others say on this and the backport to 2.4.x.
Anyone?

> In case users are not aware of this new ListenCoresBucketsRatio 
> configurable flag, they can still enjoy the performance benefits.

Users with 64 cores available should always care about performances tuning ;)

Btw, I wonder if there are other ways to take advantage of the listeners 
buckets (even with fewer cores).
The other advantage of SO_REUSEPORT is that, provided that each child has its 
own listeners bucket, we can avoid the accept mutex lock (which also seemed yo 
be a bottleneck if I recall your original proposal's discussion correctly).

Did you make any testing without the accept mutex and with a number of buckets 
equal to some reasonable ServerLimit (and then play with ThreadPerChild to 
reach the MaxClient/MaxRequestWorkers goal)?

Regards,
Yann.

RE: Listeners buckets and duplication w/ and w/o SO_REUSEPORT on trunk

Reply via email to