Hi Jim, Thanks very much for your email! I will look into both of them and send an update tonight!
Thanks, Yingqi > On May 31, 2014, at 9:43 AM, "Jim Jagielski" <j...@jagunet.com> wrote: > > I also see: > > /* kill off the idle ones */ > - ap_mpm_pod_killpg(pod, retained->max_daemons_limit); > + for (i = 0; i < num_buckets; i++) { > + ap_mpm_pod_killpg(pod[i], i, retained->max_daemons_limit); > + } > > > Is that right? Why isn't it: ap_mpm_pod_killpg(pod, > retained->max_daemons_limit, i); ?? > > /** > * Write data to the pipe-of-death, signalling that all child process > * should die. > * @param pod The pipe-of-death to write to. > * @param num The number of child processes to kill > + * @param my_bucket the bucket that holds the dying child process. > */ > -AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num); > +AP_DECLARE(void) ap_mpm_pod_killpg(ap_pod_t *pod, int num, int child_bucket); > > Isn't 'num' the same in both implementation?? > >> On May 31, 2014, at 12:03 PM, Jim Jagielski <j...@jagunet.com> wrote: >> >> Sorry I didn't catch this earlier: >> >> I see >> >> +++ httpd-trunk.new/include/mpm_common.h 2014-05-16 13:07:03.892987491 >> -0400 >> @@ -267,16 +267,18 @@ >> * Write data to the pipe-of-death, signalling that one child process >> * should die. >> * @param pod the pipe-of-death to write to. >> + * @param my_bucket the bucket that holds the dying child process. >> */ >> -AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod); >> +AP_DECLARE(apr_status_t) ap_mpm_pod_signal(ap_pod_t *pod, int child_bucket); >> >> We can change the API at this point. We could >> add another function, eg ap_mpm_pod_signal_ex() which >> takes the int param, but we can't modify ap_mpm_pod_signal() >> itself. >> >>> On May 30, 2014, at 11:15 AM, Lu, Yingqi <yingqi...@intel.com> wrote: >>> >>> Thank you very much! >>> >>> Thanks, >>> Yingqi >>> >>> -----Original Message----- >>> From: Jim Jagielski [mailto:j...@jagunet.com] >>> Sent: Friday, May 30, 2014 7:07 AM >>> To: dev@httpd.apache.org >>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with SO_REUSEPORT >>> support >>> >>> Thx! Let me review. My plan is to fold into trunk this weekend. >>> >>>> On May 16, 2014, at 2:53 PM, Lu, Yingqi <yingqi...@intel.com> wrote: >>>> >>>> Hi Jim, >>>> >>>> Thanks very much for clarifying this with me. I added #ifdef in the code >>>> to check _SC_NPROCESSORS_ONLN in the so_reuseport patch. Bucket patch does >>>> not use this parameter so that it remains the same. >>>> >>>> Attached are the two most recent patches. I already updated the bugzilla >>>> #55897 as well. >>>> >>>> Thanks, >>>> Yingqi >>>> >>>> -----Original Message----- >>>> From: Jim Jagielski [mailto:j...@jagunet.com] >>>> Sent: Thursday, May 15, 2014 7:53 AM >>>> To: dev@httpd.apache.org >>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with >>>> SO_REUSEPORT support >>>> >>>> I was thinking more about the sysconf(_SC_NPROCESSORS_ONLN) stuff... >>>> We could either check for that during config/build or protect it with a >>>> #ifdef in the code (and create some logging so the admin nows if it was >>>> found or not). >>>> >>>>> On May 14, 2014, at 11:59 AM, Lu, Yingqi <yingqi...@intel.com> wrote: >>>>> >>>>> Hi Jim, >>>>> >>>>> Thanks very much for your email. >>>>> >>>>> In the SO_REUSEPORT patch, SO_REUSEPORT support is checked inside >>>>> listen.c file. If the feature is not supported on the OS (for example, >>>>> Linux kernel < 3.9), it will fall back to the original behavior. >>>>> >>>>> In the bucket patch, there is no need to check the params. With single >>>>> listen statement, it is just the default behavior. >>>>> >>>>> Please let me know if this answers your question. >>>>> >>>>> Thanks, >>>>> Yingqi >>>>> >>>>> -----Original Message----- >>>>> From: Jim Jagielski [mailto:j...@jagunet.com] >>>>> Sent: Wednesday, May 14, 2014 6:57 AM >>>>> To: dev@httpd.apache.org >>>>> Subject: Re: [PATCH ASF bugzilla# 55897]prefork_mpm patch with >>>>> SO_REUSEPORT support >>>>> >>>>> This is very cool! >>>>> >>>>> mod_status assumes that sysconf() exists, but do we need to do a config >>>>> check on the params we use in these patches? >>>>> We look OK on Linux, FreeBSD and OSX... >>>>> >>>>> I'm +1 on folding into trunk. >>>>> >>>>>> On May 13, 2014, at 7:55 PM, Lu, Yingqi <yingqi...@intel.com> wrote: >>>>>> >>>>>> Dear All, >>>>>> >>>>>> During the last couple weeks, I spent some time extending the original >>>>>> two patches from prefork MPM only to all three Linux MPMs (prefork, >>>>>> worker and event). Attached is the latest version of the two patches. >>>>>> Bugzilla database has also been updated already. The ID for the two >>>>>> patches are #55897 and #56279. Please refer to messages below for >>>>>> details on both of the patches. >>>>>> >>>>>> Quick test result on modern dual socket Intel platform (Linux Kernel >>>>>> 3.13.9) SO_REUSEPORT patch (bugzilla #55897) >>>>>> 1. Prefork MPM: 1 listen statement: 2.16X throughput improvement; >>>>>> 2 listen statements: 2.33X throughput improvement >>>>>> 2. Worker MPM: 1 listen statement: 10% throughput improvement; 2 >>>>>> listen statements: 35% throughput improvement >>>>>> 3. Event MPM: 1 listen statement: 13% throughput improvement; 2 >>>>>> listen statements: throughput parity, but 62% response time reduction >>>>>> (with patch, 38% response time as original SW) >>>>>> >>>>>> Bucket patch (bugzilla #56279, only impact multiple listen statement >>>>>> case) >>>>>> 1. Prefork MPM: 2 listen statements: 42% throughput improvement >>>>>> 2. Worker MPM: 2 listen statements: 7% throughput improvement >>>>>> >>>>>> In all the above testing cases, significant response time reductions are >>>>>> observed, even with throughput improvements. >>>>>> >>>>>> Please let me know your feedback and comments. >>>>>> >>>>>> Thanks, >>>>>> Yingqi >>>>>> Software and workloads used in performance tests may have been optimized >>>>>> for performance only on Intel microprocessors. Performance tests, such >>>>>> as SYSmark and MobileMark, are measured using specific computer systems, >>>>>> components, software, operations and functions. Any change to any of >>>>>> those factors may cause the results to vary. You should consult other >>>>>> information and performance tests to assist you in fully evaluating your >>>>>> contemplated purchases, including the performance of that product when >>>>>> combined with other products. >>>>>> >>>>>> From: Lu, Yingqi [mailto:yingqi...@intel.com] >>>>>> Sent: Monday, March 17, 2014 1:41 PM >>>>>> To: dev@httpd.apache.org >>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with >>>>>> SO_REUSEPORT support >>>>>> >>>>>> Dear all, >>>>>> >>>>>> Based on the feedback we received, we modified this patch. Here is the >>>>>> most recent version. We also modified the Bugzilla database(Bugzilla# >>>>>> 55897 for SO_REUSEPORT patch; Bugzilla# 56279 for bucket patch). >>>>>> >>>>>> Below are the changes we made into this new version: >>>>>> >>>>>> According to Yann Ylavic and other people's comments, we separate the >>>>>> original patch between with and without SO_REUSEPORT into two separated >>>>>> patches. The SO_REUSEPORT patch does not change the original listen >>>>>> sockets, it just duplicate the original one into multiple ones. Since >>>>>> the listen sockets are identical, there is no need to change the >>>>>> idle_server_maintenance function. The bucket patch (without >>>>>> SO_REUSEPORT), on the other hand, it breaks down the original listen >>>>>> record (if there are multiple listen socks) to multiple listen record >>>>>> linked lists. In this case, idle_server_maintenance is implemented at >>>>>> bucket level to address the situation that imbalanced traffic occurs >>>>>> among different listen sockets/children buckets. In the bucket patch, >>>>>> the polling in the child process is removed since each child only >>>>>> listens to 1 sock. >>>>>> >>>>>> According to Arkadiusz Miskiewicz's comment, we make the "detection of >>>>>> SO_REUSEPORT" at run time. >>>>>> >>>>>> According to Jeff Trawick's comments, 1. We generate the patches >>>>>> against the httpd trunk. >>>>>> 2. We tested the current patches and they do not impact event and worker >>>>>> mpms. If current patches can be accepted, we would be happy to extend >>>>>> them to other Linux based mpms. There are not much code changes, but >>>>>> require some time to setup the workload to test. >>>>>> 3. We removed unnecessary comments and changed APLOGNO(). We also >>>>>> changed some of the parameter/variable/function names to better >>>>>> represent their meanings. >>>>>> 4. There should be no build-in limitations for SO_REUSEPORT patch. For >>>>>> bucket patch, the only thing is the number of children bucket only >>>>>> scales to MAX_SPAWN_RATE. If there are more than 32 (current default >>>>>> MAX_SPQN_RATE) listen statements specified in the httpd.conf, the number >>>>>> of buckets will be fixed to 32. The reason for this is because that we >>>>>> implement the idle_server_maintenance at bucket level, each bucket's own >>>>>> max_spawn_rate is set to MAX_SPAWN_RATE/num_buckets. >>>>>> >>>>>> Again, thanks very much for all the comments and feedback. Please let us >>>>>> know if there are more changes we need to complete to make them accepted. >>>>>> >>>>>> Thanks, >>>>>> Yingqi Lu >>>>>> >>>>>> >>>>>> >>>>>> From: Lu, Yingqi >>>>>> Sent: Tuesday, March 04, 2014 10:43 AM >>>>>> To: dev@httpd.apache.org >>>>>> Subject: RE: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with >>>>>> SO_REUSEPORT support >>>>>> >>>>>> Hi Jeff, >>>>>> >>>>>> Thanks very much for your time reviewing the patch! We will modify the >>>>>> patch according to your comments and repost it here. >>>>>> >>>>>> Thanks, >>>>>> Yingqi >>>>>> >>>>>> From: Jeff Trawick [mailto:traw...@gmail.com] >>>>>> Sent: Tuesday, March 04, 2014 10:08 AM >>>>>> To: Apache HTTP Server Development List >>>>>> Subject: Re: FW: [PATCH ASF bugzilla# 55897]prefork_mpm patch with >>>>>> SO_REUSEPORT support >>>>>> >>>>>> On Tue, Mar 4, 2014 at 10:35 AM, Lu, Yingqi <yingqi...@intel.com> wrote: >>>>>> Hi All, >>>>>> >>>>>> I just want to ping again on this patch to gather your feedback and >>>>>> comments. Please refer to the messages below for patch details. >>>>>> >>>>>> If you need any additional information/supporting data, please let us >>>>>> know as well. >>>>>> >>>>>> Yeah, it has been on my todo list, but I don't have time to give an >>>>>> in depth review at the moment. Here are a few questions/comments. >>>>>> (And you'll have to deal with the fact that it is unnecessarily >>>>>> tedious for me to evaluate higher-level considerations if there are >>>>>> a lot of distractions, such as the code comments below ;) But >>>>>> others are of course free to chime in.) >>>>>> >>>>>> The patch should be against httpd trunk. It probably won't take much >>>>>> time for you to create that patch and confirm basic operation. >>>>>> >>>>>> What is the impact to other MPMs, even if they shouldn't use or don't >>>>>> have the necessary code to use SO_REUSEPORT at this time? >>>>>> >>>>>> Have you tried the event MPM? >>>>>> >>>>>> Is there a way for the admin to choose this behavior? Most won't care, >>>>>> but everyone's behavior is changed AFAICT. >>>>>> >>>>>> Are there built-in limitations in this patch that we should be aware of? >>>>>> E.g., the free slot/spawn rate changes suggest to me that there can't >>>>>> be more than 1025 children??? >>>>>> >>>>>> We should assume for now that there's no reason this couldn't be >>>>>> committed to trunk after review/rework, so make sure it is as close as >>>>>> you can get it to what you think is the final form. >>>>>> >>>>>> For the configure-time check for 3.9 kernel: I think we'd also use >>>>>> AC_TRY_COMPILE at configure time to confirm that the SO_REUSEPORT >>>>>> definition is available, and not enable it if the system includes >>>>>> doesn't define it. (Does that cause a problem for any significant >>>>>> number of people?) >>>>>> >>>>>> Don't mention the patch in the patch ;) (e.g., "This function is >>>>>> added for the patch.") >>>>>> >>>>>> Incomplete comments on style/syntax issues: >>>>>> >>>>>> * mixing declarations and statements (e.g., "duplr->next = 0; >>>>>> apr_socket_t *temps;") isn't supported by all compilers and is >>>>>> distracting when reviewing >>>>>> * suitable identifier names (e.g., fix global variable "flag" and >>>>>> whatever else isn't appropriate; "ap_post_config_listeners" should >>>>>> be renamed to indicate what it does >>>>>> * APLOGNO(99999) and comments about fixing it: Instead put "APLOGNO()" >>>>>> and don't add reminders in comments >>>>>> * this doesn't seem portable: "int >>>>>> free_slots[MAX_SPAWN_RATE/num_buckets];" >>>>>> and so on >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Yingqi >>>>>> >>>>>> >>>>>> From: Lu, Yingqi >>>>>> Sent: Friday, January 24, 2014 3:26 PM >>>>>> To: dev@httpd.apache.org >>>>>> Subject: [PATCH ASF bugzilla# 55897]prefork_mpm patch with >>>>>> SO_REUSEPORT support >>>>>> >>>>>> Dear All, >>>>>> >>>>>> Our analysis of Apache httpd 2.4.7 prefork mpm, on 32 and 64 thread >>>>>> Intel Xeon 2600 series systems, using an open source three tier social >>>>>> networking web server workload, revealed performance scaling issues. In >>>>>> current software single listen statement (listen 80) provides better >>>>>> scalability due to un-serialized accept. However, when system is under >>>>>> very high load, this can lead to big number of child processes stuck in >>>>>> D state. On the other hand, the serialized accept approach cannot scale >>>>>> with the high load either. In our analysis, a 32-thread system, with 2 >>>>>> listen statements specified, could scale to just 70% utilization, and a >>>>>> 64-thread system, with signal listen statement specified (listen 80, 4 >>>>>> network interfaces), could scale to only 60% utilization. >>>>>> >>>>>> Based on those findings, we created a prototype patch for prefork mpm >>>>>> which extends performance and thread utilization. In Linux kernel newer >>>>>> than 3.9, SO_REUSEPORT is enabled. This feature allows multiple sockets >>>>>> listen to the same IP:port and automatically round robins connections. >>>>>> We use this feature to create multiple duplicated listener records of >>>>>> the original one and partition the child processes into buckets. Each >>>>>> bucket listens to 1 IP:port. In case of old kernel which does not have >>>>>> the SO_REUSEPORT enabled, we modified the "multiple listen statement >>>>>> case" by creating 1 listen record for each listen statement and >>>>>> partitioning the child processes into different buckets. Each bucket >>>>>> listens to 1 IP:port. >>>>>> >>>>>> Quick tests of the patch, running the same workload, demonstrated a 22% >>>>>> throughput increase with 32-threads system and 2 listen statements >>>>>> (Linux kernel 3.10.4). With the older kernel (Linux Kernel 3.8.8, >>>>>> without SO_REUSEPORT), 10% performance gain was measured. With single >>>>>> listen statement (listen 80) configuration, we observed over 2X >>>>>> performance improvements on modern dual socket Intel platforms (Linux >>>>>> Kernel 3.10.4). We also observed big reduction in response time, in >>>>>> addition to the throughput improvement gained in our tests 1. >>>>>> >>>>>> Following the feedback from the bugzilla website where we originally >>>>>> submitted the patch, we removed the dependency of APR change to simplify >>>>>> the patch testing process. Thanks Jeff Trawick for his good suggestion! >>>>>> We are also actively working on extending the patch to worker and event >>>>>> MPMs, as a next step. Meanwhile, we would like to gather comments from >>>>>> all of you on the current prefork patch. Please take some time test it >>>>>> and let us know how it works in your environment. >>>>>> >>>>>> This is our first patch to the Apache community. Please help us review >>>>>> it and let us know if there is anything we might revise to improve it. >>>>>> Your feedback is very much appreciated. >>>>>> >>>>>> Configuration: >>>>>> <IfModule prefork.c> >>>>>> ListenBacklog 105384 >>>>>> ServerLimit 105000 >>>>>> MaxClients 1024 >>>>>> MaxRequestsPerChild 0 >>>>>> StartServers 64 >>>>>> MinSpareServers 8 >>>>>> MaxSpareServers 16 >>>>>> </IfModule> >>>>>> >>>>>> 1. Software and workloads used in performance tests may have been >>>>>> optimized for performance only on Intel microprocessors. Performance >>>>>> tests, such as SYSmark and MobileMark, are measured using specific >>>>>> computer systems, components, software, operations and functions. Any >>>>>> change to any of those factors may cause the results to vary. You should >>>>>> consult other information and performance tests to assist you in fully >>>>>> evaluating your contemplated purchases, including the performance of >>>>>> that product when combined with other products. >>>>>> >>>>>> Thanks, >>>>>> Yingqi >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Born in Roswell... married an alien... >>>>>> http://emptyhammock.com/ >>>>>> http://edjective.org/ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Born in Roswell... married an alien... >>>>>> http://emptyhammock.com/ >>>>>> http://edjective.org/ >>>>>> >>>>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch> >>>> >>>> <httpd_trunk_so_reuseport.patch><httpd_trunk_bucket.patch> >