Hi Willy, I agree the haproxy logs show that, but we also monitor the time spent processing the request which takes in to account, GC, reading data off the FS and a number of things inside the app and I see no 3sec times in there or anything near it. Also I have no 3 sec outliers in output from my test so that seems a little weird it says 3secs. Also I have the connections set really high to prevent queueing for now, we usually only have around 1000-2000 connections open.
uname -a Linux 2.6.18-194.17.1.el5 #1 SMP Wed Sep 29 12:50:31 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux haproxy -vv HA-Proxy version 1.4.15 2011/04/08 Copyright 2000-2010 Willy Tarreau <w...@1wt.eu> Build options : TARGET = linux26 CPU = generic CC = gcc CFLAGS = -m64 -march=x86-64 -O2 -g -fno-strict-aliasing OPTIONS = USE_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 Encrypted password support via crypt(3): yes Available polling systems : sepoll : pref=400, test result OK epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 4 (4 usable), will use sepoll. My config global maxconn 200000 stats socket /var/run/haproxy.stat mode 600 pidfile /var/run/haproxy.pid daemon defaults mode http timeout client 8000 timeout server 5000 timeout connect 5000 timeout queue 5000 option http-server-close option forwardfor option tcp-smart-connect option tcp-smart-accept balance roundrobin frontend recs-west-cost bind XXXX maxconn 100000 default_backend runtimes frontend recs-central bind XXX maxconn 100000 default_backend runtimes frontend ssl-recs bind 127.0.0.2:8443 maxconn 100000 default_backend ssl-runtimes backend runtimes mode http http-check disable-on-404 http-check expect status 200 option httpchk /rrserver/healthcheck server sf-rt-101 10.108.0.101:8101 weight 1 check server sf-rt-102 10.108.0.102:8101 weight 1 check server sf-rt-103 10.108.0.103:8101 weight 1 check server sf-rt-104 10.108.0.104:8101 weight 1 check server sf-rt-105 10.108.0.105:8101 weight 1 check server sf-rt-106 10.108.0.106:8101 weight 1 check server sf-rt-107 10.108.0.107:8101 weight 1 check server sf-rt-108 10.108.0.108:8101 weight 1 check server sf-rt-109 10.108.0.109:8101 weight 1 check server sf-rt-110 10.108.0.110:8101 weight 1 check server sf-rt-111 10.108.0.111:8101 weight 1 check server sf-rt-141 10.108.0.141:8101 weight 1 check server sf-rt-142 10.108.0.142:8101 weight 1 check backend ssl-runtimes mode http http-check disable-on-404 http-check expect status 200 option httpchk /rrserver/healthcheck server sf-rt-101 10.108.0.101:8151 weight 1 check server sf-rt-102 10.108.0.102:8151 weight 1 check server sf-rt-103 10.108.0.103:8151 weight 1 check server sf-rt-104 10.108.0.104:8151 weight 1 check server sf-rt-105 10.108.0.105:8151 weight 1 check server sf-rt-106 10.108.0.106:8151 weight 1 check server sf-rt-107 10.108.0.107:8151 weight 1 check server sf-rt-108 10.108.0.108:8151 weight 1 check server sf-rt-109 10.108.0.109:8151 weight 1 check server sf-rt-110 10.108.0.110:8151 weight 1 check server sf-rt-111 10.108.0.111:8151 weight 1 check server sf-rt-141 10.108.0.141:8151 weight 1 check server sf-rt-142 10.108.0.142:8151 weight 1 check userlist UsersFor_HAProxyStatistics group admin users XXX user XXXXX user XXXXX listen stats *:9000 mode http stats enable option contstats stats uri /haproxy_stats stats show-node stats show-legends acl AuthOkay_ReadOnly http_auth(UsersFor_HAProxyStatistics) acl AuthOkay_Admin http_auth_group(UsersFor_HAProxyStatistics) admin stats http-request auth realm HAProxy-Statistics unless AuthOkay_ReadOnly stats admin if AuthOkay_Admin Thanks, Matt C. On Thu, Jun 9, 2011 at 1:01 PM, Willy Tarreau <w...@1wt.eu> wrote: > Hi Matt, > > On Thu, Jun 09, 2011 at 11:37:00AM -0700, Matt Christiansen wrote: >> I turned on those two options and seemed to help a little. >> >> We don't have a 2.6.30+ kernel so I don't believe option >> splice-response will work(?). Thats one of the things I'm going to try >> next. > > Splicing is OK since 2.6.27.something. But it will not affect the > time distribution at all. > >> I used halog to narrow down the sample, it was still a few 100 lines >> so I picked three at random. >> >> Jun 1 14:19:59 localhost haproxy[3124]: 76.102.107.85:28023 >> [01/Jun/2011:14:19:48.502] recs runtimes/sf-102 8062/0/0/3123/+11185 >> 200 +814 - - ---- 1267/1267/18/14/0 0/0 {Apache-Coyote/1.1|3827|||} >> Jun 1 14:19:09 localhost haproxy[3124]: 96.229.202.77:56011 >> [01/Jun/2011:14:19:00.861] recs runtimes/sf-103 4982/0/0/3956/+8938 >> 200 +426 - - ---- 1214/1212/39/39/0 0/0 {Apache-Coyote/1.1|622|||} >> Jun 1 14:22:09 localhost haproxy[3124]: 108.68.28.81:59854 >> [01/Jun/2011:14:19:02.218] recs runtimes/sf-110 3731/0/0/3844/+7575 >> 200 +523 - - ---- 1214/1212/45/43/0 0/0 {Apache-Coyote/1.1|4856|||} >> >> If you need more I can attach the log, im removing the request url and >> referrer just because it has client info it, I'll have to ask if thats >> ok to post. > > These lines indicate that the server took between 3.1 and 3.8 seconds > to start responding. The variation in the server's response time is > higher than the variation you observe in your tests, so it's really > not easy to guess what happens, because it could as well be the server > taking more time with requests from haproxy than from nginx as it could > be haproxy by itself. > > Could you please post your config ? Just replace any sensible information > such as IP addresses, server names or user/passwords with XXXX. It's very > possible we could spot something abnormal or suboptimal there but without > a config it's hard to tell. > > One thing which can be responsible for this is the frontend's backlog > parameter, which defaults to the maxconn value. If it's too low, the > system might drop some incoming connection requests from time to time. > > Also, could you please report the output of "haproxy -vv", which indicates > the build options and supported pollers ? After that we could suggest a few > parameters to test. > > Regards, > Willy > >