Hello,

We’ve been trying to upgrade our current LB configuration from nginx + haproxy 1.4 to haproxy 1.5. Right now as you might guess we’re using nginx for SSL termination and would like to get rid of it and use haproxy for that. Currently we’re seeing a bit of weird behavior, we have monitoring for number of ports and in what states they are (TIME_WAIT, ESTABLISHED, etc.) and the problem is that graph looks very different for nginx + haproxy and just haproxy setup. 

On attached graph you can see TIME_WAIT and ESTABLISHED number of ports comparing old configuration vs new (this bump is when we re-routed part of the traffic to new LB). So, as you can see number of TIME_WAIT ports on nginx + haproxy configuration is around 65k (which make sense), and number of ESTABLISHED ports less than 1k, but for haproxy 1.5 configuration these numbers looks completely different - TIME_WAIT and ESTABLISHED are almost on the same level ~7k. We’re gathering these metrics every minute.


HAproxy 1.5 configuration file attached to email. Briefly, we’re using listen (in tcp mode) which terminates SSL traffic and then use send-proxy/accept-proxy to send to frontend which in turn sends requests to backend. Comparing to haproxy 1.4 configuration we don’t have “listen” but only frontend -> backend. 

Initially I thought the problem was that we have only one haproxy instance and it was spending a lot of time on SSL decryption/encryption so it couldn’t process requests fast enough (it got 100% utilization straight away), but increasing number of procs (nbproc) doesn’t seem to help.

Attachment: haproxy_1.5.cfg
Description: Binary data


So, that being said, I have a question:
  • why number of established connection so different - less than 1k (on nginx + haproxy) vs ~7k (on haproxy 1.5)? does haproxy handles ports differently? I guess haproxy might use some Linux tricks with sockets to avoid massive creation of TIME_WAIT sockets, or re-using it using it’s own algorithm, and do not rely on system to allocate ports. (btw we have net.ipv4.tw_reuse enabled)

Another problem - lot of connection being dropped with “cD” state in tcp mode, after the "client timeout" has been reached. Wondering maybe tcp mode (on listen) + http mode (on frontend/backend) with combination send-proxy/accept-proxy is not the best choice here? And does haproxy gracefully closes the connection on “client timeout” or just drops it?

Jul 24 06:03:20 lb001 haproxy[22533]: 100.1.102.223:45356 [24/Jul/2014:06:03:20.389] http-in app-tier/api011 68/0/0/21/89 200 456 - - ---- 19001/19001/159/6/0 0/0 "POST /2/devices/bluetooth-support-level.json HTTP/1.1"
Jul 24 06:03:20 lb001 haproxy[22533]: 100.1.102.223:45356 [24/Jul/2014:06:03:20.479] http-in app-tier/api016 87/0/0/32/119 200 511 - - ---- 18988/18988/164/8/0 0/0 "GET /2/user/-/devices.json HTTP/1.1"
Jul 24 06:03:20 lb001 haproxy[22533]: 100.1.102.223:45356 [24/Jul/2014:06:03:20.599] http-in app-tier/api012 87/0/1/31/119 200 679 - - ---- 19008/19008/154/7/0 0/0 "GET /2/user/-/devices/t.json HTTP/1.1"
Jul 24 06:03:20 lb001 haproxy[22533]: 100.1.102.223:45356 [24/Jul/2014:06:03:20.718] http-in app-tier/api016 84/0/0/33/117 200 772 - - ---- 19029/19029/158/9/0 0/0 "GET /2/user/-/devices/t/options.json HTTP/1.1"
Jul 24 06:03:21 lb001 haproxy[22533]: 100.1.102.223:45356 [24/Jul/2014:06:03:20.836] http-in app-tier/api021 153/0/1/25/179 200 448 - - ---- 19016/19016/198/7/0 0/0 "GET /2/user/-/devices/t/10134293/alarms.json HTTP/1.1"
Jul 24 06:03:21 lb001 haproxy[22533]: 100.1.102.223:45356 [24/Jul/2014:06:03:21.014] http-in app-tier/api025 93/0/0/37/130 200 4295 - - ---- 19036/19036/169/9/0 0/0 "GET /2/user/-/activities/s/date/2014-07-24/2014-07-24.json HTTP/1.1"
Jul 24 06:04:11 lb001 haproxy[22541]: 100.1.102.223:45356 [24/Jul/2014:06:03:20.280] https-in~ https-in/http-in 109/0/50866 7335 cD 1376/1376/1303/1303/0 0/0

Thank you for you input! Please include me in Cc, since I’m not subscribed to this mailing list.

Reply via email to