Re: Segfault in liblua-5.3.so
Hi Willy, Thanks for the response. Unfortunately we cannot reproduce this in the test and we have disabled the reload dependent feature in production. We will test more with the latest build and let you know. On Sat, Feb 13, 2021 at 1:28 PM Willy Tarreau wrote: > Hi Sachin, > > On Thu, Feb 11, 2021 at 03:11:09AM +0530, Sachin Shetty wrote: > > Hi, > > > > We have a lua block that connects to memcache when a request arrives > > > > """ > > function get_from_gds(host, port, key)local sock = core.tcp() > > sock:settimeout(20)local result = DOMAIN_NOT_FOUNDlocal > > status, error = sock:connect(host, port)if not status then > > core.Alert(GDS_LOG_PREFIX .. "GDS_ERROR: Error in connecting:" .. key > > .. ":" .. port .. ":" .. error)return GDS_ERROR, "Error: " .. > > errorendsock:send(key .. "\r\n")while true dolocal > > s, status, partial = sock:receive("*l")if not s then > > core.Alert(GDS_LOG_PREFIX .. "GDS_ERROR: Error reading:" .. key .. > > ":" .. port .. ":" .. status)return GDS_ERROR, status > > endif s == "END" then break endresult = send > > sock:close()return resultend > > > > -- Comment: get_proxy calls get_from_gds > > > > core.register_action("get_proxy", { "http-req" }, get_proxy) > > """ > > The value is cached in a haproxy map so we don't make a memcache > > connection for every request. > > > > At peak traffic if we reload haproxy, that invalidates the map and the > > surge causes > > quite a few memcache connections to fail. Error returned is "Can't > connect" > > > > We see the following messages in dmesg > > > > [ +0.006924] haproxy[14258]: segfault at 0 ip 7f117fba94c4 sp > > 7f1179eefe08 error 4 in liblua-5.3.so[7f117fba1000+37000] > > > > HA-Proxy version 2.0.18-be8b761 2020/09/30 - https://haproxy.org/ > > Unfortunately, this is not enough to figure the cause, you'll need to > enable core dumps and to pass it through gdb to figure a more exploitable > backtrace. Please take this opportunity for updating, as I'm seeing 117 > patches merged into 2.0 after your version, some of which affect Lua > and others related to thread safety. One of them is even related to > Lua+maps. > > Note, if that's not urgent on your side, we do have a few more fixes > pending to be backported to 2.0 that will warrant yet another version. > However none of them seem related to your issue (but if you're willing > to retest with the latest 2.0 snapshot you're welcome of course). > > Willy >
Segfault in liblua-5.3.so
Hi, We have a lua block that connects to memcache when a request arrives """ function get_from_gds(host, port, key)local sock = core.tcp() sock:settimeout(20)local result = DOMAIN_NOT_FOUNDlocal status, error = sock:connect(host, port)if not status then core.Alert(GDS_LOG_PREFIX .. "GDS_ERROR: Error in connecting:" .. key .. ":" .. port .. ":" .. error)return GDS_ERROR, "Error: " .. errorendsock:send(key .. "\r\n")while true dolocal s, status, partial = sock:receive("*l")if not s then core.Alert(GDS_LOG_PREFIX .. "GDS_ERROR: Error reading:" .. key .. ":" .. port .. ":" .. status)return GDS_ERROR, status endif s == "END" then break endresult = send sock:close()return resultend -- Comment: get_proxy calls get_from_gds core.register_action("get_proxy", { "http-req" }, get_proxy) """ The value is cached in a haproxy map so we don't make a memcache connection for every request. At peak traffic if we reload haproxy, that invalidates the map and the surge causes quite a few memcache connections to fail. Error returned is "Can't connect" We see the following messages in dmesg [ +0.006924] haproxy[14258]: segfault at 0 ip 7f117fba94c4 sp 7f1179eefe08 error 4 in liblua-5.3.so[7f117fba1000+37000] HA-Proxy version 2.0.18-be8b761 2020/09/30 - https://haproxy.org/ This is a recent issue, we never saw this in 1.8. Any idea? We only see this at peak load. At regular load we don't see this issue even when we reload haproxy. Thanks Sachin
Stats socket set map errors out if key does not exist
Hi, As per the documentation ( https://cbonte.github.io/haproxy-dconv/1.8/configuration.html#4.2-http-request) stats socket set map should handle adding a new key as well as a updating an existing key, but my tests show otherwise echo "set map /somemap.txt abcKey abcValue" | socat stdio /var/run/haproxy_socket >>entry not found. if the key is missing, set map fails. The stats socket documentation in management section does not mention anything about this? Is there a way to check to do a single add Or setIfExists kind of operation using stats socket? Thanks Sachin
Haproxy reload and maps
Hi, We are using maps extensively in our architecture to map host headers to backends. The maps are seeded dynamically with a lua handler to an external service as requests arrive, there are no pre-seeded values in the map, the physical map file is empty On haproxy reload at peak traffic, the maps are emptied and I guess that is expected. But this causes a stampede to the external service which causes some failures. Is there a way to prevent emptying of the map when we do an haproxy reload? Thanks Sachin
Re: Setting a unique header per server in a backend
Thankyou Willy for the prompt response. We have a lot of servers, 100s of them, but we are generating the configs using scripts so this logically work for us, just that it would make the config long and complex. I will try it out. Thanks Sachin On Wed, Jan 2, 2019 at 7:43 PM Willy Tarreau wrote: > Hi Sachin, > > On Wed, Jan 02, 2019 at 07:33:03PM +0530, Sachin Shetty wrote: > > Hi Willy, > > > > It seems the http-send-name-header directive is not sent with > health-check > > and I need it in the health-check as well :) > > Indeed it's not supported there because the health checks are independant > on the traffic and could even be sent somewhere else. Also the request is > forged per backend and the same request is sent to all servers in the farm. > > > is there a way to make it work with health-check as well? > > There is a solution, it's not pretty, it depends on the number of servers > you're dealing with in your farm. The solution consists in replacing health > checks with trackers and to manually configure your health checks in > separate > backends, one per server. For example : > >backend my_prod_backend > server s1 1.1.1.1:80 track chk_s1/srv > server s2 1.1.1.2:80 track chk_s2/srv > server s3 1.1.1.3:80 track chk_s3/srv > >backend chk_s1 > option httpchk GET /foo "HTTP/1.0\r\nHost: blah\r\nsrv: s1" > server srv 1.1.1.1:80 check > >backend chk_s2 > option httpchk GET /foo "HTTP/1.0\r\nHost: blah\r\nsrv: s2" > server srv 1.1.1.1:80 check > >backend chk_s3 > option httpchk GET /foo "HTTP/1.0\r\nHost: blah\r\nsrv: s3" > server srv 1.1.1.1:80 check > > As you can see, the check is performed by these chk_* backends, and > reflected in the prod backend thanks to the "track" directive. I know > it's not pretty but it provides a lot of flexibility, including the > ability to have different checks per server. > > We definitely need to revamp all the check subsystem to bring more > flexibility... > > Cheers, > Willy >
Re: Setting a unique header per server in a backend
Hi Willy, It seems the http-send-name-header directive is not sent with health-check and I need it in the health-check as well :) is there a way to make it work with health-check as well? Thanks Sachin On Tue, Dec 18, 2018 at 5:18 PM Sachin Shetty wrote: > Thankyou Willy. http-send-name-header works for my use case. > > @Norman - Yes, we are looking at replacing the usage of X- headers. > > Thanks > Sachin > > On Mon, Dec 17, 2018 at 2:18 AM Norman Branitsky < > norman.branit...@micropact.com> wrote: > >> Don't forget the "X-" header prefix is deprecated: >> https://tools.ietf.org/html/rfc6648 >> >> Norman Branitsky >> >> On Dec 16, 2018, at 03:50, Willy Tarreau wrote: >> >> Hi Sachin, >> >> On Sat, Dec 15, 2018 at 10:32:21PM +0530, Sachin Shetty wrote: >> >> Hi, >> >> >> We have a tricky requirement to set a different header value in the >> request >> >> based on which server in a backend is picked. >> >> >> backend pod0 >> >>... >> >>server server1 server1:6180 check >> >>server server2 server2:6180 check >> >>server server3 server3:6180 check >> >> >> so when request is forwarded to server1 - I want to inject an header >> >> "X-Some-Header: Server1", "X-Some-Header: Server2" for server 2 and so >> >> on. >> >> >> You have this with "http-send-name-header", you need to pass it the >> header field name and it will fill the value with the server's name. >> It will even support redispatch by rewinding the stream and rewriting >> the value (which made it very tricky and infamous for quite some time). >> >> If it possible to register some lua action that would inject the header >> >> based on the server selected before the request is forwarded to the >> server. >> >> >> In fact except for the directive above it's not possible to perform >> changes after the server has been selected, because the server is >> selected when trying to connect, which happens after the contents are >> being forwarded, thus you can't perform any processing anymore. There >> is quite some ugly code to support http-send-name-header and it cannot >> be generalized at all. Just to give you an idea, think that a hash-based >> LB algo (balance uri, balance hdr) could decide to use some contents >> you're about to modify... So the contents have to be fixed before the >> server is chosen. >> >> Cheers, >> Willy >> >>
Re: Setting a unique header per server in a backend
Thankyou Willy. http-send-name-header works for my use case. @Norman - Yes, we are looking at replacing the usage of X- headers. Thanks Sachin On Mon, Dec 17, 2018 at 2:18 AM Norman Branitsky < norman.branit...@micropact.com> wrote: > Don't forget the "X-" header prefix is deprecated: > https://tools.ietf.org/html/rfc6648 > > Norman Branitsky > > On Dec 16, 2018, at 03:50, Willy Tarreau wrote: > > Hi Sachin, > > On Sat, Dec 15, 2018 at 10:32:21PM +0530, Sachin Shetty wrote: > > Hi, > > > We have a tricky requirement to set a different header value in the request > > based on which server in a backend is picked. > > > backend pod0 > >... > >server server1 server1:6180 check > >server server2 server2:6180 check > >server server3 server3:6180 check > > > so when request is forwarded to server1 - I want to inject an header > > "X-Some-Header: Server1", "X-Some-Header: Server2" for server 2 and so > > on. > > > You have this with "http-send-name-header", you need to pass it the > header field name and it will fill the value with the server's name. > It will even support redispatch by rewinding the stream and rewriting > the value (which made it very tricky and infamous for quite some time). > > If it possible to register some lua action that would inject the header > > based on the server selected before the request is forwarded to the > server. > > > In fact except for the directive above it's not possible to perform > changes after the server has been selected, because the server is > selected when trying to connect, which happens after the contents are > being forwarded, thus you can't perform any processing anymore. There > is quite some ugly code to support http-send-name-header and it cannot > be generalized at all. Just to give you an idea, think that a hash-based > LB algo (balance uri, balance hdr) could decide to use some contents > you're about to modify... So the contents have to be fixed before the > server is chosen. > > Cheers, > Willy > >
Setting a unique header per server in a backend
Hi, We have a tricky requirement to set a different header value in the request based on which server in a backend is picked. backend pod0 ... server server1 server1:6180 check server server2 server2:6180 check server server3 server3:6180 check so when request is forwarded to server1 - I want to inject an header "X-Some-Header: Server1", "X-Some-Header: Server2" for server 2 and so on. If it possible to register some lua action that would inject the header based on the server selected before the request is forwarded to the server. Thanks Sachin
Re: Compression disabling on chunked response
Thanks Willy. Yes I can understand no-transform disabling it, I just wanted to make sure that chunked response no longer disables compression. It verifies my tests as well, thanks for confirming it. Thanks Sachin On Mon, Oct 8, 2018 at 11:00 PM Willy Tarreau wrote: > Hi Sasha, > > On Fri, Oct 05, 2018 at 12:38:15PM +0530, Sachin Shetty wrote: > > Hi, > > > > I see this in the documentation: > > > > Compression is disabled when: > > * ... > > * response header "Transfer-Encoding" contains "chunked" (Temporary > > Workaround) > > * > > > > Is this still accurate? > > Ah no it is not. I'm surprised to see it still present, but the problem > with updating docs to reflect removal of limitations is always the same! > > > I have tested a lot of responses from Server with compression enabled > > in backend > > and server sending chunked response, haproxy is compressing the stream > > correctly. > > > > What am I missing? I am trying to figure out in what cases could > > haproxy not compress a response from server. > > It depends on a few factors like the presence of "cache-control: > no-transform" > in the response, or certain user-agents in the request that are known to > indicate some affected by bugs. You can take a look at these functions for > an exhaustive list of exceptions : > > select_compression_request_header() > select_compression_response_header() > > Hoping this helps, > Willy >
Compression disabling on chunked response
Hi, I see this in the documentation: Compression is disabled when: * ... * response header "Transfer-Encoding" contains "chunked" (Temporary Workaround) * Is this still accurate? I have tested a lot of responses from Server with compression enabled in backend and server sending chunked response, haproxy is compressing the stream correctly. What am I missing? I am trying to figure out in what cases could haproxy not compress a response from server.
Re: lua socket settimeout has no effect
Hi Cyril, Thankyou for the response. Please ignore the second timeout setting, I was testing different things. I have changed the lua code as you suggested - thanks for the hint there. function get_from_gds(key) local sock = core.tcp() -- Connect timeout after patch sock:settimeout(3) local result = DOMAIN_NOT_FOUND local status, error = sock:connect(gds_host, gds_port) if not status then core.Alert("Error in connecting:" .. key .. ":" .. error) return "Error", "Error: " .. error end sock:send(key .. "\r\n") while true do local s, status, partial = sock:receive("*l") if not s then core.Alert("Error reading:" .. status) return "Error", status end if s == "END" then break end result = s end sock:close() core.Alert("Returning from GDS:" .. key .. ":" .. result) return result end but the receive still does not timeout in 3 seconds. On Mon, Aug 13, 2018 at 2:19 AM, Cyril Bonté wrote: > Le 12/08/2018 à 18:21, Sachin Shetty a écrit : > >> Hi Cyril, >> >> I have created a very simple config to reproduce this. This config always >> read timesout in 9 seconds. >> > > I think there are 3 issues. > > [...] >> function get_from_gds(key) >> [...] local sock = core.tcp() >> -- Connect timeout after patch >> sock:settimeout(3) >> local result = DOMAIN_NOT_FOUND >> local status, error = sock:connect(gds_host, gds_port) >> if not status then >> core.Alert("Error in connecting:" .. key .. ":" .. error) >> return "Error", "Error: " .. error >> end >> sock:settimeout(2) >> sock:send(key .. "\r\n") >> while true do >> local s, status, partial = sock:receive("*l") >> > > 1. The first one is in the LUA code, where you don't check the return code > after calling sock:receive(). In this case, you enter in an "infinite" > loop, adding an extra time to the response almost equal to > tune.lua.session-timeout (4s by default). > You may want to add this : > if not s then > core.Alert("Error reading:" .. status) > return "Error", status > end > > Now, 2 other issues seems to be in haproxy, but I'm not sure if it's the > right way to fix this (I add Willy and Thierry to the thread) : > 2. hlua_socket_settimeout() initializes rto/wto values, maybe it should > also compute the rex/wex values : > socket->s->req.rex = tick_add_ifset(now_ms, tmout); > socket->s->req.wex = tick_add_ifset(now_ms, tmout); > socket->s->res.rex = tick_add_ifset(now_ms, tmout); > socket->s->res.wex = tick_add_ifset(now_ms, tmout); > 3. It may require to wake up the task if a new timeout is set after a > first one was already set (in your case the task doesn't wake up after 2 > secondes because a first timeout was set to 3 seconds) : > task_wakeup(socket->s->task, TASK_WOKEN_OTHER); > > At least, it seems to fix the issue but before sending a patch, I want to > be sure that's how we should fix this. > > -- > Cyril Bonté >
Re: lua socket settimeout has no effect
Hi Cyril, I have created a very simple config to reproduce this. This config always read timesout in 9 seconds. To create a timingout service, you can just use nc as follows nc -l Haproxy Conf: global pidfile /var/run/haproxy/l1_webui.pid log /dev/log local0 info alert log-tag haproxy_l1_webui user haproxy group haproxy chroot/var/lib/haproxy lua-load /home/sshetty/l1_rework_unit_test/gds.lua defaults mode tcp log global option httplog timeout client 300s # Client and server timeout must match the longest timeout server 300s # time we may wait for a response from the server. timeout queue 30s # Don't queue requests too long if saturated. timeout connect 4s # There's no reason to change this one. timeout http-request 5s maxconn 2 retries 2 option redispatch option dontlognull option http-server-close frontend http_l1_webui bind-process 1 # http front end since some webdav and image requests are made over http, send as is to L1 mode http option httplog bind 0.0.0.0:80 http-request lua.get_proxy default_backend apache_l1 backend apache_l1 mode http -- Lua code: -- gds_host="127.0.0.1" gds_port= function get_from_gds(key) local sock = core.tcp() -- Connect timeout after patch sock:settimeout(3) local result = DOMAIN_NOT_FOUND local status, error = sock:connect(gds_host, gds_port) if not status then core.Alert("Error in connecting:" .. key .. ":" .. error) return "Error", "Error: " .. error end sock:settimeout(2) sock:send(key .. "\r\n") while true do local s, status, partial = sock:receive("*l") if s == "END" then break end result = s end sock:close() core.Alert("Returning from GDS:" .. key .. ":" .. result) return result end function protected_get_from_gds(key) local status, result = pcall(get_from_gds, key) if not status then core.Alert("Error in getting key: " .. key .. ":" .. result) return "Error" end return result end function get_proxy(txn) local time = os.time() local host = txn.sf:req_fhdr("host") local fqn = host host = host:gsub("%..*", "") core.Alert("Getting proxy for domain:" .. host .. ", connecting to " .. gds_host .. ":" .. gds_port) local result = protected_get_from_gds("get apache." .. host) core.Alert("Received Response:" .. host .. ":" .. result .. "<") end core.register_action("get_proxy", { "http-req" }, get_proxy) Sample log timing out in 9 seconds Aug 12 16:16:12 l1ratelimit01 haproxy_l1_webui[23965]: Getting proxy for domain:qaus16march2017, connecting to 127.0.0.1: Aug 12 16:16:21 l1ratelimit01 haproxy_l1_webui[23965]: Error in getting key: get apache.qaus16march2017:execution timeout Aug 12 16:16:21 l1ratelimit01 haproxy_l1_webui[23965]: Received Response:qaus16march2017:Error< Aug 12 16:16:21 l1ratelimit01 haproxy_l1_webui[23965]: 127.0.0.1:39304 [12/Aug/2018:16:16:12.164] http_l1_webui apache_l1/ 9026/-1/-1/-1/9026 503 212 - - SC-- 0/0/0/0/0 0/0 "GET /123 HTTP/1.1" On Sun, Aug 12, 2018 at 4:02 PM, Cyril Bonté wrote: > Hi, > > Le 12/08/2018 à 08:41, Sachin Shetty a écrit : > >> Hi Cyril, >> >> Any idea how I can deterministically set the readtimeout as well? >> > > Well, currently I can't reproduce this at all. Can you provide some more > details ? or even a full configuration providing the test case ? > From the tests I've made, read and write timeouts work as expected. > > > >> Thanks >> Sachin >> >> On Fri, Jul 27, 2018 at 1:23 PM, Sachin Shetty > <mailto:sshe...@egnyte.com>> wrote: >> >> Thankyou Cyril, your patch fixed the connect issue. >> >> Read timeout still seems a bit weird though, at settimeout(1), >> readtimeout kicks in at about 4 seconds, and at settimeout(2), >> readtimeout kicks in at about 8 seconds. >> >> is that expected? I couldn't find read timeout explicitly set >> anywhere in the same source file. >> >> Thanks >> Sachin >> >> On Fri, Jul 27, 2018 at 5:18 AM, Cyril Bonté > <mailto:cyril.bo...@free.fr>> wrote: >> >> Hi, >> >> Le 26/07/2018 à 19:54, Sachin Shetty a écrit : >> &g
Re: lua socket settimeout has no effect
Hi Cyril, Any idea how I can deterministically set the readtimeout as well? Thanks Sachin On Fri, Jul 27, 2018 at 1:23 PM, Sachin Shetty wrote: > Thankyou Cyril, your patch fixed the connect issue. > > Read timeout still seems a bit weird though, at settimeout(1), readtimeout > kicks in at about 4 seconds, and at settimeout(2), readtimeout kicks in at > about 8 seconds. > > is that expected? I couldn't find read timeout explicitly set anywhere in > the same source file. > > Thanks > Sachin > > On Fri, Jul 27, 2018 at 5:18 AM, Cyril Bonté wrote: > >> Hi, >> >> Le 26/07/2018 à 19:54, Sachin Shetty a écrit : >> >>> Hi, >>> >>> We are using a http-req lua action to dynamically set some app specific >>> metadata headers. The lua handler connects to a upstream memcache like >>> service over tcp to fetch additional metadata. >>> >>> Functionally everything works ok, but I am seeing that socket.settimeout >>> has no effect. Irrespective of what I set in settimeout if the upstream >>> service is unreachable, connect always timesout at 5 seconds, and read >>> timeout around 10 seconds. It seems like settimeout has no effect and it >>> always picks defaults of 5 seconds for connect timeout and 10 seconds for >>> read timeout. >>> >> >> For the connect timeout, it seems this is a hardcoded default value in >> src/hlua.c: >> socket_proxy.timeout.connect = 5000; /* By default the timeout >> connection is 5s. */ >> >> If it's possible, can you try the patch attached (for the 1.7.x branch) ? >> But please don't use it in production yet ;-) >> >> >>> Haproxy conf call: >>> >>> http-request lua.get_proxy >>> >>> Lua code sample: >>> >>> function get_proxy(txn) >>> local sock = core.tcp() >>> sock:settimeout(2) >>> status, error = sock:connect(gds_host, gds_port) >>> if not status then >>> core.Alert("1 Error in connecting:" .. key .. ":" .. error) >>> return result, "Error: " .. error >>> end >>> sock:send(key .. "\r\n") >>> >>> >>> >>> >>> core.register_action("get_proxy", { "http-req" }, get_proxy) >>> >>> Haproxy version: >>> >>> HA-Proxy version 1.7.8 2017/07/07 >>> Copyright 2000-2017 Willy Tarreau >> wi...@haproxy.org>> >>> >>> >>> Build options : >>>TARGET = linux2628 >>>CPU = generic >>>CC = gcc >>>CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement >>> -fwrapv -DTCP_USER_TIMEOUT=18 >>>OPTIONS = USE_LINUX_TPROXY=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 >>> USE_LUA=1 USE_PCRE=1 >>> >>> Default settings : >>>maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = >>> 200 >>> >>> Encrypted password support via crypt(3): yes >>> Built with zlib version : 1.2.7 >>> Running on zlib version : 1.2.7 >>> Compression algorithms supported : identity("identity"), >>> deflate("deflate"), raw-deflate("deflate"), gzip("gzip") >>> Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013 >>> Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013 >>> OpenSSL library supports TLS extensions : yes >>> OpenSSL library supports SNI : yes >>> OpenSSL library supports prefer-server-ciphers : yes >>> Built with PCRE version : 8.32 2012-11-30 >>> Running on PCRE version : 8.32 2012-11-30 >>> PCRE library supports JIT : no (USE_PCRE_JIT not set) >>> Built with Lua version : Lua 5.3.2 >>> Built with transparent proxy support using: IP_TRANSPARENT >>> IPV6_TRANSPARENT IP_FREEBIND >>> >>> Available polling systems : >>>epoll : pref=300, test result OK >>> poll : pref=200, test result OK >>> select : pref=150, test result OK >>> Total: 3 (3 usable), will use epoll. >>> >>> Available filters : >>> [COMP] compression >>> [TRACE] trace >>> [SPOE] spoe >>> >>> >>> >>> Thanks >>> Sachin >>> >> >> >> -- >> Cyril Bonté >> > >
race condition causing some corruption in lua socket service response
Hi, We are using a http-req lua action to dynamically set some app specific metadata headers. The lua handler connects to a upstream memcache like service over tcp to fetch additional metadata. Here is a simplified config: function get_from_gds(txn) local key = txn.sf:req_fhdr("host") local sock = core.tcp() -- Connect timeout after patch sock:settimeout(3) local result = DOMAIN_NOT_FOUND local status, error = sock:connect(gds_host, gds_port) if not status then core.Alert("Error in connecting:" .. key .. ":" .. error) return GDS_ERROR, "Error: " .. error end sock:settimeout(2) sock:send(key .. "\r\n") while true do local s, status, partial = sock:receive("*l") if s == "END" then break end result = s end sock:close() core.Alert("Returning from GDS:" .. key .. ":" .. result) return result end core.register_action("get_proxy", { "http-req" }, get_from_gds) Functionally it works, I am load testing with 100 concurrent threads over several hours, and like once in million I see result to be not same as what was sent by the upstream service - I have added logging and confirmed that the upstream server sends the correct response, but the result variable somehow gets mixed up with a value from another concurrent running request. Any idea? I have looked at both the lua code and the upstream service and made sure all variables are local etc, but not able to spot anything. Thanks Sachin
Re: Performance of using lua calls for map manipulation on every request
Thankyou Thierry for your reply. I will change to txn.f[‘req.fhdr’]. On Wed, Aug 1, 2018 at 2:31 PM, Thierry Fournier < thierry.fourn...@arpalert.org> wrote: > Hi, > > The Lua overhead is very low. On my laptop I reach easyly 18 000 HTTP > requests by seconds with basic Lua processing. I guess that your code > will not have significant impact on perfs. > > Note that the function: > > > txn.http:req_get_headers()["host"][0] > > Is consume more CPU than > >txn.f[‘req.fhdr’](‘host’) > > or > >txn.sf[‘req.fhdr’](‘host’) > > Other point: I’m not sure that the split() function exists. > > Thierry > > > > On 27 Jul 2018, at 14:38, Sachin Shetty wrote: > > > > Hi, > > > > We are doing about 10K requests/minute on a single haproxy server, we > have enough CPUs and memory. Right now each requests looks up a map for > backend info. It works well. > > > > Now we need to build some expire logic around the map. Like ignore some > entries in the map entries after some time. I could do this in lua, but it > woud mean that every request would make a lua call to look up a map value > and make a decision. > > > > My lua method looks like this: > > > > function get_proxy_from_map(txn) > > local host = txn.http:req_get_headers()["host"][0] > > local value = proxy_map_v2:lookup(host) > > if value then > > local values = split(value, ",") > > local proxy = values[1] > > local time = values[2] > > if os.time() > tonumber(time) then > > core.Alert("Expired: returning nil: " .. host) > > return > > else > > return proxy > > end > > end > > return > > end > > > > > > Any suggestions on how this would impact performance, our tests looks > ok. > > > > Thanks > > Sachin > >
Performance of using lua calls for map manipulation on every request
Hi, We are doing about 10K requests/minute on a single haproxy server, we have enough CPUs and memory. Right now each requests looks up a map for backend info. It works well. Now we need to build some expire logic around the map. Like ignore some entries in the map entries after some time. I could do this in lua, but it woud mean that every request would make a lua call to look up a map value and make a decision. My lua method looks like this: function get_proxy_from_map(txn) local host = txn.http:req_get_headers()["host"][0] local value = proxy_map_v2:lookup(host) if value then local values = split(value, ",") local proxy = values[1] local time = values[2] if os.time() > tonumber(time) then core.Alert("Expired: returning nil: " .. host) return else return proxy end end return end Any suggestions on how this would impact performance, our tests looks ok. Thanks Sachin
Re: lua socket settimeout has no effect
Thankyou Cyril, your patch fixed the connect issue. Read timeout still seems a bit weird though, at settimeout(1), readtimeout kicks in at about 4 seconds, and at settimeout(2), readtimeout kicks in at about 8 seconds. is that expected? I couldn't find read timeout explicitly set anywhere in the same source file. Thanks Sachin On Fri, Jul 27, 2018 at 5:18 AM, Cyril Bonté wrote: > Hi, > > Le 26/07/2018 à 19:54, Sachin Shetty a écrit : > >> Hi, >> >> We are using a http-req lua action to dynamically set some app specific >> metadata headers. The lua handler connects to a upstream memcache like >> service over tcp to fetch additional metadata. >> >> Functionally everything works ok, but I am seeing that socket.settimeout >> has no effect. Irrespective of what I set in settimeout if the upstream >> service is unreachable, connect always timesout at 5 seconds, and read >> timeout around 10 seconds. It seems like settimeout has no effect and it >> always picks defaults of 5 seconds for connect timeout and 10 seconds for >> read timeout. >> > > For the connect timeout, it seems this is a hardcoded default value in > src/hlua.c: > socket_proxy.timeout.connect = 5000; /* By default the timeout > connection is 5s. */ > > If it's possible, can you try the patch attached (for the 1.7.x branch) ? > But please don't use it in production yet ;-) > > >> Haproxy conf call: >> >> http-request lua.get_proxy >> >> Lua code sample: >> >> function get_proxy(txn) >> local sock = core.tcp() >> sock:settimeout(2) >> status, error = sock:connect(gds_host, gds_port) >> if not status then >> core.Alert("1 Error in connecting:" .. key .. ":" .. error) >> return result, "Error: " .. error >> end >> sock:send(key .. "\r\n") >> >> >> >> >> core.register_action("get_proxy", { "http-req" }, get_proxy) >> >> Haproxy version: >> >> HA-Proxy version 1.7.8 2017/07/07 >> Copyright 2000-2017 Willy Tarreau > wi...@haproxy.org>> >> >> >> Build options : >>TARGET = linux2628 >>CPU = generic >>CC = gcc >>CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement >> -fwrapv -DTCP_USER_TIMEOUT=18 >>OPTIONS = USE_LINUX_TPROXY=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 >> USE_LUA=1 USE_PCRE=1 >> >> Default settings : >>maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 >> >> Encrypted password support via crypt(3): yes >> Built with zlib version : 1.2.7 >> Running on zlib version : 1.2.7 >> Compression algorithms supported : identity("identity"), >> deflate("deflate"), raw-deflate("deflate"), gzip("gzip") >> Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013 >> Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013 >> OpenSSL library supports TLS extensions : yes >> OpenSSL library supports SNI : yes >> OpenSSL library supports prefer-server-ciphers : yes >> Built with PCRE version : 8.32 2012-11-30 >> Running on PCRE version : 8.32 2012-11-30 >> PCRE library supports JIT : no (USE_PCRE_JIT not set) >> Built with Lua version : Lua 5.3.2 >> Built with transparent proxy support using: IP_TRANSPARENT >> IPV6_TRANSPARENT IP_FREEBIND >> >> Available polling systems : >>epoll : pref=300, test result OK >> poll : pref=200, test result OK >> select : pref=150, test result OK >> Total: 3 (3 usable), will use epoll. >> >> Available filters : >> [COMP] compression >> [TRACE] trace >> [SPOE] spoe >> >> >> >> Thanks >> Sachin >> > > > -- > Cyril Bonté >
lua socket settimeout has no effect
Hi, We are using a http-req lua action to dynamically set some app specific metadata headers. The lua handler connects to a upstream memcache like service over tcp to fetch additional metadata. Functionally everything works ok, but I am seeing that socket.settimeout has no effect. Irrespective of what I set in settimeout if the upstream service is unreachable, connect always timesout at 5 seconds, and read timeout around 10 seconds. It seems like settimeout has no effect and it always picks defaults of 5 seconds for connect timeout and 10 seconds for read timeout. Haproxy conf call: http-request lua.get_proxy Lua code sample: function get_proxy(txn) local sock = core.tcp() sock:settimeout(2) status, error = sock:connect(gds_host, gds_port) if not status then core.Alert("1 Error in connecting:" .. key .. ":" .. error) return result, "Error: " .. error end sock:send(key .. "\r\n") core.register_action("get_proxy", { "http-req" }, get_proxy) Haproxy version: HA-Proxy version 1.7.8 2017/07/07 Copyright 2000-2017 Willy Tarreau Build options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -DTCP_USER_TIMEOUT=18 OPTIONS = USE_LINUX_TPROXY=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Encrypted password support via crypt(3): yes Built with zlib version : 1.2.7 Running on zlib version : 1.2.7 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013 Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports prefer-server-ciphers : yes Built with PCRE version : 8.32 2012-11-30 Running on PCRE version : 8.32 2012-11-30 PCRE library supports JIT : no (USE_PCRE_JIT not set) Built with Lua version : Lua 5.3.2 Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available filters : [COMP] compression [TRACE] trace [SPOE] spoe Thanks Sachin
Slow download speeds on SSL v/s plain http
Hi, We have been using haproxy in our production systems for a long time. Recently we spotted a slowdown in downloads in SSL compared to plain http. We are able to reproduce this in a test setup which has no other traffic. We have nbproc set according to the number of cpus Haproxy has two front ends, one SSL and one plain http. Both forward to the same webserver on local host to serve a 1GB file. on https, we get about 150 MBPS speed, where as on the http backend we get about 500MBPS. Everything else is same and all tests are on localhost so there is no external network. Haproxy version: /usr/sbin/haproxy -vvv HA-Proxy version 1.7.8 2017/07/07 Copyright 2000-2017 Willy TarreauBuild options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -DTCP_USER_TIMEOUT=18 OPTIONS = USE_LINUX_TPROXY=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Encrypted password support via crypt(3): yes Built with zlib version : 1.2.7 Running on zlib version : 1.2.7 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013 Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports prefer-server-ciphers : yes Built with PCRE version : 8.32 2012-11-30 Running on PCRE version : 8.32 2012-11-30 PCRE library supports JIT : no (USE_PCRE_JIT not set) Built with Lua version : Lua 5.3.2 Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available filters : [COMP] compression [TRACE] trace [SPOE] spoe OS: Linux l1webui-ratelimit01 3.10.0-514.21.2.el7.x86_64 #1 SMP Tue Jun 20 12:24:47 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Thanks Sachin
Re: Dynamic backends decided by an external service
Thanks a lot Thierry, that was it, changing to http-request solved my issue. I am now able to leverage a fully dynamic backend :) Thanks Sachin On 7/21/16, 10:52 PM, "thierry.fourn...@arpalert.org" <thierry.fourn...@arpalert.org> wrote: >On Tue, 19 Jul 2016 15:28:25 +0530 >Sachin Shetty <sshe...@egnyte.com> wrote: > >> Hi, >> >> We always had a unique requirement of picking a backend based on >>response >> from a external http service. In the past we have got this working by >> routing requests via a modified apache and caching the headers in maps >>for >> further request, but now I am trying to simplify our topology and >>trying to >> get this done using just haproxy and lua. >> >> I am running in to some problems: >> >> Lua method: >> >> function choose_backend(txn) >> >> local host = txn.http:req_get_headers()["host"][0] >> >> core.Alert("Getting Info:" .. host) >> >> local sock = core.tcp() >> >> sock:connect("127.0.0.1", 6280) >> >> sock:send("GET /eos/rest/private/gds/l1/1.0/domain/" .. host .. >> "\r\n") >> >> result = sock:receive("*a") >> >> sock:close() >> >> core.Alert("Received Response:" .. result .. "<") >> >> core.set_map("/tmp/proxy_webui.map", host, result) >> >> core.Alert("Map Set:" .. host .. "-->" .. result .. "<") >> >> end >> >> >> >> core.register_action("choose_backend", { "http-req" }, choose_backend) >> >> >> Haprpxy Conf: >> >> frontend luatest >> >> mode http >> >> maxconn 1 >> >> bind *:9000 >> >> >> >> use_backend %[hdr(host),lower,map(/tmp/proxy_webui.map)] if >>FALSE # >> To declare the map >> >> >> http-request lua.choose_backend >> >> >> tcp-request content capture hdr(host),map(/tmp/proxy_webui.map) >>len >> 80 >> >> acl is_ez_pod capture.req.hdr(0) http://127.0.0.1:6280 >> >> use_backend ez_pod if is_ez_pod >> >> >> >> >> >> backend ez_pod >> >> server ez_pod 192.168.56.101:6280 maxconn 2 >> >> >> >> >> >> There are some issues: >> 1. I do see Map Set called correctly in the logs, but the haproxy >>capture >> does not find the key in the map for the first request. Is this related >>to >> async execution of luna routine? > > >Hi, > >The processing of the request by haproxy is pending while the execution >of the lua action. > >The execution of "tcp-request content" directives are done before the >execution of "http-request". So HAProxy executes first the capture and >them the action which populate the map. > >Normally, HAProxy display a warning when it start, if it detects a >configuration order different than a execution order. Try to replace the >"tcp-request" by "http-request". > > >> 2. I expected subsequent requests to see the value atleast, but even >>that is >> not consistent, some request see the value in the map and some don¹t >> Is there a better way to do this? > > >It seems that you use a good way to do this. > > >> I already found that I cannot invoke the >> luna routine with use_backend because yield is not allowed in a sample >>fetch >> context. > > >Yes, the usage of core.tcp() require some yields, and haproxu doesn't >support yield during the execution of sample-fetches. For bypassing >this problem, I use variables. I store the expected result in a >variable during the execution of the action, and I use these variable >with the use_backend directive. > > >Thierry > > >> What would be the best way to achieve this, our requirement is similar >>to >> what can be done with redis+nginx here: >> http://openresty.org/en/dynamic-routing-based-on-redis.html except for >>we >> have an http service that decides the backend instead of a redis >>service. >> >> Thanks >> Sachin >> >>
Re: Dynamic backends decided by an external service
Hi, Any suggestions? Basically if I call http-request lua.choose_backend which seeds a map, will the map values be available in the subsequent map look up in the same request? Thanks Sachin From: Sachin Shetty <sshe...@egnyte.com> Date: Tuesday, July 19, 2016 at 3:28 PM To: "haproxy@formilux.org" <haproxy@formilux.org> Subject: Dynamic backends decided by an external service Hi, We always had a unique requirement of picking a backend based on response from a external http service. In the past we have got this working by routing requests via a modified apache and caching the headers in maps for further request, but now I am trying to simplify our topology and trying to get this done using just haproxy and lua. I am running in to some problems: Lua method: function choose_backend(txn) local host = txn.http:req_get_headers()["host"][0] core.Alert("Getting Info:" .. host) local sock = core.tcp() sock:connect("127.0.0.1", 6280) sock:send("GET /eos/rest/private/gds/l1/1.0/domain/" .. host .. "\r\n") result = sock:receive("*a") sock:close() core.Alert("Received Response:" .. result .. "<") core.set_map("/tmp/proxy_webui.map", host, result) core.Alert("Map Set:" .. host .. "-->" .. result .. "<") end core.register_action("choose_backend", { "http-req" }, choose_backend) Haprpxy Conf: frontend luatest mode http maxconn 1 bind *:9000 use_backend %[hdr(host),lower,map(/tmp/proxy_webui.map)] if FALSE # To declare the map http-request lua.choose_backend tcp-request content capture hdr(host),map(/tmp/proxy_webui.map) len 80 acl is_ez_pod capture.req.hdr(0) http://127.0.0.1:6280 use_backend ez_pod if is_ez_pod backend ez_pod server ez_pod 192.168.56.101:6280 maxconn 2 There are some issues: 1. I do see Map Set called correctly in the logs, but the haproxy capture does not find the key in the map for the first request. Is this related to async execution of luna routine? 2. I expected subsequent requests to see the value atleast, but even that is not consistent, some request see the value in the map and some don¹t Is there a better way to do this? I already found that I cannot invoke the luna routine with use_backend because yield is not allowed in a sample fetch context. What would be the best way to achieve this, our requirement is similar to what can be done with redis+nginx here: http://openresty.org/en/dynamic-routing-based-on-redis.html except for we have an http service that decides the backend instead of a redis service. Thanks Sachin
Dynamic backends decided by an external service
Hi, We always had a unique requirement of picking a backend based on response from a external http service. In the past we have got this working by routing requests via a modified apache and caching the headers in maps for further request, but now I am trying to simplify our topology and trying to get this done using just haproxy and lua. I am running in to some problems: Lua method: function choose_backend(txn) local host = txn.http:req_get_headers()["host"][0] core.Alert("Getting Info:" .. host) local sock = core.tcp() sock:connect("127.0.0.1", 6280) sock:send("GET /eos/rest/private/gds/l1/1.0/domain/" .. host .. "\r\n") result = sock:receive("*a") sock:close() core.Alert("Received Response:" .. result .. "<") core.set_map("/tmp/proxy_webui.map", host, result) core.Alert("Map Set:" .. host .. "-->" .. result .. "<") end core.register_action("choose_backend", { "http-req" }, choose_backend) Haprpxy Conf: frontend luatest mode http maxconn 1 bind *:9000 use_backend %[hdr(host),lower,map(/tmp/proxy_webui.map)] if FALSE # To declare the map http-request lua.choose_backend tcp-request content capture hdr(host),map(/tmp/proxy_webui.map) len 80 acl is_ez_pod capture.req.hdr(0) http://127.0.0.1:6280 use_backend ez_pod if is_ez_pod backend ez_pod server ez_pod 192.168.56.101:6280 maxconn 2 There are some issues: 1. I do see Map Set called correctly in the logs, but the haproxy capture does not find the key in the map for the first request. Is this related to async execution of luna routine? 2. I expected subsequent requests to see the value atleast, but even that is not consistent, some request see the value in the map and some don¹t Is there a better way to do this? I already found that I cannot invoke the luna routine with use_backend because yield is not allowed in a sample fetch context. What would be the best way to achieve this, our requirement is similar to what can be done with redis+nginx here: http://openresty.org/en/dynamic-routing-based-on-redis.html except for we have an http service that decides the backend instead of a redis service. Thanks Sachin
Re: undefined symbol: lua_getmetatable in using luasocket
Thankyou Cyril. I could not get it work with 5.3 either, I am now trying to use built in sockets with core.tcp(). On 7/19/16, 4:00 AM, "Cyril Bonté" <cyril.bo...@free.fr> wrote: >Hi Sachin, > >Le 18/07/2016 à 16:16, Sachin Shetty a écrit : >> (...) >> However when starting haproxy, I get this error: >> >> [ALERT] 199/063903 (7106) : parsing >> [/home/egnyte/haproxy/conf/haproxy.conf:9] : lua runtime error: error >> loading module 'socket.core' from file >> '/usr/local/lib/lua/5.1/socket/core.so': >> >> /usr/local/lib/lua/5.1/socket/core.so: undefined symbol: >>lua_getmetatable > > From this previous line, it's not a haproxy issue. It looks like you >are using a lua library for the wrong lua version. >Try to use the library for lua 5.3. > >> >> >> Standalone lua scripts is fine with the require ³socket² line and I do >> see the output, but it fails to load within haproxy. >> >> >> Thanks >> >> Sachin >> >> >> >> >> > > >-- >Cyril Bonté
undefined symbol: lua_getmetatable in using luasocket
Hi, I am trying to load a luasocket script which would make a rest call to a upstream service to determine the backend The script is as follows: ³²" http = require ³socket.http" function choose_backend(txn, arg1) core.log(core.info, "Getting Info:" .. arg1) result, statuscode, content = http.request("http://localhost:6280/eos/rest/private/gds/l1/1.0/domain/; .. arg1) return result end core.register_fetches("choose_backend", choose_backend) ³²" However when starting haproxy, I get this error: [ALERT] 199/063903 (7106) : parsing [/home/egnyte/haproxy/conf/haproxy.conf:9] : lua runtime error: error loading module 'socket.core' from file '/usr/local/lib/lua/5.1/socket/core.so': /usr/local/lib/lua/5.1/socket/core.so: undefined symbol: lua_getmetatable Standalone lua scripts is fine with the require ³socket² line and I do see the output, but it fails to load within haproxy. Thanks Sachin
How to log all response header values instead of last occurance
Hi, As documented capture response header ³X-Via² will only log the last value of header X-Via. We have backends that might inject more than one value for the header X-Via. Is there a way to log all the values? Thanks Sachin
Re: Haproxy running on 100% CPU and slow downloads
To close this thread out: we found the issue to be in 1.6.4-20160426 patch that I was using. The issue is fixed in 1.6.5. Thanks Willy and Lukas. Thanks Sachin On 5/13/16, 8:14 PM, "Willy Tarreau" <w...@1wt.eu> wrote: >On Fri, May 13, 2016 at 07:32:36PM +0530, Sachin Shetty wrote: >> In 24 hours all servers had connections growing, we have reverted the >> patch for now. >> >> I have the show sess all output if you would like to see. > >Interestingly in the "show sess all" from yesterday I'm seeing only >negative "tofwd" values for stuck sessions. Exactly the type of thing >which is supposedly fixed now (it's the problem with 2-4GB transfers). >I don't understand since I tested the backport and had the confirmation >from another user that it was OK for him. Maybe there's a corner case I >haven't figure which may depend on certain options. > >Could you please send me privately your config (remove the confidential >stuff) ? I think you gave it to me a few times already but I don't want >to keep those you know. > >Thanks, >Willy >
Re: Haproxy running on 100% CPU and slow downloads
In 24 hours all servers had connections growing, we have reverted the patch for now. I have the show sess all output if you would like to see. Thanks Sachin On 5/12/16, 10:08 PM, "Sachin Shetty" <sshe...@egnyte.com> wrote: >Hi Lukas, > >Attached output. > >Thanks >Sachin > >On 5/12/16, 7:41 PM, "Lukas Tribus" <lu...@gmx.net> wrote: > >>Hi, >> >> >>Am 12.05.2016 um 14:37 schrieb Sachin Shetty: >>> Hi Willy, >>> >>> We are seeing a strange problem on the patched server. We have several >>> haproxy servers running but only one with the latest patch, and this >>> haproxy has frozen twice in last two days, basically it hits max open >>> connections 2000 on frontend and then stalls. From the logs it has 1999 >>> connections on one of the backends which is nginx, but nginx_status >>>shows >>> me only a few active connections. It only happens on the patched >>>haproxy >>> server and does not happen anywhere else. Interesting thing is this >>> haproxy is not the one doing SSL, we have two haproxies on the same box >>> with the latest binary, the SSL one seems ok but the non SSL one keeps >>>on >>> accumulating connections. >>> >>> Right now, I see connections building on one backend hitting 150 in the >>> last few hours, but the backend nginx only shows about 20 active >>> connections. >> >>Can you collect "show sess all" output from the admin socket? >> >>Lukas
Re: Haproxy running on 100% CPU and slow downloads
Hi Willy, We are seeing a strange problem on the patched server. We have several haproxy servers running but only one with the latest patch, and this haproxy has frozen twice in last two days, basically it hits max open connections 2000 on frontend and then stalls. From the logs it has 1999 connections on one of the backends which is nginx, but nginx_status shows me only a few active connections. It only happens on the patched haproxy server and does not happen anywhere else. Interesting thing is this haproxy is not the one doing SSL, we have two haproxies on the same box with the latest binary, the SSL one seems ok but the non SSL one keeps on accumulating connections. Right now, I see connections building on one backend hitting 150 in the last few hours, but the backend nginx only shows about 20 active connections. On 5/10/16, 5:47 PM, "Willy Tarreau" <w...@1wt.eu> wrote: >On Tue, May 10, 2016 at 11:10:14AM +0530, Sachin Shetty wrote: >> We deployed the latest and we saw throughput still dropped around peak >> hours a bit, then we swithed to nbproc 4 which is holding up ok. > >So probably you were reaching the processing limits for a single process, >that can easily happen with SSL if a lot of rekeying has to be done. > >> Note that >> 4 Cpus was not sufficient earlier, so I believe the latest version is >> scaling better. > >Good, that confirms that you're not facing these bugs anymore. I'm >currently >starting a new release, that will make it easier for you to deploy. > >Thanks for the report, >Willy >
Re: Haproxy running on 100% CPU and slow downloads
We deployed the latest and we saw throughput still dropped around peak hours a bit, then we swithed to nbproc 4 which is holding up ok. Note that 4 Cpus was not sufficient earlier, so I believe the latest version is scaling better. Thanks Lukas and Willy. On 4/29/16, 11:09 AM, "Willy Tarreau"wrote: >Hi guys, > >On Tue, Apr 26, 2016 at 08:46:37AM +0200, Lukas Tribus wrote: >> Hi Sachin, >> >> >> there is another fix Willy recently committed, its ff9c7e24fb [1] >> and its in the snapshots [2] since 1.6.4-20160426. >> >> This is supposed to fix the issue altogether. >> >> Please let us know if this works for you. > >Yes it should fix this. Please note that I've got one report in 1.5 of >some huge transfers (multi-GB) stalling after this patch, and since I >can't find any case where it could be wrong nor can I reproduce it, I >suspect we may have a bug somewhere else (at least in 1.5) that was >hidden by the bug this series of patches fix. We had no such report on >1.6 however. > >There's another case of high CPU usage which Cyril managed to isolate. >The issue has been present since 1.4 and is *very* hard to reproduce, >I even had to tweek some sysctls on my laptop to see it and am careful >not to reboot it. It is triggered by *some* pipelined requests. We're >currently working on fixing it, there are several ways to fix it but >all of them come with their downsides for now (one of them being a >different code path between 1.7 and 1.6/1.5/1.4 which doesn't appeal >me much). > >This is why I'm still waiting before issuing a new series of versions. > >In the mean time, feel free to test latest 1.6 snapshot and report any >issues you may face. I've really committed into getting these issues >fixed once for all, it's getting irritating to see such bugs surviving >but I never give up the fight :-) > >Best regards, >Willy >
Re: Haproxy running on 100% CPU and slow downloads
Thanks Lukas and Willy. I am in the process of getting 1.6.4-20160426 deployed in our QA, I will keep you guys posted. On 4/29/16, 11:09 AM, "Willy Tarreau"wrote: >Hi guys, > >On Tue, Apr 26, 2016 at 08:46:37AM +0200, Lukas Tribus wrote: >> Hi Sachin, >> >> >> there is another fix Willy recently committed, its ff9c7e24fb [1] >> and its in the snapshots [2] since 1.6.4-20160426. >> >> This is supposed to fix the issue altogether. >> >> Please let us know if this works for you. > >Yes it should fix this. Please note that I've got one report in 1.5 of >some huge transfers (multi-GB) stalling after this patch, and since I >can't find any case where it could be wrong nor can I reproduce it, I >suspect we may have a bug somewhere else (at least in 1.5) that was >hidden by the bug this series of patches fix. We had no such report on >1.6 however. > >There's another case of high CPU usage which Cyril managed to isolate. >The issue has been present since 1.4 and is *very* hard to reproduce, >I even had to tweek some sysctls on my laptop to see it and am careful >not to reboot it. It is triggered by *some* pipelined requests. We're >currently working on fixing it, there are several ways to fix it but >all of them come with their downsides for now (one of them being a >different code path between 1.7 and 1.6/1.5/1.4 which doesn't appeal >me much). > >This is why I'm still waiting before issuing a new series of versions. > >In the mean time, feel free to test latest 1.6 snapshot and report any >issues you may face. I've really committed into getting these issues >fixed once for all, it's getting irritating to see such bugs surviving >but I never give up the fight :-) > >Best regards, >Willy >
Re: Haproxy running on 100% CPU and slow downloads
Hi Lukas, We tried the patch, it seems better. As soon as we switched nbproc off, throughput did not drop immediately like it did with earlier version, it started deteriorating slowly as traffic increased to peak hours, but eventually it did crash to the same levels as before. CPU Usage was also better, only at peak hours I saw 100% CPU consumed by haproxy, other wise it would be between 60-80%. Please see attached image measuring througput, nbproc=20 until ~10PM, nbroc=1 from ~10PM to ~10AM, nbproc reverted to 20 from 10 AM onwards. Y-axis is speed in MBPS. Thanks Sachin On 4/21/16, 12:57 PM, "Lukas Tribus" <lu...@gmx.net> wrote: >Hi, > > >Am 21.04.2016 um 08:11 schrieb Sachin Shetty: >> Hi, >> >> any hints to further isolate this - we have deferred the problem by >>adding >> all the cores we had, but I have a feeling that our request rate is not >> that high (7K per minute a peak) and it will show up again as traffic >> increases. >> >> Thanks >> Sachin >> > >Try the fix 9c09ee87 [1], which is in snapshots since 1.6.4-20160412. > > >cheers, > >lukas > >[1] >http://www.haproxy.org/git?p=haproxy-1.6.git;a=commitdiff_plain;h=9c09ee87 >836bb2efd78a17f9b16d8afe0ec64018;hp=3bee40bfb7a35b624c5cc9d88daff5a9e3b99f >33 >[2] http://www.haproxy.org/download/1.6/src/snapshot/
Re: Haproxy running on 100% CPU and slow downloads
Hi, any hints to further isolate this - we have deferred the problem by adding all the cores we had, but I have a feeling that our request rate is not that high (7K per minute a peak) and it will show up again as traffic increases. Thanks Sachin On 4/18/16, 12:22 PM, "Sachin Shetty" <sshe...@egnyte.com> wrote: >Hi Lukas, > >We upgraded to 1.6, went back to nbproc 1 from 12 and the problem showed >up again. Haproxy hitting 90-100% and monitors reported download speed >drop from 100MBPS to 10MBPS immediately. > >I ran strace as you said, output it huge, have attached a small subset of >it in the email. Please let me know if you need more of strace output. > >Thanks >Sachin > > > >On 4/7/16, 5:51 PM, "Lukas Tribus" <lu...@gmx.net> wrote: > >>Hi, >> >>Am 05.04.2016 um 09:38 schrieb Sachin Shetty: >>> Hi Lukas, Pavlos, >>> >>> Thanks for your response, more info as requested. >>> >>> 1. Attached conf with some obfuscation >>> 2. Haproxy -vv >>> HA-Proxy version 1.5.4 2014/09/02 >>> Copyright 2000-2014 Willy Tarreau <w...@1wt.eu> >>> >> >>I would upgrade to something more recent, the number of bugfixes >>since 1.5.4 amount to more than 100! >> >>That said, I've not stumbled upon a particular bug explaining what >>you are seeing. >> >>My suggestion would be to go back to nbproc 1 (its easier to >>troubleshoot), and run the 100% spinning process through >>strace -tt -p and post the output. >> >> >> >> >>Thanks, >> >>Lukas
Re: Haproxy running on 100% CPU and slow downloads
Hi Lukas, We upgraded to 1.6, went back to nbproc 1 from 12 and the problem showed up again. Haproxy hitting 90-100% and monitors reported download speed drop from 100MBPS to 10MBPS immediately. I ran strace as you said, output it huge, have attached a small subset of it in the email. Please let me know if you need more of strace output. Thanks Sachin On 4/7/16, 5:51 PM, "Lukas Tribus" <lu...@gmx.net> wrote: >Hi, > >Am 05.04.2016 um 09:38 schrieb Sachin Shetty: >> Hi Lukas, Pavlos, >> >> Thanks for your response, more info as requested. >> >> 1. Attached conf with some obfuscation >> 2. Haproxy -vv >> HA-Proxy version 1.5.4 2014/09/02 >> Copyright 2000-2014 Willy Tarreau <w...@1wt.eu> >> > >I would upgrade to something more recent, the number of bugfixes >since 1.5.4 amount to more than 100! > >That said, I've not stumbled upon a particular bug explaining what >you are seeing. > >My suggestion would be to go back to nbproc 1 (its easier to >troubleshoot), and run the 100% spinning process through >strace -tt -p and post the output. > > > > >Thanks, > >Lukas 23:30:41.257757 sendto(120, "...", 16384, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_MORE, NULL, 0) = 16384 23:30:41.258001 sendto(87, "...", 919, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 919 23:30:41.258077 read(33, "\27\3\3\0020", 5) = 5 23:30:41.258134 read(33, "...", 560) = 560 23:30:41.258201 read(3, "\26\3\3\0F", 5) = 5 23:30:41.258244 read(3, "...", 70) = 70 23:30:41.259294 read(3, "\24\3\3\0\1", 5) = 5 23:30:41.259347 read(3, "\1", 1)= 1 23:30:41.259514 read(3, "\26\3\3\0@", 5) = 5 23:30:41.259559 read(3, "...", 64) = 64 23:30:41.259668 write(3, "...", 75) = 75 23:30:41.259748 read(3, 0x7feeaed21343, 5) = -1 EAGAIN (Resource temporarily unavailable) 23:30:41.259818 read(71, "\26\3\1\2\6", 5) = 5 23:30:41.259863 read(71, "...", 518) = 518 23:30:41.280711 read(71, "\24\3\1\0\1", 5) = 5 23:30:41.280790 read(71, "\1", 1) = 1 23:30:41.280967 read(71, "\26\3\1\", 5) = 5 23:30:41.281012 read(71, "...", 48) = 48 23:30:41.281121 write(71, "...", 59) = 59 23:30:41.281199 read(71, 0x7feeaed21343, 5) = -1 EAGAIN (Resource temporarily unavailable) 23:30:41.281246 read(51, "...", 14977) = 14977 23:30:41.281405 sendto(56, "...", 16384, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_MORE, NULL, 0) = 16384 23:30:41.281472 read(38, 0x7feeaeb15183, 5) = -1 EAGAIN (Resource temporarily unavailable) 23:30:41.281517 read(140, "...", 7677) = 5840 23:30:41.281562 read(140, 0x7feeaec87a2b, 1837) = -1 EAGAIN (Resource temporarily unavailable) 23:30:41.281605 read(45, "\27\3\3\2\240", 5) = 5 23:30:41.281647 read(45, "...", 672) = 672 23:30:41.281699 read(31, "...", 48) = 48 23:30:41.281811 sendto(272, "...", 16384, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_MORE, NULL, 0) = 16384 23:30:41.281948 write(167, "...", 15525) = 15525 23:30:41.282025 read(72, "...", 15923) = 8184 23:30:41.282076 read(72, "...", 7739) = 1364 23:30:41.282119 read(72, 0x7feeaebf89c1, 6375) = -1 EAGAIN (Resource temporarily unavailable) 23:30:41.282162 read(24, "...", 1837) = 1837 23:30:41.282278 sendto(107, "...", 16384, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_MORE, NULL, 0) = 16384 23:30:41.282328 recvfrom(41, "...", 16384, 0, NULL, NULL) = 16384 23:30:41.282382 recvfrom(81, "...", 15360, 0, NULL, NULL) = 214 23:30:41.282438 write(21, "...", 389) = 389 23:30:41.282497 write(25, "...", 389) = 389 23:30:41.282563 write(25, "...", 53) = 53 23:30:41.282613 shutdown(25, SHUT_WR) = 0 23:30:41.282660 read(18, 0x7feeae813be3, 5) = -1 EAGAIN (Resource temporarily unavailable) 23:30:41.282704 sendto(92, "...", 818, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 818 23:30:41.282753 read(39, 0x7feeae813be3, 5) = -1 EAGAIN (Resource temporarily unavailable) 23:30:41.282796 sendto(88, "...", 2062, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 2062 23:30:41.282944 getsockname(33, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("Some-IP")}, [16]) = 0 23:30:41.283008 getsockopt(33, SOL_IP, 0x50 /* IP_??? */, "\2\0\1\273\n\31\220\17\0\0\0\0\0\0\0\0", [16]) = 0 23:30:41.283082 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 77 23:30:41.283132 fcntl(77, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 23:30:41.283188 setsockopt(77, SOL_TCP, TCP_NODELAY, [1], 4) = 0 23:30:41.283233 connect(77, {sa_family=AF_INET, sin_port=htons(7300), sin_addr=inet_addr("Some-IP")}, 16) = -1 EINPROGRESS (Operation now in progress) 23:30:41.283415 getsockname(45, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("Some-IP")},
Rate limiting with multiple haproxy servers
Hi, We have multiple haproxy servers receiving traffic from our firewall, we want to apply some rate limiting that takes into account counters from all the haproxy servers. I am testing this with 1.6.4 and I tried the peer feature, but not able to get it to work. I understand that counter aggregation does not happen, but even replication doesn¹t seem to be working for me. Conf: Peers article peer haproxy1 127.0.0.1:11023 peer haproxy2 127.0.0.1:11024 global stats socket /tmp/haproxy.sock mode 600 level admin #maxconn 3000 #maxconn 1 defaults log 127.0.0.1 local1 option httplog mode http timeout server 120s timeout queue 1000s timeout client 1200s # CLient Inactive time timeout connect 100s # timeout for server connection timeout check 500s # timeout for server check pings maxconn 1 retries 2 option redispatch option http-server-close frontend haproxy1_l2 mode http option forwardfor capture cookie egnyte-proxy len 32 capture request header host len 32 bind *:1443 ssl crt /home/egnyte/haproxy/conf/key.pem crt /home/egnyte/haproxy/conf/certs tcp-request inspect-delay 5s tcp-request content accept if { req_ssl_hello_type 1 } stick-table type string size 1M expire 10m store conn_cur peers article acl is_range hdr_sub(Range) bytes= acl is_path_throttled path_beg /public-api/v1/fs-content-download acl is_path_throttled path_end /get_file acl is_path_throttled path_beg /wsgi/print_headers.py #tcp-request content track-sc1 base32 if is_range is_path_throttled http-request set-header X-track %[url] http-request track-sc1 req.hdr(X-track) if is_range is_path_throttled http-request deny if { sc1_conn_cur gt 2 } is_range is_path_throttled default_backend apache_l1 backend apache_l1 mode http maxconn 1 reqadd X-Haproxy-L1:\ true server apache_l1 127.0.0.1:80 Is there any other way to have rate limiting that can track the counters across haproxy servers? How about seeding counters in to redis using lua and then reading them to rate limit is it even feasible, I have not looked at it in detail yet, just wanted to see if somebody has tried something similar. Thanks Sachin
Re: Haproxy running on 100% CPU and slow downloads
agree to both the points. Thanks Sachin On 4/7/16, 11:24 PM, "Willy Tarreau" <w...@1wt.eu> wrote: >On Thu, Apr 07, 2016 at 10:59:24PM +0530, Sachin Shetty wrote: >> Hi Willy, >> >> Sorry for the confusion. I wrote to you much before in my >>investigation. I >> will take care going forward. > >OK but in general the point remains, and it's not just for you but for >everyone in general, the mailing list is here to reach around 1000 persons >at once, so once your message is posted, you have to keep in mind that >several of them will start to think about your problem even if they don't >respond, which is why it is very important to be transparent about any >progress made on parallel investigation or parallel contacts. Just like >when you ask something to two distinct coworkers, one gives you a fast >response, the other ones comes the next day and says "I've setup a lab >yesterday to check what you asked me and I found this last night". You'll >feel bad telling him "Oh I already got the response, thank you anyway". > >> Only now I realized that I messed up the version numbers because it >>seems >> we have different versions in our cluster. > >OK similarly there's nothing wrong telling errors in bug reports, we all >do this because we test lots of stuff and we end up confusing things. But >once you notice something was wrong, simply respond again and fix the >information. Reliable versions helps eliminate candidate patches and also >help people joining saying "same problem here". > >> We are now testing with 1.6.4 and trying to fast track it. > >OK thanks for the feedback! > >Willy >
Re: Haproxy running on 100% CPU and slow downloads
Hi Willy, Sorry for the confusion. I wrote to you much before in my investigation. I will take care going forward. Only now I realized that I messed up the version numbers because it seems we have different versions in our cluster. We are now testing with 1.6.4 and trying to fast track it. Thanks Sachin On 4/7/16, 6:31 PM, "Willy Tarreau" <w...@1wt.eu> wrote: >Hi Sachin, > >On Thu, Apr 07, 2016 at 02:21:16PM +0200, Lukas Tribus wrote: >> Hi, >> >> Am 05.04.2016 um 09:38 schrieb Sachin Shetty: >> >Hi Lukas, Pavlos, >> > >> >Thanks for your response, more info as requested. >> > >> >1. Attached conf with some obfuscation >> >2. Haproxy -vv >> >HA-Proxy version 1.5.4 2014/09/02 >> >Copyright 2000-2014 Willy Tarreau <w...@1wt.eu> >> > >> >> I would upgrade to something more recent, the number of bugfixes >> since 1.5.4 amount to more than 100! >(...) > >I'm just discovering that you opened this thread twice in parallel, >once with me in private and once with the ML, resulting in everyone >doing the work twice and giving you the same advices twice. Please >avoid this in the future, it wastes everyone's time and discourages >people from responding to such questions. The place to ask is the ML, >and if you contact someone privately please at least point to the >public question so that the response is public and it saves others' >valuable time. > >Also the version you reported to me was different : > > HA-Proxy version 1.5.9 2014/11/25 > >Thanks, >Willy >
Re: Haproxy running on 100% CPU and slow downloads
Hi Lukas, Pavlos, Thanks for your response, more info as requested. 1. Attached conf with some obfuscation 2. Haproxy -vv HA-Proxy version 1.5.4 2014/09/02 Copyright 2000-2014 Willy TarreauBuild options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing -DTCP_USER_TIMEOUT=18 OPTIONS = USE_LINUX_TPROXY=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 Encrypted password support via crypt(3): yes Built with zlib version : 1.2.7 Compression algorithms supported : identity, deflate, gzip Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013 Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports prefer-server-ciphers : yes Built with PCRE version : 8.32 2012-11-30 PCRE library supports JIT : no (USE_PCRE_JIT not set) Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. 3. uname -a Linux avl-www10.dc.egnyte.lan 3.10.0-327.10.1.el7.x86_64 #1 SMP Tue Feb 16 17:03:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux [sshetty@avl-www10 haproxy_l1_sync]$ 4. rfc5077-client seems ok [✔] Prepare tests. [✔] Run tests without use of tickets. [✔] Display result set: │ IP address│ Try │ Cipher│ Reuse │SSL Session ID │ Master key │ Ticket │ Answer │ ───┼─┼───┼───┼─ ┼─┼┼─── │ 208.83.105.14 │ 0 │ ECDHE-RSA-AES256-SHA │ ✘ │ 40A2D3E903C2457551… │ B4A08BB73457356AA2… │ ✘│ HTTP/1.1 200 OK │ 208.83.105.14 │ 1 │ ECDHE-RSA-AES256-SHA │ ✔ │ 40A2D3E903C2457551… │ B4A08BB73457356AA2… │ ✘│ HTTP/1.1 200 OK │ 208.83.105.14 │ 2 │ ECDHE-RSA-AES256-SHA │ ✔ │ 40A2D3E903C2457551… │ B4A08BB73457356AA2… │ ✘│ HTTP/1.1 200 OK │ 208.83.105.14 │ 3 │ ECDHE-RSA-AES256-SHA │ ✔ │ 40A2D3E903C2457551… │ B4A08BB73457356AA2… │ ✘│ HTTP/1.1 200 OK │ 208.83.105.14 │ 4 │ ECDHE-RSA-AES256-SHA │ ✔ │ 40A2D3E903C2457551… │ B4A08BB73457356AA2… │ ✘│ HTTP/1.1 200 OK [✔] Dump results to file. [✔] Run tests with use of tickets. [✔] Display result set: │ IP address│ Try │ Cipher│ Reuse │SSL Session ID │ Master key │ Ticket │ Answer │ ───┼─┼───┼───┼─ ┼─┼┼─── │ 208.83.105.14 │ 0 │ ECDHE-RSA-AES256-SHA │ ✘ │ E4559330FD100E69F5… │ 05F768F5574FD27E88… │ ✔│ HTTP/1.1 200 OK │ 208.83.105.14 │ 1 │ ECDHE-RSA-AES256-SHA │ ✔ │ E4559330FD100E69F5… │ 05F768F5574FD27E88… │ ✔│ HTTP/1.1 200 OK │ 208.83.105.14 │ 2 │ ECDHE-RSA-AES256-SHA │ ✔ │ E4559330FD100E69F5… │ 05F768F5574FD27E88… │ ✔│ HTTP/1.1 200 OK │ 208.83.105.14 │ 3 │ ECDHE-RSA-AES256-SHA │ ✔ │ E4559330FD100E69F5… │ 05F768F5574FD27E88… │ ✔│ HTTP/1.1 200 OK │ 208.83.105.14 │ 4 │ ECDHE-RSA-AES256-SHA │ ✔ │ E4559330FD100E69F5… │ 05F768F5574FD27E88… │ ✔│ HTTP/1.1 200 OK [✔] Dump results to file. On 4/5/16, 12:14 AM, "Lukas Tribus" wrote: >Hi Sachin, > > >(due to email troubles on my side this may look like a new thread, sorry >about that) > > > > We have quite a few regex and acls in our config, is there a way to >profile > > haproxy and see what could be slowing it down? > >You can use strace for syscalls or ltrace for library calls to see if >something >in particular shows up, but perf may be the better tool for this job (I >never >used it though). > > >Like Pavlos said, lets collect some basic informations first: > >- haproxy -vv output >- uname -a >- configuration (replace proprietary informations but leave everything >else intact) >- does TLS resumption correctly work? Check with rfc5077-client: > >git clone https://github.com/vincentbernat/rfc5077.git >cd rfc5077 >make rfc5077-client > > >./rfc5077-client > > > >There's a chance that it is SSL/TLS related. > > > >Regards, > >Lukas > haproxy.sync.conf Description: Binary data
Haproxy running on 100% CPU and slow downloads
Hi, I am chasing some weird capacity issues in our setup. Haproxy which also does SSL is forwarding request to various other servers upstream. I am seeing a simple 100MB file download from our upstream components starts to slow down time to time like hitting as low as 1MBPS, usually is it greater than 100MBPS. When this happens, I tried downloading the file from the upstream component bypassing haproxy from the same box, and that is fast enough 100MBPS. So it seems like haproxy is getting jammed on something. The only suspicious thing I see is that haproxy will be spinning on 100% CPU. So we added nbproc 4 and I still see the same pattern, when the speed drops, all haproxy proceses are hitting 80-100%. The request rate when the speed drops is about 5K/minute which is only 2X of requests when things are normal and download speeds are fine. We have quite a few regex and acls in our config, is there a way to profile haproxy and see what could be slowing it down? Thanks Sachin
Haproxy and http chunked trailers
Hi, We have started using Http trailers in http chunked request. Http trailers are pretty well defined in the spec but seems like not widely used. We have haproxy forwarding the trailers to Apache tomcat and it is all working fine, I just wanted to confirm from the group that it is working by design and won¹t stop working in some future release :) Our request looks like this: telnet somehost 80 POST /some-path HTTP/1.1 Authorization: Basic = Host: somehost.domain.com Transfer-Encoding: chunked Trailer: My-Test-Trailer 50 111 1 0 My-Test-Trailer: some-value-new As I said, the trailer My-Test-Trailer is forwarded to the backends and all good as of now. Thanks Sachin
Re: Haproxy and http chunked trailers
Well we are only going to use it for incoming uploads APIs, so as long as somebody can make a post request using some client library or handcoded http request, we are fine. We won’t be generating any trailers ourselves in the response. Thanks Sachin On 7/22/15, 5:38 PM, Vincent Bernat ber...@luffy.cx wrote: ❦ 22 juillet 2015 17:22 +0530, Sachin Shetty sshe...@egnyte.com : We have started using Http trailers in http chunked request. Http trailers are pretty well defined in the spec but seems like not widely used. Are they supported by browsers? Last time I checked, this was not the case (at least for the Cookies trailer for example). -- Lord, what fools these mortals be! -- William Shakespeare, A Midsummer-Night's Dream
Re: Haproxy and http chunked trailers
Thanks Willy. Yeah trailers are rarely used and I am having a tough time making it work in Apache web server. Thanks for taking care of it in Haproxy from the start. :) On 7/22/15, 6:22 PM, Willy Tarreau w...@1wt.eu wrote: Hi Sachin, On Wed, Jul 22, 2015 at 05:22:00PM +0530, Sachin Shetty wrote: Hi, We have started using Http trailers in http chunked request. Http trailers are pretty well defined in the spec but seems like not widely used. We have haproxy forwarding the trailers to Apache tomcat and it is all working fine, I just wanted to confirm from the group that it is working by design and won¹t stop working in some future release :) Hehe that's a fun way to help spot future regressions :-) You should have specified the exact version you tested with. That said, chunked encoding was initially implemented with trailers support in both directions. That's typically the sort of thing you don't want to try to introduce later as it breaks the state machine and becomes much harder to do later than to do initially. So I was pretty sure it used to work, though I must confess I don't test them often :-) Cheers, Willy
Re: Limiting concurrent range connections
Tried it, I don¹t see the table populating at all. stick-table type string size 1M expire 10m store conn_cur acl is_range hdr_sub(Range) bytes= acl is_path_throttled path_beg /public-api/v1/fs-content-download #tcp-request content track-sc1 base32 if is_range is_path_throttled http-request set-header X-track %[url] tcp-request content track-sc1 req.hdr(X-track) if is_range is_path_throttled http-request deny if { sc1_conn_cur gt 2 } is_range is_path_throttled (egnyte_server)egnyte@egnyte-laptop:~$ echo show table haproxy_l2 | socat /tmp/haproxy.sock stdio # table: haproxy_l2, type: string, size:1048576, used:0 (egnyte_server)egnyte@egnyte-laptop:~$ On 6/3/15 8:36 PM, Baptiste bed...@gmail.com wrote: Yes, the url sample copies whole URL as sent by the client. Simply give it a try on a staging server and let us know the status. Baptiste On Wed, Jun 3, 2015 at 3:19 PM, Sachin Shetty sshe...@egnyte.com wrote: Thanks Baptiste - Will http-request set-header X-track %[url] help me track URL with query parameters as well? On 6/3/15 6:36 PM, Baptiste bed...@gmail.com wrote: On Wed, Jun 3, 2015 at 2:17 PM, Sachin Shetty sshe...@egnyte.com wrote: Hi, I am trying to write some throttles that would limit concurrent connections for Range requests + specific urls. For example I want to allow only 2 concurrent range requests downloading a file /public-api/v1/fs-content-download I have a working rule: stick-table type string size 1M expire 10m store conn_cur tcp-request inspect-delay 5s acl is_range hdr_sub(Range) bytes= acl is_path_throttled path_beg /public-api/v1/fs-content-download tcp-request content track-sc1 base32 if is_range is_path_throttled http-request deny if { sc1_conn_cur gt 2 } is_range is_path_throttled Just wanted to see if there is a better way of doing this? Is this efficient enough. I need to include the query string as well in my tracker, but I could not figure that out. Thanks Sachin Hi Sachin, I would do it like this: stick-table type string size 1M expire 10m store conn_cur tcp-request inspect-delay 5s tcp-request accept if HTTP acl is_range hdr_sub(Range) bytes= acl is_path_throttled path_beg /public-api/v1/fs-content-download http-request set-header X-track %[url] http-request track-sc1 req.hdr(X-track) if is_range is_path_throttled http-request deny if { sc1_conn_cur gt 2 } is_range is_path_throttled There might be some typo, but you get the idea. Baptiste
Re: Limiting concurrent range connections
I did try it, it needs 1.6.dev1 and that version segfaults as soon as the request is made (egnyte_server)egnyte@egnyte-laptop:~/haproxy$ ~/haproxy/sbin/haproxy -f conf/haproxy.conf -d [WARNING] 154/044207 (24974) : Setting tune.ssl.default-dh-param to 1024 by default, if your workload permits it you should set it to at least 2048. Please set a value = 1024 to make this warning disappear. Note: setting global.maxconn to 2000. Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result FAILED Total: 3 (2 usable), will use epoll. Using epoll() as the polling mechanism. :haproxy_l2.accept(0005)=0009 from [192.168.56.102:50119] Segmentation fault Thanks Sachin On 6/4/15 3:45 PM, Baptiste bed...@gmail.com wrote: Hi sachin, Look my conf, I turned your tcp-request content statement into http-request. Baptiste On Thu, Jun 4, 2015 at 12:05 PM, Sachin Shetty sshe...@egnyte.com wrote: Tried it, I don¹t see the table populating at all. stick-table type string size 1M expire 10m store conn_cur acl is_range hdr_sub(Range) bytes= acl is_path_throttled path_beg /public-api/v1/fs-content-download #tcp-request content track-sc1 base32 if is_range is_path_throttled http-request set-header X-track %[url] tcp-request content track-sc1 req.hdr(X-track) if is_range is_path_throttled http-request deny if { sc1_conn_cur gt 2 } is_range is_path_throttled (egnyte_server)egnyte@egnyte-laptop:~$ echo show table haproxy_l2 | socat /tmp/haproxy.sock stdio # table: haproxy_l2, type: string, size:1048576, used:0 (egnyte_server)egnyte@egnyte-laptop:~$ On 6/3/15 8:36 PM, Baptiste bed...@gmail.com wrote: Yes, the url sample copies whole URL as sent by the client. Simply give it a try on a staging server and let us know the status. Baptiste On Wed, Jun 3, 2015 at 3:19 PM, Sachin Shetty sshe...@egnyte.com wrote: Thanks Baptiste - Will http-request set-header X-track %[url] help me track URL with query parameters as well? On 6/3/15 6:36 PM, Baptiste bed...@gmail.com wrote: On Wed, Jun 3, 2015 at 2:17 PM, Sachin Shetty sshe...@egnyte.com wrote: Hi, I am trying to write some throttles that would limit concurrent connections for Range requests + specific urls. For example I want to allow only 2 concurrent range requests downloading a file /public-api/v1/fs-content-download I have a working rule: stick-table type string size 1M expire 10m store conn_cur tcp-request inspect-delay 5s acl is_range hdr_sub(Range) bytes= acl is_path_throttled path_beg /public-api/v1/fs-content-download tcp-request content track-sc1 base32 if is_range is_path_throttled http-request deny if { sc1_conn_cur gt 2 } is_range is_path_throttled Just wanted to see if there is a better way of doing this? Is this efficient enough. I need to include the query string as well in my tracker, but I could not figure that out. Thanks Sachin Hi Sachin, I would do it like this: stick-table type string size 1M expire 10m store conn_cur tcp-request inspect-delay 5s tcp-request accept if HTTP acl is_range hdr_sub(Range) bytes= acl is_path_throttled path_beg /public-api/v1/fs-content-download http-request set-header X-track %[url] http-request track-sc1 req.hdr(X-track) if is_range is_path_throttled http-request deny if { sc1_conn_cur gt 2 } is_range is_path_throttled There might be some typo, but you get the idea. Baptiste
Limiting concurrent range connections
Hi, I am trying to write some throttles that would limit concurrent connections for Range requests + specific urls. For example I want to allow only 2 concurrent range requests downloading a file /public-api/v1/fs-content-download I have a working rule: stick-table type string size 1M expire 10m store conn_cur tcp-request inspect-delay 5s acl is_range hdr_sub(Range) bytes= acl is_path_throttled path_beg /public-api/v1/fs-content-download tcp-request content track-sc1 base32 if is_range is_path_throttled http-request deny if { sc1_conn_cur gt 2 } is_range is_path_throttled Just wanted to see if there is a better way of doing this? Is this efficient enough. I need to include the query string as well in my tracker, but I could not figure that out. Thanks Sachin
Re: Limiting concurrent range connections
Thanks Baptiste - Will http-request set-header X-track %[url] help me track URL with query parameters as well? On 6/3/15 6:36 PM, Baptiste bed...@gmail.com wrote: On Wed, Jun 3, 2015 at 2:17 PM, Sachin Shetty sshe...@egnyte.com wrote: Hi, I am trying to write some throttles that would limit concurrent connections for Range requests + specific urls. For example I want to allow only 2 concurrent range requests downloading a file /public-api/v1/fs-content-download I have a working rule: stick-table type string size 1M expire 10m store conn_cur tcp-request inspect-delay 5s acl is_range hdr_sub(Range) bytes= acl is_path_throttled path_beg /public-api/v1/fs-content-download tcp-request content track-sc1 base32 if is_range is_path_throttled http-request deny if { sc1_conn_cur gt 2 } is_range is_path_throttled Just wanted to see if there is a better way of doing this? Is this efficient enough. I need to include the query string as well in my tracker, but I could not figure that out. Thanks Sachin Hi Sachin, I would do it like this: stick-table type string size 1M expire 10m store conn_cur tcp-request inspect-delay 5s tcp-request accept if HTTP acl is_range hdr_sub(Range) bytes= acl is_path_throttled path_beg /public-api/v1/fs-content-download http-request set-header X-track %[url] http-request track-sc1 req.hdr(X-track) if is_range is_path_throttled http-request deny if { sc1_conn_cur gt 2 } is_range is_path_throttled There might be some typo, but you get the idea. Baptiste
Setting compression for specific request paths
Hi, I see that we can set compression type on a frontend or backend. Due to some application level complication we want haproxy to not compress specific request path for example /api and compress the rest as usual. Any idea on how this can be done? One way would be to route the requests through a different backend and disable compression there, but that would be a ugly config to maintain. Thanks Sachin
Re: Performance implications of using dynamic maps
Hi Willy, I need one more clarification, I need the value in multiple acls acl is_a_v-1 hdr(host),map(/opt/haproxy/current/conf/proxy.map) a_v-1 acl is_a_v-2 hdr(host),map(/opt/haproxy/current/conf/proxy.map) a_v-2 acl is_a_v-3 hdr(host),map(/opt/haproxy/current/conf/proxy.map) a_v-3 .. .. acl is_a_v-10 hdr(host),map(/opt/haproxy/current/conf/proxy.map) a_v-10 is there a way I could lookup once and use the values in multiple acls? Unfortunately I cannot refer to an acl in another acl conditions which would have worked for me. Thanks Sachin On 12/2/14 2:15 PM, Sachin Shetty sshe...@egnyte.com wrote: Thanks a lot Willy. Yes, I tried my luck with sticky tables, but could not find a way to store key value mapping for 1000s of host names. I will move this to testing, thanks for you help as always :) Thanks Sachin On 12/2/14 1:01 PM, Willy Tarreau w...@1wt.eu wrote: Hi Sachin, On Sat, Nov 29, 2014 at 04:19:54PM +0530, Sachin Shetty wrote: Hi, In our architecture, we have thousands of host names resolving to a single haproxy, we dynamically decide a sticky backend based on our own custom sharding. To determine the shard info, we let the request flow in to a default apache proxy that processes the requests and also responds with the shard info. To be able to serve the consequent requests directly bypassing the apache, we want to store the shard info received in the first request in a map and use it for subsequent request 1. Store the shard info from apache backend apache_l1 mode http http-response set-map(/opt/haproxy/current/conf/proxy.map) %[res.hdr(X-Request-Host)] %[res.hdr(X-Backend-Id)] server apache_l1 IP:80 2. Use the backend directly for subsequent requests: acl is_a_v-1 hdr(host),map(/opt/haproxy/current/conf/proxy.map) a_v-1 use_backend l2_haproxy if is_a_v-1 I have tested this config and it works well, but I am not sure about the performance. For every request sent to Apache, we will be adding a key, value to the map and we will be looking up the key value for every requests that is coming in to haproxy is that ok considering that this is very high performance stack? The haproxy servers are pretty powerful and dedicated to just doing proxy. Here you're using string-to-string mapping, it's one of the cheapest one since there's no conversion of text to patterns. The string lookups are performed in a few tens of nanoseconds so that does not count. The update here will require : - building a new key : log-format + strdup(result) - building a new value : log-format + strdup(new) - lookup of the key in the tree - replacement or insertion of the key in the tree - free(old_key) - free(old_value) I suspect that below 10-2 req/s you will not notice a significant difference. Above it can cost a few percent extra CPU usage. It's interesting to see that you have basically reimplemented stickiness using maps :-) Regards, Willy
Re: Performance implications of using dynamic maps
Thanks willy, I need to do more than just pick a backend. So you feel even with a map of 10K keys, multiple look ups should be ok? Thanks Sachin On 12/8/14 6:15 PM, Willy Tarreau w...@1wt.eu wrote: Hi Sachin, On Mon, Dec 08, 2014 at 06:04:35PM +0530, Sachin Shetty wrote: Hi Willy, I need one more clarification, I need the value in multiple acls acl is_a_v-1 hdr(host),map(/opt/haproxy/current/conf/proxy.map) a_v-1 acl is_a_v-2 hdr(host),map(/opt/haproxy/current/conf/proxy.map) a_v-2 acl is_a_v-3 hdr(host),map(/opt/haproxy/current/conf/proxy.map) a_v-3 .. .. acl is_a_v-10 hdr(host),map(/opt/haproxy/current/conf/proxy.map) a_v-10 is there a way I could lookup once and use the values in multiple acls? There would be an option for this. Using a capture would permit to have a temporary variable containing the result of your map. Something like this approximately : tcp-request inspect-delay 10s tcp-request capture %[hdr(host),map(/opt/haproxy/current/conf/proxy.map)] len 40 Then your ACLs can refer to capture.req.hdr(0) (assuming it's the first capture rule) : acl is_a_v-1 capture.req.hdr(0) a_v-1 acl is_a_v-2 capture.req.hdr(0) a_v-2 acl is_a_v-3 capture.req.hdr(0) a_v-3 ... Note that when using rules as yours above (string-to-string mapping), the lookup is very fast, only the header extraction costs a little bit, so you should not be worried by these few rules. If you would use case-insensitive match or regex match, it would be different and you'd really need this optimization. If you only want to use these rules to select a proper backend, you could also use the dynamic use_backend rules (but please carefully read the doc about use_backend and maps for the details) : use_backend %[hdr(host),map(proxy.map)] And you don't need any acl anymore, and everything is done in a single lookup. Regards, Willy
Re: Performance implications of using dynamic maps
Thanks a lot Willy. Yes, I tried my luck with sticky tables, but could not find a way to store key value mapping for 1000s of host names. I will move this to testing, thanks for you help as always :) Thanks Sachin On 12/2/14 1:01 PM, Willy Tarreau w...@1wt.eu wrote: Hi Sachin, On Sat, Nov 29, 2014 at 04:19:54PM +0530, Sachin Shetty wrote: Hi, In our architecture, we have thousands of host names resolving to a single haproxy, we dynamically decide a sticky backend based on our own custom sharding. To determine the shard info, we let the request flow in to a default apache proxy that processes the requests and also responds with the shard info. To be able to serve the consequent requests directly bypassing the apache, we want to store the shard info received in the first request in a map and use it for subsequent request 1. Store the shard info from apache backend apache_l1 mode http http-response set-map(/opt/haproxy/current/conf/proxy.map) %[res.hdr(X-Request-Host)] %[res.hdr(X-Backend-Id)] server apache_l1 IP:80 2. Use the backend directly for subsequent requests: acl is_a_v-1 hdr(host),map(/opt/haproxy/current/conf/proxy.map) a_v-1 use_backend l2_haproxy if is_a_v-1 I have tested this config and it works well, but I am not sure about the performance. For every request sent to Apache, we will be adding a key, value to the map and we will be looking up the key value for every requests that is coming in to haproxy is that ok considering that this is very high performance stack? The haproxy servers are pretty powerful and dedicated to just doing proxy. Here you're using string-to-string mapping, it's one of the cheapest one since there's no conversion of text to patterns. The string lookups are performed in a few tens of nanoseconds so that does not count. The update here will require : - building a new key : log-format + strdup(result) - building a new value : log-format + strdup(new) - lookup of the key in the tree - replacement or insertion of the key in the tree - free(old_key) - free(old_value) I suspect that below 10-2 req/s you will not notice a significant difference. Above it can cost a few percent extra CPU usage. It's interesting to see that you have basically reimplemented stickiness using maps :-) Regards, Willy
Re: http-keep-alive with SSL backend
Thanks Cyril, but no luck, I still see no connection reuse. For every new connection from the same client, haproxy make a new connection to the server and terminates it right after. Lukas, as per the documentation, the 1.5 dev version does support server side pooling. http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4.2-option%20h ttp-keep-alive Setting option http-keep-alive http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#option%20http -keep-alive enables HTTP keep-alive mode on the client- and server- sides. This provides the lowest latency on the client side (slow network) and the fastest session reuse on the server side at the expense of maintaining idle connections to the servers. In general, it is possible with this option to achieve approximately twice the request rate that the http-server-close http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#option%20http -server-close option achieves on small objects. There are mainly two situations where this option may be useful : - when the server is non-HTTP compliant and authenticates the connection instead of requests (eg: NTLM authentication) - when the cost of establishing the connection to the server is significant compared to the cost of retrieving the associated object from the server. On 11/30/14 4:58 PM, Cyril Bonté cyril.bo...@free.fr wrote: Hi all, Le 30/11/2014 11:54, Lukas Tribus a écrit : Hi Sachin, Hi, We have SSL backends which are remote, so we want to use http-keep-alive to pool connections Connection pooling/multiplexing is simply not (yet) supported. Its is therefor expected behavior that 1 frontend connection equals 1 backend connection. Sachin, as your configuration seems to not provide any sticky sessions (cookies, load balancing algorithm, nor stick tables), I believe y'oure in round robin. You can try to add option prefer-last-server, this will try to reuse server connection for a same client connection. http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#option%20pref er-last-server -- Cyril Bonté
Re: http-keep-alive with SSL backend
Thanks Cyril, appreciate your help on this. I will take this up internally on how we could workaround it. Thanks again. Thanks Sachin On 11/30/14 5:47 PM, Cyril Bonté cyril.bo...@free.fr wrote: Hi again Sachin, Le 30/11/2014 13:01, Sachin Shetty a écrit : Thanks Cyril, but no luck, I still see no connection reuse. For every new connection from the same client, haproxy make a new connection to the server and terminates it right after. Then, ensure that it can't be due to a explicit behaviour asked by the client or the server. Lukas, as per the documentation, the 1.5 dev version does support server side pooling. http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4.2-option%2 0h ttp-keep-alive No, Lukas is right, there's no pooling yet in haproxy. haproxy will only try to reuse the previous server side connection for the same client connection. This is not a pool of connections. Once the client closes it's connection, there won't be any connection reuse on the server side. -- Cyril Bonté
Performance implications of using dynamic maps
Hi, In our architecture, we have thousands of host names resolving to a single haproxy, we dynamically decide a sticky backend based on our own custom sharding. To determine the shard info, we let the request flow in to a default apache proxy that processes the requests and also responds with the shard info. To be able to serve the consequent requests directly bypassing the apache, we want to store the shard info received in the first request in a map and use it for subsequent request 1. Store the shard info from apache backend apache_l1 mode http http-response set-map(/opt/haproxy/current/conf/proxy.map) %[res.hdr(X-Request-Host)] %[res.hdr(X-Backend-Id)] server apache_l1 IP:80 2. Use the backend directly for subsequent requests: acl is_a_v-1 hdr(host),map(/opt/haproxy/current/conf/proxy.map) a_v-1 use_backend l2_haproxy if is_a_v-1 I have tested this config and it works well, but I am not sure about the performance. For every request sent to Apache, we will be adding a key, value to the map and we will be looking up the key value for every requests that is coming in to haproxy is that ok considering that this is very high performance stack? The haproxy servers are pretty powerful and dedicated to just doing proxy. Thanks Sachin
http-keep-alive with SSL backend
Hi, We have SSL backends which are remote, so we want to use http-keep-alive to pool connections to the the SSL backends, however it does not seem to be working: backend qa option http-keep-alive timeout http-keep-alive 30s server qa IP:443 maxconn 100 ssl verify none I am monitoring connections to backend using netstat and I see the connection to the backend immediately drops as soon as my front end request finishes, any suggestions? Is it because of SSL backend? Thanks Sachin
Re: How to redispatch a request after queue timeout
Thanks Willy. I am precisely using it for caching. I need requests to go to the same nodes for cache hits, but when the node is already swamped I would prefer a cache miss over a 503. Thanks Sachin On 8/31/13 12:57 PM, Willy Tarreau w...@1wt.eu wrote: On Fri, Aug 30, 2013 at 02:10:50PM +0530, Sachin Shetty wrote: Thanks Lukas. Yes, I was hoping to workaround by setting a smaller maxqueue limit and queue timeout. So what other options do we have, I need to: 1. Send all requests for a host (mytest.mydomain.com) to one backend as long as it can serve. 2. If the backend is swamped, it should go to any other backend available. I'm wondering if we should not try to implement this when the hash type is set to consistent. The principle of the consistent hash precisely is that we want the closest node but we know that sometimes things will be slightly redistributed (eg: when adding/removing a server in the farm). So maybe it would make sense to specify that when using consistent hash, if a server has a maxqueue parameter and this maxqueue is reached, then look for the closest server. That might be OK with caches as well as the ones close to each other tend to share a few objects when the farm size changes. What do others think ? Willy
Re: How to redispatch a request after queue timeout
We did try consistent hashing, but I found better distribution without it. We don¹t add or remove servers often so we should be ok. Our total pool is sized correctly and we are able to serve 100% requests when we use roundrobin, however sticky on host is what causes some nodes to hit maxconn. My goal is to never send a 503 as long as we have other nodes available which is always the case in our pool. Thanks Sachin On 8/31/13 1:17 PM, Willy Tarreau w...@1wt.eu wrote: On Sat, Aug 31, 2013 at 01:03:34PM +0530, Sachin Shetty wrote: Thanks Willy. I am precisely using it for caching. I need requests to go to the same nodes for cache hits, but when the node is already swamped I would prefer a cache miss over a 503. Then you should already be using hash-type consistent, otherwise when you lose or add a server, you redistribute everything and will end up with only about 1/#cache at the same place and all the rest with misses. Not many cache architectures resist to this, really. Interestingly, a long time ago I wanted to have some outgoing rules (they're on the diagram in the doc directory). The idea was to be able to apply some processing *after* the LB algorithm was called. Such processing could include detecting the selected server's queue size or any such thing and decide to force to use another server. But in practice it doesn't play well with the current sequencing so it was never done. It could have been useful in such a situation I think. I'll wait a bit for others to step up about the idea of redistributing connections only for consistent hashing. I really don't want to break existing setups (eventhough I think in this case it should be OK). Willy
Re: How to redispatch a request after queue timeout
Yes, no-queue is what I am looking for. If consistent hashing is easier to accommodate this change, I won't mind switching to consistent hashing when the fix is available. Right now with no support for maxqueue failover, consistent hashing is even more severe for our use case. Thanks again Willy. Thanks Sachin On 8/31/13 2:17 PM, Willy Tarreau w...@1wt.eu wrote: On Sat, Aug 31, 2013 at 01:27:41PM +0530, Sachin Shetty wrote: We did try consistent hashing, but I found better distribution without it. That's known and normal. We don¹t add or remove servers often so we should be ok. It depends on what you do with them in fact, because most places will not accept that the whole farm goes down due to a server falling down causing 100% redistribution. If you have reverse caches in general it is not a big issue because the number of objects is very limited and caches can quickly refill. But outgoing caches generally take ages to fill up. Our total pool is sized correctly and we are able to serve 100% requests when we use roundrobin, however sticky on host is what causes some nodes to hit maxconn. My goal is to never send a 503 as long as we have other nodes available which is always the case in our pool. OK so if we perform the proposed change it will not match your usage since you're not using consistent hashing anyway. So we might have to add another explicit option such as loose/strict assignment of the server. We could have 3 levels BTW : - no-queue : find another server if the destination is full - loose: find another server if the destination has reached maxqueue - strict : never switch to another server I would just like to find how to do something clean for the map-based hash that you're using without recomputing a map excluding the unusable server(s) but trying to stick as much as possible to the same servers to optimize hit rate. Maybe scanning the table for the next usable server will be enough, though it will not match the same servers as the ones used in case of a change of the farm size. This could be a limitation that has to be accepted for this. Willy
Re: How to redispatch a request after queue timeout
Thanks Lukas. Yes, I was hoping to workaround by setting a smaller maxqueue limit and queue timeout. So what other options do we have, I need to: 1. Send all requests for a host (mytest.mydomain.com) to one backend as long as it can serve. 2. If the backend is swamped, it should go to any other backend available. Thanks Sachin On 8/30/13 1:57 PM, Lukas Tribus luky...@hotmail.com wrote: Hi Sachin, We want to maintain stickiness to a backed server based on host header so balance hdr(host) works pretty well for us, however as soon at the backend hits max connection, requests pile up in the queue eventually timeout with a 503 and sQ in the logs. I don't think balance hdr(host) is the correct load-balancing method then. Is there a way to redispatch the requests to other servers ignoring the persistence on queue timeout? No; after a queue timeout (configurable by timeout queue), haproxy will drop the request. Think of what would happen when we redispatch on queue timeout: The request would wait seconds for a free slot on the backend and only then be redispatched to another server. This would delay many requests and hide the problem from you, while your customers furiously switch to a competitor. You should find another way to better distribute the load across your backends. Lukas
How to redispatch a request after queue timeout
Hi, We want to maintain stickiness to a backed server based on host header so balance hdr(host) works pretty well for us, however as soon at the backend hits max connection, requests pile up in the queue eventually timeout with a 503 and sQ in the logs. Is there a way to redispatch the requests to other servers ignoring the persistence on queue timeout? My config: frontend test bind *:7444 mode http log global option httplog capture request header host len 32 default_backend test_backend backend test_backend balance hdr(host) server wsgi_11 qa-vm01:4180 maxconn 5 check inter 3 server wsgi_22 qa-vm01:4180 maxconn 2 check inter 3 Logs: Aug 29 07:32:21 qa-vm02 haproxy[24532]: 172.27.202.12:17288 [29/Aug/2013:07:32:11.466] eos test_backend/wsgi_11 0/1/-1/-1/1 503 212 sQ-- 6/6/6/5/0 19/0 {qa-vm02:7300|||} GET /wsgi/print_headers.py HTTP/1.1 Aug 29 07:32:21 qa-vm02 haproxy[24532]: 172.27.202.12:17289 [29/Aug/2013:07:32:11.467] eos test_backend/wsgi_11 0/1/-1/-1/1 503 212 sQ-- 5/5/5/5/0 20/0 {qa-vm02:7300|||} GET /wsgi/print_headers.py HTTP/1.1 Aug 29 07:32:21 qa-vm02 haproxy[24532]: 172.27.202.12:17289 [29/Aug/2013:07:32:11.467] eos test_backend/wsgi_11 0/1/-1/-1/1 503 212 sQ-- 5/5/5/5/0 20/0 {qa-vm02:7300|||} GET /wsgi/print_headers.py HTTP/1.1 Thanks Sachin
Re: add header does not happen every request due to keepalive
Thanks Emeric and Scott, I will try this out. Thanks Sachin On 7/12/13 12:27 AM, Emeric BRUN eb...@exceliance.fr wrote: original message- De: Sachin Shetty sshe...@egnyte.com A: haproxy@formilux.org Date: Thu, 11 Jul 2013 23:57:40 +0530 - Hi, We need to add a header to every request that is being routed via haproxy, we were able to achieve with a simple add header instruction: reqadd X-Haproxy-L1:\ true However it seems haproxy only adds this request to the first request in a keep alive connection stream and this header is missing when browser reuses the connection. We could work around this behavior using httpclose, however this would disable keep alive I guess. Is there a way to support keep alive and yet add the headers (or apply some rewrite rules) to all the request effectively terminating the keep alive at haproxy like Apache. We also need to get some rewrite rules going and would need haproxy to apply the rules in every request as well. use option http-server-close Regards, Emeric
add header does not happen every request due to keepalive
Hi, We need to add a header to every request that is being routed via haproxy, we were able to achieve with a simple add header instruction: reqadd X-Haproxy-L1:\ true However it seems haproxy only adds this request to the first request in a keep alive connection stream and this header is missing when browser reuses the connection. We could work around this behavior using httpclose, however this would disable keep alive I guess. Is there a way to support keep alive and yet add the headers (or apply some rewrite rules) to all the request effectively terminating the keep alive at haproxy like Apache. We also need to get some rewrite rules going and would need haproxy to apply the rules in every request as well. Thanks Sachin
Haproxy equivalent of Apache mod_rewrite RewriteMap
Hi, We use RewriteMap extensively in Apache to look up an external service on the header host to determine which downstream pool we want to use: Something like this in apache: RewriteMap d2u prg:/www/bin/dash2under.pl RewriteRule - ${d2u:%{HOST}} Is there a way to do this in haproxy? i.e lookup for a backend pool name based on a header and then route the request to the backend. Please note that we cannot simply hash the requests to any backend since specific requests can only be handled by specific pools. Thanks Sachin
Grouping servers for failover within a backend
Hi, We have four web servers in a single backend. Physically these four servers are on two different machines. A new sessions is made sticky by hashing on one of the headers. Regular flow is ok, but when one of the webservers are down for an in-flight session, the request should be re-dispatched to the webserver on the same machine if available. I looked at various options in the config, but couldn't figure out a way to do it. Has anybody achieved any thing similar with some config tweaks? Thanks Sachin
RE: syslogd dropping requests from haproxy
We switched to rsyslog and since then seeing a huge increase in the log volume. Thanks for all the help! Thanks Sachin -Original Message- From: Sachin Shetty [mailto:sshe...@egnyte.com] Sent: Saturday, October 15, 2011 11:35 AM To: 'Willy Tarreau' Cc: 'haproxy@formilux.org' Subject: RE: syslogd dropping requests from haproxy Thanks Willy - I will these and let you know. -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Saturday, October 15, 2011 11:32 AM To: Sachin Shetty Cc: haproxy@formilux.org Subject: Re: syslogd dropping requests from haproxy On Sat, Oct 15, 2011 at 01:35:46AM +0530, Sachin Shetty wrote: Found some more info, when haproxy configured to send logs to remote host instead of local syslogd, it works fine. Definately something to do with the local syslogd under heavy load. Check that your syslog doesn't log synchronously. For the basic sysklogd, it means you need to have a - in front of your file names. And even with this, sysklogd's limits are quickly reached (between 300 and 1000 logs/s depending on the machine). For high loads, I strongly recommend syslog-ng. It's the only one I found which managed to log more than 1 logs/s without dropping any : http://www.balabit.com/network-security/syslog-ng Regards, Willy
RE: syslogd dropping requests from haproxy
Thanks Willy - I will these and let you know. -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Saturday, October 15, 2011 11:32 AM To: Sachin Shetty Cc: haproxy@formilux.org Subject: Re: syslogd dropping requests from haproxy On Sat, Oct 15, 2011 at 01:35:46AM +0530, Sachin Shetty wrote: Found some more info, when haproxy configured to send logs to remote host instead of local syslogd, it works fine. Definately something to do with the local syslogd under heavy load. Check that your syslog doesn't log synchronously. For the basic sysklogd, it means you need to have a - in front of your file names. And even with this, sysklogd's limits are quickly reached (between 300 and 1000 logs/s depending on the machine). For high loads, I strongly recommend syslog-ng. It's the only one I found which managed to log more than 1 logs/s without dropping any : http://www.balabit.com/network-security/syslog-ng Regards, Willy
syslogd dropping requests from haproxy
Hi, We have a pretty heavily loaded haproxy server - more than one are running on a single machine. I am seeing not all requests are being logged to syslogd - I am sure this is not related to the httpclose parameter since the same conf file works fine on another machine where the load is pretty low Anybody faced any such problem? any configs in haproxy to write to a different log file instead of syslogd? Thanks Sachin
RE: syslogd dropping requests from haproxy
Found some more info, when haproxy configured to send logs to remote host instead of local syslogd, it works fine. Definately something to do with the local syslogd under heavy load. Thanks Sachin From: Sachin Shetty [mailto:sshe...@egnyte.com] Sent: Saturday, October 15, 2011 12:57 AM To: 'haproxy@formilux.org' Subject: syslogd dropping requests from haproxy Hi, We have a pretty heavily loaded haproxy server - more than one are running on a single machine. I am seeing not all requests are being logged to syslogd - I am sure this is not related to the httpclose parameter since the same conf file works fine on another machine where the load is pretty low Anybody faced any such problem? any configs in haproxy to write to a different log file instead of syslogd? Thanks Sachin
RE: Apache translates 500 to 502 from haproxy
Well, all the problems, the original one that we hit a couple of months ago and the current one are related to one thing: Apache expects some request/response to be read by the downstream haproxy ( and its backends) which refuse to do it due to some error condition and instead sends back a error status like 404, 502, 401 abruptly. Haproxy seem to send a correct response back to Apache as we have seen before, it's the apache that misinterprets it. Yeah, I definitely need to reproduce this problem in test and see what could be the real cause. I will keep you posted. Thanks Sachin -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Tuesday, September 20, 2011 10:31 AM To: Sachin Shetty Cc: 'Cassidy, Bryan'; haproxy@formilux.org; 'Amrit Jassal' Subject: Re: Apache translates 500 to 502 from haproxy Hi Sachin, On Mon, Sep 19, 2011 at 01:47:28PM +0530, Sachin Shetty wrote: Hey Willy, So we are now hit by the side effect of this fix i.e. disabling httpclose. Two problems: 1. Entries in the log are missing, I guess you already warned me about it. Do you think if we disable keep alive in our Apache fronting haproxy, this will problem will go away? Yes it will solve this issue at least. BTW, with what I saw in your trace, I really see no reason why http-server-close would not work, because the server advertises a correct content-length so haproxy should wait for both streams to synchronize. Are you sure you had http-server-close in both the frontend and the backend, and that you didn't have any remains of forceclose nor httpclose ? Just in doubt, if you're willing to make a new test, I'm interested in a new trace :-) 2. Related to one, but an interesting one. - A request comes to haproxy, as configured after waiting in haproxy queue for 10 seconds due to backend free connection unavailable, it sends a 503 back, logged correctly in haproxy and apache - The client retries, I think with Keep Alive over the same connection and it sees a 400 status back. Now this request is no where in haproxy logs so there is no way to see what happened in haproxy and who really dropped the ball. The connection never made it to the backed cherrypy server since it logs each request it receives. When you see the 400, is it the standard haproxy response or is it apache ? If it is haproxy, you should see it in its logs, which doesn't seem to be the case. It is possible that the client (or apache ?) continues to send a bit of the remaining POST data before the request and that this confuses the next hop (apache or haproxy). That's just a guess, of course. Cheers, Willy
RE: Apache translates 500 to 502 from haproxy
Hey Willy, So we are now hit by the side effect of this fix i.e. disabling httpclose. Two problems: 1. Entries in the log are missing, I guess you already warned me about it. Do you think if we disable keep alive in our Apache fronting haproxy, this will problem will go away? 2. Related to one, but an interesting one. - A request comes to haproxy, as configured after waiting in haproxy queue for 10 seconds due to backend free connection unavailable, it sends a 503 back, logged correctly in haproxy and apache - The client retries, I think with Keep Alive over the same connection and it sees a 400 status back. Now this request is no where in haproxy logs so there is no way to see what happened in haproxy and who really dropped the ball. The connection never made it to the backed cherrypy server since it logs each request it receives. Thanks Sachin -Original Message- From: Sachin Shetty [mailto:sshe...@egnyte.com] Sent: Wednesday, June 15, 2011 5:23 PM To: 'Willy Tarreau' Cc: 'Cassidy, Bryan'; 'haproxy@formilux.org' Subject: RE: Apache translates 500 to 502 from haproxy tried with option http-server-close instead of option httpclose - no luck it does not work. The only way I can get it to work is without either. Thanks Sachin -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Wednesday, June 15, 2011 12:40 PM To: Sachin Shetty Cc: 'Cassidy, Bryan'; haproxy@formilux.org Subject: Re: Apache translates 500 to 502 from haproxy On Wed, Jun 15, 2011 at 12:11:58PM +0530, Sachin Shetty wrote: I also think apache is the issue. I think we have a few mischievous modules like mod_rpaf that I need to disable and test. I will keep you posted. For now do you see any severe problem if I disabe httpclose as a workaround? As I said, in theory no, unless you need haproxy to analyse more than one request per connection. But please do make a test with http-server-close. While there is little chance that it helps, it's not completely impossible because it is able to reinforce keep-alive on the client side even if it is disable on the server side. Willy
RE: Apache translates 500 to 502 from haproxy
I also think apache is the issue. I think we have a few mischievous modules like mod_rpaf that I need to disable and test. I will keep you posted. For now do you see any severe problem if I disabe httpclose as a workaround? Thanks Sachin -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Wednesday, June 15, 2011 11:08 AM To: Sachin Shetty Cc: 'Cassidy, Bryan'; haproxy@formilux.org Subject: Re: Apache translates 500 to 502 from haproxy On Tue, Jun 14, 2011 at 06:16:23PM +0530, Sachin Shetty wrote: man We already had option httpclose in our config and removing it fixed it. We haven't tweaked any configs for a few months, I dont even know why we had this in the first place :) I read through the documentation, I dont think we need it. Do you have any reservations about taking it out? If you don't have the option, then only the first request and first response of each connection is analysed. So if Apache does keep-alive with the server over haproxy, then haproxy won't parse the second and subsequent requests. If you were already having httpclose, then haproxy did not close by itself, so that means that the server was closing the connection after it had nothing else to send. In other words, we have two servers (haproxy and the server behind it) who agree on acting the same way and the Apache definitely is the issue here. Could you just make a try with option http-server-close then ? I think it won't work due to the server closing, but if it does it would be better. I'll have to think about implementing a drain mode over keep-alive connections for this precise case : if the connection to the server is dead while the connection to the client is active with data still coming, we should silently drain those data. Regards, Willy
RE: Apache translates 500 to 502 from haproxy
Does disabling httpclose also mean that haproxy will not even log subsequent requests on the same connection? Thanks Sachin -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Wednesday, June 15, 2011 12:40 PM To: Sachin Shetty Cc: 'Cassidy, Bryan'; haproxy@formilux.org Subject: Re: Apache translates 500 to 502 from haproxy On Wed, Jun 15, 2011 at 12:11:58PM +0530, Sachin Shetty wrote: I also think apache is the issue. I think we have a few mischievous modules like mod_rpaf that I need to disable and test. I will keep you posted. For now do you see any severe problem if I disabe httpclose as a workaround? As I said, in theory no, unless you need haproxy to analyse more than one request per connection. But please do make a test with http-server-close. While there is little chance that it helps, it's not completely impossible because it is able to reinforce keep-alive on the client side even if it is disable on the server side. Willy
RE: Apache translates 500 to 502 from haproxy
tried with option http-server-close instead of option httpclose - no luck it does not work. The only way I can get it to work is without either. Thanks Sachin -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Wednesday, June 15, 2011 12:40 PM To: Sachin Shetty Cc: 'Cassidy, Bryan'; haproxy@formilux.org Subject: Re: Apache translates 500 to 502 from haproxy On Wed, Jun 15, 2011 at 12:11:58PM +0530, Sachin Shetty wrote: I also think apache is the issue. I think we have a few mischievous modules like mod_rpaf that I need to disable and test. I will keep you posted. For now do you see any severe problem if I disabe httpclose as a workaround? As I said, in theory no, unless you need haproxy to analyse more than one request per connection. But please do make a test with http-server-close. While there is little chance that it helps, it's not completely impossible because it is able to reinforce keep-alive on the client side even if it is disable on the server side. Willy
RE: Apache translates 500 to 502 from haproxy
Hey Willy, tcpdump would be fine? Thanks Sachin -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Tuesday, June 14, 2011 12:14 PM To: Sachin Shetty Cc: 'Cassidy, Bryan'; haproxy@formilux.org Subject: Re: Apache translates 500 to 502 from haproxy On Tue, Jun 14, 2011 at 11:25:19AM +0530, Sachin Shetty wrote: Hey Willy, We have 1.4.11 - so how do we use the workaround - This is already in prod and I need to work around it before we can get to the root cause. The workaround is part of the code, so you're hitting a different issue I think. Please do not hesitate to send me a network capture of what passes between apache and haproxy. Maybe it's not hard to improve the workaround to cover your case if that's the issue. Regards, Willy
RE: Apache translates 500 to 502 from haproxy
Yeah, I understand. So what could we do? I am really stuck with this and not able to figure out any workaround either. -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Tuesday, June 14, 2011 4:17 PM To: Sachin Shetty Cc: 'Cassidy, Bryan'; haproxy@formilux.org Subject: Re: Apache translates 500 to 502 from haproxy On Tue, Jun 14, 2011 at 02:14:15PM +0530, Sachin Shetty wrote: I looked at all the apache modules we have and haven't seen anything glaring there. Attached tcpdump from cherrypy server when Apache is bypassing haproxy and forwarding to cherrypy directly. Obviously cherrypy does it in order, loads the whole file and then sends the response back which works ok for Apache. I understand, but I see nothing in HTTP which makes it mandatory to read all the content once you have responded. If you respond, you don't need those data anymore. The common practice is to read at least some of them in order to avoid the issue with the close causing a reset. Haproxy reads as much as it can until the close. Once the connection is closed, it cannot read anymore. Apache accepts to read a lot more of data,but with a limit too. Willy
RE: Apache translates 500 to 502 from haproxy
man We already had option httpclose in our config and removing it fixed it. We haven't tweaked any configs for a few months, I dont even know why we had this in the first place :) I read through the documentation, I dont think we need it. Do you have any reservations about taking it out? Thanks Sachin -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Tuesday, June 14, 2011 5:15 PM To: Sachin Shetty Cc: 'Cassidy, Bryan'; haproxy@formilux.org Subject: Re: Apache translates 500 to 502 from haproxy On Tue, Jun 14, 2011 at 04:27:00PM +0530, Sachin Shetty wrote: Yeah, I understand. So what could we do? I am really stuck with this and not able to figure out any workaround either. I really have no idea, because the server already replies before the data are sent. Maybe you could disable the http-server-close option so that haproxy works in tunnel mode. It will then not analyse any request nor response beyond headers and will let client and server agree on when to close the connection. You may even resort to the plain old option httpclose so that each other are aware that the connection is used only for one request. Willy
Re: Apache translates 500 to 502 from haproxy
Willy Tarreau w at 1wt.eu writes: On Fri, Jun 10, 2011 at 04:41:08PM +0530, Manoj Kumar wrote: Hi, We are forwarding specific requests from apache to haproxy which interbally forwards it to a pool of cherry py servers. We have seen that certain requests end up in 500 in haproxy and cherry py logs which is ok and understood, but apache instead sends a 502 to the client. Maybe for any reason apache is rejecting the response and aborting the connection, which might explain that message in your logs : [Fri Jun 10 00:46:01 2011] [error] (103)Software caused connection abort: proxy: pass request body failed to 192.168.2.100:9910 http://192.168.2.15:9910/ (192.168.2.15) Willy Hi Willy, I spent some more time looking in to this, notice the error in apache log, it is parsing request body and not response I think this is what is gong on: 1. Apache receives a POST request 2. Forwards to haproxy 3. haproxy forwards to Cherrypy 4. Cherrypy aborts the request due to some internal error, returns 401/500 5. haproxy sends the response back to Apache and terminates the connection 6. Apache however is still expecting somebody will read the posted response and barfs with pass request body failed error Now this is definitely due to haproxy, since if I skip haproxy and make Apache hit cherrypy directly, I see a proper response code from Apache. I think haproxy is terminating the connection prematurely when the backend server returns and error status code Any idea?
Re: Apache translates 500 to 502 from haproxy
Sachin Shetty sshetty@... writes: I see a similar thread here, no solution though https://forums.rightscale.com//showthread.php?t=210
RE: Apache translates 500 to 502 from haproxy
Hey Bryan, I did check Cherrypy response by directly posting the same request via curl and it looks ok to me. A few interesting things are: 1. haproxy logs the response as 401 correctly - its apache which is calling haproxy marks it 502 2. It's a post request 3. Even via haproxy, it works when posting smaller files, but get the bad proxy error when posting a bigger file like 1.5MB+ file Thanks Sachin -Original Message- From: Cassidy, Bryan [mailto:bcass...@winnipeg.ca] Sent: Tuesday, June 14, 2011 12:19 AM To: Sachin Shetty; haproxy@formilux.org Subject: RE: Apache translates 500 to 502 from haproxy Hello, Check that Cherrypy is serving up valid HTTP. You could also try setting HAProxy to balance in TCP mode instead of HTTP mode, though if this helps it would just be masking any problem that might exist. I once had a backend 500 response translated to 502 by HAProxy balancing in HTTP mode. I wrote 500 in quotes because the backend (Apache improperly configured in my case) served up an HTML document containing the words 500 internal server error, but didn't actually serve up any HTTP headers prior to the document - just the document itself. HAProxy then changed the response to a 502, as it should, because not including headers is obviously invalid HTTP. I was totally stumped until I ran tcpdump and saw what was happening. Your setup is different than mine was, of course. But maybe this will give you a lead... Hope this helps, Bryan -Original Message- From: Sachin Shetty [mailto:sshe...@egnyte.com] Sent: Monday, June 13, 2011 11:12 AM To: haproxy@formilux.org Subject: Re: Apache translates 500 to 502 from haproxy Willy Tarreau w at 1wt.eu writes: On Fri, Jun 10, 2011 at 04:41:08PM +0530, Manoj Kumar wrote: Hi, We are forwarding specific requests from apache to haproxy which interbally forwards it to a pool of cherry py servers. We have seen that certain requests end up in 500 in haproxy and cherry py logs which is ok and understood, but apache instead sends a 502 to the client. Maybe for any reason apache is rejecting the response and aborting the connection, which might explain that message in your logs : [Fri Jun 10 00:46:01 2011] [error] (103)Software caused connection abort: proxy: pass request body failed to 192.168.2.100:9910 http://192.168.2.15:9910/ (192.168.2.15) Willy Hi Willy, I spent some more time looking in to this, notice the error in apache log, it is parsing request body and not response I think this is what is gong on: 1. Apache receives a POST request 2. Forwards to haproxy 3. haproxy forwards to Cherrypy 4. Cherrypy aborts the request due to some internal error, returns 401/500 5. haproxy sends the response back to Apache and terminates the connection 6. Apache however is still expecting somebody will read the posted response and barfs with pass request body failed error Now this is definitely due to haproxy, since if I skip haproxy and make Apache hit cherrypy directly, I see a proper response code from Apache. I think haproxy is terminating the connection prematurely when the backend server returns and error status code Any idea?
RE: Apache translates 500 to 502 from haproxy
Hey Willy, We have 1.4.11 - so how do we use the workaround - This is already in prod and I need to work around it before we can get to the root cause. ./haproxy -version HA-Proxy version 1.4.11 2011/02/10 Copyright 2000-2010 Willy Tarreau w...@1wt.eu Thanks Sachin -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Tuesday, June 14, 2011 11:03 AM To: Sachin Shetty Cc: 'Cassidy, Bryan'; haproxy@formilux.org Subject: Re: Apache translates 500 to 502 from haproxy On Tue, Jun 14, 2011 at 12:28:55AM +0530, Sachin Shetty wrote: Hey Bryan, I did check Cherrypy response by directly posting the same request via curl and it looks ok to me. A few interesting things are: 1. haproxy logs the response as 401 correctly - its apache which is calling haproxy marks it 502 2. It's a post request 3. Even via haproxy, it works when posting smaller files, but get the bad proxy error when posting a bigger file like 1.5MB+ file I don't like this, it smells like the issue we have with Linux dropping all socket contents after a close() if the client sends more data. What version of haproxy are you running ? A workaround for this issue was added into 1.4.9. It made haproxy read ad much as it could from the connection before closing. Ideally, a network trace of what is received and sent by haproxy when the issue happens would be of great help to try to improve the behaviour without turning it into an easy DoS vulnerability. Willy