Re: Problems with haproxy 1.7.3 on FreeBSD 11.0-p8
Thanks Willy, Am 07.03.2017 um 00:32 schrieb Willy Tarreau: > Sorry, when I said "revert" I meant typically like this : > > patch -Rp1 < 0001-BUG-MEDIUM-tcp-don-t-poll-for-write-when-connect-suc.patch > > I've just tested here on 1.7.3 and it does apply correctly. > > With git apply you'll have to pass -R as well. > Sorry for not being clear the first time. I recompiled haproxy 1.7.3 with the patch reverted and will test it now. Please give me 9 hours, to give you a real feedback, but so far the monitoring system is quiet. Gruß Matthias -- "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to produce bigger and better idiots. So far, the universe is winning." -- Rich Cook
Re: [RFC PATCH] MEDIUM: persistent connections for SSL checks
On Mon, Mar 06, 2017 at 06:34:09PM -0800, Steven Davidovitz wrote: > Interestingly, as far as I can tell, we are running into the problem > described in this forum post: > http://discourse.haproxy.org/t/backend-encryption-and-reusing-ssl-sessions/503/4 > Switching the conn_data_shutw_hard call to conn_data_shutw in checks.c > decreased CPU usage completely. Forcing SSLv3 as in this email ( > https://www.mail-archive.com/haproxy@formilux.org/msg09105.html) also > worked. This is very interesting! The problem with not using conn_data_shutw_hard() is that conn_data_shutw() will only *try* to notify the other side about an imminent close but at the same time we're going to close with SO_NOLINGER resulting in the close notification to be lost in the middle. And not using SO_NOLINGER is not an option as we can end up with tons of TIME_WAIT sockets on haproxy which clearly is not acceptable. But more importantly it means that we probably have a similar problem with production traffic if you don't use persistent connections to the server. But now I'm thinking about something, I'm wondering if in fact it would not be the lack of call to SSL_shutdown() which causes the connection not to be kept in haproxy's SSL context. In 1.8-dev this function changed a bit so that we first call SSL_set_quiet_shutdown() then SSL_shutdown(). > I haven't had time to dig further, and it may certainly be client > misconfiguration, but could you shed any light on why that might be a > problem? It might be useful to use another haproxy after the checks instead of your server to see if you observe the same effect. And if you can run a test with 1.8-dev it would be great. I'd rather backport just the change to ssl_sock_shutw() to 1.7 if it fixes the problem :-) BTW, given that only the checks are causing the trouble it's easy to start an independant process for this. Just change all bind addresses and let the backends run their checks to see the effect. Willy
Re: [RFC PATCH] MEDIUM: persistent connections for SSL checks
Thanks for the response! On Mon, Mar 6, 2017 at 1:34 AM, Willy Tarreau wrote: > > [snip] > > Also it is not normal at all that SSL checks lead to CPU saturation. > Normally, health checks are expected to store the last SSL_CTX in the > server struct for later reuse, leading to a TLS resume connection. > There is one case where this doesn't work which is when SNI is being > used on the server lines. Is this your case ? If so a better solution > would be to have at least two (possibly a bit more) SSL_CTX per server > as was mentionned in commit 119a408 ("BUG/MEDIUM: ssl: for a handshake > when server-side SNI changes"). This would both improve the checks > performance and the production traffic performance by avoiding renegs > on SNI change between two consecutive connections. > Interestingly, as far as I can tell, we are running into the problem described in this forum post: http://discourse.haproxy.org/t/backend-encryption-and-reusing-ssl-sessions/503/4 Switching the conn_data_shutw_hard call to conn_data_shutw in checks.c decreased CPU usage completely. Forcing SSLv3 as in this email ( https://www.mail-archive.com/haproxy@formilux.org/msg09105.html) also worked. I haven't had time to dig further, and it may certainly be client misconfiguration, but could you shed any light on why that might be a problem?
PAYMENT CONFIRMATION - FW: OV14229PA0620339 - OTT Payment Advice
Attention,  Attached is the payment transferred to your bank account for INV- 081116 as directed by our customer to you, we are sorry for the delay. Please review for your reference PDF id is INVOICEPAYMENT1.  Thanks & Best Regards,  Sarah        -- Forwarded message -- From: WELLS FARGO BANK N.A Date: Mon, Mar 6, 2017 at 4:47 PM Subject: REMITTANCE To: Sales Manager     SWIFT Text :- US$ . - Message copy Instance Type and Transmission -- -    Notification (Transmission) of Original sent to SWIFT (ACK)    Network Delivery Status  : Network Ack    Priority/Delivery        : Normal --- Message Header TT-Invoice-Payment.pdf Description: Binary data
[SPAM] 专业做化工产品的国际快递的。粉末,液体。无需鉴定报告
您好! 我们是专业做化工产品的国际快递的。粉末,液体。无需鉴定报告优质的包装材料保护货物样品在 安全运输的情况下美观大方,价格优惠客服一对一的服务 期待与您合作。 主要航线是:FEDEX DHL TNT EMS,UPS。大货(21KG以上)另有优惠。 五大航线强势+贴心跟单+及时的信息通知+门到门服务=您的满意。 价格可以在报价基础上另行优惠,欢迎咨询。 手机:18930306441联系:张琴 QQ:1755462759 电话: 021-68095814 优势服务: 第一,我们公司有5大航线别人走不了的快递我们可以发我们会根据国家来判断航线考虑性价比 第二,我们换的品名都是海关经常进出口的 查验的少 第三,即使万一查到了我们也有换品名的msds,coa等资料,只要国外积极配合都是可以请出来的。 第四:我们的价格优惠,不要鉴定报告就能发走 我们在包装时还会给您贴唛头 ,换包装 还会拍照的 都是新的材料 而且我们还有专业的包装器材。都是全新的包装:箱子,铝箔袋,自封袋,以及专门的铝箔袋封口机,液体封口机,编织袋封口机等,从未出现错件,泄露的情况。 并且我们每天按时查单,有问题就及时告诉您,您的件交给我们就几乎不用操任何心,如果扣关了,我们会积极配合您国外客户清关。 做单证、在行政方面,我们因为已经操作多年有一套成熟的经验,尤其在清关方面我们有成熟的经验和清关资料。 公司实力方面:公司的网站先发货后付款,不出中国不收费,包出中国。
Re: Problems with haproxy 1.7.3 on FreeBSD 11.0-p8
On Mon, Mar 06, 2017 at 11:19:18PM +0100, Matthias Fechner wrote: > Dear Willy and Dmitry, > > Am 06.03.2017 um 11:16 schrieb Willy Tarreau: > > with the attachment now (thanks Dmitry) > > hm, I'm not able to apply the patch: > git apply --ignore-space-change --ignore-whitespace > 0001-BUG-MEDIUM-tcp-don-t-poll-for-write-when-connect-suc.patch Sorry, when I said "revert" I meant typically like this : patch -Rp1 < 0001-BUG-MEDIUM-tcp-don-t-poll-for-write-when-connect-suc.patch I've just tested here on 1.7.3 and it does apply correctly. With git apply you'll have to pass -R as well. Sorry for not being clear the first time. Willy
Re: Problems with haproxy 1.7.3 on FreeBSD 11.0-p8
Dear Willy and Dmitry, Am 06.03.2017 um 11:16 schrieb Willy Tarreau: > with the attachment now (thanks Dmitry) hm, I'm not able to apply the patch: git apply --ignore-space-change --ignore-whitespace 0001-BUG-MEDIUM-tcp-don-t-poll-for-write-when-connect-suc.patch But I get: 0001-BUG-MEDIUM-tcp-don-t-poll-for-write-when-connect-suc.patch:41: trailing whitespace. if (connect(fd, (struct sockaddr *)&conn->addr.to, get_addr_len(&conn->addr.to)) == -1) { 0001-BUG-MEDIUM-tcp-don-t-poll-for-write-when-connect-suc.patch:42: trailing whitespace. if (errno == EINPROGRESS || errno == EALREADY) { 0001-BUG-MEDIUM-tcp-don-t-poll-for-write-when-connect-suc.patch:43: trailing whitespace. /* common case, let's wait for connect status */ 0001-BUG-MEDIUM-tcp-don-t-poll-for-write-when-connect-suc.patch:44: trailing whitespace. conn->flags |= CO_FL_WAIT_L4_CONN; 0001-BUG-MEDIUM-tcp-don-t-poll-for-write-when-connect-suc.patch:45: trailing whitespace. } error: patch failed: src/proto_tcp.c:474 error: src/proto_tcp.c: patch does not apply It is a 1.7.3 version (I just did a make extract in the freebsd port and tried to apply the patch). Gruß Matthias -- "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to produce bigger and better idiots. So far, the universe is winning." -- Rich Cook
Re: Capturing browser TLS cipher suites
On Mon, Mar 06, 2017 at 09:31:40PM +0100, thierry.fourn...@arpalert.org wrote: > You're right, I'm hurry and tired. I dont sew the problem with > comparisons. I think that the attached version is ok. I reviewed all > comments. OK this one looks good. I've just met a build issue here : src/ssl_sock.c: In function 'smp_fetch_ssl_fc_cl_str': src/ssl_sock.c:5716:69: error: operator '!' has no right operand #if (OPENSSL_VERSION_NUMBER >= 0x1000200fL) && !OPENSSL_NO_SSL_TRACE I changed it like this (added "defined()") : #if (OPENSSL_VERSION_NUMBER >= 0x1000200fL) && !defined(OPENSSL_NO_SSL_TRACE) And I've now merged it :-) Thanks, and try to have some rest, I suspect one little guy is keeping you awake too late at night! Willy
Re: HaProxy Hang
Willy, per your comment on /dev/random exhaustion. I think running haveged on servers doing crypto work is/should be best practice. jerry On 3/6/17 12:02 PM, Willy Tarreau wrote: Hi Mark, On Mon, Mar 06, 2017 at 02:49:28PM -0500, Mark S wrote: As for the timing issue, I can add to the discussion with a few related data points. In short, system uptime does not seem to be a commonality to my situation. thanks! 1) I had this issue affect 6 servers, spread across 5 data centers (only 2 servers are in the same facility.) All servers stopped processing requests at roughly the same moment, certainly within the same minute. All servers running FreeBSD 11.0-RELEASE-p2 with HAProxy compiled locally against OpenSSL-1.0.2k OK. 2) System uptime was not at all similar across these servers, although chances are most servers HAProxy process start time would be similar. The servers with the highest system uptime were at about 27 days at the time of the incident, while the shortest were under a day or two. OK so that means that haproxy could have hung in a day or two, then your case is much more common than one of the other reports. If your fdront LB is fair between the 6 servers, that could be related to a total number of requests or connections or something like this. 3) HAProxy configurations are similar, but not exactly consistent between servers - different IPs on the frontend, different ACLs and backends. OK. 4) The only synchronized application common to all of these servers is OpenNTPd. Is there any risk that the ntpd causes time jumps in the future or in the past for whatever reasons ? Maybe there's something with kqueue and time jumps in recent versions ? 5) I have since upgraded to HAProxy-1.7.3, same build process: the full version output is below - and will of course report any observed issues. haproxy -vv HA-Proxy version 1.7.3 2017/02/28 (...) Everything there looks pretty standard. If it dies again it could be good to try with "nokqueue" in the global section (or start haproxy with -dk) to disable kqueue and switch to poll. It will eat a bit more CPU, so don't do this on all nodes at once. I'm thinking about other things : - if you're doing a lot of SSL we could imagine an issue with random generation using /dev/random instead of /dev/urandom. I've met this issue a long time ago on some apache servers where all the entropy was progressively consumed until it was not possible anymore to get a connection. - it could be useful to run "netstat -an" on a dead node before killing haproxy and archive this for later analysis. It may reveal that all file descriptors were used by close_wait connections (indicating a close bug in haproxy) or something like this. If instead you see a lot of FIN_WAIT1 or FIN_WAIT2 it may indicate an issue with some external firewall or pf blocking some final traffic and leading to socket space exhaustion. If you have the same issue that was reported with kevent() being called in loops and returning an error, you may definitely see tons of close_wait and it will indicate an issue with this poller, though I have no idea which one, especially since it doesn't change often and *seems* to work with previous versions. Best regards, Willy -- Soundhound Devops "What could possibly go wrong?"
Re: HaProxy Hang
On Mon, 06 Mar 2017 15:02:43 -0500, Willy Tarreau wrote: OK so that means that haproxy could have hung in a day or two, then your case is much more common than one of the other reports. If your fdront LB is fair between the 6 servers, that could be related to a total number of requests or connections or something like this. Another relevant point is that these servers are tied together using upstream, GeoIP-based DNS load balancing. So the request rate across servers varies quite a bit depending on the location. This would make a synchronized failure based on total requests less likely. I'm thinking about other things : - if you're doing a lot of SSL we could imagine an issue with random generation using /dev/random instead of /dev/urandom. I've met this issue a long time ago on some apache servers where all the entropy was progressively consumed until it was not possible anymore to get a connection. I'll set up a script to capture the netstat and other info prior to reloading should this issue re-occur. As for SSL, yes, we do a fair bit of SSL ( about 30% of total request count ) and HAProxy does the TLS termination and then hands off via TCP proxy. Best, -=Mark S.
Re: HaProxy Hang
Hi Mark, On Mon, Mar 06, 2017 at 02:49:28PM -0500, Mark S wrote: > As for the timing issue, I can add to the discussion with a few related data > points. In short, system uptime does not seem to be a commonality to my > situation. thanks! > 1) I had this issue affect 6 servers, spread across 5 data centers (only 2 > servers are in the same facility.) All servers stopped processing requests > at roughly the same moment, certainly within the same minute. All servers > running FreeBSD 11.0-RELEASE-p2 with HAProxy compiled locally against > OpenSSL-1.0.2k OK. > 2) System uptime was not at all similar across these servers, although > chances are most servers HAProxy process start time would be similar. The > servers with the highest system uptime were at about 27 days at the time of > the incident, while the shortest were under a day or two. OK so that means that haproxy could have hung in a day or two, then your case is much more common than one of the other reports. If your fdront LB is fair between the 6 servers, that could be related to a total number of requests or connections or something like this. > 3) HAProxy configurations are similar, but not exactly consistent between > servers - different IPs on the frontend, different ACLs and backends. OK. > 4) The only synchronized application common to all of these servers is > OpenNTPd. Is there any risk that the ntpd causes time jumps in the future or in the past for whatever reasons ? Maybe there's something with kqueue and time jumps in recent versions ? > 5) I have since upgraded to HAProxy-1.7.3, same build process: the full > version output is below - and will of course report any observed issues. > > haproxy -vv > HA-Proxy version 1.7.3 2017/02/28 (...) Everything there looks pretty standard. If it dies again it could be good to try with "nokqueue" in the global section (or start haproxy with -dk) to disable kqueue and switch to poll. It will eat a bit more CPU, so don't do this on all nodes at once. I'm thinking about other things : - if you're doing a lot of SSL we could imagine an issue with random generation using /dev/random instead of /dev/urandom. I've met this issue a long time ago on some apache servers where all the entropy was progressively consumed until it was not possible anymore to get a connection. - it could be useful to run "netstat -an" on a dead node before killing haproxy and archive this for later analysis. It may reveal that all file descriptors were used by close_wait connections (indicating a close bug in haproxy) or something like this. If instead you see a lot of FIN_WAIT1 or FIN_WAIT2 it may indicate an issue with some external firewall or pf blocking some final traffic and leading to socket space exhaustion. If you have the same issue that was reported with kevent() being called in loops and returning an error, you may definitely see tons of close_wait and it will indicate an issue with this poller, though I have no idea which one, especially since it doesn't change often and *seems* to work with previous versions. Best regards, Willy
Re: Capturing browser TLS cipher suites
On Mon, Mar 06, 2017 at 07:19:00PM +0100, thierry.fourn...@arpalert.org wrote: > Your read my response one minute too early. The right path is in the > second email I sent. Sorry. Thierry, please look below : > On Mon, 6 Mar 2017 18:38:30 +0100 > Willy Tarreau wrote: > > > And below : > > > > > + if (len < rec_len + 4) > > > + return; > > > + msg += 4; > > > + end = msg + rec_len; > > > + if (end <= msg) > > > + return; > > > > This one was still not fixed :-( This > > > > > + > > > + /* Expect 2 bytes for protocol version (1 byte for major and 1 byte > > > + * for minor, the random, composed by 4 bytes for the unix time and > > > + * 28 bytes for unix payload, and them 1 byte for the session id. So > > > + * we jump 1 + 1 + 4 + 28 + 1 bytes. > > > + */ > > > + msg += 1 + 1 + 4 + 28 + 1; > > > + if (msg >= end) > > > + return; > > > > This one neither :-( And this. And now below, your latest patch : > From c0bf9fcf4e78d65641a589083ddca14377c620fd Mon Sep 17 00:00:00 2001 > From: Thierry FOURNIER > Date: Sat, 25 Feb 2017 12:45:22 +0100 > Subject: [PATCH 2/2] MEDIUM: ssl: add new sample-fetch which captures the > cipherlist > (...) > + msg += 4; > + end = msg + rec_len; > + if (end <= msg) > + return; This. > + /* Expect 2 bytes for protocol version (1 byte for major and 1 byte > + * for minor, the random, composed by 4 bytes for the unix time and > + * 28 bytes for unix payload, and them 1 byte for the session id. So > + * we jump 1 + 1 + 4 + 28 + 1 bytes. > + */ > + msg += 1 + 1 + 4 + 28 + 1; > + if (msg >= end) > + return; And this. As you can see, these ones were left unchanged. It's the 4th (5th?) time I'm reading the whole patch to check if all comments were properly addressed. That's not acceptable. As you know I'm fine if you disagree with my comments and just say that I'm wrong or to go f*ck myself because I'm too retarded to read your code, that's perfect. But what really irritates me is that I spend a lot of time reading code and making comments twice (three times with this one) and this time is a pure waste because you didn't even *read* them. This is exactly what discourages anyone from reviewing code. *all points* in a review have to be addressed or contested. By sending the "fixed" version you claim that you addressed everything which is false. That's really bad because now I don't trust your patches anymore and I have to read them fully again just in case you developped them in a hurry. If you don't have time, I prefer that you say that you'll post an update later than making me read the same unfixed code multiple times :-( Willy
Re: HaProxy Hang
On Mon, 06 Mar 2017 01:35:19 -0500, Willy Tarreau wrote: On Fri, Mar 03, 2017 at 07:54:46PM +0300, Dmitry Sivachenko wrote: > On 03 Mar 2017, at 19:36, David King wrote: > > Thanks for the response! > Thats interesting, i don't suppose you have the details of the other issues? First report is https://www.mail-archive.com/haproxy@formilux.org/msg25060.html Second one https://www.mail-archive.com/haproxy@formilux.org/msg25067.html Thanks for the links Dmitry. That's indeed really odd. If all hang at the same time, timing or uptime looks like a good candidate. There's not much which is really specific to FreeBSD in haproxy. However, the kqueue poller is only used there (and on OpenBSD), and uses timing for the timeout. Thus it sounds likely that there could be an issue there, either in haproxy or FreeBSD. A hang every 2-3 months makes me think about the 49.7 days it takes for a millisecond counter to wrap. These bugs are hard to troubleshoot. We used to have such an issue a long time ago in linux 2.4 when the timer was set to 100 Hz, it required 497 days to know whether the bug was solved or not (obviously it now is). I've just compared ev_epoll.c and ev_kqueue.c in case I could spot anything obvious but from what I'm seeing they're pretty much similar so I don't see what there could cause this bug. And since it apparently works fine on FreeBSD 10, at best one of our bugs could only trigger a system bug if it exists. David, if your workload permits it, you can disable kqueue and haproxy will automatically fall back to poll. For this you can simply put "nokqueue" in the global section. poll() doesn't scale as well as kqueue(), it's cheaper on low connection counts but it will use more CPU above ~1000 concurrent connections. Regards, Willy Hi Willy, As for the timing issue, I can add to the discussion with a few related data points. In short, system uptime does not seem to be a commonality to my situation. 1) I had this issue affect 6 servers, spread across 5 data centers (only 2 servers are in the same facility.) All servers stopped processing requests at roughly the same moment, certainly within the same minute. All servers running FreeBSD 11.0-RELEASE-p2 with HAProxy compiled locally against OpenSSL-1.0.2k 2) System uptime was not at all similar across these servers, although chances are most servers HAProxy process start time would be similar. The servers with the highest system uptime were at about 27 days at the time of the incident, while the shortest were under a day or two. 3) HAProxy configurations are similar, but not exactly consistent between servers - different IPs on the frontend, different ACLs and backends. 4) The only synchronized application common to all of these servers is OpenNTPd. 5) I have since upgraded to HAProxy-1.7.3, same build process: the full version output is below - and will of course report any observed issues. haproxy -vv HA-Proxy version 1.7.3 2017/02/28 Copyright 2000-2017 Willy Tarreau Build options : TARGET = freebsd CPU = generic CC = clang CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement OPTIONS = USE_OPENSSL=1 USE_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Encrypted password support via crypt(3): yes Built without compression support (neither USE_ZLIB nor USE_SLZ are set) Compression algorithms supported : identity("identity") Built with OpenSSL version : OpenSSL 1.0.2k 26 Jan 2017 Running on OpenSSL version : OpenSSL 1.0.2k 26 Jan 2017 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports prefer-server-ciphers : yes Built with PCRE version : 8.39 2016-06-14 Running on PCRE version : 8.39 2016-06-14 PCRE library supports JIT : no (USE_PCRE_JIT not set) Built without Lua support Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use kqueue. Available filters : [SPOE] spoe [TRACE] trace [COMP] compression Cheers, -=Mark
Re: Capturing browser TLS cipher suites
Your read my response one minute too early. The right path is in the second email I sent. Sorry. On Mon, 6 Mar 2017 18:38:30 +0100 Willy Tarreau wrote: > On Mon, Mar 06, 2017 at 06:30:33PM +0100, thierry.fourn...@arpalert.org wrote: > > > > + /* Next three bytes are the length of the message. The total > > > > length > > > > +* must be this decoded length + 4. If the length given as > > > > argument > > > > +* is not the same, we abort the protocol dissector. > > > > +*/ > > > > + rec_len = (msg[1] << 3) + (msg[2] << 2) + msg[3]; > > > > > > Here. The correct statement is : > > > > > > rec_len = msg[1] * 65536 + msg[2] * 256 + msg[3]; > > > > > > (or << 16, << 8) > > But Thierry, are you doing it on purpose to annoy me ? It's the third > time you get it wrong after I propose the correct version, as you can > see with your version below it's still wrong and differs from the two > proposed versions above : > > > + rec_len = (msg[1] << 24) + (msg[2] << 16) + msg[3]; > > And below : > > > + if (len < rec_len + 4) > > + return; > > + msg += 4; > > + end = msg + rec_len; > > + if (end <= msg) > > + return; > > This one was still not fixed :-( > > > + > > + /* Expect 2 bytes for protocol version (1 byte for major and 1 byte > > +* for minor, the random, composed by 4 bytes for the unix time and > > +* 28 bytes for unix payload, and them 1 byte for the session id. So > > +* we jump 1 + 1 + 4 + 28 + 1 bytes. > > +*/ > > + msg += 1 + 1 + 4 + 28 + 1; > > + if (msg >= end) > > + return; > > This one neither :-( > > > + > > + /* Next two bytes are the ciphersuite length. */ > > + if (msg + 2 > end) > > + return; > > + rec_len = (msg[0] << 16) + msg[1]; > > This one is still wrong as well :-( > > Please double-check next time, it's time consuming to re-read the same > bugs between two versions, each time I have to reread the whole patch. > > Willy >From c0bf9fcf4e78d65641a589083ddca14377c620fd Mon Sep 17 00:00:00 2001 From: Thierry FOURNIER Date: Sat, 25 Feb 2017 12:45:22 +0100 Subject: [PATCH 2/2] MEDIUM: ssl: add new sample-fetch which captures the cipherlist X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 This new sample-fetches captures the cipher list offer by the client SSL connection during the client-hello phase. This is useful for fingerprint the SSL connection. --- doc/configuration.txt | 32 ++ src/ssl_sock.c| 287 + 2 files changed, 319 insertions(+) diff --git a/doc/configuration.txt b/doc/configuration.txt index 25167cc..aaffa38 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -618,6 +618,7 @@ The following keywords are supported in the "global" section : - tune.ssl.maxrecord - tune.ssl.default-dh-param - tune.ssl.ssl-ctx-cache-size + - tune.ssl.capture-cipherlist-size - tune.vars.global-max-size - tune.vars.proc-max-size - tune.vars.reqres-max-size @@ -1502,6 +1503,11 @@ tune.ssl.ssl-ctx-cache-size dynamically is expensive, they are cached. The default cache size is set to 1000 entries. +tune.ssl.capture-cipherlist-size + Sets the maximum size of the buffer used for capturing client-hello cipher + list. If the value is 0 (default value) the capture is disabled, otherwise + a buffer is allocated for each SSL/TLS connection. + tune.vars.global-max-size tune.vars.proc-max-size tune.vars.reqres-max-size @@ -13871,6 +13877,32 @@ ssl_fc_cipher : string Returns the name of the used cipher when the incoming connection was made over an SSL/TLS transport layer. +ssl_fc_cipherlist_bin : binary + Returns the binary form of the client hello cipher list. The maximum returned + value length is according with the value of + "tune.ssl.capture-cipherlist-size". Note that this sample-fetch is available + only with OpenSSL > 0.9.7 + +ssl_fc_cipherlist_hex : string + Returns the binary form of the client hello cipher list encoded as + hexadecimal. The maximum returned value length is according with the value of + "tune.ssl.capture-cipherlist-size". Note that this sample-fetch is available + only with OpenSSL > 0.9.7 + +ssl_fc_cipherlist_str : string + Returns the decoded text form of the client hello cipher list. The maximum + number of ciphers returned is according with the value of + "tune.ssl.capture-cipherlist-size". Note that this sample-fetch is only + avaible with OpenSSL > 1.0.2 compiled with the option enable-ssl-trace. + If the function is not enabled, this sample-fetch returns the hash + like "ssl_fc_cipherlist_xxh". + +ssl_fc_cipherlist_xxh : integer + Returns a xxh64 of the cipher list. This hash can be return only is the value + "tune.ssl.capture-cipherlist-size" is set greater than 0, however the hash + take in account all the data of the cipher list. Note that this sample-fetch is + avalaible only
Re: [PATCH] BUG/MEDIUM: ssl: in bind line, ssl-options after 'crt' are ignored.
On Mon, Mar 06, 2017 at 04:50:02PM +0100, Emmanuel Hocdet wrote: > This fix is for current 1.8dev with "MEDIUM: ssl: remove ssl-options from > crt-list » apply. Strangely it refuses to apply to ssl_sock.c. 14 of 14 hunks rejected. I tried by hand (patch -p1, patch -lp1), same result. I don't understand why, the code looks the same, maybe mangled spaces ? (shouldn't be mangled as it's sent as an attachment). I'd rather avoid having to copy-paste each of them one at a time, so if you could spot what causes this I would appreciate it, because for now I'm stumped. Thanks! Willy
Re: Capturing browser TLS cipher suites
Hi, This is the new patch without bug. The previous it was too quicly tested. Thierry On Mon, 6 Mar 2017 18:30:33 +0100 thierry.fourn...@arpalert.org wrote: > On Mon, 6 Mar 2017 12:35:47 +0100 > Willy Tarreau wrote: > > > Hi Thierry, > > > > On Sat, Feb 25, 2017 at 01:01:54PM +0100, thierry.fourn...@arpalert.org > > wrote: > > > The patch implementing this idea is in attachment. It returns the > > > client-hello cioher list as binary, hexadecimal string, xxh64 and with > > > the decoded ciphers. > > > > Is this supposed to be the last version ? I'm asking because it's still > > bogus regarding the length calculation : > > > > > +static inline > > > +void ssl_sock_parse_clienthello(int write_p, int version, int > > > content_type, > > > +const void *buf, size_t len, > > > +struct ssl_capture *capture) > > > +{ > > > + unsigned char *msg; > > > + unsigned char *end; > > > + unsigned int rec_len; > > > + > > > + /* This function is called for "from client" and "to server" > > > + * connections. The combination of write_p == 0 and content_type == 22 > > > + * is only avalaible during "from client" connection. > > > + */ > > > + > > > + /* "write_p" is set to 0 is the bytes are received messages, > > > + * otherwise it is set to 1. > > > + */ > > > + if (write_p != 0) > > > + return; > > > + > > > + /* content_type contains the type of message received or sent > > > + * according with the SSL/TLS protocol spec. This message is > > > + * encoded with one byte. The value 256 (two bytes) is used > > > + * for designing the SSL/TLS record layer. According with the > > > + * rfc6101, the expected message (other than 256) are: > > > + * - change_cipher_spec(20) > > > + * - alert(21) > > > + * - handshake(22) > > > + * - application_data(23) > > > + * - (255) > > > + * We are interessed by the handshake and specially the client > > > + * hello. > > > + */ > > > + if (content_type != 22) > > > + return; > > > + > > > + /* The message length is at least 4 bytes, containing the > > > + * message type and the message length. > > > + */ > > > + if (len < 4) > > > + return; > > > + > > > + /* First byte of the handshake message id the type of > > > + * message. The konwn types are: > > > + * - hello_request(0) > > > + * - client_hello(1) > > > + * - server_hello(2) > > > + * - certificate(11) > > > + * - server_key_exchange (12) > > > + * - certificate_request(13) > > > + * - server_hello_done(14) > > > + * We are interested by the client hello. > > > + */ > > > + msg = (unsigned char *)buf; > > > + if (msg[0] != 1) > > > + return; > > > + > > > + /* Next three bytes are the length of the message. The total length > > > + * must be this decoded length + 4. If the length given as argument > > > + * is not the same, we abort the protocol dissector. > > > + */ > > > + rec_len = (msg[1] << 3) + (msg[2] << 2) + msg[3]; > > > > Here. The correct statement is : > > > > rec_len = msg[1] * 65536 + msg[2] * 256 + msg[3]; > > > > (or << 16, << 8) > > > Arg ! you're right ! Thanks for the review. > > > > > > > > > + if (len < rec_len + 4) > > > + return; > > > + msg += 4; > > > + end = msg + rec_len; > > > + if (end <= msg) > > > + return; > > > > This one looks wrong as it prevents rec_len from being NULL, the > > correct overflow test is if (end < msg). > > > > > + /* Expect 2 bytes for protocol version (1 byte for major and 1 byte > > > + * for minor, the random, composed by 4 bytes for the unix time and > > > + * 28 bytes for unix payload, and them 1 byte for the session id. So > > > + * we jump 1 + 1 + 4 + 28 + 1 bytes. > > > + */ > > > + msg += 1 + 1 + 4 + 28 + 1; > > > + if (msg >= end) > > > + return; > > > > It seems like this one should be "if (msg > end)" given that it accounts for > > a length. However given that it's covered by the next one, maybe it can > > simply be dropped. > > > > > + /* Next two bytes are the ciphersuite length. */ > > > + if (msg + 2 > end) > > > + return; > > > + rec_len = (msg[0] << 2) + msg[1]; > > > > Wrong shift again. > > > Thanks, a new patch in attachment. > > > > > > > + msg += 2; > > > + if (msg + rec_len > end || msg + rec_len < msg) > > > + return; > > > + > > > + /* Compute the xxh64 of the ciphersuite. */ > > > + capture->xxh64 = XXH64(msg, rec_len, 0); > > > + > > > + /* Capture the ciphersuite. */ > > > + capture->ciphersuite_len = rec_len; > > > + if (capture->ciphersuite_len > global_ssl.capture_cipherlist) > > > + capture->ciphersuite_len = global_ssl.capture_cipherlist; > > > + memcpy(capture->ciphersuite, msg, capture->ciphersuite_len); > > > +} > > > + > > > > The rest looks OK though. Just let me know. > > > > Thanks, > > Willy > > >From c0bf9fcf4e78d65641a589083ddca14377c620fd Mon Sep 17 00:00:00 2001 From: Thierry FOURNIER Date: Sat, 25 Feb 20
Re: Capturing browser TLS cipher suites
On Mon, Mar 06, 2017 at 06:30:33PM +0100, thierry.fourn...@arpalert.org wrote: > > > + /* Next three bytes are the length of the message. The total length > > > + * must be this decoded length + 4. If the length given as argument > > > + * is not the same, we abort the protocol dissector. > > > + */ > > > + rec_len = (msg[1] << 3) + (msg[2] << 2) + msg[3]; > > > > Here. The correct statement is : > > > > rec_len = msg[1] * 65536 + msg[2] * 256 + msg[3]; > > > > (or << 16, << 8) But Thierry, are you doing it on purpose to annoy me ? It's the third time you get it wrong after I propose the correct version, as you can see with your version below it's still wrong and differs from the two proposed versions above : > + rec_len = (msg[1] << 24) + (msg[2] << 16) + msg[3]; And below : > + if (len < rec_len + 4) > + return; > + msg += 4; > + end = msg + rec_len; > + if (end <= msg) > + return; This one was still not fixed :-( > + > + /* Expect 2 bytes for protocol version (1 byte for major and 1 byte > + * for minor, the random, composed by 4 bytes for the unix time and > + * 28 bytes for unix payload, and them 1 byte for the session id. So > + * we jump 1 + 1 + 4 + 28 + 1 bytes. > + */ > + msg += 1 + 1 + 4 + 28 + 1; > + if (msg >= end) > + return; This one neither :-( > + > + /* Next two bytes are the ciphersuite length. */ > + if (msg + 2 > end) > + return; > + rec_len = (msg[0] << 16) + msg[1]; This one is still wrong as well :-( Please double-check next time, it's time consuming to re-read the same bugs between two versions, each time I have to reread the whole patch. Willy
Re: Capturing browser TLS cipher suites
On Mon, Mar 06, 2017 at 06:30:34PM +0100, thierry.fourn...@arpalert.org wrote: > On Mon, 6 Mar 2017 14:54:44 +0100 > Emmanuel Hocdet wrote: > > xxh64 is not a fingerprint class algorithme, sha256 should be use. > > > Hi Manu, > > My choice is driven regarding these hash algorithm elements: > > - The repartiion of the hash (and the collision risk) > > - The execution time. > > I choosed xxh64 because it is very quick, the repartion is good and the > collision risk is low. Obviously sha1 is better because the collision > risk is very low, but is very slow. So I prefer xxh64. Yep and also in the end we only keep 32 or 64 bit of the resulting hash, we're not doing crypto here. The typical use case is to have a reasonably good indication whether two very large cipher lists are similar or not without storing them. Willy
Re: Capturing browser TLS cipher suites
On Mon, 6 Mar 2017 12:35:47 +0100 Willy Tarreau wrote: > Hi Thierry, > > On Sat, Feb 25, 2017 at 01:01:54PM +0100, thierry.fourn...@arpalert.org wrote: > > The patch implementing this idea is in attachment. It returns the > > client-hello cioher list as binary, hexadecimal string, xxh64 and with > > the decoded ciphers. > > Is this supposed to be the last version ? I'm asking because it's still > bogus regarding the length calculation : > > > +static inline > > +void ssl_sock_parse_clienthello(int write_p, int version, int content_type, > > +const void *buf, size_t len, > > +struct ssl_capture *capture) > > +{ > > + unsigned char *msg; > > + unsigned char *end; > > + unsigned int rec_len; > > + > > + /* This function is called for "from client" and "to server" > > +* connections. The combination of write_p == 0 and content_type == 22 > > +* is only avalaible during "from client" connection. > > +*/ > > + > > + /* "write_p" is set to 0 is the bytes are received messages, > > +* otherwise it is set to 1. > > +*/ > > + if (write_p != 0) > > + return; > > + > > + /* content_type contains the type of message received or sent > > +* according with the SSL/TLS protocol spec. This message is > > +* encoded with one byte. The value 256 (two bytes) is used > > +* for designing the SSL/TLS record layer. According with the > > +* rfc6101, the expected message (other than 256) are: > > +* - change_cipher_spec(20) > > +* - alert(21) > > +* - handshake(22) > > +* - application_data(23) > > +* - (255) > > +* We are interessed by the handshake and specially the client > > +* hello. > > +*/ > > + if (content_type != 22) > > + return; > > + > > + /* The message length is at least 4 bytes, containing the > > +* message type and the message length. > > +*/ > > + if (len < 4) > > + return; > > + > > + /* First byte of the handshake message id the type of > > +* message. The konwn types are: > > +* - hello_request(0) > > +* - client_hello(1) > > +* - server_hello(2) > > +* - certificate(11) > > +* - server_key_exchange (12) > > +* - certificate_request(13) > > +* - server_hello_done(14) > > +* We are interested by the client hello. > > +*/ > > + msg = (unsigned char *)buf; > > + if (msg[0] != 1) > > + return; > > + > > + /* Next three bytes are the length of the message. The total length > > +* must be this decoded length + 4. If the length given as argument > > +* is not the same, we abort the protocol dissector. > > +*/ > > + rec_len = (msg[1] << 3) + (msg[2] << 2) + msg[3]; > > Here. The correct statement is : > > rec_len = msg[1] * 65536 + msg[2] * 256 + msg[3]; > > (or << 16, << 8) Arg ! you're right ! Thanks for the review. > > > > + if (len < rec_len + 4) > > + return; > > + msg += 4; > > + end = msg + rec_len; > > + if (end <= msg) > > + return; > > This one looks wrong as it prevents rec_len from being NULL, the > correct overflow test is if (end < msg). > > > + /* Expect 2 bytes for protocol version (1 byte for major and 1 byte > > +* for minor, the random, composed by 4 bytes for the unix time and > > +* 28 bytes for unix payload, and them 1 byte for the session id. So > > +* we jump 1 + 1 + 4 + 28 + 1 bytes. > > +*/ > > + msg += 1 + 1 + 4 + 28 + 1; > > + if (msg >= end) > > + return; > > It seems like this one should be "if (msg > end)" given that it accounts for > a length. However given that it's covered by the next one, maybe it can > simply be dropped. > > > + /* Next two bytes are the ciphersuite length. */ > > + if (msg + 2 > end) > > + return; > > + rec_len = (msg[0] << 2) + msg[1]; > > Wrong shift again. Thanks, a new patch in attachment. > > > + msg += 2; > > + if (msg + rec_len > end || msg + rec_len < msg) > > + return; > > + > > + /* Compute the xxh64 of the ciphersuite. */ > > + capture->xxh64 = XXH64(msg, rec_len, 0); > > + > > + /* Capture the ciphersuite. */ > > + capture->ciphersuite_len = rec_len; > > + if (capture->ciphersuite_len > global_ssl.capture_cipherlist) > > + capture->ciphersuite_len = global_ssl.capture_cipherlist; > > + memcpy(capture->ciphersuite, msg, capture->ciphersuite_len); > > +} > > + > > The rest looks OK though. Just let me know. > > Thanks, > Willy > >From ba0e6354cbfd6d70c26eacfc7010a9735aaf90c2 Mon Sep 17 00:00:00 2001 From: Thierry FOURNIER Date: Sat, 25 Feb 2017 12:45:22 +0100 Subject: [PATCH 2/2] MEDIUM: ssl: add new sample-fetch which captures the cipherlist X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 This new sample-fetches captures the cipher list offer by the client SSL connection during the client-hello phase. This is useful
Re: Capturing browser TLS cipher suites
On Mon, 6 Mar 2017 14:54:44 +0100 Emmanuel Hocdet wrote: > Hi Thierry > > > Le 25 févr. 2017 à 13:01, thierry.fourn...@arpalert.org a écrit : > > > > Hi all, > > > > On Thu, 9 Feb 2017 07:37:51 +0100 > > Willy Tarreau wrote: > > > >> Hi Olivier, > >> > >> On Sat, Feb 04, 2017 at 11:52:30AM +0100, Olivier Doucet wrote: > >>> Hello, > >>> > >>> I'm trying to capture the cipher suites sent by browser when negociating > >>> the encryption level with HAProxy. > >>> Digging into the haproxy doc, I can already find the TLS version and > >>> cipher > >>> used (variables %sslc and %sslv), but not the complete list of ciphers > >>> sent > >>> by the browser. > >>> > >>> Why such information ? This could be used as a method of fingerprintin ! > >>> For example, finding malware that emulates a browser. Such malwares could > >>> be spotted by comparing the user-agent field (on http level) with the > >>> cipher suites used (and how the are ordered) and see if they match. An > >>> example of implementation could be found here : > >>> https://www.securityartwork.es/2017/02/02/tls-client-fingerprinting-with-bro/ > >> > >> That's an interesting idea! I'm not sure how accurate it can be since > >> users can change their ciphers in their browser's config, and even the > >> list of negociated TLS versions (I do it personally). > > > > > > Yes, it is interesting ! > > > > > >>> Is this even possible with HAProxy ? > >> > >> I'm not sure. I don't even know if openssl exposes this. However if you > >> want to do this on the TCP connection only (without deciphering), you > >> could possibly extend the SSL client hello parser to emit the list of > >> such ciphers as a string. > > > > > > The patch implementing this idea is in attachment. It returns the > > client-hello cioher list as binary, hexadecimal string, xxh64 and with > > the decoded ciphers. > > xxh64 is not a fingerprint class algorithme, sha256 should be use. Hi Manu, My choice is driven regarding these hash algorithm elements: - The repartiion of the hash (and the collision risk) - The execution time. I choosed xxh64 because it is very quick, the repartion is good and the collision risk is low. Obviously sha1 is better because the collision risk is very low, but is very slow. So I prefer xxh64. I think that a low collision risk is acceptable because this hash is simply used for differenciating two different stack from two different client devices. If an attacker found two different cipher lists with the same xxh64, I dont care because this is not a risk. Maybe I can make this choice configurable, but it impact the memory allocation and implies a new "tune.*" directive. Thierry > Manu > > >
[PATCH] BUG/MEDIUM: ssl: in bind line, ssl-options after 'crt' are ignored.
This fix is for current 1.8dev with "MEDIUM: ssl: remove ssl-options from crt-list » apply. 0001-BUG-MEDIUM-ssl-in-bind-line-ssl-options-after-crt-ar.patch Description: Binary data
Re: pre-connect header problem
On 06/03/2017 14:45, Simon E. Silva Lauinger wrote: bind *:443 name *:443 ssl crt /path/to/cert.pem mode tcp Did you also try with mode http on the frontend? .marcoc
Re: HTTP 429 Too Many Requests (tarpit deny_status)
Hi Willy, On Fri, Feb 10, Willy Tarreau wrote: > > How should I send the patches ? One commit for > > http_server_error/http_get_status_idx changes and tarpit deny_status > > parser / doc in another commit ? > > Yes that's the prefered way to do it, one commit per architecture or > functional change to ease review and bug tracking later. I'm including two commits for this feature. - 0001-MEDIUM-http_error_message-txn-status-http_get_status.patch Removes second argument from http_error_message and uses txn->status / http_get_status_idx to map the 200..504 status code to HTTP_ERR_200..504 enum. (http_get_status_idx has default return value of HTTP_ERR_500, is this ok ?) - 0002-MINOR-http-request-tarpit-deny_status.patch Adds http-request tarpit deny_status functionality. (depends on the 0001-... commit). Alternative implementation (0001-) for http_return_srv_error could be something like this: void http_return_srv_error(struct stream *s, struct stream_interface *si) { int err_type = si->err_type; int send_msg = 1; if (err_type & SI_ET_QUEUE_ABRT) http_server_error(s, si, SF_ERR_CLICL, SF_FINST_Q, 503, NULL); else if (err_type & SI_ET_CONN_ABRT) { if (s->txn->flags & TX_NOT_FIRST) send_msg = 0; http_server_error(s, si, SF_ERR_CLICL, SF_FINST_C, 503, NULL); } ... ... else /* SI_ET_CONN_OTHER and others */ http_server_error(s, si, SF_ERR_INTERNAL, SF_FINST_C, 500, NULL); if (send_msg) bo_putchk(s->res.buf, http_error_message(s)); } -Jarno -- Jarno Huuskonen >From 1f8f67d28d44e1fc43a255f61b70437dc5fdacbb Mon Sep 17 00:00:00 2001 From: Jarno Huuskonen Date: Mon, 6 Mar 2017 14:21:49 +0200 Subject: [PATCH 1/2] MEDIUM: http_error_message: txn->status / http_get_status_idx. X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 This commit removes second argument(msgnum) from http_error_message and changes http_error_message to use s->txn->status/http_get_status_idx for mapping status code from 200..504 to HTTP_ERR_200..HTTP_ERR_504(enum). This is needed for http-request tarpit deny_status commit. --- include/proto/proto_http.h | 2 +- src/filters.c | 2 +- src/proto_http.c | 93 ++ 3 files changed, 62 insertions(+), 35 deletions(-) diff --git a/include/proto/proto_http.h b/include/proto/proto_http.h index 6c81766..9409df3 100644 --- a/include/proto/proto_http.h +++ b/include/proto/proto_http.h @@ -136,7 +136,7 @@ struct act_rule *parse_http_res_cond(const char **args, const char *file, int li void free_http_req_rules(struct list *r); void free_http_res_rules(struct list *r); void http_reply_and_close(struct stream *s, short status, struct chunk *msg); -struct chunk *http_error_message(struct stream *s, int msgnum); +struct chunk *http_error_message(struct stream *s); struct redirect_rule *http_parse_redirect_rule(const char *file, int linenum, struct proxy *curproxy, const char **args, char **errmsg, int use_fmt, int dir); int smp_fetch_cookie(const struct arg *args, struct sample *smp, const char *kw, void *private); diff --git a/src/filters.c b/src/filters.c index 9ec794a..cafc449 100644 --- a/src/filters.c +++ b/src/filters.c @@ -1069,7 +1069,7 @@ handle_analyzer_result(struct stream *s, struct channel *chn, http_reply_and_close(s, s->txn->status, NULL); else { s->txn->status = 400; - http_reply_and_close(s, 400, http_error_message(s, HTTP_ERR_400)); + http_reply_and_close(s, 400, http_error_message(s)); } } diff --git a/src/proto_http.c b/src/proto_http.c index 2d567c1..9239823 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -366,6 +366,26 @@ const char *get_reason(unsigned int status) } } +/* This function returns HTTP_ERR_ (enum) matching http status code. + * Returned value should match codes from http_err_codes. + */ +static const int http_get_status_idx(unsigned int status) +{ + switch (status) { + case 200: return HTTP_ERR_200; + case 400: return HTTP_ERR_400; + case 403: return HTTP_ERR_403; + case 405: return HTTP_ERR_405; + case 408: return HTTP_ERR_408; + case 429: return HTTP_ERR_429; + case 500: return HTTP_ERR_500; + case 502: return HTTP_ERR_502; + case 503: return HTTP_ERR_503; + case 504: return HTTP_ERR_504; + default: return HTTP_ERR_500; + } +} + void init_proto_http() { int i; @@ -1031,10 +1051,10 @@ static void http_server_error(struct stream *s, struct stream_interface *si, channel_erase(si_oc(si)); channel_auto_close(si_ic(si)); channel_auto_read(si_ic(si)); - if (status > 0 && msg) { + if (status > 0) s->txn->status = status; + if (msg)
Re: Capturing browser TLS cipher suites
Hi Thierry > Le 25 févr. 2017 à 13:01, thierry.fourn...@arpalert.org a écrit : > > Hi all, > > On Thu, 9 Feb 2017 07:37:51 +0100 > Willy Tarreau wrote: > >> Hi Olivier, >> >> On Sat, Feb 04, 2017 at 11:52:30AM +0100, Olivier Doucet wrote: >>> Hello, >>> >>> I'm trying to capture the cipher suites sent by browser when negociating >>> the encryption level with HAProxy. >>> Digging into the haproxy doc, I can already find the TLS version and cipher >>> used (variables %sslc and %sslv), but not the complete list of ciphers sent >>> by the browser. >>> >>> Why such information ? This could be used as a method of fingerprintin ! >>> For example, finding malware that emulates a browser. Such malwares could >>> be spotted by comparing the user-agent field (on http level) with the >>> cipher suites used (and how the are ordered) and see if they match. An >>> example of implementation could be found here : >>> https://www.securityartwork.es/2017/02/02/tls-client-fingerprinting-with-bro/ >> >> That's an interesting idea! I'm not sure how accurate it can be since >> users can change their ciphers in their browser's config, and even the >> list of negociated TLS versions (I do it personally). > > > Yes, it is interesting ! > > >>> Is this even possible with HAProxy ? >> >> I'm not sure. I don't even know if openssl exposes this. However if you >> want to do this on the TCP connection only (without deciphering), you >> could possibly extend the SSL client hello parser to emit the list of >> such ciphers as a string. > > > The patch implementing this idea is in attachment. It returns the > client-hello cioher list as binary, hexadecimal string, xxh64 and with > the decoded ciphers. xxh64 is not a fingerprint class algorithme, sha256 should be use. Manu
pre-connect header problem
Hallo HAProxy community, I have a problem with chrome's pre-connect feature in conjunction with haproxy's "choose-backend-by-acl" feature. I really hope sombody can help me. Here a simplified version of my config: frontend https bind *:443 name *:443 ssl crt /path/to/cert.pem mode tcp acl acl_58a704192fda06.42911106 hdr_end(host) -i .example.net use_backend _.example.net if acl_58a704192fda06.42911106 acl acl_58ad6d70f2d8f9.76023768 hdr(host) -i example.net use_backend example.net if acl_58ad6d70f2d8f9.76023768 backend _.example.net mode http server _.example.net 1.2.3.4:443 ssl verify none backend example.net mode http server example.net 5.6.7.8:443 ssl verify none The problem is that in Chrome random files are not being received. I think it has to do with Chromes pre-connect "feature". Chrome opens a HTTP-connection without sending anything, even no http header. HAProxy tries to assign the request to a backend using the not available host header, so it fails. Does somebody know how to deal with this problem? Can I say HAProxy to wait for data before trying to assign the request to a backend? Thank you for your help! Regards, Simon
Re: Capturing browser TLS cipher suites
Hi Thierry, On Sat, Feb 25, 2017 at 01:01:54PM +0100, thierry.fourn...@arpalert.org wrote: > The patch implementing this idea is in attachment. It returns the > client-hello cioher list as binary, hexadecimal string, xxh64 and with > the decoded ciphers. Is this supposed to be the last version ? I'm asking because it's still bogus regarding the length calculation : > +static inline > +void ssl_sock_parse_clienthello(int write_p, int version, int content_type, > +const void *buf, size_t len, > +struct ssl_capture *capture) > +{ > + unsigned char *msg; > + unsigned char *end; > + unsigned int rec_len; > + > + /* This function is called for "from client" and "to server" > + * connections. The combination of write_p == 0 and content_type == 22 > + * is only avalaible during "from client" connection. > + */ > + > + /* "write_p" is set to 0 is the bytes are received messages, > + * otherwise it is set to 1. > + */ > + if (write_p != 0) > + return; > + > + /* content_type contains the type of message received or sent > + * according with the SSL/TLS protocol spec. This message is > + * encoded with one byte. The value 256 (two bytes) is used > + * for designing the SSL/TLS record layer. According with the > + * rfc6101, the expected message (other than 256) are: > + * - change_cipher_spec(20) > + * - alert(21) > + * - handshake(22) > + * - application_data(23) > + * - (255) > + * We are interessed by the handshake and specially the client > + * hello. > + */ > + if (content_type != 22) > + return; > + > + /* The message length is at least 4 bytes, containing the > + * message type and the message length. > + */ > + if (len < 4) > + return; > + > + /* First byte of the handshake message id the type of > + * message. The konwn types are: > + * - hello_request(0) > + * - client_hello(1) > + * - server_hello(2) > + * - certificate(11) > + * - server_key_exchange (12) > + * - certificate_request(13) > + * - server_hello_done(14) > + * We are interested by the client hello. > + */ > + msg = (unsigned char *)buf; > + if (msg[0] != 1) > + return; > + > + /* Next three bytes are the length of the message. The total length > + * must be this decoded length + 4. If the length given as argument > + * is not the same, we abort the protocol dissector. > + */ > + rec_len = (msg[1] << 3) + (msg[2] << 2) + msg[3]; Here. The correct statement is : rec_len = msg[1] * 65536 + msg[2] * 256 + msg[3]; (or << 16, << 8) > + if (len < rec_len + 4) > + return; > + msg += 4; > + end = msg + rec_len; > + if (end <= msg) > + return; This one looks wrong as it prevents rec_len from being NULL, the correct overflow test is if (end < msg). > + /* Expect 2 bytes for protocol version (1 byte for major and 1 byte > + * for minor, the random, composed by 4 bytes for the unix time and > + * 28 bytes for unix payload, and them 1 byte for the session id. So > + * we jump 1 + 1 + 4 + 28 + 1 bytes. > + */ > + msg += 1 + 1 + 4 + 28 + 1; > + if (msg >= end) > + return; It seems like this one should be "if (msg > end)" given that it accounts for a length. However given that it's covered by the next one, maybe it can simply be dropped. > + /* Next two bytes are the ciphersuite length. */ > + if (msg + 2 > end) > + return; > + rec_len = (msg[0] << 2) + msg[1]; Wrong shift again. > + msg += 2; > + if (msg + rec_len > end || msg + rec_len < msg) > + return; > + > + /* Compute the xxh64 of the ciphersuite. */ > + capture->xxh64 = XXH64(msg, rec_len, 0); > + > + /* Capture the ciphersuite. */ > + capture->ciphersuite_len = rec_len; > + if (capture->ciphersuite_len > global_ssl.capture_cipherlist) > + capture->ciphersuite_len = global_ssl.capture_cipherlist; > + memcpy(capture->ciphersuite, msg, capture->ciphersuite_len); > +} > + The rest looks OK though. Just let me know. Thanks, Willy
Re: Problems with haproxy 1.7.3 on FreeBSD 11.0-p8
with the attachment now (thanks Dmitry) On Mon, Mar 06, 2017 at 10:44:56AM +0100, Willy Tarreau wrote: > On Mon, Mar 06, 2017 at 09:59:21AM +0100, Matthias Fechner wrote: > > Hi Georg, > > > > Am 06.03.2017 um 09:43 schrieb Georg Faerber: > > > I'm not running FreeBSD myself, but have a look at [1]: In the > > > follow-ups to this thread there are two more people reporting problems. > > > > > > [1] https://www.mail-archive.com/haproxy@formilux.org/msg25093.html > > > > no, this cannot be the problem, because this error reported in [1] is > > related to haproxy version 1.7.2. > > Since we don't know what causes the issue above, we could end up discovering > that there's a behaviour change that depends on the workload. > > > My problem is related to 1.7.3. The problem was introduced by a change > > for 1.7.3. as 1.7.2 is running fine. > > Could you retry 1.7.3 by reverting the attached patch ? I don't see > why it would cause any trouble but that's the only likely candidate > I'm seeing between 1.7.2 and 1.7.3. If it fixes it, it may indicate > an issue with our implementation of the kqueue poller, possibly > explaining the issues other people have reported with fbsd 11 vs 10. > > Also any hint you can provide to help people reproduce it would be welcome! > > Thanks, > Willy >From cd4c5a3ecf5e77fb4734c423c914f7280199c763 Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Wed, 25 Jan 2017 14:12:22 +0100 Subject: BUG/MEDIUM: tcp: don't poll for write when connect() succeeds X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 While testing a tcp_fastopen related change, it appeared that in the rare case where connect() can immediately succeed, we still subscribe to write notifications on the socket, causing the conn_fd_handler() to immediately be called and a second call to connect() to be attempted to double-check the connection. In fact this issue had already been met with unix sockets (which often respond immediately) and partially addressed but incorrect so another patch will follow. But for TCP nothing was done. The fix consists in removing the WAIT_L4_CONN flag if connect() succeeds and to subscribe for writes only if some handshakes or L4_CONN are still needed. In addition in order not to fail raw TCP health checks, we have to continue to enable polling for data when nothing is scheduled for leaving and the connection is already established, otherwise the caller will never be notified. This fix should be backported to 1.7 and 1.6. (cherry picked from commit 819efbf4b532d718abeb5e5aa6b2521ed725fe17) --- src/proto_tcp.c | 30 +- 1 file changed, 25 insertions(+), 5 deletions(-) diff --git a/src/proto_tcp.c b/src/proto_tcp.c index f6d8ca1..c04f276 100644 --- a/src/proto_tcp.c +++ b/src/proto_tcp.c @@ -474,10 +474,16 @@ int tcp_connect_server(struct connection *conn, int data, int delack) if (global.tune.server_rcvbuf) setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &global.tune.server_rcvbuf, sizeof(global.tune.server_rcvbuf)); - if ((connect(fd, (struct sockaddr *)&conn->addr.to, get_addr_len(&conn->addr.to)) == -1) && - (errno != EINPROGRESS) && (errno != EALREADY) && (errno != EISCONN)) { - - if (errno == EAGAIN || errno == EADDRINUSE || errno == EADDRNOTAVAIL) { + if (connect(fd, (struct sockaddr *)&conn->addr.to, get_addr_len(&conn->addr.to)) == -1) { + if (errno == EINPROGRESS || errno == EALREADY) { + /* common case, let's wait for connect status */ + conn->flags |= CO_FL_WAIT_L4_CONN; + } + else if (errno == EISCONN) { + /* should normally not happen but if so, indicates that it's OK */ + conn->flags &= ~CO_FL_WAIT_L4_CONN; + } + else if (errno == EAGAIN || errno == EADDRINUSE || errno == EADDRNOTAVAIL) { char *msg; if (errno == EAGAIN || errno == EADDRNOTAVAIL) { msg = "no free ports"; @@ -514,6 +520,10 @@ int tcp_connect_server(struct connection *conn, int data, int delack) return SF_ERR_SRVCL; } } + else { + /* connect() == 0, this is great! */ + conn->flags &= ~CO_FL_WAIT_L4_CONN; + } conn->flags |= CO_FL_ADDR_TO_SET; @@ -523,7 +533,6 @@ int tcp_connect_server(struct connection *conn, int data, int delack) conn_ctrl_init(conn); /* registers the FD */ fdtab[fd].linger_risk = 1; /* close hard if needed */ - conn_sock_want_send(conn); /* for connect status */ if (conn_xprt_init(conn) < 0) { conn_force_close(conn); @@ -531,6 +540,17 @@ int tcp_connect_server(struct connection *conn, int data, int delack) return SF_ERR_RESOURCE; } + if (conn->flags & (CO_FL_HA
Re: Client Cert Improvements
> Le 4 mars 2017 à 15:03, mlist a écrit : > For those first 3 points we don't need renegotiation. > Current implementation is buggy, but once we merge: "BUG/MEDIUM: ssl: fix verify/ca-file per certificate" > all those issues will be addressed, without complex workarounds or multiple IPs. > > For 2.Allow to have the default cert to use for non-SNI client for the > same domain used also for client certificate request > > I done a test that demonstrates a not working behavior. > Emmanuel write me why. See test case and Emmanuel answer: > > My test Case and question: > . haproxy.conf: > bind :443 ssl crt-list /etc/haproxy/crt-list.txt > > . crtlist.cfg: > /cert1.pem [ca-file //ca1.pem ca-file > //ca1.pem verify optional] > /cert2.pem > /cert3.pem > > but any request for any domain for any hostname pop-up > on the client side client certificate selection window > popup selection is presented to client also for domain > not in cert1.pem but in cert2.pem and cert3.pem. > Also: what is the default certificate for not-SNI > client if one use crt-list file instead of crt on bind line ? (without > crt-list file is the first crt in the bind line) > > Emmanuel answer: > The default cert is always the first cert parsed. It's > cert1.pem in your configuration. > The default cert is a source of errors because it's > used in the SSL negotiation. > The [ca-file verify optional] is also present in > the SSL negotiation, the switch to the correct cert will not override it. > => You must move the cert1.pem later in your > configuration and let the default cert as neutral as possible. > > It's a open problem with openssl. I have trying to > create a neutral SSL context (without any certificat) before select the > certificat, but openssl don't like that. > Without a real solution, this behaviour should be > documented. > The fix is in last 1.8dev ( « BUG/MEDIUM: ssl: fix verify/ca-file per certificate » )
Re: Problems with haproxy 1.7.3 on FreeBSD 11.0-p8
On Mon, Mar 06, 2017 at 09:59:21AM +0100, Matthias Fechner wrote: > Hi Georg, > > Am 06.03.2017 um 09:43 schrieb Georg Faerber: > > I'm not running FreeBSD myself, but have a look at [1]: In the > > follow-ups to this thread there are two more people reporting problems. > > > > [1] https://www.mail-archive.com/haproxy@formilux.org/msg25093.html > > no, this cannot be the problem, because this error reported in [1] is > related to haproxy version 1.7.2. Since we don't know what causes the issue above, we could end up discovering that there's a behaviour change that depends on the workload. > My problem is related to 1.7.3. The problem was introduced by a change > for 1.7.3. as 1.7.2 is running fine. Could you retry 1.7.3 by reverting the attached patch ? I don't see why it would cause any trouble but that's the only likely candidate I'm seeing between 1.7.2 and 1.7.3. If it fixes it, it may indicate an issue with our implementation of the kqueue poller, possibly explaining the issues other people have reported with fbsd 11 vs 10. Also any hint you can provide to help people reproduce it would be welcome! Thanks, Willy
Re: [RFC PATCH] MEDIUM: persistent connections for SSL checks
Hi Steven, On Wed, Mar 01, 2017 at 04:03:17PM -0800, Steven Davidovitz wrote: > Having hundreds of HTTP SSL health checks leads to CPU saturation. > This patch allows HTTP health checks without any http-expect directives > to keep the connection open for subsequent health checks. This patch > also does not affect any TCP check code. I think something like this could possibly work, but at least the persistent setting should definitely be an option. Indeed, for many people, checking the connection is as important if not more as testing the fact that the service works behind. I can give you some examples, such as if you check another haproxy, this last one will never quit upon a reload or soft-stop, so your health checks will continuously check the old process and will not detect a crash of the new one which listens to the connections. We could imagine having a more general option (per server, per backend?) to indicate that HTTP checks want to be performed on persistent connections, not just the SSL ones. In fact we could specify how many consecutive checks are allowed over a persistent connection before renewing the connection. That would cover your use case, allow users to set a reasonable counter to ensure that after a few checks, the listener is properly tested, and may be useful to some users even with pure HTTP (eg: less logs on intermediate firewalls). Also it is not normal at all that SSL checks lead to CPU saturation. Normally, health checks are expected to store the last SSL_CTX in the server struct for later reuse, leading to a TLS resume connection. There is one case where this doesn't work which is when SNI is being used on the server lines. Is this your case ? If so a better solution would be to have at least two (possibly a bit more) SSL_CTX per server as was mentionned in commit 119a408 ("BUG/MEDIUM: ssl: for a handshake when server-side SNI changes"). This would both improve the checks performance and the production traffic performance by avoiding renegs on SNI change between two consecutive connections. BTW, very good job for a first submission! Cheers, Willy
Re: [PATCH] BUILD: ssl: fix build with -DOPENSSL_NO_DH
On Mon, Mar 06, 2017 at 10:13:31AM +0100, Willy Tarreau wrote: > On Fri, Mar 03, 2017 at 05:12:55PM +0100, Emmanuel Hocdet wrote: > > Build without DH support is broken. This fix is for 1.8dev. > > It significantly reduces the size and initial memory footprint of haproxy. > > Hmmm this one does not apply :-( Finally it applied after I applied MINOR-ssl-removes-SSL_CTX_set_ssl_version... Thanks, Willy
Re: [PATCH 1/2] MINOR: ssl: isolate SSL_CTX_new with initial negotiation environnement
On Fri, Mar 03, 2017 at 01:28:40PM +0100, Emmanuel Hocdet wrote: > New version of this patch. > Little cleanup but much better comment. Applied, thanks Manu. Willy
Re: [PATCH] BUILD: ssl: fix build with -DOPENSSL_NO_DH
On Fri, Mar 03, 2017 at 05:12:55PM +0100, Emmanuel Hocdet wrote: > Build without DH support is broken. This fix is for 1.8dev. > It significantly reduces the size and initial memory footprint of haproxy. Hmmm this one does not apply :-( Willy
Re: openssl-1.1 SNI callback causing client failures
On Fri, Mar 03, 2017 at 03:55:05PM +0100, Emmanuel Hocdet wrote: > Patch candidat to merge in 1.8dev. > I think this patch should be backported, at least in versions compat with > openssl-1.1.0. Applied, thanks Manu! Willy
Re: Problems with haproxy 1.7.3 on FreeBSD 11.0-p8
Am 2017-03-06 10:05, schrieb Matthias Fechner: Dear Rainer, I opened a bug report here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=217576 I have only one server already upgraded to FreeBSD 11. The 10.3 installation are running fine with haproxy 1.7.3. Thanks!
Re: Problems with haproxy 1.7.3 on FreeBSD 11.0-p8
Dear Rainer, Am 06.03.2017 um 09:52 schrieb rai...@ultra-secure.de: > it would be cool if somebody could open a PR at > > https://bugs.freebsd.org/ > > I personally don't use FreeBSD 11 for any of my HAProxy-installations > (yet), so I'm not really affected (yet) - but thanks for the heads-up. I opened a bug report here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=217576 I have only one server already upgraded to FreeBSD 11. The 10.3 installation are running fine with haproxy 1.7.3. Gruß Matthias -- "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to produce bigger and better idiots. So far, the universe is winning." -- Rich Cook
Re: Problems with haproxy 1.7.3 on FreeBSD 11.0-p8
Hi Georg, Am 06.03.2017 um 09:43 schrieb Georg Faerber: > I'm not running FreeBSD myself, but have a look at [1]: In the > follow-ups to this thread there are two more people reporting problems. > > [1] https://www.mail-archive.com/haproxy@formilux.org/msg25093.html no, this cannot be the problem, because this error reported in [1] is related to haproxy version 1.7.2. My problem is related to 1.7.3. The problem was introduced by a change for 1.7.3. as 1.7.2 is running fine. Gruß Matthias -- "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to produce bigger and better idiots. So far, the universe is winning." -- Rich Cook signature.asc Description: OpenPGP digital signature
Re: Problems with haproxy 1.7.3 on FreeBSD 11.0-p8
Hi, it would be cool if somebody could open a PR at https://bugs.freebsd.org/ I personally don't use FreeBSD 11 for any of my HAProxy-installations (yet), so I'm not really affected (yet) - but thanks for the heads-up. Regards, Rainer
Re: Problems with haproxy 1.7.3 on FreeBSD 11.0-p8
Hi Matthias, On 17-03-06 09:34:07, Matthias Fechner wrote: > are problem with haproxy 1.7.3 on FreeBSD 11.0-p8 known? I'm not running FreeBSD myself, but have a look at [1]: In the follow-ups to this thread there are two more people reporting problems. Cheers, Georg [1] https://www.mail-archive.com/haproxy@formilux.org/msg25093.html signature.asc Description: Digital signature
Problems with haproxy 1.7.3 on FreeBSD 11.0-p8
Dear all, are problem with haproxy 1.7.3 on FreeBSD 11.0-p8 known? I have the problem that I got a lot of timeout for all websites that are behind haproxy. Haproxy does terminate the SSL connection and forwards to nginx. Before haproxy I have a sslh running. Downgrading to version 1.7.2 fixed the problem. Here my config (I removed some fqdns and username and password): global maxconn 2048 user haproxy group haproxy daemon tune.ssl.default-dh-param 2048 # logging ulimit-n 65536 #log /var/run/log local0 info log /var/run/log local0 err # Configure chipers to not use, see https://mozilla.github.io/server-side-tls/ssl-config-generator/ ssl-default-bind-ciphers ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS ssl-default-bind-options no-sslv3 no-tls-tickets ssl-default-server-ciphers ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS ssl-default-server-options no-sslv3 no-tls-tickets defaults mode http log global option httplog #option dontlog-normal timeout connect 80s timeout client 80s timeout server 80s #timeout check 1s #timeout http-keep-alive 1s #timeout http-request 400s # slowloris protection option forwardfor option http-server-close default-server inter 3s fall 2 rise 2 slowstart 60s compression algo gzip compression type text/html text/plain text/css frontend www-http bind *:80 redirect scheme https code 301 if !{ ssl_fc } reqadd X-Forwarded-Proto:\ http default_backend nginx-backend frontend www-https mode tcp bind 192.168.0.251:8443 ssl crt /usr/local/etc/haproxy/certs/ alpn h2,http/1.1 bind 192.168.200.6:8443 ssl crt /usr/local/etc/haproxy/certs/ alpn h2,http/1.1 bind localhost:443 ssl crt /usr/local/etc/haproxy/certs/ alpn h2,http/1.1 bind 127.0.0.1:443 ssl crt /usr/local/etc/haproxy/certs/ alpn h2,http/1.1 acl use_nginx ssl_fc_sni -i fqdn1 fqdn2 acl http2 ssl_fc_alpn -i h2 use_backend nginx-http2-backend if http2 use_backend nginx-http-backend if use_nginx default_backend nginx-http-backend backend nginx-backend server www-1 127.0.0.1:8082 check send-proxy backend nginx-http2-backend mode tcp server www-1 127.0.0.1:8083 check send-proxy backend nginx-http-backend mode tcp server www-1 127.0.0.1:8082 check send-proxy frontend haproxy-stats bind 192.168.0.251:9001 mode http stats enable stats hide-version stats realm Haproxy\ Statistics stats uri /haproxy_stats stats auth : Gruß Matthias -- "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to produce bigger and better idiots. So far, the universe is winning." -- Rich Cook