Re: Few problems seen in haproxy? (threads, connections).
On Thu, Oct 11, 2018 at 08:18:21AM +0530, Krishna Kumar (Engineering) wrote: > Hi Willy, > > Thank you very much for the in-depth analysis and configuration setting > suggestions. > I believe I have got the 3 key items to continue based on your mail: > > 1. Thread pinning > 2. Fix system irq pinning accordingly > 3. Listen on all threads. > > I will post the configuration changes and the results. By the way, please pull the latest master fixes. I've addressed two issues there with locking : - one where the scheduler work was slightly too high, increasing the time spent on RQ lock - another one where I messed up on a fix, causing lock-free pools to be disabled (as seen in your output, where the POOL lock appears a lot) On some tests I've run here, I've found the stick-tables lock to be a bottleneck when tracking is enabled. I don't have a short-term solution to this, but looking at the code it's obvious that it can significantly be improved (though it will take quite some time). I'll probably at least try to replace it with an RW lock as I think it could improve the situation. The FD lock is another one requiring some lift-up. I'm certain it's possible, I just don't know if it will not degrade low-thread count performance by using too many atomic ops instead. We'll have to experiment. Cheers, Willy
Re: Few problems seen in haproxy? (threads, connections).
Hi Willy, Thank you very much for the in-depth analysis and configuration setting suggestions. I believe I have got the 3 key items to continue based on your mail: 1. Thread pinning 2. Fix system irq pinning accordingly 3. Listen on all threads. I will post the configuration changes and the results. Regards, - Krishna On Wed, Oct 10, 2018 at 6:39 PM Willy Tarreau wrote: > Hi Krishna, > > On Tue, Oct 02, 2018 at 09:18:19PM +0530, Krishna Kumar (Engineering) > wrote: > (...) > > 1. HAProxy system: > > Kernel: 4.17.13, > > CPU: 48 core E5-2670 v3 > > Memory: 128GB memory > > NIC: Mellanox 40g with IRQ pinning > > > > 2. Client, 48 core similar to server. Test command line: > > wrk -c 4800 -t 48 -d 30s http:///128 > > > > 3. HAProxy version: I am testing both 1.8.14 and 1.9-dev3 (git checkout > as > > of > > Oct 2nd). > > # haproxy-git -vv > > HA-Proxy version 1.9-dev3 2018/09/29 > (...) > > 4. HAProxy results for #processes and #threads > > #Threads-RPS Procs-RPS > > 1 20903 19280 > > 2 46400 51045 > > 4 96587 142801 > > 8 172224 254720 > > 12 210451 437488 > > 16 173034 437375 > > 24 79069 519367 > > 32 55607 586367 > > 48 31739 596148 > > Our largest thread test was on 12 cores and it happens that in your case > it's also the optimal one. > > However I do have some comments about your config, before going back to > real thread issues : > > > # cpu-map auto:1/1-48 0-39 > => you must absolutely pin your processes, and they must be pinned > to cores *not* shared with the network card. That's critical. > Moreover it's also important that threads are not split across > multiple physical CPUs because the remote L3 cache access time > over QPI/UPI is terrible. When you run on 12 threads with two > 12-cores/24-threads CPUs, you could very well have haproxy using > 12 threads from 6 cores, and the NIC using 12 threads from the > other 6 cores of the same physical CPU. The second socket is, > as usual, useless for anything requiring low latency. However > it's perfect to run SSL. So you could be interested in testing > if running the NIC on one socket (try to figure what node the > PCIe lanes are physically connected to), and haproxy on the other > one. It *could* be possible that you get more performance from 12 > cores of each but I strongly doubt it based on a number of tests. > If you use SSL however it's different as you will benefit from > lots of cores much more than low latency. > > > bind :80 process 1/1-48 > => it's also capital for scalability to have individual bind lines. Here > you have a single socket accessed from all 48 threads. There's no > efficient thread load balancing here. By having this : > > bind :80 process 1/1 > bind :80 process 1/2 > ... > bind :80 process 1/47 > bind :80 process 1/48 > > You will let the kernel perform the load balancing and distribute a > fair load to all threads. This way none of them will risk to pick a > larger share of the incoming connections than optimal. I know it's > annoying to configure at the moment, I've been thinking about having > a way to automatically iterate from a single config line (like the > "auto" feature of cpu-map), but for now it's not done. > > Now back to the thread measurements : > > > 5. Lock stats for 1.9-dev3: Some write locks on average took a lot more > time > >to acquire, e.g. "POOL" and "TASK_WQ". For 48 threads, I get: > > Stats about Lock FD: > > # write lock : 143933900 > > # write unlock: 143933895 (-5) > > # wait time for write : 11370.245 msec > > This one definitely is huge. We know some work is still needed on this lock > and that there are still a few low hanging fruits but not much savings to > expect short term. This output is very revealing however of the importance > of this lock. > > > # wait time for write/lock: 78.996 nsec > > That's roughly the time it takes to access the other CPU's cache, so using > your two sockets for the same process definitely hurts a lot here. > > > Stats about Lock TASK_RQ: > > # write lock : 2062874 > > # write unlock: 2062875 (1) > > # wait time for write : 7820.234 msec > > This one is still far too large for what we'd hope, even though it > has significantly shrunk since 1.8. It could be related to the poor > distribution of the incoming connections across threads. > > > # wait time for write/lock: 3790.941 nsec > > Wow, 3.8 microseconds to acquire the write lock is a lot! I'm starting > to suspect some longer tree walks than expected. Again, lack of fairness > between threads can make this significantly worse than it should. > > > Stats about Lock TASK_WQ: > > # write lock : 2601227 > > # write unlock: 2601227 (0) > > # wait time for write : 5019.811 msec > > # wait time for write/lock: 1929.786 nsec > > For this one I had an idea that could significantly improve the situation. > It will unlikely be for 1.9
HAProxy "http-request auth" vs Safari WebSockets -- can this marriage be saved?
Hi all, First, a little background (and my apologies if my questions are very basic, I’m new to HAProxy) — I’ve got an embedded device with a built-in web server that allows the device to be controlled via HTTP requests and/or (more interactively) via WebSockets connections. This all works fine when the device’s web-server accessed directly from Safari, Chrome, Internet Explorer, Firefox, etc. The problem is, this embedded device doesn’t implement any kind of security or access-control, so it would be a bad idea to put it directly on an untrusted network, since any random person could point their web browser at it and mess up its settings. To work around that problem, I hide connect this embedded device directly to the second Ethernet port of a Mac running HAProxy 1.8.13, so that I can use HAProxy’s “http-request auth” feature (with or without SSH/TLS) to provide authentication. That way, nobody on the untrusted network can talk to my insecure embedded device directly; instead, they can point their web browsers to my Mac’s IP address, and HAProxy makes them enter the secret username-and-password before any of their connections can get forwarded on through to the embedded device’s web server. This also works great — at least, it works great when the web browser is Chrome. If the web browser is Safari on the other hand, the vanilla http/https stuff works fine, but the WebSocket connections error-out when they hit HAProxy. In particular, the JavaScript scripts served from the device’s embedded web page can’t connect to the embedded device’s web-server (using either ws:// or wss:// protocol), and Safari’s JavaScript console shows this error message when the try: [Error] WebSocket connection to 'wss://localhost:8080/' failed: Invalid HTTP version string: HTTP/1.0 My question is, does anyone know what might be going wrong here, or have any ideas about how I might get Safari’s WebSockets to play nicely with HAProxy’s client-username/password authentication feature? (Safari’s WebSockets do work fine through HAProxy if I comment out the “http-request auth” line in my haproxy.cfg file’s “frontend” section, but then accessing my embedded device no longer requires a password, which defeats the point of the exercise) Thanks, Jeremy ps some hopefully-relevant debugging info follows... I’m testing with Safari 12.0.1 (13606.2.104) running on the HAProxy-hosting Mac. haproxy is v1.8.13. If I run haproxy with debugging output enabled, this is what I see when the JavaScript tries (and fails) to connect a WebSocket through HAProxy under Safari: $ haproxy - -f /usr/local/etc/haproxy.cfg [WARNING] 282/174723 (9610) : parsing [/usr/local/etc/haproxy.cfg:32] : a 'http-request' rule placed after a 'reqadd' rule will still be processed before. Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result FAILED Total: 3 (2 usable), will use kqueue. Available filters : [SPOE] spoe [COMP] compression [TRACE] trace Using kqueue() as the polling mechanism. [WARNING] 282/174725 (9610) : Server galaxynodes/server1 is DOWN, reason: Layer4 timeout, check duration: 2002ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. :localnodes.accept(0004)=0009 from [127.0.0.1:63330] ALPN= :localnodes.clireq[0009:]: GET / HTTP/1.1 :localnodes.clihdr[0009:]: Upgrade: websocket :localnodes.clihdr[0009:]: Connection: Upgrade :localnodes.clihdr[0009:]: Host: localhost:8080 :localnodes.clihdr[0009:]: Origin: https://localhost:8080 :localnodes.clihdr[0009:]: Pragma: no-cache :localnodes.clihdr[0009:]: Cache-Control: no-cache :localnodes.clihdr[0009:]: Sec-WebSocket-Key: sUs/WOhQoe4plAvU5HQ+MQ== :localnodes.clihdr[0009:]: Sec-WebSocket-Version: 13 :localnodes.clihdr[0009:]: Sec-WebSocket-Extensions: x-webkit-deflate-frame :localnodes.clihdr[0009:]: User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0.1 Safari/605.1.15 :localnodes.clicls[0009:] :localnodes.closed[0009:] … and then, just for comparison, here is the output from haproxy when the client is Google Chrome (v69.0.3497.100), and the WebSocket connections succeeds: Jeremys-Mac-Pro:specs jaf$ haproxy - -f /usr/local/etc/haproxy.cfg [WARNING] 282/175026 (9663) : parsing [/usr/local/etc/haproxy.cfg:32] : a 'http-request' rule placed after a 'reqadd' rule will still be processed before. Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result FAILED Total: 3 (2 usable), will use kqueue. Available filters : [SPOE]
Re: [PATCH] MEDIUM: lua: Add stick table support for Lua
Am Sa., 29. Sep. 2018 um 20:18 Uhr schrieb Willy Tarreau : > > Hi Adis, > > On Thu, Sep 27, 2018 at 05:32:22PM +0200, Adis Nezirovic wrote: > > On Thu, Sep 27, 2018 at 04:52:29PM +0200, Thierry Fournier wrote: > > > I Adis, > > > > > > Sorry for the delay, I processed a quick review, and all seems to be ok > > > for me! > > > > > > BR, > > > Thierry > > > > Great, happy to hear that, I hope guys will merge it soon. > > OK, just merged now. > > Thanks to you both! > > Willy > Hi Adis, nice feature, thank you. Are there plans for adding write access also? Currently i've a use case for this in Lua (get/set some sort of shared lock) and i'm planning to use HAProxy maps as a workaround instead of stick tables (or writing entries to stick tables from Lua via tcp to admin socket). Best Regards / Mit freundlichen Grüßen Bjoern
Re: lua haproxy-auth-request - round 2
Hi Tim, On 2018-10-09 10:40 AM, Tim Düsterhus wrote: Bob, Am 05.10.2018 um 22:10 schrieb Computerisms Corporation: Okay, this is something I hadn't looked at. Not 100% sure I am interpreting correctly, but assuming I am, then the browser is reporting that the connection is successfully upgrading to websockets both with and without the Lua script enabled. As far as I can tell, the only thing that changes is the cookies and keys. There is a significant difference in the "waiting" response, though; ~350ms with the lua script, but less than 5ms without it. Use your browser's dev tools (F12 for Firefox, Chrome). Search for the websocket request. It's just called '/websocket' for Etherpad. It should return a 101 Switching Protocols. In Google Chrome you can even look at the WebSocket Frames. okay, that is what I was looking at, and yes, it does show the websocket upgrade request, see below. it looks like the websockets upgrade is /socket.io followed by some arguments denoted with a /?. Can't paste, but it starts with a call to this file with code 200; getting en-gb.json with code 304; another call to /socket.io with code 200; then the code 101 with an upgrade; 10 seconds later it shows a POST to the /socket.io with code 200. b) Send credentials for basic authentication for Websockets. hm, so I don't seem to be able to connect to etherpad directly using a ws:// schematic in chrome or firefox, and I think that is what you mean. not sure if that is just me, though, will work some more on that and see if I can figure out if I am doing something wrong there... No, use the dev tools and look at the request headers, whether there is an Authorization header. Of course oauth_proxy needs to be enabled. I think by oauth_proxy you mean the auth method I am using with the lua script, in my case authnz_external from apache. if you literally mean oauth_proxy, maybe this is the link I am missing. I have not configured oauth_proxy any where, as I am/(was?) not planning to use it. But by the request headers, I understand what you mean now. I am using firefox primarily, but appears I get the same basic thing in chrome. Firefox and Chrome both report 101 switching protocols, the websocket upgrade header, and the authorization header: Response headers (175 B) Connection Upgrade Sec-WebSocket-AcceptgU996yXfDiutgquFxiRxGuffglc= Sec-WebSocket-Extensionspermessage-deflate Upgrade websocket Request headers (936 B) Accept text/html,application/xhtml+xm…plication/xml;q=0.9,*/*;q=0.8 Accept-Encoding gzip, deflate, br Accept-Language en-CA,en-US;q=0.7,en;q=0.3 Authorization Basic Ym9iQGNvbXB1dGVyaXNtcy5j…HMHJnZTB1cyhXZWUpRWdnbGl0cw== Cache-Control no-cache Connection keep-alive, Upgrade Cookie _ga=GA1.2.1611432529.153149834…fFXUaWksCAAAJ; language=en-gb Hostpad.computerisms.ca Origin https://pad.computerisms.ca Pragma no-cache Sec-WebSocket-Extensionspermessage-deflate Sec-WebSocket-Key DhxDcq4PkH+/TF2kaSW8JQ== Sec-WebSocket-Version 13 Upgrade websocket User-Agent Mozilla/5.0 (X11; Ubuntu; Linu…) Gecko/20100101 Firefox/62.0 I attached a screenshot of Chrome's dev tools. Thanks, that confirms I am using the correct thing. If you send me credentials in private I can take a look myself, if you want. will follow up off list... Best regards Tim Düsterhus
Re: Few problems seen in haproxy? (threads, connections).
Hi Krishna, On Tue, Oct 02, 2018 at 09:18:19PM +0530, Krishna Kumar (Engineering) wrote: (...) > 1. HAProxy system: > Kernel: 4.17.13, > CPU: 48 core E5-2670 v3 > Memory: 128GB memory > NIC: Mellanox 40g with IRQ pinning > > 2. Client, 48 core similar to server. Test command line: > wrk -c 4800 -t 48 -d 30s http:///128 > > 3. HAProxy version: I am testing both 1.8.14 and 1.9-dev3 (git checkout as > of > Oct 2nd). > # haproxy-git -vv > HA-Proxy version 1.9-dev3 2018/09/29 (...) > 4. HAProxy results for #processes and #threads > #Threads-RPS Procs-RPS > 1 20903 19280 > 2 46400 51045 > 4 96587 142801 > 8 172224 254720 > 12 210451 437488 > 16 173034 437375 > 24 79069 519367 > 32 55607 586367 > 48 31739 596148 Our largest thread test was on 12 cores and it happens that in your case it's also the optimal one. However I do have some comments about your config, before going back to real thread issues : > # cpu-map auto:1/1-48 0-39 => you must absolutely pin your processes, and they must be pinned to cores *not* shared with the network card. That's critical. Moreover it's also important that threads are not split across multiple physical CPUs because the remote L3 cache access time over QPI/UPI is terrible. When you run on 12 threads with two 12-cores/24-threads CPUs, you could very well have haproxy using 12 threads from 6 cores, and the NIC using 12 threads from the other 6 cores of the same physical CPU. The second socket is, as usual, useless for anything requiring low latency. However it's perfect to run SSL. So you could be interested in testing if running the NIC on one socket (try to figure what node the PCIe lanes are physically connected to), and haproxy on the other one. It *could* be possible that you get more performance from 12 cores of each but I strongly doubt it based on a number of tests. If you use SSL however it's different as you will benefit from lots of cores much more than low latency. > bind :80 process 1/1-48 => it's also capital for scalability to have individual bind lines. Here you have a single socket accessed from all 48 threads. There's no efficient thread load balancing here. By having this : bind :80 process 1/1 bind :80 process 1/2 ... bind :80 process 1/47 bind :80 process 1/48 You will let the kernel perform the load balancing and distribute a fair load to all threads. This way none of them will risk to pick a larger share of the incoming connections than optimal. I know it's annoying to configure at the moment, I've been thinking about having a way to automatically iterate from a single config line (like the "auto" feature of cpu-map), but for now it's not done. Now back to the thread measurements : > 5. Lock stats for 1.9-dev3: Some write locks on average took a lot more time >to acquire, e.g. "POOL" and "TASK_WQ". For 48 threads, I get: > Stats about Lock FD: > # write lock : 143933900 > # write unlock: 143933895 (-5) > # wait time for write : 11370.245 msec This one definitely is huge. We know some work is still needed on this lock and that there are still a few low hanging fruits but not much savings to expect short term. This output is very revealing however of the importance of this lock. > # wait time for write/lock: 78.996 nsec That's roughly the time it takes to access the other CPU's cache, so using your two sockets for the same process definitely hurts a lot here. > Stats about Lock TASK_RQ: > # write lock : 2062874 > # write unlock: 2062875 (1) > # wait time for write : 7820.234 msec This one is still far too large for what we'd hope, even though it has significantly shrunk since 1.8. It could be related to the poor distribution of the incoming connections across threads. > # wait time for write/lock: 3790.941 nsec Wow, 3.8 microseconds to acquire the write lock is a lot! I'm starting to suspect some longer tree walks than expected. Again, lack of fairness between threads can make this significantly worse than it should. > Stats about Lock TASK_WQ: > # write lock : 2601227 > # write unlock: 2601227 (0) > # wait time for write : 5019.811 msec > # wait time for write/lock: 1929.786 nsec For this one I had an idea that could significantly improve the situation. It will unlikely be for 1.9 though. > Stats about Lock POOL: > # write lock : 2823393 > # write unlock: 2823393 (0) > # wait time for write : 11984.706 msec Interesting. Very interesting... I thought we merged the lock-free pool code. Maybe I was mistaken. Or I misunderstood something about the nature of the changes that went in early during the 1.9-dev cycle. This one is not expected to be hard to address, and I even see how we could go further with a small thread-local cache since I developed such a thing a few years ago (though it was over-engineered) that could easily be retrofitted here. > # wait time for
Re: HA-Proxy configuration
On Wed, 10 Oct 2018 at 07:08, anjireddy.komire...@wipro.com < anjireddy.komire...@wipro.com> wrote: > Hi Team, > > > I am looking for HA-Proxy configuration Help in over project, can i know > some one who can give more information on configuration using 2 different > HA-Proxy > servers for high availability. > > > Feel free to contact me on - 9849916124 > Hey there, Welcome to the public mailing list for users of the open source haproxy tool. You'd probably do best by posting the configuration and HA setup as far as you've managed to get it going, and asking questions about specific problems you encounter along the way. You're more likely to get help via email than via telephone! Here is the starter guide for the current stable version: http://cbonte.github.io/haproxy-dconv/1.8/intro.html. There are links along the top of that page to the configuration and management manuals, which will be of interest as you evolve your HA setup. If, instead, you feel you would like to trade time for money, and want to take advantage of a commercial support option, some are listed here: http://www.haproxy.org/#supp As a backstop, my UK company is already set up as a supplier inside Wipro's procurement system. Do get in touch if the routes I've mentioned above don't meet your needs :-) All the best, Jonathan -- Jonathan Matthews London, UK http://www.jpluscplusm.com/contact.html -- Jonathan Matthews London, UK http://www.jpluscplusm.com/contact.html
Re: HA-Proxy configuration
Hi. Have you seen this link list, there are some good examples for a ha solutins ? http://www.haproxy.org/#link In general is keepalived and haproxy a good kombination. On which plattform do you plan to run the ha setup? Best regards Aleks Ursprüngliche Nachricht Von: "anjireddy.komire...@wipro.com" Gesendet: 10. Oktober 2018 08:05:24 MESZ An: "haproxy@formilux.org" CC: "santhosh.pa...@wipro.com" Betreff: HA-Proxy configuration Hi Team, I am looking for HA-Proxy configuration Help in over project, can i know some one who can give more information on configuration using 2 different HA-Proxy servers for high availability. Feel free to contact me on - 9849916124 Regards, Anjireddy. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com
HA-Proxy configuration
Hi Team, I am looking for HA-Proxy configuration Help in over project, can i know some one who can give more information on configuration using 2 different HA-Proxy servers for high availability. Feel free to contact me on - 9849916124 Regards, Anjireddy. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com