Re: Few problems seen in haproxy? (threads, connections).

2018-10-10 Thread Willy Tarreau
On Thu, Oct 11, 2018 at 08:18:21AM +0530, Krishna Kumar (Engineering) wrote:
> Hi Willy,
> 
> Thank you very much for the in-depth analysis and configuration setting
> suggestions.
> I believe I have got the 3 key items to continue based on your mail:
> 
> 1. Thread pinning
> 2. Fix system irq pinning accordingly
> 3. Listen on all threads.
> 
> I will post the configuration changes and the results.

By the way, please pull the latest master fixes. I've addressed two issues
there with locking :
  - one where the scheduler work was slightly too high, increasing the time
spent on RQ lock
  - another one where I messed up on a fix, causing lock-free pools to be
disabled (as seen in your output, where the POOL lock appears a lot)

On some tests I've run here, I've found the stick-tables lock to be a
bottleneck when tracking is enabled. I don't have a short-term solution
to this, but looking at the code it's obvious that it can significantly
be improved (though it will take quite some time). I'll probably at least
try to replace it with an RW lock as I think it could improve the situation.

The FD lock is another one requiring some lift-up. I'm certain it's possible,
I just don't know if it will not degrade low-thread count performance by
using too many atomic ops instead. We'll have to experiment.

Cheers,
Willy



Re: Few problems seen in haproxy? (threads, connections).

2018-10-10 Thread Krishna Kumar (Engineering)
Hi Willy,

Thank you very much for the in-depth analysis and configuration setting
suggestions.
I believe I have got the 3 key items to continue based on your mail:

1. Thread pinning
2. Fix system irq pinning accordingly
3. Listen on all threads.

I will post the configuration changes and the results.

Regards,
- Krishna


On Wed, Oct 10, 2018 at 6:39 PM Willy Tarreau  wrote:

> Hi Krishna,
>
> On Tue, Oct 02, 2018 at 09:18:19PM +0530, Krishna Kumar (Engineering)
> wrote:
> (...)
> > 1. HAProxy system:
> > Kernel: 4.17.13,
> > CPU: 48 core E5-2670 v3
> > Memory: 128GB memory
> > NIC: Mellanox 40g with IRQ pinning
> >
> > 2. Client, 48 core similar to server. Test command line:
> > wrk -c 4800 -t 48 -d 30s http:///128
> >
> > 3. HAProxy version: I am testing both 1.8.14 and 1.9-dev3 (git checkout
> as
> > of
> > Oct 2nd).
> > # haproxy-git -vv
> > HA-Proxy version 1.9-dev3 2018/09/29
> (...)
> > 4. HAProxy results for #processes and #threads
> > #Threads-RPS Procs-RPS
> > 1 20903 19280
> > 2 46400 51045
> > 4 96587 142801
> > 8 172224 254720
> > 12 210451 437488
> > 16 173034 437375
> > 24 79069 519367
> > 32 55607 586367
> > 48 31739 596148
>
> Our largest thread test was on 12 cores and it happens that in your case
> it's also the optimal one.
>
> However I do have some comments about your config, before going back to
> real thread issues :
>
> > # cpu-map auto:1/1-48 0-39
>   => you must absolutely pin your processes, and they must be pinned
>  to cores *not* shared with the network card. That's critical.
>  Moreover it's also important that threads are not split across
>  multiple physical CPUs because the remote L3 cache access time
>  over QPI/UPI is terrible. When you run on 12 threads with two
>  12-cores/24-threads CPUs, you could very well have haproxy using
>  12 threads from 6 cores, and the NIC using 12 threads from the
>  other 6 cores of the same physical CPU. The second socket is,
>  as usual, useless for anything requiring low latency. However
>  it's perfect to run SSL. So you could be interested in testing
>  if running the NIC on one socket (try to figure what node the
>  PCIe lanes are physically connected to), and haproxy on the other
>  one. It *could* be possible that you get more performance from 12
>  cores of each but I strongly doubt it based on a number of tests.
>  If you use SSL however it's different as you will benefit from
>  lots of cores much more than low latency.
>
> > bind :80 process 1/1-48
>   => it's also capital for scalability to have individual bind lines. Here
>  you have a single socket accessed from all 48 threads. There's no
>  efficient thread load balancing here. By having this :
>
>  bind :80 process 1/1
>  bind :80 process 1/2
>  ...
>  bind :80 process 1/47
>  bind :80 process 1/48
>
>  You will let the kernel perform the load balancing and distribute a
>  fair load to all threads. This way none of them will risk to pick a
>  larger share of the incoming connections than optimal. I know it's
>  annoying to configure at the moment, I've been thinking about having
>  a way to automatically iterate from a single config line (like the
>  "auto" feature of cpu-map), but for now it's not done.
>
> Now back to the thread measurements :
>
> > 5. Lock stats for 1.9-dev3: Some write locks on average took a lot more
> time
> >to acquire, e.g. "POOL" and "TASK_WQ". For 48 threads, I get:
> > Stats about Lock FD:
> > # write lock  : 143933900
> > # write unlock: 143933895 (-5)
> > # wait time for write : 11370.245 msec
>
> This one definitely is huge. We know some work is still needed on this lock
> and that there are still a few low hanging fruits but not much savings to
> expect short term. This output is very revealing however of the importance
> of this lock.
>
> > # wait time for write/lock: 78.996 nsec
>
> That's roughly the time it takes to access the other CPU's cache, so using
> your two sockets for the same process definitely hurts a lot here.
>
> > Stats about Lock TASK_RQ:
> > # write lock  : 2062874
> > # write unlock: 2062875 (1)
> > # wait time for write : 7820.234 msec
>
> This one is still far too large for what we'd hope, even though it
> has significantly shrunk since 1.8. It could be related to the poor
> distribution of the incoming connections across threads.
>
> > # wait time for write/lock: 3790.941 nsec
>
> Wow, 3.8 microseconds to acquire the write lock is a lot! I'm starting
> to suspect some longer tree walks than expected. Again, lack of fairness
> between threads can make this significantly worse than it should.
>
> > Stats about Lock TASK_WQ:
> > # write lock  : 2601227
> > # write unlock: 2601227 (0)
> > # wait time for write : 5019.811 msec
> > # wait time for write/lock: 1929.786 nsec
>
> For this one I had an idea that could significantly improve the situation.
> It will unlikely be for 1.9 

HAProxy "http-request auth" vs Safari WebSockets -- can this marriage be saved?

2018-10-10 Thread Jeremy Friesner
Hi all,

First, a little background (and my apologies if my questions are very basic, 
I’m new to HAProxy) — I’ve got an embedded device with a built-in web server 
that allows the device to be controlled via HTTP requests and/or (more 
interactively) via WebSockets connections.  This all works fine when the 
device’s web-server accessed directly from Safari, Chrome, Internet Explorer, 
Firefox, etc.

The problem is, this embedded device doesn’t implement any kind of security or 
access-control, so it would be a bad idea to put it directly on an untrusted 
network, since any random person could point their web browser at it and mess 
up its settings.

To work around that problem, I hide connect this embedded device directly to 
the second Ethernet port of a Mac running HAProxy 1.8.13, so that I can use 
HAProxy’s “http-request auth” feature (with or without SSH/TLS) to provide 
authentication.  That way, nobody on the untrusted network can talk to my 
insecure embedded device directly; instead, they can point their web browsers 
to my Mac’s IP address, and HAProxy makes them enter the secret 
username-and-password before any of their connections can get forwarded on 
through to the embedded device’s web server.

This also works great — at least, it works great when the web browser is 
Chrome.  If the web browser is Safari on the other hand, the vanilla http/https 
stuff works fine, but the WebSocket connections error-out when they hit 
HAProxy.  In particular, the JavaScript scripts served from the device’s 
embedded web page can’t connect to the embedded device’s web-server (using 
either ws:// or wss:// protocol), and Safari’s JavaScript console shows this 
error message when the try:

[Error] WebSocket connection to 'wss://localhost:8080/' failed: Invalid 
HTTP version string: HTTP/1.0

My question is, does anyone know what might be going wrong here, or have any 
ideas about how I might get Safari’s WebSockets to play nicely with HAProxy’s 
client-username/password authentication feature?   (Safari’s WebSockets do work 
fine through HAProxy if I comment out the “http-request auth” line in my 
haproxy.cfg file’s “frontend” section, but then accessing my embedded device no 
longer requires a password, which defeats the point of the exercise)

Thanks,
Jeremy

ps some hopefully-relevant debugging info follows...

I’m testing with Safari 12.0.1 (13606.2.104) running on the HAProxy-hosting 
Mac.   haproxy is v1.8.13.

If I run haproxy with debugging output enabled, this is what I see when the 
JavaScript tries (and fails) to connect a WebSocket through HAProxy under 
Safari:

$ haproxy - -f /usr/local/etc/haproxy.cfg
[WARNING] 282/174723 (9610) : parsing [/usr/local/etc/haproxy.cfg:32] : a 
'http-request' rule placed after a 'reqadd' rule will still be processed before.
Available polling systems :
 kqueue : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result FAILED
Total: 3 (2 usable), will use kqueue.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace
Using kqueue() as the polling mechanism.
[WARNING] 282/174725 (9610) : Server galaxynodes/server1 is DOWN, reason: 
Layer4 timeout, check duration: 2002ms. 1 active and 0 backup servers left. 0 
sessions active, 0 requeued, 0 remaining in queue.
:localnodes.accept(0004)=0009 from [127.0.0.1:63330] ALPN=
:localnodes.clireq[0009:]: GET / HTTP/1.1
:localnodes.clihdr[0009:]: Upgrade: websocket
:localnodes.clihdr[0009:]: Connection: Upgrade
:localnodes.clihdr[0009:]: Host: localhost:8080
:localnodes.clihdr[0009:]: Origin: https://localhost:8080
:localnodes.clihdr[0009:]: Pragma: no-cache
:localnodes.clihdr[0009:]: Cache-Control: no-cache
:localnodes.clihdr[0009:]: Sec-WebSocket-Key: 
sUs/WOhQoe4plAvU5HQ+MQ==
:localnodes.clihdr[0009:]: Sec-WebSocket-Version: 13
:localnodes.clihdr[0009:]: Sec-WebSocket-Extensions: 
x-webkit-deflate-frame
:localnodes.clihdr[0009:]: User-Agent: Mozilla/5.0 (Macintosh; 
Intel Mac OS X 10_13_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0.1 
Safari/605.1.15
:localnodes.clicls[0009:]
:localnodes.closed[0009:]

… and then, just for comparison, here is the output from haproxy when the 
client is Google Chrome (v69.0.3497.100), and the WebSocket connections 
succeeds:

Jeremys-Mac-Pro:specs jaf$ haproxy - -f /usr/local/etc/haproxy.cfg
[WARNING] 282/175026 (9663) : parsing [/usr/local/etc/haproxy.cfg:32] : a 
'http-request' rule placed after a 'reqadd' rule will still be processed before.
Available polling systems :
 kqueue : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result FAILED
Total: 3 (2 usable), will use kqueue.

Available filters :
[SPOE] 

Re: [PATCH] MEDIUM: lua: Add stick table support for Lua

2018-10-10 Thread bjun...@gmail.com
Am Sa., 29. Sep. 2018 um 20:18 Uhr schrieb Willy Tarreau :
>
> Hi Adis,
>
> On Thu, Sep 27, 2018 at 05:32:22PM +0200, Adis Nezirovic wrote:
> > On Thu, Sep 27, 2018 at 04:52:29PM +0200, Thierry Fournier wrote:
> > > I Adis,
> > >
> > > Sorry for the delay, I processed a quick review, and all seems to be ok 
> > > for me!
> > >
> > > BR,
> > > Thierry
> >
> > Great, happy to hear that, I hope guys will merge it soon.
>
> OK, just merged now.
>
> Thanks to you both!
>
> Willy
>

Hi Adis,

nice feature, thank you. Are there plans for adding write access also?

Currently i've a use case for this in Lua (get/set some sort of shared
lock) and i'm planning to use HAProxy maps as a workaround instead of
stick tables (or writing entries to stick tables from Lua via tcp to
admin socket).


Best Regards / Mit freundlichen Grüßen

Bjoern



Re: lua haproxy-auth-request - round 2

2018-10-10 Thread Computerisms Corporation

Hi Tim,

On 2018-10-09 10:40 AM, Tim Düsterhus wrote:

Bob,

Am 05.10.2018 um 22:10 schrieb Computerisms Corporation:

Okay, this is something I hadn't looked at.  Not 100% sure I am
interpreting correctly, but assuming I am, then the browser is reporting
that the connection is successfully upgrading to websockets both with
and without the Lua script enabled.  As far as I can tell, the only
thing that changes is the cookies and keys.  There is a significant
difference in the "waiting" response, though; ~350ms with the lua
script, but less than 5ms without it.


Use your browser's dev tools (F12 for Firefox, Chrome). Search for the
websocket request. It's just called '/websocket' for Etherpad. It should
return a 101 Switching Protocols. In Google Chrome you can even look at
the WebSocket Frames.


okay, that is what I was looking at, and yes, it does show the websocket 
upgrade request, see below.  it looks like the websockets upgrade is 
/socket.io followed by some arguments denoted with a /?.


Can't paste, but it starts with a call to this file with code 200; 
getting en-gb.json with code 304;

another call to /socket.io with code 200;
then the code 101 with an upgrade;
10 seconds later it shows a POST to the /socket.io with code 200.




b) Send credentials for basic authentication for Websockets.


hm, so I don't seem to be able to connect to etherpad directly using a
ws:// schematic in chrome or firefox, and I think that is what you mean.
  not sure if that is just me, though, will work some more on that and
see if I can figure out if I am doing something wrong there...


No, use the dev tools and look at the request headers, whether there is
an Authorization header. Of course oauth_proxy needs to be enabled.


I think by oauth_proxy you mean the auth method I am using with the lua 
script, in my case authnz_external from apache.  if you literally mean 
oauth_proxy, maybe this is the link I am missing.  I have not configured 
oauth_proxy any where, as I am/(was?) not planning to use it.


But by the request headers, I understand what you mean now.  I am using 
firefox primarily, but appears I get the same basic thing in chrome. 
Firefox and Chrome both report 101 switching protocols, the websocket 
upgrade header, and the authorization header:


Response headers (175 B)
Connection  Upgrade
Sec-WebSocket-AcceptgU996yXfDiutgquFxiRxGuffglc=
Sec-WebSocket-Extensionspermessage-deflate
Upgrade websocket
Request headers (936 B) 
Accept  text/html,application/xhtml+xm…plication/xml;q=0.9,*/*;q=0.8
Accept-Encoding gzip, deflate, br
Accept-Language en-CA,en-US;q=0.7,en;q=0.3
Authorization   Basic Ym9iQGNvbXB1dGVyaXNtcy5j…HMHJnZTB1cyhXZWUpRWdnbGl0cw==
Cache-Control   no-cache
Connection  keep-alive, Upgrade
Cookie  _ga=GA1.2.1611432529.153149834…fFXUaWksCAAAJ; language=en-gb
Hostpad.computerisms.ca
Origin  https://pad.computerisms.ca
Pragma  no-cache
Sec-WebSocket-Extensionspermessage-deflate
Sec-WebSocket-Key   DhxDcq4PkH+/TF2kaSW8JQ==
Sec-WebSocket-Version   13
Upgrade websocket
User-Agent  Mozilla/5.0 (X11; Ubuntu; Linu…) Gecko/20100101 Firefox/62.0


I attached a screenshot of Chrome's dev tools.


Thanks, that confirms I am using the correct thing.


If you send me credentials in private I can take a look myself, if you want.


will follow up off list...



Best regards
Tim Düsterhus





Re: Few problems seen in haproxy? (threads, connections).

2018-10-10 Thread Willy Tarreau
Hi Krishna,

On Tue, Oct 02, 2018 at 09:18:19PM +0530, Krishna Kumar (Engineering) wrote:
(...)
> 1. HAProxy system:
> Kernel: 4.17.13,
> CPU: 48 core E5-2670 v3
> Memory: 128GB memory
> NIC: Mellanox 40g with IRQ pinning
> 
> 2. Client, 48 core similar to server. Test command line:
> wrk -c 4800 -t 48 -d 30s http:///128
> 
> 3. HAProxy version: I am testing both 1.8.14 and 1.9-dev3 (git checkout as
> of
> Oct 2nd).
> # haproxy-git -vv
> HA-Proxy version 1.9-dev3 2018/09/29
(...)
> 4. HAProxy results for #processes and #threads
> #Threads-RPS Procs-RPS
> 1 20903 19280
> 2 46400 51045
> 4 96587 142801
> 8 172224 254720
> 12 210451 437488
> 16 173034 437375
> 24 79069 519367
> 32 55607 586367
> 48 31739 596148

Our largest thread test was on 12 cores and it happens that in your case
it's also the optimal one.

However I do have some comments about your config, before going back to
real thread issues :

> # cpu-map auto:1/1-48 0-39
  => you must absolutely pin your processes, and they must be pinned
 to cores *not* shared with the network card. That's critical.
 Moreover it's also important that threads are not split across
 multiple physical CPUs because the remote L3 cache access time
 over QPI/UPI is terrible. When you run on 12 threads with two
 12-cores/24-threads CPUs, you could very well have haproxy using
 12 threads from 6 cores, and the NIC using 12 threads from the
 other 6 cores of the same physical CPU. The second socket is,
 as usual, useless for anything requiring low latency. However
 it's perfect to run SSL. So you could be interested in testing
 if running the NIC on one socket (try to figure what node the
 PCIe lanes are physically connected to), and haproxy on the other
 one. It *could* be possible that you get more performance from 12
 cores of each but I strongly doubt it based on a number of tests.
 If you use SSL however it's different as you will benefit from
 lots of cores much more than low latency.

> bind :80 process 1/1-48
  => it's also capital for scalability to have individual bind lines. Here
 you have a single socket accessed from all 48 threads. There's no
 efficient thread load balancing here. By having this :

 bind :80 process 1/1
 bind :80 process 1/2
 ...
 bind :80 process 1/47
 bind :80 process 1/48

 You will let the kernel perform the load balancing and distribute a
 fair load to all threads. This way none of them will risk to pick a
 larger share of the incoming connections than optimal. I know it's
 annoying to configure at the moment, I've been thinking about having
 a way to automatically iterate from a single config line (like the
 "auto" feature of cpu-map), but for now it's not done.

Now back to the thread measurements :

> 5. Lock stats for 1.9-dev3: Some write locks on average took a lot more time
>to acquire, e.g. "POOL" and "TASK_WQ". For 48 threads, I get:
> Stats about Lock FD:
> # write lock  : 143933900
> # write unlock: 143933895 (-5)
> # wait time for write : 11370.245 msec

This one definitely is huge. We know some work is still needed on this lock
and that there are still a few low hanging fruits but not much savings to
expect short term. This output is very revealing however of the importance
of this lock.

> # wait time for write/lock: 78.996 nsec

That's roughly the time it takes to access the other CPU's cache, so using
your two sockets for the same process definitely hurts a lot here.

> Stats about Lock TASK_RQ:
> # write lock  : 2062874
> # write unlock: 2062875 (1)
> # wait time for write : 7820.234 msec

This one is still far too large for what we'd hope, even though it
has significantly shrunk since 1.8. It could be related to the poor
distribution of the incoming connections across threads.

> # wait time for write/lock: 3790.941 nsec

Wow, 3.8 microseconds to acquire the write lock is a lot! I'm starting
to suspect some longer tree walks than expected. Again, lack of fairness
between threads can make this significantly worse than it should.

> Stats about Lock TASK_WQ:
> # write lock  : 2601227
> # write unlock: 2601227 (0)
> # wait time for write : 5019.811 msec
> # wait time for write/lock: 1929.786 nsec

For this one I had an idea that could significantly improve the situation.
It will unlikely be for 1.9 though.

> Stats about Lock POOL:
> # write lock  : 2823393
> # write unlock: 2823393 (0)
> # wait time for write : 11984.706 msec

Interesting. Very interesting... I thought we merged the lock-free pool
code. Maybe I was mistaken. Or I misunderstood something about the nature
of the changes that went in early during the 1.9-dev cycle. This one is
not expected to be hard to address, and I even see how we could go further
with a small thread-local cache since I developed such a thing a few years
ago (though it was over-engineered) that could easily be retrofitted here.

> # wait time for 

Re: HA-Proxy configuration

2018-10-10 Thread Jonathan Matthews
On Wed, 10 Oct 2018 at 07:08, anjireddy.komire...@wipro.com <
anjireddy.komire...@wipro.com> wrote:

> Hi Team,
>
>
> I am looking for HA-Proxy configuration Help in over project, can i know
> some one who can give more information on configuration using 2 different 
> HA-Proxy
> servers for high availability.
>
>
> Feel free to contact me on - 9849916124
>

Hey there,


Welcome to the public mailing list for users of the open source haproxy
tool.


You'd probably do best by posting the configuration and HA setup as far as
you've managed to get it going, and asking questions about specific
problems you encounter along the way. You're more likely to get help via
email than via telephone!


Here is the starter guide for the current stable version:

http://cbonte.github.io/haproxy-dconv/1.8/intro.html. There are links along
the top of that page to the configuration and management manuals, which
will be of interest as you evolve your HA setup.


If, instead, you feel you would like to trade time for money, and want to
take advantage of a commercial support option, some are listed here:

http://www.haproxy.org/#supp


As a backstop, my UK company is already set up as a supplier inside Wipro's
procurement system. Do get in touch if the routes I've mentioned above
don't meet your needs :-)


All the best,

Jonathan

-- 

Jonathan Matthews

London, UK

http://www.jpluscplusm.com/contact.html



-- 
Jonathan Matthews
London, UK
http://www.jpluscplusm.com/contact.html


Re: HA-Proxy configuration

2018-10-10 Thread Aleksandar Lazic
Hi.

Have you seen this link list, there are some good examples for a ha solutins ?

http://www.haproxy.org/#link

In general is keepalived and haproxy a good kombination.

On which plattform do you plan to run the ha setup?

Best regards
Aleks


 Ursprüngliche Nachricht 
Von: "anjireddy.komire...@wipro.com" 
Gesendet: 10. Oktober 2018 08:05:24 MESZ
An: "haproxy@formilux.org" 
CC: "santhosh.pa...@wipro.com" 
Betreff: HA-Proxy configuration

Hi Team,


I am looking for HA-Proxy configuration Help in over project, can i know some 
one who can give more information on configuration using 2 different HA-Proxy 
servers for high availability.


Feel free to contact me on - 9849916124


Regards,

Anjireddy.

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com


HA-Proxy configuration

2018-10-10 Thread anjireddy.komire...@wipro.com
Hi Team,


I am looking for HA-Proxy configuration Help in over project, can i know some 
one who can give more information on configuration using 2 different HA-Proxy 
servers for high availability.


Feel free to contact me on - 9849916124


Regards,

Anjireddy.

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com