Re: HAProxy makes backend unresponsive when handling multiple thousand connections per second

Jerry Scharf Thu, 22 Jun 2017 14:06:25 -0700

Daniel,

Here's a different approach to your problem.

As someone who wears too many hats, I am often asked about reencryptingconnections that are terminated by haproxy. Since this is a commonoccurrence between a small number of systems, it is much more efficientto create a small IPSec overlay network with the haproxy and target servers.

The encryption costs are probably dropped by an order of magnitude ormore for the connection. You do IKE renegotiation on the order of oncean hour vs. TLS negotiations hundreds of times per second. The actualmessage encryption is quite fast on any modern server processor. Youalso don't need to manage certs on the inside machines. These loadreductions are on both sides, so the tomcat machines will benefit as well.

The down side is that you either need to mesh the IPSec connections orrun a routing daemon to handle reachability on the IPSec overlay. Oneway or the other, you will probably need to do some automation to getthis into production. (I build these in a somewhat star-like set ofIPSec connections with the haproxy systems at the centers. Then I run adirt simple routing protocol with bird.)


At your traffic rate, I would do this as my default approach.

jerry

On 06/22/2017 01:21 AM, Daniel Heitepriem wrote:

Hi everyone,

thanks for your suggestions. Let me go through them step by step:

    Actually, I would have suggested the opposite: making the whole
    thing less expensive, by going full blown keep-alive with
    http-reuse:

    option http-keep-alive
    option prefer-last-server
    timeout http-keep-alive 30s
    http-reuse safe

I will try these settings, thank you Lukas. If I understood the manual
correctly when "prefer-last-server" is set, HAProxy tries to use an
already established connection to a backend for an active session
instead of rerouting it to another backend.

    Why specify ulimit? Haproxy will do this for you, you are just
    asking for trouble. I suggest you remove this.

By default Solaris 11 has an ulimit of 256:
-bash-4.4$ ulimit -n
256
If HAProxy can handle the ulimit beyond these 256 file descriptors this
would be fine and the "ulimit"-Parameter isn't necessary indeed.

    Maybe something on your backend (conntrack or the application)
    is rate-limiting per IP, or the aggressive client your are facing
    is keep-aliving properly with the backend, while it doesn't when
    using haproxy.

A rate-limit per IP is not active on any of our backends. I really
suppose that our HAProxy config isn't sane and has some paradoxical
parameters in it. A majority of the clients which access our application
are using multiple backend which are NATed to the same IP on their side
so on our side we just see one incoming IP which has several hundred to
thousand of connections.

    if we can see the tomcat connector settings (and logs possibly)
    maybe something there is causing issues.


Here are our Tomcat connector settings which are identical across our
backends

    <Connector port="8443"
protocol="org.apache.coyote.http11.Http11NioProtocol"
               maxThreads="1024" enableLookups="false"
               acceptCount="500"
               compression="on" compressableMimeType="application/xml"
               clientAuth="false" URIEncoding="UTF-8"
               keystoreFile="/opt/tomcat/conf/.keystore"
               keystorePass="XXX" keyAlias="tomcat"
               SSLEnabled="true" scheme="https" secure="true"
               sslEnabledProtocols="TLSv1.2"
               ciphers="TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,
                        TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,
                        TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384,
                        TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,
                        TLS_RSA_WITH_AES_128_CBC_SHA256,
                        TLS_RSA_WITH_AES_128_CBC_SHA,
                        TLS_RSA_WITH_AES_256_CBC_SHA256,
                        TLS_RSA_WITH_AES_256_CBC_SHA" />

At first I will try the settings that Lukas suggested. This could take
some time as we have to reproduce the problem in our test environment. I
will get back to you once I got some results.

Thank you very much and regards,
Daniel


--
Jerry Scharf, Soundhound DevOps
"What could possibly go wrong?"

Re: HAProxy makes backend unresponsive when handling multiple thousand connections per second

Reply via email to