Re: Help! HAProxy randomly failing health checks!

Igor Cicimov Sat, 19 Mar 2016 02:45:16 -0700

On Thu, Mar 17, 2016 at 12:46 PM, Igor Cicimov <
ig...@encompasscorporation.com> wrote:


>
>
> On Thu, Mar 17, 2016 at 11:14 AM, Zachary Punches <zpunc...@getcake.com>
> wrote:
>
>> I wanna say average is like 4-6 connections a second? Super minimal
>>
>> From what I’ve seen in the logs during the SSL errors, the log hangs then
>> outputs a bunch of SSL errors all at once.
>>
>> Here it the output from sysctl –p
>> net.ipv4.ip_forward = 0
>> net.ipv4.conf.default.rp_filter = 1
>> net.ipv4.conf.default.accept_source_route = 0
>> kernel.sysrq = 0
>> kernel.core_uses_pid = 1
>> net.ipv4.tcp_syncookies = 1
>> error: "net.bridge.bridge-nf-call-ip6tables" is an unknown key
>> error: "net.bridge.bridge-nf-call-iptables" is an unknown key
>> error: "net.bridge.bridge-nf-call-arptables" is an unknown key
>> kernel.msgmnb = 65536
>> kernel.msgmax = 65536
>> kernel.shmmax = 68719476736
>> kernel.shmall = 4294967296
>>
>> What would lowering the tune.ssl.default-dh-param to 1024 do?
>> From: Igor Cicimov <ig...@encompasscorporation.com>
>> Date: Wednesday, March 16, 2016 at 5:01 PM
>> To: Zachary Punches <zpunc...@getcake.com>
>> Cc: Baptiste <bed...@gmail.com>, "haproxy@formilux.org" <
>> haproxy@formilux.org>
>> Subject: Re: Help! HAProxy randomly failing health checks!
>>
>>
>>
>> On Thu, Mar 17, 2016 at 10:55 AM, Zachary Punches <zpunc...@getcake.com>
>> wrote:
>>
>>> Thanks for the reply!
>>>
>>> Ok so based on what you saw in my config, does it look like we’re
>>> misconfigured enough to cause this to happen?
>>>
>>> If we were misconfigured, one would assume we would go down all the time
>>> yeah?
>>>
>>> From: Igor Cicimov <ig...@encompasscorporation.com>
>>> Date: Wednesday, March 16, 2016 at 4:50 PM
>>> To: Zachary Punches <zpunc...@getcake.com>
>>> Cc: Baptiste <bed...@gmail.com>, "haproxy@formilux.org" <
>>> haproxy@formilux.org>
>>> Subject: Re: Help! HAProxy randomly failing health checks!
>>>
>>>
>>>
>>> On Thu, Mar 17, 2016 at 10:47 AM, Igor Cicimov <
>>> ig...@encompasscorporation.com> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Mar 17, 2016 at 5:29 AM, Zachary Punches <zpunc...@getcake.com>
>>>> wrote:
>>>>
>>>>> I’m not, these guys aren’t sitting behind an ELB. They sit behind
>>>>> route53 routing. If one of the proxy boxes fails 3 checks in 30 seconds
>>>>> (with 4 checks done a second) then Route53 changes its routing from the
>>>>> first proxy box to the second
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 3/15/16, 9:46 PM, "Baptiste" <bed...@gmail.com> wrote:
>>>>>
>>>>> >Maybe you're checking a third party VM :)
>>>>> >
>>>>>
>>>>
>>>> AFAIK the Route53 health checks come from different points around the
>>>> globe and it is possible that at some time of the day AWS has scheduled
>>>> some specific end points to perform the HC. And it is possible that those
>>>> ones have different SSL settings from the ones performing the HC during
>>>> your day time. I would suggest you bring up this issue with AWS support,
>>>> let them know your SSL cypher settings in HAP and ask if they are
>>>> compatible with ALL their servers performing SSL health checks.
>>>>
>>>> I personally haven't seen any issues with failed SSL handshakes coming
>>>> from AWS servers and have HAP's running in AU and UK regions.
>>>>
>>>> Igor
>>>>
>>>
>>> That is if you are absolutely sure that the failed handshakes are not
>>> caused by overload or misconfigured (system) settings on HAP
>>>
>>>
>> I was saying this in regards to system (kernel) settings. For example,
>> assuming Unix/Linux is your net.core.somaxconn actually set *higher* than
>> your maxconn which is set to 30000 and 15000 in HAP? Any other kernel
>> settings you might have changed? (output of "sysctl -p" command)
>>
>> What is your pick hour load, how many connections/sessions are you seeing
>> on each HAP?
>>
>> Another suggestion is maybe set tune.ssl.default-dh-param to 1024 and see
>> if that helps.
>>
>
>
> Ok, so on default ubuntu cloud instance this is what we have:
>
> # sysctl -a | grep maxconn
> net.core.somaxconn = 128
>
> which is too low for production server. Check your value and adjust it to
> your needs.
>
> By the way, what is accept-proxy doing there in your setup? Is there
> anything else in front of HAP using PROXY protocol?
>
>
Wait a minute:

bind *:1027 # Health checking port

are you maybe using https health check on a non SSL port?

Re: Help! HAProxy randomly failing health checks!

Reply via email to