Err...more precisely... HA-Proxy version 1.4.15 2011/04/08 Copyright 2000-2010 Willy Tarreau <[email protected]>
Build options :
TARGET = linux26
CPU = generic
CC = gcc
CFLAGS = -O2 -g -fno-strict-aliasing
OPTIONS = USE_REGPARM=1 USE_PCRE=1
Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200
Encrypted password support via crypt(3): yes
Available polling systems :
sepoll : pref=400, test result OK
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 4 (4 usable), will use sepoll.
On May 24, 2012, at 5:31 PM, Lange, Kevin M. (GSFC-423.0)[RAYTHEON COMPANY]
wrote:
>
>
> I thought it was a bug in the reporting, considering we've played with
> numerous values for the various timeouts as an experiment, but wanted your
> thoughts.
> This is v1.4.15.
>
> [root@opsslb1 log]# haproxy -v
> HA-Proxy version 1.4.15 2011/04/08
> Copyright 2000-2010 Willy Tarreau <[email protected]>
>
> On May 24, 2012, at 5:17 PM, Willy Tarreau wrote:
>
>> Hi Kevin,
>>
>> On Thu, May 24, 2012 at 04:04:03PM -0500, Lange, Kevin M.
>> (GSFC-423.0)[RAYTHEON COMPANY] wrote:
>>> Hi,
>>> We're having odd behavior (apparently have always but didn't realize it),
>>> where our backend httpchks "time out":
>>>
>>> May 24 04:03:33 opsslb1 haproxy[4594]: Server webapp_ops_bk/webapp_ops1 is
>>> DOWN, reason: Layer7 timeout, check duration: 1002ms. 0 active and 0 backup
>>> servers left. 1 sessions active, 0 requeued, 0 remaining in queue.
>>> May 24 04:41:55 opsslb1 haproxy[4594]: Server webapp_ops_bk/webapp_ops1 is
>>> DOWN, reason: Layer7 timeout, check duration: 1001ms. 0 active and 0 backup
>>> servers left. 2 sessions active, 0 requeued, 0 remaining in queue.
>>> May 24 08:38:10 opsslb1 haproxy[4594]: Server webapp_ops_bk/webapp_ops1 is
>>> DOWN, reason: Layer7 timeout, check duration: 1002ms. 0 active and 0 backup
>>> servers left. 1 sessions active, 0 requeued, 0 remaining in queue.
>>> May 24 08:53:37 opsslb1 haproxy[4594]: Server webapp_ops_bk/webapp_ops2 is
>>> DOWN, reason: Layer7 timeout, check duration: 1001ms. 0 active and 0 backup
>>> servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
>>> May 24 09:32:20 opsslb1 haproxy[4594]: Server webapp_ops_bk/webapp_ops2 is
>>> DOWN, reason: Layer7 timeout, check duration: 1002ms. 0 active and 0 backup
>>> servers left. 3 sessions active, 0 requeued, 0 remaining in queue.
>>> May 24 09:35:01 opsslb1 haproxy[4594]: Server webapp_ops_bk/webapp_ops3 is
>>> DOWN, reason: Layer7 timeout, check duration: 1001ms. 0 active and 0 backup
>>> servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
>>> May 24 09:41:37 opsslb1 haproxy[4594]: Server webapp_ops_bk/webapp_ops2 is
>>> DOWN, reason: Layer7 timeout, check duration: 1001ms. 0 active and 0 backup
>>> servers left. 1 sessions active, 0 requeued, 0 remaining in queue.
>>> May 24 09:56:41 opsslb1 haproxy[4594]: Server webapp_ops_bk/webapp_ops3 is
>>> DOWN, reason: Layer7 timeout, check duration: 1002ms. 0 active and 0 backup
>>> servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
>>> May 24 10:01:45 opsslb1 haproxy[4594]: Server webapp_ops_bk/webapp_ops1 is
>>> DOWN, reason: Layer7 timeout, check duration: 1001ms. 0 active and 0 backup
>>> servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
>>>
>>>
>>> We've been playing with the timeout values, and we don't know what is
>>> controlling the "Layer7 timeout, check duration: 1002ms". The backend
>>> service availability check (by hand) typically takes 2-3 seconds on average.
>>> Here is the relevant haproxy setup.
>>>
>>> #---------------------------------------------------------------------
>>> # Global settings
>>> #---------------------------------------------------------------------
>>> global
>>> log-send-hostname opsslb1
>>> log 127.0.0.1 local1 info
>>> # chroot /var/lib/haproxy
>>> pidfile /var/run/haproxy.pid
>>> maxconn 1024
>>> user haproxy
>>> group haproxy
>>> daemon
>>>
>>> #---------------------------------------------------------------------
>>> # common defaults that all the 'listen' and 'backend' sections will
>>> # use if not designated in their block
>>> #---------------------------------------------------------------------
>>> defaults
>>> mode http
>>> log global
>>> option dontlognull
>>> option httpclose
>>> option httplog
>>> option forwardfor
>>> option redispatch
>>> timeout connect 500 # default 10 second time out if a backend is not found
>>> timeout client 50000
>>> timeout server 3600000
>>> maxconn 60000
>>> retries 3
>>>
>>> frontend webapp_ops_ft
>>>
>>> bind 10.0.40.209:80
>>> default_backend webapp_ops_bk
>>>
>>> backend webapp_ops_bk
>>> balance roundrobin
>>> option httpchk HEAD /app/availability
>>> reqrep ^Host:.* Host:\ webapp.example.com
>>> server webapp_ops1 opsapp1.ops.example.com:41000 check inter 30000
>>> server webapp_ops2 opsapp2.ops.example.com:41000 check inter 30000
>>> server webapp_ops3 opsapp3.ops.example.com:41000 check inter 30000
>>> timeout check 15000
>>> timeout connect 15000
>>
>> This is quite strange. The timeout is defined first by "timeout check" or if
>> unset, by "inter". So in your case you should observe a 15sec timeout, not
>> one second.
>>
>> What exact version is this ? (haproxy -vv)
>>
>> It looks like a bug, however it could be a bug in the timeout handling as
>> well as in the reporting. I'd suspect the latter since you're saying that
>> the service takes 2-3 sec to respond and you don't seem to see errors
>> that often.
>>
>> Regards,
>> Willy
>>
>
> Kevin Lange
> [email protected]
> [email protected]
> W: +1 (301) 851-8450
> Raytheon | NASA | ECS Evolution Development Program
> https://www.echo.com | https://www.raytheon.com
>
Kevin Lange
[email protected]
[email protected]
W: +1 (301) 851-8450
Raytheon | NASA | ECS Evolution Development Program
https://www.echo.com | https://www.raytheon.com
smime.p7s
Description: S/MIME cryptographic signature

