Hello,

I'm stuck with one issue. Can you help me, please.

I have a service that gets about 1K connections/second 
and 15K requests/sec in top. 
And my service should response maximum in 120 ms.
The client, that sends me these requests within keep-alive connection.

But i have a lot of 400 and 408 errors.
Example:
... <NOSRV> -1/-1/-1/-1/50185 408 212 - - 
                    cR-- 5236/1728/0/0/0 0/0 "<BADREQ>"
... <NOSRV> -1/-1/-1/-1/13282 400 187 - - 
                   CR-- 5169/3506/0/0/0 0/0 "<BADREQ>"

>From the docs, i found the explanation of these errors:
- for the first one:
       he client never completed its request, which was aborted by the
       time-out ("c---") after 50s, while the proxy was waiting for the request
       headers ("-R--").  Nothing was sent to any server, but the proxy could
       send a 408 return code to the client.
- for the second one: 
       the client never completed its request and aborted itself ("C---") after
       8.5s, while the proxy was waiting for the request headers ("-R--").
       Nothing was sent to any server.

As i understand here, the client established connection 
and in first case didn't send any request 
during 50 sec (i'm closing the connection with 408 code). 
And in second case the client aborting 
connection during which no request was send.

May be my settings (e.g. sysctl.conf or something else) 
are not optimized for such load?
And the main problem is that i'm not a system administrator, 
i'm a developer ((( 
But for the last couple of months i had to read so many docs )))

Another question:
Is it possible to abort request with code 204 
if it takes more than some time (e.g. 120 ms)?

The configuration file is:

global
    daemon
    pidfile /var/run/haproxy-3.pid
    maxconn 250000
    tune.bufsize    8024
    log 127.0.0.1 local0

defaults
    log global
    mode http
    option httplog
    #option dontlognull
    option dontlog-normal

    no option httpclose
    option http-server-close
    no option forceclose
    option forwardfor
    balance roundrobin
    option redispatch
    retries 3

    timeout client 50s
    timeout http-keep-alive 30s
    timeout server 50s
    timeout connect 10s

frontend http_front
    maxconn 30000
    bind    xxx.xxx.xxx.xxx:80
    reqadd X-Scheme:\ http

    acl is_value path_beg /some/path/
    use_backend some_backend if is_value

backend some_backend
    option http-server-close

    server server1.1 xxx.xxx.xxx.xxx:8101 weight 1 
               maxconn 100 check port 8101
    server server1.2 xxx.xxx.xxx.xxx:8102 weight 1 
               maxconn 100 check port 8102
    i have about 32 app instances on 4 servers .....



Here sysctl.conf:
# TCP tunning

# Do a 'modprobe tcp_cubic' first
net.ipv4.tcp_congestion_control = cubic

# Turn on the tcp_window_scaling
net.ipv4.tcp_window_scaling = 1

# Increase the maximum total buffer-space allocatable
# This is measured in units of pages (4096 bytes)
net.ipv4.tcp_mem = 65536 131072 262144
#net.ipv4.tcp_mem = 4096 1048576 16777216
net.ipv4.udp_mem = 65536 131072 262144

# Increase the read-buffer space allocatable
net.ipv4.tcp_rmem = 8192 87380 16777216
#net.ipv4.tcp_rmem = 4096 1048576 16777216
net.ipv4.udp_rmem_min = 16384
net.core.rmem_default = 131072
#net.core.rmem_default = 1048576
net.core.rmem_max = 16777216

# Increase the write-buffer-space allocatable
net.ipv4.tcp_wmem = 8192 65536 16777216
#net.ipv4.tcp_wmem = 4096 1048576 16777216
net.ipv4.udp_wmem_min = 16384
net.core.wmem_default = 131072
#net.core.wmem_default = 1048576
net.core.wmem_max = 16777216

# Increase number of incoming connections backlog
#net.core.netdev_max_backlog = 60000
net.core.netdev_max_backlog = 4096
net.core.dev_weight = 64

# Increase number of incoming connections
net.core.somaxconn = 60000

# Increase the maximum amount of option memory buffers
net.core.optmem_max = 65536 # 20480

# Increase the tcp-time-wait buckets pool size 
# to prevent simple DOS attacks
net.ipv4.tcp_max_tw_buckets = 1440000 # 131072
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_tw_reuse = 1

# Limit number of orphans, each orphan can eat up to 16M 
# (max wmem) of unswappable memory
net.ipv4.tcp_max_orphans = 16384
net.ipv4.tcp_orphan_retries = 0

# don't cache ssthresh from previous connection
net.ipv4.tcp_no_metrics_save = 1 # 0
net.ipv4.tcp_moderate_rcvbuf = 1

net.core.netdev_budget = 30000
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_fack = 1
net.ipv4.tcp_fin_timeout = 10
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_low_latency = 0

net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_synack_retries = 3
net.ipv4.tcp_syncookies = 1


Reply via email to