Hey,
having a dual-node setup of 6.0 in prod, I decided to move forward with one of 
machines
and upgrade to 6.1-stable. Ending up in benchmark tool ”locking” the 6.1 
machine.

Background:
Nodes are Xeon E5-2642v3 3.4Ghz x12, 16G RAM, 64G DOM modules as hdd,
4x X540T (ix) - 2x on-board and 2x PCI-card.

All 4x X540T are connected to 2x Cisco Nexus 3000-series, creating an LACP 
trunk (1x on-board + 1x PCI).
trunk0 - external (VLAN), 1x NIC connected to switch1 and 1x NIC connected to 
switch2 (ix0 + ix3)
trunk1 - internal (VLAN) , 1x NIC connected to switch1 and 1x NIC connected to 
switch2 (ix1 + ix2)
As I have 2x Nexus 3000, VPC is configured and sitting on top of LACP trunk on 
their end.

Each obsd node have several carp interfaces configured on top of trunk0.
Only one carp interface on trunk1 - carp1.

Each switch acting as a default gw (VRRP configured) for any existing VLAN, 
except one towards trunk1.
Default gateway for those switches is IP on carp1.
Those switches run OSPF as well as obsd nodes do.

obsd nodes are the front line, facing the Internet. (2x uplink goes into 2x 
Nexus and then traffic is passed to 2x obsd.)
Running relayd with SSL-offload and plain HTTP.
Except relayd, there is ospfd, ntpd, snmpd, and bgpd(for distributed 
blacklisting around other global nodes).

The problem:
While doing a bench with https://github.com/wg/wrk <https://github.com/wg/wrk> 
from my laptop (OS X, 1Gbps max. pipe) agains the environment (HTTPS)
relayd experienced problems with handling the traffic.

shell# ./wrk -t16 -c1500 -d90s —latency <https://URL>

wrk hammering apache 2.4(behind those nodes), serving a txt file with avg 
7k-10k req/s as an output:

wrk -t16 -c1500 -d90s --latency https:/<URL>/ping.txt
Running 2m test @ https://<URL>/ping.txt
  16 threads and 1500 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   131.17ms   70.91ms   1.97s    91.70%
    Req/Sec   651.06    135.80     1.09k    84.95%
  Latency Distribution
     50%  131.90ms
     75%  144.63ms
     90%  159.63ms
     99%  230.92ms
  927039 requests in 1.50m, 190.12MB read
  Socket errors: connect 0, read 0, write 0, timeout 1330
Requests/sec:  10290.54
Transfer/sec:      2.11MB

wrk hammering apache 2.4, mod_proxy_balance, with NodeJS nodes behind apache:

wrk -t16 -c1500 -d90s --latency https://<URL>/nodejs
Running 2m test @ https://<URL>/nodejs
  16 threads and 1500 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   445.91ms  518.66ms   2.00s    83.49%
    Req/Sec    56.80     26.89   180.00     68.48%
  Latency Distribution
     50%  217.57ms
     75%  374.15ms
     90%    1.50s
     99%    1.95s
  80673 requests in 1.50m, 1.12GB read
  Socket errors: connect 0, read 5534, write 0, timeout 18099
Requests/sec:    895.42
Transfer/sec:     12.72MB     

’top’ showed none interrupting at all, but rather heavy system load values and 
some user values.
20-30% - user
80-90% - system
relayd (12 forks as the number of cores) - 99% usage.

I basically killed both machines running 6.0, thus my decision to upgrade to 
6.1.
However, during the tests against 6.0, my ssh session never got terminated 
(”kicked out”) even with this hight load (0% CPU idle).
6.1 showed different symptoms - ssh session termination, login via web based 
IPMI GUI hanging after log in part,
ping not responding(from the switches and node1 which is 6.0 yet).
After a while, with bench aborted, 6.1 eventually let me in via ssh (terminal 
via IPMI stil hanging).

snmpd which been running (remember), been polled by other sys doing graphs.
What been seen on those graphs is high rate of output err pkts on trunks, not 
NICs (ix) them selves.
Also, syslog, with enabled ’log all’ for relayd showed a lot of ’buffer timeout 
event’,
ospfd yeilding about ’no buffer space available’.

I had to modd relayd.conf to spawn only 8 preforks instead of 12
and 

kern.maxclusters=24576 #12288
kern.maxfiles=65536 #32768

in order to survive the bench (e.g.. having ssh session alive).
Values commented out are from the 6.0 setup.

I’m looking for any advice here, which hopefully will lead to a stable and 
performant setup.
Configuration follows.

———sysct.conf (obsd 6.0)————
net.inet.ip.forwarding=1
net.inet.ipcomp.enable=1        # 1=Enable the IPCOMP protocol
net.inet.etherip.allow=1        # 1=Enable the Ethernet-over-IP protocol
net.inet.tcp.ecn=1              # 1=Enable the TCP ECN extension
net.inet.carp.preempt=1 # 1=Enable carp(4) preemption
net.inet.carp.log=3             # log level of carp(4) info, default 2
ddb.panic=0                     # 0=Do not drop into ddb on a kernel panic
ddb.console=1                   # 1=Permit entry of ddb from the console
kern.pool_debug=0
net.inet.ip.maxqueue=2048
kern.somaxconn=4096
kern.maxclusters=12288
kern.maxfiles=32768
net.inet.ip.ifq.maxlen=2048


————login.conf———————
relayd:\
        :maxproc-max=31:\
        :openfiles-cur=65536:\
        :openfiles-max=65536:\
        :tc=daemon:

—————pf.conf———————
set block-policy drop
set limit { states 3000000, frags 2000, src-nodes 1000000 }

—————relayd.conf———————
interval 10
timeout 1000
prefork 8 #12
log all ——>>>>>>>> for debuging the situation

shell# netstat -m
1227 mbufs in use:
        626 mbufs allocated to data
        189 mbufs allocated to packet headers
        412 mbufs allocated to socket names and addresses
0/232/64 mbuf 2048 byte clusters in use (current/peak/max)
423/2865/120 mbuf 2112 byte clusters in use (current/peak/max)
0/160/64 mbuf 4096 byte clusters in use (current/peak/max)
0/200/64 mbuf 8192 byte clusters in use (current/peak/max)
0/14/112 mbuf 9216 byte clusters in use (current/peak/max)
0/20/80 mbuf 12288 byte clusters in use (current/peak/max)
0/16/64 mbuf 16384 byte clusters in use (current/peak/max)
0/8/64 mbuf 65536 byte clusters in use (current/peak/max)
23400 Kbytes allocated to network (5% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

Kernel is stock, latest via syspatch.

P.S.
Ifq.drops been never observed, nor
high ifq.len (max. 5 pkts in queue)
PF had max. 290k states.
 
Br
//mxb


 

  

Reply via email to