Hi,

I've been able to get my hands on some rather nice servers with 2 x 12 core 
Intel CPU's and was wondering if anybody had any decent tuning tips to get BIND 
to respond at a faster rate.

I'm seeing that pretty much cpu count beyond a single die doesn't get any real 
improvement. I understand the NUMA boundaries etc., but this hasn't been my 
experience on previous iterations of the Intel CPU's, at least not this 
dramatically. When I use more than a single die, CPU utilization continues to 
match the core count however throughput doesn't increase to match.

All the testing I've been doing for now (dnsperf from multiple sources for now) 
seems to be plateauing around 340k qps per BIND host.

Some notes:
- Primarily looking at UDP throughput here
- Intention is for high-throughput, authoritative only
- The zone files used for testing are fairly small and reside completely 
in-memory; no disk IO involved
- RHEL7, bind 9.10 series, iptables 'NOTRACK' firmly in place
- Current configure:

built by make with '--build=x86_64-redhat-linux-gnu' 
'--host=x86_64-redhat-linux-gnu' '--program-prefix=' 
'--disable-dependency-tracking' '--prefix=/usr' '--exec-prefix=/usr' 
'--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' 
'--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' 
'--libexecdir=/usr/libexec' '--sharedstatedir=/var/lib' 
'--mandir=/usr/share/man' '--infodir=/usr/share/info' '--localstatedir=/var' 
'--with-libtool' '--enable-threads' '--enable-ipv6' '--with-pic' 
'--enable-shared' '--disable-static' '--disable-openssl-version-check' 
'--with-tuning=large' '--with-libxml2' '--with-libjson' 
'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 
'CFLAGS= -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
-fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 
-mtune=generic -fPIC' 'LDFLAGS=-Wl,-z,relro ' 'CPPFLAGS= -DDIG_SIGCHASE -fPIC'

Things tried:
- Using 'taskset' to bind to a single CPU die and limiting BIND to '-n' cpu's 
doesn't improve much beyond letting BIND make its own decision
- NIC interfaces are set for TOE
- rmem & wmem changes (beyond a point) seem to do little to improve 
performance, mainly just make throughput more consistent

I've yet to investigate the switch throughput or tweaking (don't yet have 
access to it).

So, any thoughts?

Stuart
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to