Hi!
My system - OS FreeBSD 12.2 and filesystem - zfs. Samba 4.13.14 runs in
a jail with Bind 9.16.23 like backend. Also I have Bind 9.16.23 on
another server, its working like secondary dns. Secondary Bind gets
zones from DC by transferring with a tsig-key. Also, I have several
subnetworks(loopback and 3 other), whom DC listen.
Some time ago I moved DC from one jail to another. And I have strange
behaviour of Bind at new DC.
When I set in resolv.conf of new DC other dns server, for example - old
DC or secondary Bind, all works fine. New DC successfully resolve any
records by nslookup or host commands from himself or other host.
When I set in resolv.conf of new DC localhost or himself internal ip,
Bind periodically freezing by the next regularity:
- Bind stops to reply for the requests for a ~5 minutes. After start
working without service restart and freeze again.
- At the daytime(when employees in a office), in freezes after less 1
minute work, at the night - after 10-15 minutes.
- If I change resolv.conf from secondary Bind to internal IP, then not
need to restart Bind or Samba to start or stop periodically freezing.
Just change nameserver record and wait. If it was freezed, when
resolv.conf changing, then it will be in freeze state ~5 minutes after
start freezing and after will work fine.
- If I change resolv.conf from secondary Bind to loopback, then NEED to
restart Bind to start or stop freezing.
- When Bind freeze - it don't stopped service by a command and don't
killed by default, only kill -9 work.
- Internal Samba DNS work fine and don't freeze, when resolv.conf look
to localhost.
- Sometime Bind freeze not for all subnetworks. It can freeze for
localhost and 2 subnetworks. In one last subnetwork DC Bind can
successfully resolve any records from any subnetworks. But this
situation I saw only one time and can't repeat it for now.
- No special Bind log records with "debug 50", in time or before of
freezing. Its freezing after any messages. And all this messages I see
in log, when Bind works without freezing.
- I tried to run bind with logging to terminal, but don't saw no
additional information, when freeze. Terminal logs the same, like in log
files.
- rndc freeze also.
I found one way for resolving this problem. My server, where work jail
with DC, have 40 CPUs(20 cores and 40 threads). Therefore, when I starts
named, it is creates 40 workers for every listen ip, i.e. 40 tcp and 40
udp for every ip.
Because its too much for my configuration, I intuitively made a decision
to try to decrease number of named workers to 10 by "-n 10". And all
works without freezing with correct resolv.conf during last 2 weeks.
After, I tried set "-n 40", the same like named defines this value
automatically. After restart named freezed again. May be it was
coincidence, but with other settings named do not stop freezing. Also I
noticed, that when named works without freezing, "number of zones" in
"rndc status" output decreasing from 9 to 3. Seems, that named missed
samba zones, but resolving of records from them works fine.
I tried to collect some logs by ktrace and catched freeze moment. After
last record from usual log(when Bind freezing), in kdump starts many
times repeating the next records:
36460 named CALL nanosleep(0x7fffffffea30,0)
36460 named RET nanosleep 0
What can be wrong here? How I can more localize the problem?
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe
from this list
ISC funds the development of this software with paid support subscriptions.
Contact us at https://www.isc.org/contact/ for more information.
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users