... and only on one host.

So to start, yes my clocks are in sync to within 5 seconds.

First the info on the setup:

There's one master server ns00.example.net, and two slave servers ns01.example.net and ns11.example.net. The master hosts about a dozen zones to the slaves, and uses TSIG for the transfers. To make it more interesting, I can't replicate the issue transferring example.net with ns01, it's named does it fine, albeit with a different TSIG key. This is on CentOS 5.3 i386, which has BIND 9.3.4-P1 (more specifically RPM says bind-9.3.4-10.P1.el5).


[r...@ns11 ~]# rndc reload example.net
zone refresh queued
[r...@ns11 ~]# Jun 22 14:28:21 ns11 named[1744]: 22-Jun-2009 14:28:21.775 general: debug 1: received control channel command 'null' Jun 22 14:28:21 ns11 named[1744]: 22-Jun-2009 14:28:21.776 general: debug 1: received control channel command 'reload example.net' Jun 22 14:28:21 ns11 named[1744]: 22-Jun-2009 14:28:21.776 general: debug 1: queue_soa_query: zone example.net/IN: enter Jun 22 14:28:21 ns11 named[1744]: 22-Jun-2009 14:28:21.776 general: debug 1: soa_query: zone example.net/IN: enter Jun 22 14:28:22 ns11 named[1744]: 22-Jun-2009 14:28:22.247 general: debug 1: refresh_callback: zone example.net/IN: enter Jun 22 14:28:22 ns11 named[1744]: 22-Jun-2009 14:28:22.247 general: info: zone example.net/IN: refresh: failure trying master 1.1.2.50#53 (source 0.0.0.0#0): tsig verify failure Jun 22 14:28:22 ns11 named[1744]: 22-Jun-2009 14:28:22.247 general: debug 1: queue_soa_query: zone example.net/IN: enter Jun 22 14:28:22 ns11 named[1744]: 22-Jun-2009 14:28:22.278 general: debug 1: soa_query: zone example.net/IN: enter Jun 22 14:28:22 ns11 named[1744]: 22-Jun-2009 14:28:22.279 general: debug 1: cancel_refresh: zone example.net/IN: enter

But when I do another zone, keep in mind this is to the same master, so the TSIG settings are exactly the same (I've set them up per-IP not per-zone). Jun 22 14:31:14 ns11 named[1744]: 22-Jun-2009 14:31:14.008 general: info: zone example.com/IN: Transfer started. Jun 22 14:31:14 ns11 named[1744]: 22-Jun-2009 14:31:14.008 general: debug 1: zone example.com/IN: requesting IXFR from 1.1.2.50#53 Jun 22 14:31:14 ns11 named[1744]: 22-Jun-2009 14:31:14.100 general: debug 1: zone example.com/IN: zone transfer finished: success Jun 22 14:31:14 ns11 named[1744]: 22-Jun-2009 14:31:14.100 general: info: zone example.com/IN: transferred serial 2009062204: TSIG 'ns11.example.net-ns01.example.net'

I can't make heads or tails of *WHY* exactly tsig is throwing the verify error, even with debugging turned up to 99 the above is all I get in my logs.

Just to make things more interesting, if I do a TSIG AXFR query directly from dig on ns11, it works with example.net!

[r...@ns11 ~]# dig @1.1.1.50 example.net axfr -y ns11.example.net- ns01.example.net.:2HL0vpUE2JYFxv0YaAtrVg== ; <<>> DiG 9.3.4-P1 <<>> @1.1.1.50 example.net axfr -y ns11.example.net-ns01.example.net.
; (1 server found)
;; global options:  printcmd
example.net. 86400 IN SOA example.net. support.example.net. 2009062202 600 300 3600000 86400
[snip]
example.net. 86400 IN SOA example.net. support.example.net. 2009062202 600 300 3600000 86400 ns11.example.net-ns01.example.net. 0 ANY TSIG hmac-md5.sig- alg.reg.int. 1245707011 300 16 l+rb6H0RuqwXCT6H4G6JgQ== 49169 NOERROR 0
;; Query time: 272 msec
;; SERVER: 1.1.1.50#53(1.1.1.50)
;; WHEN: Mon Jun 22 14:43:31 2009
;; XFR size: 32 records (messages 1)

Help? I'm open to trying just about any crazy ideas at this point.


_______________________________________________
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to