I have solved this problem. Here's what I did: I have more than one DNS server. I can telnet to all of them on port 53 except for one: the primary NS. I killed the process ID (PID) of bind on the primary, started Bind again and zone transfer happened immediately. And everything is good. I guess something went wrong when I made the mistake on "domain.com" zone file while editing it, I corrected my mistake and HUP'd the PID but somehow it made port 53 "filtered". And now I killed the PID and started a new Bind PID and everything is good.
Maybe someone with UNIX internal knowledge can explain why it happened. On Mon, Oct 13, 2008 at 1:26 PM, Chris Henderson <[EMAIL PROTECTED]> wrote: > On Mon, Oct 13, 2008 at 9:27 AM, Chris Henderson <[EMAIL PROTECTED]> wrote: >> On Fri, Oct 10, 2008 at 6:21 PM, Matus UHLAR - fantomas >> <[EMAIL PROTECTED]> wrote: >>> log on the slave and query the master. tcpdump the communication on the >>> master too. Check both TCP and UDP communication. >> >> here's what I am getting from sniffing both the slave and master at >> the same time: >> >> from the slave I can see: >> >> slave -> master DNS C port=55480 >> >> slave -> master DNS C port=55480 >> slave -> master DNS C port=55480 >> slave -> master DNS C port=55480 >> slave -> master DNS C port=55480 >> slave -> master DNS C port=55480 >> >> from the master I can see: >> >> slave -> master DNS C domain.com. Internet SOA ? >> master -> slave DNS R domain.com. Internet SOA >> slave -> master DNS C port=55571 >> slave -> master DNS C port=55571 >> slave -> master DNS C port=55571 >> slave -> master DNS C port=55571 >> slave -> master DNS C port=55571 >> >> And in the slave's log I can that "timed-out" error. >> >> I don't have any firewall. Besides, I can ping, traceroute, ssh to and >> from the master and slave without a problem. >> >> Thanks for any further help. >> > > Further to my previous mail, I have another zone file from the same > master server (called "203.10.21") - which is coming fine as zone > transfer to the same slave. But my "domain.com" zone transfer is > timing out. The size of the "domain.com" file is much smaller than the > "203.10.21" zone file. The zone transfer stopped after I added a > $origin RR to the master server's doamin.com file - which was a wrong > entry and I reverted the change back and ran named-checkzone on > domain.com which looks good. > > I'm running out of options here. The only thing I can think of is: > delete "domain.com" file from the master, restore from backup the last > known good file and see if zone transfer happens. > > Anyone has any other ideas? Bind is not really telling me why it's > timing out while doing the zone transfer for "domain.com" and not for > "203.10.21". >
