Re: Dig Hangs during axfr request when not on localhost.

2019-06-14 Thread John Horne
On Fri, 2019-06-14 at 10:05 +, John Horne wrote:
> On Fri, 2019-06-14 at 08:53 +0100, Pete Fry via bind-users wrote:
> > Hi
> >
> > versions:
> > BIND 9.9.4-RedHat-9.9.4-74.el7_6.1 (Extended Support Version)
> > CentOS Linux release 7.6.1810 (Core)
> >
> > We are having a problem on our masters that have large zone files (around
> > 5MB) are failing to be loaded on our slaves.
> >
> > after some investigation
> >
> > we can perform the following commands whilst local on the master
> >
> > dig @localhost ZONE axfr
> >
> > and the command performs and exits successfully
> >
> > however if you fun dig @IP.OF.MASTER ZONE axfr from a machine on the same
> > subnet the zone starts to transfer and then hangs at certain points around
> > 150k bytes give or take and fails to complete.
> >
> Hello,
>
> We have had the same problem on CentOS 7 servers after a recent bind yum
> update. For the moment we have downgraded BIND back to
> bind-9.9.4-73.el7_6.x86_64 and the zone transfers are working again.
>
Hi,

Looking a bit further into this, as far as I can tell the only difference
between version '9.9.4-73' and '9.9.4-74' is a fix for CVE-2018-5743 which
relates to TCP clients.

It's a bit confusing as we do set a limit for the TCP clients to 250. However,
the server sending the zone and the client requesting it are both lightly
loaded, and rndc shows that we are nowhere near the TCP limit on either server.

Also confusing, for me at least, is that we do log zone transfers, but usually
at a channel severity of 'info'. I changed this to dynamic, and controlled the
debug level using rndc. If I set the debug level to 9 then the transfer works.
Anything less than 9 and it fails.

It seems that the TCP connection is being lost for some reason, as the log file
(at a low debug level) shows the AXFR starting a few times. Each start
corresponds to when the transfer seems to hang. On the client side it shows the
connection as having timed out.




John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK

[http://www.plymouth.ac.uk/images/email_footer.gif]

This email and any files with it are confidential and intended solely for the 
use of the recipient to whom it is addressed. If you are not the intended 
recipient then copying, distribution or other use of the information contained 
is strictly prohibited and you should not rely on it. If you have received this 
email in error please let the sender know immediately and delete it from your 
system(s). Internet emails are not necessarily secure. While we take every 
care, University of Plymouth accepts no responsibility for viruses and it is 
your responsibility to scan emails and their attachments. University of 
Plymouth does not accept responsibility for any changes made after it was sent. 
Nothing in this email or its attachments constitutes an order for goods or 
services unless accompanied by an official order form.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Dig Hangs during axfr request when not on localhost.

2019-06-14 Thread Pete Fry via bind-users
> 
> We have had the same problem on CentOS 7 servers after a recent bind yum
> update. For the moment we have downgraded BIND back to
> bind-9.9.4-73.el7_6.x86_64 and the zone transfers are working again.
> 

John

Many thanks for this can't believe we didn't try this first!

thanks again

Pete
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Dig Hangs during axfr request when not on localhost.

2019-06-14 Thread John Horne
On Fri, 2019-06-14 at 08:53 +0100, Pete Fry via bind-users wrote:
> Hi
>
> versions:
> BIND 9.9.4-RedHat-9.9.4-74.el7_6.1 (Extended Support Version)
> CentOS Linux release 7.6.1810 (Core)
>
> We are having a problem on our masters that have large zone files (around
> 5MB) are failing to be loaded on our slaves.
>
> after some investigation
>
> we can perform the following commands whilst local on the master
>
> dig @localhost ZONE axfr
>
> and the command performs and exits successfully
>
> however if you fun dig @IP.OF.MASTER ZONE axfr from a machine on the same
> subnet the zone starts to transfer and then hangs at certain points around
> 150k bytes give or take and fails to complete.
>
Hello,

We have had the same problem on CentOS 7 servers after a recent bind yum
update. For the moment we have downgraded BIND back to
bind-9.9.4-73.el7_6.x86_64 and the zone transfers are working again.



John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK

[http://www.plymouth.ac.uk/images/email_footer.gif]

This email and any files with it are confidential and intended solely for the 
use of the recipient to whom it is addressed. If you are not the intended 
recipient then copying, distribution or other use of the information contained 
is strictly prohibited and you should not rely on it. If you have received this 
email in error please let the sender know immediately and delete it from your 
system(s). Internet emails are not necessarily secure. While we take every 
care, University of Plymouth accepts no responsibility for viruses and it is 
your responsibility to scan emails and their attachments. University of 
Plymouth does not accept responsibility for any changes made after it was sent. 
Nothing in this email or its attachments constitutes an order for goods or 
services unless accompanied by an official order form.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Dig Hangs during axfr request when not on localhost.

2019-06-14 Thread Pete Fry via bind-users
Would 
(https://docstore.mik.ua/orelly/networking_2ndEd/dns/ch10_12.htm#dns4-CHP-10-SECT-12.1.6.html)

the setting in 10.12.2.1 the data segmnet size limit be the default?
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Dig Hangs during axfr request when not on localhost.

2019-06-14 Thread Pete Fry via bind-users
Interesting I don't suppose you know where the default AXFR size can be set so 
i can do some testing?
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Dig Hangs during axfr request when not on localhost.

2019-06-14 Thread Ray Bellis




On 14/06/2019 09:38, Pete Fry via bind-users wrote:
Interestinly as we have the same problem on our dev box (running the 
same versions)


I took the decision to install the ISC-BIND following 
(https://copr.fedorainfracloud.org/coprs/isc/bind/)


running 9.14.2 and repeated the tests and it works, however the config 
will need work to have no errors and as we generally deploy via puppet 
rework will be required.


We generally use the REDHAT approved bind for support reasons.

if it was a network issue just upgrading bind shouldn't effect it should it?


Somewhere about BIND 9.11 the default size of AXFR message was reduced 
from the maximum of 65535 bytes down to 16384 because that allows for 
optimal DNS message compression.


I also suspect a network level issue such as MTU, but it's feasible that 
the above change may be allowing the packets to slip through.


kind regards,

Ray

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Dig Hangs during axfr request when not on localhost.

2019-06-14 Thread Pete Fry via bind-users
Interestinly as we have the same problem on our dev box (running the same
versions)

I took the decision to install the ISC-BIND following (
https://copr.fedorainfracloud.org/coprs/isc/bind/)

running 9.14.2 and repeated the tests and it works, however the config will
need work to have no errors and as we generally deploy via puppet rework
will be required.

We generally use the REDHAT approved bind for support reasons.

if it was a network issue just upgrading bind shouldn't effect it should it?

Pete


On Fri, 14 Jun 2019 at 09:06, Anand Buddhdev  wrote:

> On 14/06/2019 09:53, Pete Fry via bind-users wrote:
>
> Hi Pete,
>
> > however if you fun dig @IP.OF.MASTER ZONE axfr from a machine on the same
> > subnet
> > the zone starts to transfer and then hangs at certain points around 150k
> > bytes give or take and fails to complete.
> >
> > any idea on what i can look into?
> >
> > smaller zones are transferring all OK
>
> I would immediately suspect something on your network. Packet loss,
> mismatched MTU, etc.
>
> If I were you, I would run tcpdump on both master and slave and then
> attempt a zone transfer, and examine that packet trace. See what's going
> on. Are there TCP retransmits? Which side is stalling? The sender or
> receiver? What, if anything, do you see in the log files of your BIND on
> both the master and slave?
>
> Anand
>
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Dig Hangs during axfr request when not on localhost.

2019-06-14 Thread Anand Buddhdev
On 14/06/2019 09:53, Pete Fry via bind-users wrote:

Hi Pete,

> however if you fun dig @IP.OF.MASTER ZONE axfr from a machine on the same
> subnet
> the zone starts to transfer and then hangs at certain points around 150k
> bytes give or take and fails to complete.
> 
> any idea on what i can look into?
> 
> smaller zones are transferring all OK

I would immediately suspect something on your network. Packet loss,
mismatched MTU, etc.

If I were you, I would run tcpdump on both master and slave and then
attempt a zone transfer, and examine that packet trace. See what's going
on. Are there TCP retransmits? Which side is stalling? The sender or
receiver? What, if anything, do you see in the log files of your BIND on
both the master and slave?

Anand
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Dig Hangs during axfr request when not on localhost.

2019-06-14 Thread Pete Fry via bind-users
Hi

versions:
BIND 9.9.4-RedHat-9.9.4-74.el7_6.1 (Extended Support Version)
CentOS Linux release 7.6.1810 (Core)

We are having a problem on our masters that have large zone files (around
5MB)
are failing to be loaded on our slaves.

after some investigation

we can perform the following commands whilst local on the master

dig @localhost ZONE axfr

and the command performs and exits successfully

however if you fun dig @IP.OF.MASTER ZONE axfr from a machine on the same
subnet
the zone starts to transfer and then hangs at certain points around 150k
bytes give or take and fails to complete.

any idea on what i can look into?

smaller zones are transferring all OK

Thanks for your help
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users