Re: BIND and UDP tuning

2018-09-28 Thread Blake Hudson



Alex wrote on 9/26/2018 11:52 AM:

  This is all now running on a 165/35 cable system.




Early in this thread or another, I provided a packet trace that showed
what appears to me to never have received the replies - it just times
out.



It looks like there are periods of as many as 500 queries per second,
although the usual amount is closer to 200 per second.


DOCSIS cable systems use an upstream request/grant system to avoid 
collisions (they act as a hub where only one cable modem in the node can 
transmit at the same time). This leads to low pps rates compared with 
ethernet. Even a 10M ethernet connection (1k-10k pps) will outperform a 
1gig cable connection (a few hundred pps).


Based on the info you've provided, I suspect that you may be running 
into this limit. As another poster suggested, you might consider moving 
your DNS server to a VPS hosted on an ethernet connection at a location 
more suited for DNS server operation or otherwise try to leverage your 
upstream provider's DNS or an outside DNS server.


--Blake
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-28 Thread Alan Clegg
On 9/28/18 9:26 AM, Alex wrote:

>> Has your provider enabled qos?  I'd bet their dropping packets that
>> exceed qos rate limits would be considered "working as expected".
> 
> I asked and they had no idea what that even meant. The technician that
> was here replacing the modem also had no idea outside of what the
> hardware does.

You may want to consider buying a VPS somewhere other than behind the
modem at your (assumed) residence.

There are lots of 'em, some costing less than $5/month for a decent
little box (I have several scattered around the world) and when you have
a problem, they have a good chance of understanding what you are asking.

AlanC
-- 
Why don't we wander and follow la vie dansante.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-28 Thread Lee
On 9/28/18, Alex  wrote:
> Hi,
>
> On Fri, Sep 28, 2018 at 12:18 AM Lee  wrote:
>>
>> On 9/27/18, Alex  wrote:
>> > Hi,
>> >
>> >> Just a wild thought:
>> >> It works with a lower speed line (at least I read it that way) but has
>> >> problems with higher speeds.
>> >> Could it be that the line is so fast that it "overtakes" the host in
>> >> question?
>> >>
>> >> A faster incoming line will give less time between the packets for
>> >> processing.
>> >
>> > No, I actually upgraded from a 65/20mbit to a 165/35mbit recently,
>> > thinking it was too slow because it was happening at the slower speeds
>> > as well. I've also implemented some basic QoS to throttle outgoing
>> > smtp and prioritize DNS but it made no difference.
>>
>> Has your provider enabled qos?  I'd bet their dropping packets that
>> exceed qos rate limits would be considered "working as expected".
>
> I asked and they had no idea what that even meant.

Escalate?  Which is assuming you have a ticket open..

I had it a bit easier than you; I was in an enterprise environment &
had control of the routers on both sides + it was relatively easy to
demonstrate packet loss.

> The technician that
> was here replacing the modem also had no idea outside of what the
> hardware does.
>
> I've also asked on dslreports about this, and no one answered.
>
> It certainly seems to be more pronounced now than it ever was in the
> past. Sometimes so many queries are failing that it's impossible to
> use the network.

Can you make it happen on demand?  Troubleshooting is so much easier
if you can demonstrate the problem vs. trying to reconstruct what
happened after the fact.

>> Which brings up the question of exactly what does SERVFAIL mean?  Can
>> no response to a query result in SERVFAIL?  Is there a way to tell the
>> difference between no response & getting a response indicating a
>> failure?
>
> Early in this thread or another, I provided a packet trace that showed
> what appears to me to never have received the replies - it just times
> out. Also, the "Server Failure" messages are always on the loopback
> interface. I'd be happy to provide another trace if someone knows how
> to properly read it. I really have no idea what's causing the problem.

It would be nice if there was a way to tell if the problem was packet
drops (ie. no response to a query), getting a bad response from the
server or something else.  At least then you'd know where to direct
your attention..

> Also, I recently raised the trace level to 99, but I don't see
> anything in the logs beyond level 4. Where do I find what the
> different trace levels are supposed to report?

No idea.  I'm running bind at home and very occasionally see things like
28-Sep-2018 1:04:32.552 query-errors: info: client @01F0C86745C0
127.0.0.1#63459 (www.Amazon.com): query failed (SERVFAIL) for
www.Amazon.com/IN/A at ..\query.c:8580

so I'd be interested in knowing if you get a resolution to the problem.

Lee
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-28 Thread Alex
Hi,

On Fri, Sep 28, 2018 at 12:18 AM Lee  wrote:
>
> On 9/27/18, Alex  wrote:
> > Hi,
> >
> >> Just a wild thought:
> >> It works with a lower speed line (at least I read it that way) but has
> >> problems with higher speeds.
> >> Could it be that the line is so fast that it "overtakes" the host in
> >> question?
> >>
> >> A faster incoming line will give less time between the packets for
> >> processing.
> >
> > No, I actually upgraded from a 65/20mbit to a 165/35mbit recently,
> > thinking it was too slow because it was happening at the slower speeds
> > as well. I've also implemented some basic QoS to throttle outgoing
> > smtp and prioritize DNS but it made no difference.
>
> Has your provider enabled qos?  I'd bet their dropping packets that
> exceed qos rate limits would be considered "working as expected".

I asked and they had no idea what that even meant. The technician that
was here replacing the modem also had no idea outside of what the
hardware does.

I've also asked on dslreports about this, and no one answered.

It certainly seems to be more pronounced now than it ever was in the
past. Sometimes so many queries are failing that it's impossible to
use the network.

> Which brings up the question of exactly what does SERVFAIL mean?  Can
> no response to a query result in SERVFAIL?  Is there a way to tell the
> difference between no response & getting a response indicating a
> failure?

Early in this thread or another, I provided a packet trace that showed
what appears to me to never have received the replies - it just times
out. Also, the "Server Failure" messages are always on the loopback
interface. I'd be happy to provide another trace if someone knows how
to properly read it. I really have no idea what's causing the problem.

Also, I recently raised the trace level to 99, but I don't see
anything in the logs beyond level 4. Where do I find what the
different trace levels are supposed to report?

27-Sep-2018 16:57:29.688 query-errors: info: client @0x7fc7b0169ac0
127.0.0.1#31675 (72.212.15.199.backscatter.spameatingmonkey.net):
query failed (SERVFAIL) for
72.212.15.199.backscatter.spameatingmonkey.net/IN/A at
../../../bin/named/query.c:8580
26-Sep-2018 15:16:32.507 query-errors: debug 2: fetch completed at
../../../lib/dns/resolver.c:3927 for
b74c2d3722fbce8841edc1808ea0a31e.ix.dnsbl.manitu.net/A in 30.92:
timed out/success
[domain:manitu.net,referral:0,restart:5,qrysent:17,timeout:16,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]

There are also tons of messages involving disabling EDNS:
27-Sep-2018 16:57:29.549 edns-disabled: debug 1: success resolving
'232.123.75.208.dnsbl-3.uceprotect.net/A' (in
'dnsbl-3.uceprotect.net'?) after disabling EDNS

I've also just installed 'netdata', which is an app that reports on
system parameters, and find it frequently reporting messages like:
ipv4 tcp listen overflows = 4 overflows
inbound packets dropped = 22 packets
ipv4 udp receive buffer errors = 184 errors

I've also now made the following buffer adjustments based on this and
other perf tuning docs:
https://access.redhat.com/sites/default/files/attachments/20150325_network_performance_tuning.pdf
net.core.rmem_default = 8388608
net.core.rmem_max = 33554432
net.core.wmem_default = 52428800
net.core.wmem_max = 134217728
net.ipv4.udp_early_demux = 0
net.ipv4.udp_mem=764304 1019072 1528608
net.ipv4.tcp_rmem=16384 349520 16777216
net.core.rmem_max=16777216
net.ipv4.udp_rmem_min = 18192
net.ipv4.udp_wmem_min = 8192
net.core.netdev_budget = 1
net.core.netdev_max_backlog = 2000
net.core.netdev_max_backlog=10

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-28 Thread Alex
Hi,

> Hi Alex,
>
> Have you tried on a separate physical server? To rule out the actual hardware 
> as being the problem?
>
> Is this some  user grade PC with either onboard or external ethernet 
> interface, or a proper server grade equipment? Age of equipment? What else 
> does that machine do?

This is a Xeon 8-core E31240 3.30GHz with 16GB. It's a few years old.
I've also recently tried with an i7 8700 with 32GB running the same
version of fedora28 with the same bind and had the same problem. I've
also mentioned previously that I've tried unbound and had the same
postfix "Name service error" error.

I believe this error is not a recent thing - it goes back in the logs
for as long as I can see, meaning into previous versions of postfix
and fedora and bind. I've only now started to notice it and the impact
that I'd imagine it's having on our ability to effectively using RBLs
and process mail.

This server does only mail/spam filtering with
postfix/amavis/spassassin using bind. It's configured as a recursive
caching server and not otherwise authoritative for any of our domains.

I've recently tried to configure it with "edns no;" and/or
"edns-udp-size 512;" and it's had no effect.

Thanks so much for your help.
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Root zone DNSSEC KSK rollover event - 2018/10/11, 16:00 UTC

2018-09-28 Thread Ray Bellis
On 28/09/2018 10:55, Anand Buddhdev wrote:

> On 11 October, the old key won't be removed. On that day, the new key
> will start signing the DNSKEY RRset. The old key (id 19036), will remain
> in the root zone; it just won't sign the DNSKEY RRset. Eventually, in
> the first quarter of 2019, it will be revoked, and then removed *after*
> the hold-down period.

My apologies to the list, Anand is correct!  I had misremembered which
phase of the roll we were in, getting confused by just how long KSK2017
has already been in the root zone for.

The guidance in our KB articles still stands, though. :)

kind regards,

Ray
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Root zone DNSSEC KSK rollover event - 2018/10/11, 16:00 UTC

2018-09-28 Thread Anand Buddhdev
On 28/09/2018 11:37, Ray Bellis wrote:

Hi Ray,

> At this time the old key will be removed from the root zone leaving only
> the new key (id 20326) in the zone.  If your DNS servers don't know and
> trust the new key at that point then DNSSEC validation errors will occur.

On 11 October, the old key won't be removed. On that day, the new key
will start signing the DNSKEY RRset. The old key (id 19036), will remain
in the root zone; it just won't sign the DNSKEY RRset. Eventually, in
the first quarter of 2019, it will be revoked, and then removed *after*
the hold-down period.

Regards,
Anand
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Root zone DNSSEC KSK rollover event - 2018/10/11, 16:00 UTC

2018-09-28 Thread Ray Bellis
This is a reminder for users of BIND that the most critical phase of the
rollover of the root zone's DNSSEC KSK is scheduled to happen at 16:00
UTC on Thursday 11th October.

At this time the old key will be removed from the root zone leaving only
the new key (id 20326) in the zone.  If your DNS servers don't know and
trust the new key at that point then DNSSEC validation errors will occur.

ISC has written two KB articles with information on how to check that
your BIND recursive DNS server is ready for the key roll.

The first is a short Operational Notification document which is ideal
for experienced BIND administrators with good familiarity with DNSSEC:

  

The second is a much more detailed document with more DNSSEC background
material and an overview of the entire key roll process:

  

Ray Bellis
ISC Research Fellow
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: bind-9-packages: RPS and both '--enable-static' and '--disable-static'?

2018-09-28 Thread Michał Kępień
Hi James,

> Thank you for the https://www.isc.org/blogs/bind-9-packages/
> blog post and various binary distributions mentioned in it.
> 
> I am an end user, not a programmer, and I rely on Linux
> distributions and application packages and so having up-to-date
> content from authoritative sources is both helpful and very
> reassuring.
> 
> As a result of this, I now have the "stable" currently-9.12.2
> version from https://launchpad.net/~isc/+archive/ubuntu/bind
> installed on Ubuntu 18.04 here on my home desktop in order to
> hack away at something.

Thanks for giving our packages a shot!

> So I was looking forward to RPS having the effect of adding TCP
> to the mix and doing a much more respectable job of extracting
> the queries.
> 
> Which does lead to the question about some RPS documentation
> but that's sorta moot at this point.

I am not sure if you are aware of it but writing a library implementing
the DNSRPS API is not something entirely straightforward.  See the
"librpz_0_t" type in lib/dns/include/dns/librpz.h for a list of methods
comprising the library interface.  Also note that the only working
implementation whose existence I am aware of is a proprietary one.

Given the above, a strong enough motivation for using --enable-dnsrps in
the packages we build and support is lacking.  Note that Debian and its
derivatives provide tools which make rebuilding a source package with
different compile-time options fairly convenient.

> Also, when running "named -V", I see both '--enable-static' and
> '--disable-static' in the output.  I have no idea if this is
> sensible or not but it sure looks a little funny:

Thank you for catching this, we will fix it.

-- 
Best regards,
Michał Kępień
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users