Re: desync: scheduling fib reload

Thomas Boernert Sat, 25 Nov 2017 13:36:25 -0800

Hi Robert,

i have the same problem:


ospfd[94311]: desync; scheduling fib reload
ospfd[94311]: reloading interface list and routing table

Have you a fix for your problem ?

i have my mbuf extended to 32768 with no success.

i have bgpd and ospfd on the same machine and ospfs loads the completebgpd routingtable from

the whole internet in the fib. this can be my problem.

Thanks

Thomas

Am 2017-10-30 11:15, schrieb Robert Blacquiere:

Hi Theo,

On Sun, Oct 29, 2017 at 11:45:54AM -0600, Theo de Raadt wrote:


Yes, on the route socket.  It is unreasonable for the kernel to
maintain an infinite number of route change messages, so about 9 years
ago we developed this scheme of marking the situation for userland to
handle.  Such a mechanism didn't exist before, because noone had run
into the concern before -- people weren't turning *BSD systems into
full-table/high-churn routing systems before our daemons came along.


Thanks for explaining.


> We have changed default sysctl settings for:
> kern.maxcluster=24576
> net.inet.ip.ifq.maxlen=4096
> net.inet6.ip6.ifq.maxlen=1024
>
> as from netstat -m  we ran out of 2048 mbufs at defaults.

Come on, think for a second.  See "ip" and "ip6"?  That doesn't grow
the queue on the routing socket.  If anything it probably makes
your situation worse.


The ip and ip6 were the first things I changed to help drops on
interfaces. That has worked, we have now no dropped traffic. And yes I
know that does not help with the ospf issue.


As for growing the size of the route socket buffer -- it is unclear
whether that won't make the situation worse.  When a desync is
detected in userland, you will already have read and consumed the full
queue -- which now has a gap in it, and requires a fresh restart.  So
you are promising to do MORE wasteful work before recovering.

Anyways, there are two circumstances where it happens: route bufferlimits,

or temporary mbuf shortage.  I think you've hit the latter.

How can I fix this temporary mbuf shortage? I have been searching howto

detect this. From netstat -m output:

$ netstat -m
956 mbufs in use:
        933 mbufs allocated to data
        14 mbufs allocated to packet headers
        9 mbufs allocated to socket names and addresses
930/13264/24576 mbuf 2048 byte clusters in use (current/peak/max)
0/8/24576 mbuf 4096 byte clusters in use (current/peak/max)
0/8/24576 mbuf 8192 byte clusters in use (current/peak/max)
0/14/24584 mbuf 9216 byte clusters in use (current/peak/max)
0/10/24580 mbuf 12288 byte clusters in use (current/peak/max)
0/8/24576 mbuf 16384 byte clusters in use (current/peak/max)
0/8/24576 mbuf 65536 byte clusters in use (current/peak/max)
3768 Kbytes allocated to network (55% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines


We hit max 2048 mbuf clusters so i bumped the kern.maxcluster.

Does anybody know how to attack this issue, I have been searching howto

debug this potential mbuf shortage correctly but apparently went the
wrong way to fix this.

Regards

Robert



----------------------------------------------------------------
Diese Nachricht wurde versandt mit Webmail von www.tbits.net.
This message was sent using webmail of www.tbits.net.

Re: desync: scheduling fib reload

Reply via email to