Re: FreeBSD10.3-RELEASE. Kernel panic.

2016-10-12 Thread Donald Baud via freebsd-net

On 10/12/16 3:24 PM, Zaphod Beeblebrox wrote:

While my mp5 servers are possibly less busy (I havn't had common 
crashes), I have noticed a "group" of problems.


1. The carrier dropping communication (ie: fiber cut or l2 switch 
breakage) of the L2TP streams can leave mpd5 in a state where it will 
not die and will not destroy interfaces (requires reboot to clear).
I've encountered that once on 10.3 and I had tweaked some sysctl values 
while monitoring :

> vmstat -z | head -1; vmstat -z | grep -i netgraph

you might want to search other people's experience with the following 
values:

# net.graph.maxdgram   #this is set in /etc/sysctl.conf
# net.graph.recvspace#this is set in /etc/sysctl.conf
# net.graph.maxdata  #this is set in /boot/loader.conf
# net.graph.maxalloc #this is set in /boot/loader.conf

I'll leave others to comment on what's best to set as values with their 
experience on FreeBSD10.3.
In my case, as I had explained, one of the recipes that worked for me is 
to comment out and leave those kernel values to their default.


I've read in mpd5 mailing list some saying that FreeBSD-11 have had 
upgrades on the netgraph modules.
I am now using FreeBSD-11 and It looks like I don't need any of the 
kernel tweaks that I've described.


Also, may I suggest you troubleshoot the fiber-cut or L2 switch breakage 
by playing with some ipfw values to simulate a fiber-cut.:

ex: ipfw add 100 deny ip from 10.10.10.10 to me
2. There are race conditions between quagga and mpd5 for 
adding/dropping routes.
While troubleshooting the crashes of the mpd5, I have removed net/quagga 
and installed net/bird instead.
I am now using net/bird I've written a little howto to get you started 
with net/bird

see: https://forums.freebsd.org/threads/56988/

3. if A is a pppoe client and B is the mpd5 server, A cannot access 
TCP services on B.  It can access tcp services _beyond_ B, but not on 
B. (there is a ticket open for this).


On Wed, Oct 12, 2016 at 10:51 AM, Donald Baud via freebsd-net 
mailto:freebsd-net@freebsd.org>> wrote:



On 10/12/16 1:13 AM, Julian Elischer wrote:

On 11/10/2016 8:56 PM, Donald Baud via freebsd-net wrote:

I've been plagued with these =daily= panics until I tried
the following recipes and the server has been up for 30
days so far:

Normally I should expermient more to see which one of the
receipes is really the fix, but I'm just glad that the
server is stable for now.


this is really great information.
It makes debugging a lot more possible.
I know it is a hard question, but do you have a way to
simulate this workload?

I have no real way to simulate this kind of workload


Sadly, I don't have a way to simulate the workload but I am very
interested to help fix these crashes since as Cassiano said, this
makes mpd5/freebsd useless for pppoe/l2tp termination.

At this point, I would suggest that Cassiano and Андрей confirm
that they don't get panics when they apply the recipes that I am
using.

I am still running many other cisco-vpdn gateways that I would
convert into mpd5/freebsd but my plan was stalled with the daily
crashes.
I'll wait a couple of weeks to be sure that my recipes are a valid
workaround before converting my remaining cisco gateways to mpd5.

-Dbaud



recipe-1: Don't let mpd5 start automatically when server
boots:
i.e. in: /etc/rc.conf
mpd5_enable="NO"
and wait about 5 minutes after server boots then issue:
/usr/local/etc/rc.d/mpd5 onestart


recipe-2: recompile the kernel with the NETGRAPH_DEBUG option:
options NETGRAPH
options NETGRAPH_DEBUG
options NETGRAPH_KSOCKET
options NETGRAPH_L2TP
options NETGRAPH_SOCKET
options NETGRAPH_TEE
options NETGRAPH_VJC
options NETGRAPH_PPP
options NETGRAPH_IFACE
options NETGRAPH_MPPC_COMPRESSION
options NETGRAPH_MPPC_ENCRYPTION
options NETGRAPH_TCPMSS
options IPFIREWALL

recipe-3: recompile the kernel and disable the IPv6 and
SCTP options:
nooptions   INET6
nooptions   SCTP

recipe-4: Don't use any of the sysctl optimizations
in other words I commented out all values in sysctl.conf:
# net.graph.maxdgram=20480  (this is the default)
# net.graph.recvspace=20480  (this is the default)

recipe-5: Don't use any of the loader.conf optimizations
in other words I commented out all values in load

Re: FreeBSD10.3-RELEASE. Kernel panic.

2016-10-12 Thread Donald Baud via freebsd-net


On 10/12/16 1:13 AM, Julian Elischer wrote:

On 11/10/2016 8:56 PM, Donald Baud via freebsd-net wrote:
I've been plagued with these =daily= panics until I tried the 
following recipes and the server has been up for 30 days so far:


Normally I should expermient more to see which one of the receipes is 
really the fix, but I'm just glad that the server is stable for now.


this is really great information.
It makes debugging a lot more possible.
I know it is a hard question, but do you have a way to simulate this 
workload?


I have no real way to simulate this kind of workload


Sadly, I don't have a way to simulate the workload but I am very 
interested to help fix these crashes since as Cassiano said, this makes 
mpd5/freebsd useless for pppoe/l2tp termination.


At this point, I would suggest that Cassiano and Андрей confirm that 
they don't get panics when they apply the recipes that I am using.


I am still running many other cisco-vpdn gateways that I would convert 
into mpd5/freebsd but my plan was stalled with the daily crashes.
I'll wait a couple of weeks to be sure that my recipes are a valid 
workaround before converting my remaining cisco gateways to mpd5.


-Dbaud



recipe-1: Don't let mpd5 start automatically when server boots:
i.e. in: /etc/rc.conf
mpd5_enable="NO"
and wait about 5 minutes after server boots then issue:
/usr/local/etc/rc.d/mpd5 onestart


recipe-2: recompile the kernel with the NETGRAPH_DEBUG option:
options NETGRAPH
options NETGRAPH_DEBUG
options NETGRAPH_KSOCKET
options NETGRAPH_L2TP
options NETGRAPH_SOCKET
options NETGRAPH_TEE
options NETGRAPH_VJC
options NETGRAPH_PPP
options NETGRAPH_IFACE
options NETGRAPH_MPPC_COMPRESSION
options NETGRAPH_MPPC_ENCRYPTION
options NETGRAPH_TCPMSS
options IPFIREWALL

recipe-3: recompile the kernel and disable the IPv6 and SCTP options:
nooptions   INET6
nooptions   SCTP

recipe-4: Don't use any of the sysctl optimizations
in other words I commented out all values in sysctl.conf:
# net.graph.maxdgram=20480  (this is the default)
# net.graph.recvspace=20480  (this is the default)

recipe-5: Don't use any of the loader.conf optimizations
in other words I commented out all values in loader.conf
# net.graph.maxdata=4096  (this is the default)
# net.graph.maxalloc=4096 (this is the default)


In my case, I had the panics with 10.3 and 11-PRERELEASE
11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #2 r305587

With those recipes, I have been running without any crash for a month 
and counting.  Thats' 300 l2tp tunnels and 1400 l2tp sessions 
generating 700Mbit/s.



-DBaud


On Tuesday, October 11, 2016 7:30 AM, Cassiano Peixoto 
 wrote:

Hi,

There are many users complaining about this:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=186114

I've been dealing with this issue for one year with no solution. mpd5 as
pppoe server on FreeBSD is useless with this bug.

I really would like to see it working again, i think it's quite 
important

to both project and many users.

Thanks.

On Tue, Oct 11, 2016 at 3:24 AM, Eugene Grosbein  
wrote:



11.10.2016 11:02, Андрей Леушкин пишет:

Hello. I have problem with "FreeBSD nas 10.3-RELEASE FreeBSD 
10.3-RELEASE

#0: Fri Oct  7 21:12:56 YEKT 2016 nas@nas:/usr/obj/usr/src/sys/nasv3
   amd64"

Kernel panic is repeated at intervals of 2-3 days. At first I 
thought that

the problem is in the hardware, but the problem did not go away after
replacing the server platform.

Coredumps and more info on link
https://drive.google.com/open?id=0BxciMy2q7ZjTTkIxem9wTE1tM2M

Sorry for my english.
I'll wait for an answer.


This is known and long-stanging problem in the FreeBSD network stack.
It shows up when you have lots of network interfaced created/removed
frequently
like in your case of Network Access Server (PPtP, PPPoE etc).

Generally, people run into this problem using mpd5 network daemon.
mpd5 uses NETGRAPH kernel subsystem to process traffic and
if an interface disappears (f.e., ,user disconnected)
while kernel still processes traffic obtained from this interface, it
panices.

There were lots of reports of this problem. Noone seems to be 
working on

it at the moment.
You should fill a PR using Bugzilla and attach your logs to it.

Eugene Grosbein



___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: FreeBSD10.3-RELEASE. Kernel panic.

2016-10-11 Thread Donald Baud via freebsd-net
I've been plagued with these =daily= panics until I tried the following recipes 
and the server has been up for 30 days so far: 

Normally I should expermient more to see which one of the receipes is really 
the fix, but I'm just glad that the server is stable for now.


recipe-1: Don't let mpd5 start automatically when server boots:
i.e. in: /etc/rc.conf 
mpd5_enable="NO"
and wait about 5 minutes after server boots then issue: 
/usr/local/etc/rc.d/mpd5 onestart


recipe-2: recompile the kernel with the NETGRAPH_DEBUG option:
options NETGRAPH 
options NETGRAPH_DEBUG 
options NETGRAPH_KSOCKET 
options NETGRAPH_L2TP
options NETGRAPH_SOCKET
options NETGRAPH_TEE
options NETGRAPH_VJC
options NETGRAPH_PPP
options NETGRAPH_IFACE
options NETGRAPH_MPPC_COMPRESSION
options NETGRAPH_MPPC_ENCRYPTION
options NETGRAPH_TCPMSS
options IPFIREWALL

recipe-3: recompile the kernel and disable the IPv6 and SCTP options:
nooptions   INET6
nooptions   SCTP

recipe-4: Don't use any of the sysctl optimizations 
in other words I commented out all values in sysctl.conf:
# net.graph.maxdgram=20480  (this is the default)
# net.graph.recvspace=20480  (this is the default)

recipe-5: Don't use any of the loader.conf optimizations
in other words I commented out all values in loader.conf
# net.graph.maxdata=4096  (this is the default)
# net.graph.maxalloc=4096 (this is the default)


In my case, I had the panics with 10.3 and 11-PRERELEASE
11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #2 r305587

With those recipes, I have been running without any crash for a month and 
counting.  Thats' 300 l2tp tunnels and 1400 l2tp sessions generating 700Mbit/s.


-DBaud 


On Tuesday, October 11, 2016 7:30 AM, Cassiano Peixoto 
 wrote:
Hi,

There are many users complaining about this:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=186114

I've been dealing with this issue for one year with no solution. mpd5 as
pppoe server on FreeBSD is useless with this bug.

I really would like to see it working again, i think it's quite important
to both project and many users.

Thanks.

On Tue, Oct 11, 2016 at 3:24 AM, Eugene Grosbein  wrote:

> 11.10.2016 11:02, Андрей Леушкин пишет:
>
>> Hello. I have problem with "FreeBSD nas 10.3-RELEASE FreeBSD 10.3-RELEASE
>> #0: Fri Oct  7 21:12:56 YEKT 2016nas@nas:/usr/obj/usr/src/sys/nasv3
>>   amd64"
>>
>> Kernel panic is repeated at intervals of 2-3 days. At first I thought that
>> the problem is in the hardware, but the problem did not go away after
>> replacing the server platform.
>>
>> Coredumps and more info on link
>> https://drive.google.com/open?id=0BxciMy2q7ZjTTkIxem9wTE1tM2M
>>
>> Sorry for my english.
>> I'll wait for an answer.
>>
>
> This is known and long-stanging problem in the FreeBSD network stack.
> It shows up when you have lots of network interfaced created/removed
> frequently
> like in your case of Network Access Server (PPtP, PPPoE etc).
>
> Generally, people run into this problem using mpd5 network daemon.
> mpd5 uses NETGRAPH kernel subsystem to process traffic and
> if an interface disappears (f.e., ,user disconnected)
> while kernel still processes traffic obtained from this interface, it
> panices.
>
> There were lots of reports of this problem. Noone seems to be working on
> it at the moment.
> You should fill a PR using Bugzilla and attach your logs to it.
>
> Eugene Grosbein
>
>
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: FreeBSD10.3-RELEASE. Kernel panic.

2016-10-11 Thread Donald Baud via freebsd-net
I've been plagued with these =daily= panics until I tried the following recipes 
and the server has been up for 30 days so far: 
Normally I should expermient more to see which one of the recipes is really the 
fix, but I'm just glad that the server is stable for now.

recipe-1: Don't let mpd5 start automatically when server boots:i.e. in: 
/etc/rc.conf mpd5_enable="NO"and wait about 5 minutes after server boots then 
issue: /usr/local/etc/rc.d/mpd5 onestart

recipe-2: recompile the kernel with the NETGRAPH_DEBUG option:options 
NETGRAPH                            options NETGRAPH_DEBUG             
options NETGRAPH_KSOCKET options NETGRAPH_L2TPoptions 
NETGRAPH_SOCKEToptions NETGRAPH_TEEoptions NETGRAPH_VJCoptions  
   NETGRAPH_PPPoptions NETGRAPH_IFACEoptions 
NETGRAPH_MPPC_COMPRESSIONoptions NETGRAPH_MPPC_ENCRYPTIONoptions
 NETGRAPH_TCPMSSoptions IPFIREWALL
recipe-3: recompile the kernel and disable the IPv6 and SCTP options:nooptions  
 INET6nooptions   SCTP
recipe-4: Don't use any of the sysctl optimizations in other words I commented 
out all values in sysctl.conf:# net.graph.maxdgram=20480  (this is the 
default)# net.graph.recvspace=20480  (this is the default)
recipe-5: Don't use any of the loader.conf optimizationsin other words I 
commented out all values in loader.conf# net.graph.maxdata=4096  (this is the 
default)# net.graph.maxalloc=4096 (this is the default)
In my case, I had the panics with 10.3 and 
11-PRERELEASE11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #2 r305587
With those recipes, I have been running without any crash for a month and 
counting.  That's 300 l2tp tunnels and 1400 l2tp sessions generating 700Mbit/s.

_
From: Cassiano Peixoto 
Sent: Tuesday, October 11, 2016 07:30
Subject: Re: FreeBSD10.3-RELEASE. Kernel panic.
To: Eugene Grosbein 
Cc:  , Андрей Леушкин 


Hi,

There are many users complaining about this:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=186114

I've been dealing with this issue for one year with no solution. mpd5 as
pppoe server on FreeBSD is useless with this bug.

I really would like to see it working again, i think it's quite important
to both project and many users.

Thanks.

On Tue, Oct 11, 2016 at 3:24 AM, Eugene Grosbein  wrote:

> 11.10.2016 11:02, Андрей Леушкин пишет:
>
>> Hello. I have problem with "FreeBSD nas 10.3-RELEASE FreeBSD 10.3-RELEASE
>> #0: Fri Oct  7 21:12:56 YEKT 2016 nas@nas:/usr/obj/usr/src/sys/nasv3
>>   amd64"
>>
>> Kernel panic is repeated at intervals of 2-3 days. At first I thought that
>> the problem is in the hardware, but the problem did not go away after
>> replacing the server platform.
>>
>> Coredumps and more info on link
>> https://drive.google.com/open?id=0BxciMy2q7ZjTTkIxem9wTE1tM2M
>>
>> Sorry for my english.
>> I'll wait for an answer.
>>
>
> This is known and long-stanging problem in the FreeBSD network stack.
> It shows up when you have lots of network interfaced created/removed
> frequently
> like in your case of Network Access Server (PPtP, PPPoE etc).
>
> Generally, people run into this problem using mpd5 network daemon.
> mpd5 uses NETGRAPH kernel subsystem to process traffic and
> if an interface disappears (f.e., ,user disconnected)
> while kernel still processes traffic obtained from this interface, it
> panices.
>
> There were lots of reports of this problem. Noone seems to be working on
> it at the moment.
> You should fill a PR using Bugzilla and attach your logs to it.
>
> Eugene Grosbein
>
>
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"



___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

netgraph/ng_base.c causing panic daily

2016-09-02 Thread Donald Baud via freebsd-net
I need help troubleshooting what seems to be race conditions with hooks in 
netgraph/ng_base.c

Not sure what to look for in order to stop those daily panics on a machine 
running net/mpd5 with a few hundreds l2tp sessions.

I'm suspecting a crash being caused by:
/usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:2403
error = (*rcvdata)(hook, item);
break;

What can I do to confirm my suspicion once I get a crash log using kgdb?

-D




# uname -a
11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #0 r305284
#GENERIC kernel 

# mpd5 --version 
Version 5.8 (root@110amd64-quarterly-job-01 00:24 11-Aug-2016)

# cat /boot/loader.conf
net.graph.maxdata=16384
net.graph.maxalloc=16384

# cat /etc/sysctl.conf
net.inet.ip.intr_queue_maxlen=1024
net.graph.maxdgram=1024000
net.graph.recvspace=1024000


# /etc/rc.conf:
mpd_enable="YES"
bird_enable="YES"



## crash 0
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x28
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x82247a8b
stack pointer   = 0x28:0xfe2df390
frame pointer   = 0x28:0xfe2df3d0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 596 (ng_queue1)
trap number = 12
panic: page fault
cpuid = 1
KDB: stack backtrace:
#0 0x80b24087 at kdb_backtrace+0x67
#1 0x80ad9432 at vpanic+0x182
#2 0x80ad92a3 at panic+0x43
#3 0x80fa1d51 at trap_fatal+0x351
#4 0x80fa1f43 at trap_pfault+0x1e3
#5 0x80fa14cc at trap+0x26c
#6 0x80f84461 at calltrap+0x8
#7 0x8225669b at ng_l2tp_rcvdata_lower+0x4bb
#8 0x8224652e at ng_apply_item+0x14e
#9 0x822461a3 at ng_snd_item+0x383
#10 0x8225a05a at ng_ksocket_incoming2+0x17a
#11 0x822464c5 at ng_apply_item+0xe5
#12 0x82248ddd at ngthread+0x1bd
#13 0x80a900a5 at fork_exit+0x85
#14 0x80f8499e at fork_trampoline+0xe
Uptime: 4d7h58m21s
Dumping 708 out of 6111 MB:..3%..12%..21%..32%..41%..52%..62%..71%..82%..91%

Reading symbols from 
/usr/local/lib/vmware-tools/modules/drivers/vmmemctl.ko...done.
Loaded symbols for /usr/local/lib/vmware-tools/modules/drivers/vmmemctl.ko
Reading symbols from /boot/kernel/ipfw.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/ipfw.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ipfw.ko
Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/ng_socket.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ng_socket.ko
Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/netgraph.ko.debug...done.
done.
Loaded symbols for /boot/kernel/netgraph.ko
Reading symbols from /boot/kernel/ng_mppc.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/ng_mppc.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ng_mppc.ko
Reading symbols from /boot/kernel/rc4.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/rc4.ko.debug...done.
done.
Loaded symbols for /boot/kernel/rc4.ko
Reading symbols from /boot/kernel/ng_l2tp.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/ng_l2tp.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ng_l2tp.ko
Reading symbols from /boot/kernel/ng_ksocket.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/ng_ksocket.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ng_ksocket.ko
Reading symbols from /boot/kernel/ng_tee.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/ng_tee.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ng_tee.ko
Reading symbols from /boot/kernel/ng_iface.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/ng_iface.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ng_iface.ko
Reading symbols from /boot/kernel/ng_ppp.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/ng_ppp.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ng_ppp.ko
Reading symbols from /boot/kernel/ng_tcpmss.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/ng_tcpmss.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ng_tcpmss.ko
Reading symbols from /boot/kernel/ng_vjc.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/ng_vjc.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ng_vjc.ko
#0  doadump (textdump=) at pcpu.h:221
221 __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) 

###
(kgdb) list *0x82247a8b
0x82247a8b is in ng_address_hook 
(/usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:3587).
3582 * that the peer node is present, though maybe invalid.
3583 */
3584TOPOLOGY_RLOCK();
3585if ((hook == NULL)

Re: kernel panic with netgraph and mpd5.8

2016-07-12 Thread Donald Baud via freebsd-net



On 2016-07-10 10:49, Donald Baud via freebsd-net wrote:
Hi I'm running an l2tp lns through mpd5.8 and it's been crashing 
twice in 24h.

This is a new project replacing a cisco 7206, 700-sessions 800mbit/s


I am not familiar with troubleshooting kernel panic's,

I am suspecting that the crash is happening inside the netgraph 
module because the crash is happening at the


instruction pointer = 0x20:0x81c38283
I included the 2 two crash logs.  I need some help to to figure out 
what to do next.


-Dbaud



On 7/10/16 5:14 PM, Hooman Fazaeli wrote:
- Upgrade to mpd 5 (/usr/ports/net/mpd5)
- Try below workarounds:
https://lists.freebsd.org/pipermail/freebsd-bugs/2014-June/056548.html
https://lists.freebsd.org/pipermail/freebsd-bugs/2014-June/056549.html
https://lists.freebsd.org/pipermail/freebsd-net/2014-June/038954.html



On 7/10/16 8:43 PM, Donald Baud via freebsd-net wrote:
- I'm already using the latest mpd5:
> mpd5 --version
Version 5.8 (root@101amd64-quarterly-job-15 12:36  5-Jun-2016)

- I had already reviewed those links you mentioned.
Here is a summary of the main suggestions in them.
* Add a "sleep 1" to up-down interface events.
* Revert to RELENG8 or 9
* boost net.graph sysctl/loader.conf
  net.graph.maxdata=262140 # /boot/loader.conf
  net.graph.maxalloc=262140 # /boot.loader.conf

I was using the following tunings
net.graph.maxdgram=524288  (via sysctl.conf default=20480)
net.graph.recvspace=524288 (via sysctl.conf default=20480)
net.graph.maxdata=65536   (via loader.conf default=4096 )
net.graph.maxalloc=65536  (via loader.conf default=4096 )

I am suspecting that the panic might be caused by a too high maxdata 
and maxalloc values:
I reduced the value to 20480, I'll report back if that will reduce the 
occurence of kernel panics.



vmstat -z | head -1 ; vmstat -z | grep -i graph

ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
NetGraph items:  72,  20491,   2,1672,467166841, 0,   0
NetGraph data items: 72,  20491,   0, 1643,1240166475,   0,   0




The server crashed again this morning.
It looks like it crashes somewhere in the netgraph.ko module
Could someone please help me troubleshoot this issue, it crashes around 
the same location

instruction pointer= 0x20:0x81c3828d
The crash happens at random times not necessarily under heavy load.


- using plain GENERIC kernel
10.3-RELEASE-p4 FreeBSD 10.3-RELEASE-p4 #0: Sat May 28 12:23:44 UTC 2016
r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64

- # kldstat
Id Refs AddressSize Name
 1   32 0x8020 17bc6a8  kernel
 22 0x81c11000 114dbipfw.ko
 31 0x81c23000 d32f dummynet.ko
 41 0x81c31000 3831 ng_socket.ko
 58 0x81c35000 ba02 netgraph.ko
 61 0x81c41000 2b99 ng_mppc.ko
 71 0x81c44000 80c  rc4.ko
 81 0x81c45000 23dc vmmemctl.ko
 91 0x81c48000 397d ng_l2tp.ko
101 0x81c4c000 4b04 ng_ksocket.ko
111 0x81c51000 17d6 ng_tee.ko
121 0x81c53000 40d2 ng_iface.ko
131 0x81c58000 5829 ng_ppp.ko
141 0x81c5e000 18b1 ng_tcpmss.ko

-  /etc/rc.conf
mpd_enable="YES"
quagga_daemons="zebra ospfd"
devd_enable="NO"
ipv6_network_interfaces="none"
ip6addrctl_enable="NO"

- /etc/sysctl.conf
net.inet.ip.fastforwarding=1
hw.intr_storm_threshold=4
net.graph.maxdgram=524288
net.graph.recvspace=524288

- /boot/loader.conf
net.graph.maxdata=20480
net.graph.maxalloc=20480

- grep kernel: /var/log/messages
Jul 12 04:18:05 mybox syslogd: kernel boot file is /boot/kernel/kernel
Jul 12 04:18:05 mybox kernel:
Jul 12 04:18:05 mybox kernel:
Jul 12 04:18:05 mybox kernel: Fatal trap 9: general protection fault 
while in kernel mode

Jul 12 04:18:05 mybox kernel: cpuid = 0; apic id = 00
Jul 12 04:18:05 mybox kernel: instruction pointer= 
0x20:0x81c3828d
Jul 12 04:18:05 mybox kernel: stack pointer= 
0x28:0xfe0174da8380
Jul 12 04:18:05 mybox kernel: frame pointer= 
0x28:0xfe0174da83c0
Jul 12 04:18:05 mybox kernel: code segment= base 0x0, limit 
0xf, type 0x1b

Jul 12 04:18:05 mybox kernel: = DPL 0, pres 1, long 1, def32 0, gran 1
Jul 12 04:18:05 mybox kernel: processor eflags= interrupt enabled, 
resume, IOPL = 0

Jul 12 04:18:05 mybox kernel: current process= 659 (ng_queue3)
Jul 12 04:18:05 mybox kernel: trap number= 9
Jul 12 04:18:05 mybox kernel: panic: general protection fault
Jul 12 04:18:05 mybox kernel: cpuid = 0
Jul 12 04:18:05 mybox kernel: KDB: stack backtrace:
Jul 12 04:18:05 mybox kernel: #0 0x8098e390 at kdb_backtrace+0x60
Jul 12 04:18:05 mybox kernel: #1 0x80951066 at vpanic+0x126
Jul 12 04:18:05 mybox kernel: #2 0x80950f33 at panic+0x43
Jul 12 04:18:05 

Re: kernel panic with netgraph and mpd3.8

2016-07-10 Thread Donald Baud via freebsd-net



On 2016-07-10 10:49, Donald Baud via freebsd-net wrote:
Hi I'm running an l2tp lns through mpd3.8 and it's been crashing 
twice in 24h.

This is a new project replacing a cisco 7206, 700-sessions 800mbit/s


I am not familiar with troubleshooting kernel panic's,

I am suspecting that the crash is happening inside the netgraph 
module because the crash is happening at the


instruction pointer = 0x20:0x81c38283
I included the 2 two crash logs.  I need some help to to figure out 
what to do next.


-Dbaud



On 7/10/16 5:14 PM, Hooman Fazaeli wrote:
- Upgrade to mpd 5 (/usr/ports/net/mpd5)
- Try below workarounds:
https://lists.freebsd.org/pipermail/freebsd-bugs/2014-June/056548.html
https://lists.freebsd.org/pipermail/freebsd-bugs/2014-June/056549.html
https://lists.freebsd.org/pipermail/freebsd-net/2014-June/038954.html



- I'm already using the latest mpd5:
> mpd5 --version
Version 5.8 (root@101amd64-quarterly-job-15 12:36  5-Jun-2016)

- I had already reviewed those links you mentioned.
Here is a summary of the main suggestions in them.
* Add a "sleep 1" to up-down interface events.
* Revert to RELENG8 or 9
* boost net.graph sysctl/loader.conf
  net.graph.maxdata=262140 # /boot/loader.conf
  net.graph.maxalloc=262140 # /boot.loader.conf

I was using the following tunings
net.graph.maxdgram=524288  (via sysctl.conf default=20480)
net.graph.recvspace=524288 (via sysctl.conf default=20480)
net.graph.maxdata=65536   (via loader.conf default=4096 )
net.graph.maxalloc=65536  (via loader.conf default=4096 )

I am suspecting that the panic might be caused by a too high maxdata and 
maxalloc values:
I reduced the value to 20480, I'll report back if that will reduce the 
occurence of kernel panics.


vmstat -z | head -1 ; vmstat -z | grep -i graph

ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
NetGraph items:  72,  20491,   2,1672,467166841,   0,   0
NetGraph data items: 72,  20491,   0,1643,1240166475,   0,   0



___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


kernel panic with netgraph and mpd3.8

2016-07-09 Thread Donald Baud via freebsd-net
Hi I'm running an l2tp lns through mpd3.8 and it's been crashing twice in 24h.
This is a new project replacing a cisco 7206, 700-sessions 800mbit/s


I am not familiar with troubleshooting kernel panic's,

I am suspecting that the crash is happening inside the netgraph module because 
the crash is happening at the 

instruction pointer = 0x20:0x81c38283
I included the 2 two crash logs.  I need some help to to figure out what to do 
next.

-Dbaud





The box is a:
# uname -a
FreeBSD mybox.example.com 10.3-RELEASE-p4 FreeBSD 10.3-RELEASE-p4 #0: Sat May 
28 12:23:44 UTC 2016 
r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

# kldstat
Id Refs AddressSize Name
1   34 0x8020 17bc6a8  kernel
22 0x81c11000 114dbipfw.ko
31 0x81c23000 d32f dummynet.ko
41 0x81c31000 3831 ng_socket.ko
59 0x81c35000 ba02 netgraph.ko
61 0x81c41000 2b99 ng_mppc.ko
71 0x81c44000 80c  rc4.ko
81 0x81c45000 23dc vmmemctl.ko
91 0x81c48000 397d ng_l2tp.ko
101 0x81c4c000 4b04 ng_ksocket.ko
111 0x81c51000 17d6 ng_tee.ko
121 0x81c53000 40d2 ng_iface.ko
131 0x81c58000 5829 ng_ppp.ko
141 0x81c5e000 18b1 ng_tcpmss.ko
151 0x81c6 2df7 ng_vjc.ko


===
First crash dump:
Jul  8 08:09:04 mybox syslogd: kernel boot file is /boot/kernel/kernel
Jul  8 08:09:04 mybox kernel: 
Jul  8 08:09:04 mybox kernel: 
Jul  8 08:09:04 mybox kernel: Fatal trap 12: page fault while in kernel mode
Jul  8 08:09:04 mybox kernel: cpuid = 1; apic id = 01
Jul  8 08:09:04 mybox kernel: fault virtual address = 0x28
Jul  8 08:09:04 mybox kernel: fault code= supervisor read data, 
page not present
Jul  8 08:09:04 mybox kernel: instruction pointer   = 
0x20:0x81c38283
Jul  8 08:09:04 mybox kernel: stack pointer = 
0x28:0xfe0174d85540
Jul  8 08:09:04 mybox kernel: frame pointer = 
0x28:0xfe0174d85580
Jul  8 08:09:04 mybox kernel: code segment  = base 0x0, limit 
0xf, type 0x1b
Jul  8 08:09:04 mybox kernel: = DPL 0, pres 1, long 1, def32 0, gran 1
Jul  8 08:09:04 mybox kernel: processor eflags  = interrupt enabled, resume, 
IOPL = 0
Jul  8 08:09:04 mybox kernel: current process   = 628 (ng_queue2)
Jul  8 08:09:04 mybox kernel: trap number   = 12
Jul  8 08:09:04 mybox kernel: panic: page fault
Jul  8 08:09:04 mybox kernel: cpuid = 1
Jul  8 08:09:04 mybox kernel: KDB: stack backtrace:
Jul  8 08:09:04 mybox kernel: #0 0x8098e390 at kdb_backtrace+0x60
Jul  8 08:09:04 mybox kernel: #1 0x80951066 at vpanic+0x126
Jul  8 08:09:04 mybox kernel: #2 0x80950f33 at panic+0x43
Jul  8 08:09:04 mybox kernel: #3 0x80d55f7b at trap_fatal+0x36b
Jul  8 08:09:04 mybox kernel: #4 0x80d5627d at trap_pfault+0x2ed
Jul  8 08:09:04 mybox kernel: #5 0x80d558fa at trap+0x47a
Jul  8 08:09:04 mybox kernel: #6 0x80d3b8d2 at calltrap+0x8
Jul  8 08:09:04 mybox kernel: #7 0x81c5e509 at ng_tcpmss_rcvdata+0x2d9
Jul  8 08:09:04 mybox kernel: #8 0x81c370ca at ng_apply_item+0x21a
Jul  8 08:09:04 mybox kernel: #9 0x81c36d1a at ng_snd_item+0x38a
Jul  8 08:09:04 mybox kernel: #10 0x81c5a1c8 at ng_ppp_comp_recv+0x148
Jul  8 08:09:04 mybox kernel: #11 0x81c370ca at ng_apply_item+0x21a
Jul  8 08:09:04 mybox kernel: #12 0x81c36d1a at ng_snd_item+0x38a
Jul  8 08:09:04 mybox kernel: #13 0x81c370ca at ng_apply_item+0x21a
Jul  8 08:09:04 mybox kernel: #14 0x81c36d1a at ng_snd_item+0x38a
Jul  8 08:09:04 mybox kernel: #15 0x81c370ca at ng_apply_item+0x21a
Jul  8 08:09:04 mybox kernel: #16 0x81c36d1a at ng_snd_item+0x38a
Jul  8 08:09:04 mybox kernel: #17 0x81c4d3e2 at 
ng_ksocket_incoming2+0x2f2
Jul  8 08:09:04 mybox kernel: Uptime: 5d17h47m38s
Jul  8 08:09:04 mybox kernel: Copyright (c) 1992-2016 The FreeBSD Project.
Jul  8 08:09:04 mybox kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 
1991, 1992, 1993, 1994
Jul  8 08:09:04 mybox kernel: The Regents of the University of California. All 
rights reserved.
Jul  8 08:09:04 mybox kernel: FreeBSD is a registered trademark of The FreeBSD 
Foundation.
Jul  8 08:09:04 mybox kernel: FreeBSD 10.3-RELEASE-p4 #0: Sat May 28 12:23:44 
UTC 2016
Jul  8 08:09:04 mybox kernel: 
r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64
Jul  8 08:09:04 mybox kernel: FreeBSD clang version 3.4.1 
(tags/RELEASE_34/dot1-final 208032) 20140512
Jul  8 08:09:04 mybox kernel: CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz 
(2000.00-MHz K8-class CPU)
Jul  8 08:09:04 mybox kernel: Origin="GenuineIntel"  Id=0x206d7  Family=0x6  
Model=0x2d  Stepping=7
Jul  8 08:09:04 mybox kernel: 
Features=0x1fa3fbff
Jul  8