Re: FreeBSD10.3-RELEASE. Kernel panic.
On 10/12/16 3:24 PM, Zaphod Beeblebrox wrote: While my mp5 servers are possibly less busy (I havn't had common crashes), I have noticed a "group" of problems. 1. The carrier dropping communication (ie: fiber cut or l2 switch breakage) of the L2TP streams can leave mpd5 in a state where it will not die and will not destroy interfaces (requires reboot to clear). I've encountered that once on 10.3 and I had tweaked some sysctl values while monitoring : > vmstat -z | head -1; vmstat -z | grep -i netgraph you might want to search other people's experience with the following values: # net.graph.maxdgram #this is set in /etc/sysctl.conf # net.graph.recvspace#this is set in /etc/sysctl.conf # net.graph.maxdata #this is set in /boot/loader.conf # net.graph.maxalloc #this is set in /boot/loader.conf I'll leave others to comment on what's best to set as values with their experience on FreeBSD10.3. In my case, as I had explained, one of the recipes that worked for me is to comment out and leave those kernel values to their default. I've read in mpd5 mailing list some saying that FreeBSD-11 have had upgrades on the netgraph modules. I am now using FreeBSD-11 and It looks like I don't need any of the kernel tweaks that I've described. Also, may I suggest you troubleshoot the fiber-cut or L2 switch breakage by playing with some ipfw values to simulate a fiber-cut.: ex: ipfw add 100 deny ip from 10.10.10.10 to me 2. There are race conditions between quagga and mpd5 for adding/dropping routes. While troubleshooting the crashes of the mpd5, I have removed net/quagga and installed net/bird instead. I am now using net/bird I've written a little howto to get you started with net/bird see: https://forums.freebsd.org/threads/56988/ 3. if A is a pppoe client and B is the mpd5 server, A cannot access TCP services on B. It can access tcp services _beyond_ B, but not on B. (there is a ticket open for this). On Wed, Oct 12, 2016 at 10:51 AM, Donald Baud via freebsd-net mailto:freebsd-net@freebsd.org>> wrote: On 10/12/16 1:13 AM, Julian Elischer wrote: On 11/10/2016 8:56 PM, Donald Baud via freebsd-net wrote: I've been plagued with these =daily= panics until I tried the following recipes and the server has been up for 30 days so far: Normally I should expermient more to see which one of the receipes is really the fix, but I'm just glad that the server is stable for now. this is really great information. It makes debugging a lot more possible. I know it is a hard question, but do you have a way to simulate this workload? I have no real way to simulate this kind of workload Sadly, I don't have a way to simulate the workload but I am very interested to help fix these crashes since as Cassiano said, this makes mpd5/freebsd useless for pppoe/l2tp termination. At this point, I would suggest that Cassiano and Андрей confirm that they don't get panics when they apply the recipes that I am using. I am still running many other cisco-vpdn gateways that I would convert into mpd5/freebsd but my plan was stalled with the daily crashes. I'll wait a couple of weeks to be sure that my recipes are a valid workaround before converting my remaining cisco gateways to mpd5. -Dbaud recipe-1: Don't let mpd5 start automatically when server boots: i.e. in: /etc/rc.conf mpd5_enable="NO" and wait about 5 minutes after server boots then issue: /usr/local/etc/rc.d/mpd5 onestart recipe-2: recompile the kernel with the NETGRAPH_DEBUG option: options NETGRAPH options NETGRAPH_DEBUG options NETGRAPH_KSOCKET options NETGRAPH_L2TP options NETGRAPH_SOCKET options NETGRAPH_TEE options NETGRAPH_VJC options NETGRAPH_PPP options NETGRAPH_IFACE options NETGRAPH_MPPC_COMPRESSION options NETGRAPH_MPPC_ENCRYPTION options NETGRAPH_TCPMSS options IPFIREWALL recipe-3: recompile the kernel and disable the IPv6 and SCTP options: nooptions INET6 nooptions SCTP recipe-4: Don't use any of the sysctl optimizations in other words I commented out all values in sysctl.conf: # net.graph.maxdgram=20480 (this is the default) # net.graph.recvspace=20480 (this is the default) recipe-5: Don't use any of the loader.conf optimizations in other words I commented out all values in load
Re: FreeBSD10.3-RELEASE. Kernel panic.
On 10/12/16 1:13 AM, Julian Elischer wrote: On 11/10/2016 8:56 PM, Donald Baud via freebsd-net wrote: I've been plagued with these =daily= panics until I tried the following recipes and the server has been up for 30 days so far: Normally I should expermient more to see which one of the receipes is really the fix, but I'm just glad that the server is stable for now. this is really great information. It makes debugging a lot more possible. I know it is a hard question, but do you have a way to simulate this workload? I have no real way to simulate this kind of workload Sadly, I don't have a way to simulate the workload but I am very interested to help fix these crashes since as Cassiano said, this makes mpd5/freebsd useless for pppoe/l2tp termination. At this point, I would suggest that Cassiano and Андрей confirm that they don't get panics when they apply the recipes that I am using. I am still running many other cisco-vpdn gateways that I would convert into mpd5/freebsd but my plan was stalled with the daily crashes. I'll wait a couple of weeks to be sure that my recipes are a valid workaround before converting my remaining cisco gateways to mpd5. -Dbaud recipe-1: Don't let mpd5 start automatically when server boots: i.e. in: /etc/rc.conf mpd5_enable="NO" and wait about 5 minutes after server boots then issue: /usr/local/etc/rc.d/mpd5 onestart recipe-2: recompile the kernel with the NETGRAPH_DEBUG option: options NETGRAPH options NETGRAPH_DEBUG options NETGRAPH_KSOCKET options NETGRAPH_L2TP options NETGRAPH_SOCKET options NETGRAPH_TEE options NETGRAPH_VJC options NETGRAPH_PPP options NETGRAPH_IFACE options NETGRAPH_MPPC_COMPRESSION options NETGRAPH_MPPC_ENCRYPTION options NETGRAPH_TCPMSS options IPFIREWALL recipe-3: recompile the kernel and disable the IPv6 and SCTP options: nooptions INET6 nooptions SCTP recipe-4: Don't use any of the sysctl optimizations in other words I commented out all values in sysctl.conf: # net.graph.maxdgram=20480 (this is the default) # net.graph.recvspace=20480 (this is the default) recipe-5: Don't use any of the loader.conf optimizations in other words I commented out all values in loader.conf # net.graph.maxdata=4096 (this is the default) # net.graph.maxalloc=4096 (this is the default) In my case, I had the panics with 10.3 and 11-PRERELEASE 11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #2 r305587 With those recipes, I have been running without any crash for a month and counting. Thats' 300 l2tp tunnels and 1400 l2tp sessions generating 700Mbit/s. -DBaud On Tuesday, October 11, 2016 7:30 AM, Cassiano Peixoto wrote: Hi, There are many users complaining about this: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=186114 I've been dealing with this issue for one year with no solution. mpd5 as pppoe server on FreeBSD is useless with this bug. I really would like to see it working again, i think it's quite important to both project and many users. Thanks. On Tue, Oct 11, 2016 at 3:24 AM, Eugene Grosbein wrote: 11.10.2016 11:02, Андрей Леушкин пишет: Hello. I have problem with "FreeBSD nas 10.3-RELEASE FreeBSD 10.3-RELEASE #0: Fri Oct 7 21:12:56 YEKT 2016 nas@nas:/usr/obj/usr/src/sys/nasv3 amd64" Kernel panic is repeated at intervals of 2-3 days. At first I thought that the problem is in the hardware, but the problem did not go away after replacing the server platform. Coredumps and more info on link https://drive.google.com/open?id=0BxciMy2q7ZjTTkIxem9wTE1tM2M Sorry for my english. I'll wait for an answer. This is known and long-stanging problem in the FreeBSD network stack. It shows up when you have lots of network interfaced created/removed frequently like in your case of Network Access Server (PPtP, PPPoE etc). Generally, people run into this problem using mpd5 network daemon. mpd5 uses NETGRAPH kernel subsystem to process traffic and if an interface disappears (f.e., ,user disconnected) while kernel still processes traffic obtained from this interface, it panices. There were lots of reports of this problem. Noone seems to be working on it at the moment. You should fill a PR using Bugzilla and attach your logs to it. Eugene Grosbein ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: FreeBSD10.3-RELEASE. Kernel panic.
I've been plagued with these =daily= panics until I tried the following recipes and the server has been up for 30 days so far: Normally I should expermient more to see which one of the receipes is really the fix, but I'm just glad that the server is stable for now. recipe-1: Don't let mpd5 start automatically when server boots: i.e. in: /etc/rc.conf mpd5_enable="NO" and wait about 5 minutes after server boots then issue: /usr/local/etc/rc.d/mpd5 onestart recipe-2: recompile the kernel with the NETGRAPH_DEBUG option: options NETGRAPH options NETGRAPH_DEBUG options NETGRAPH_KSOCKET options NETGRAPH_L2TP options NETGRAPH_SOCKET options NETGRAPH_TEE options NETGRAPH_VJC options NETGRAPH_PPP options NETGRAPH_IFACE options NETGRAPH_MPPC_COMPRESSION options NETGRAPH_MPPC_ENCRYPTION options NETGRAPH_TCPMSS options IPFIREWALL recipe-3: recompile the kernel and disable the IPv6 and SCTP options: nooptions INET6 nooptions SCTP recipe-4: Don't use any of the sysctl optimizations in other words I commented out all values in sysctl.conf: # net.graph.maxdgram=20480 (this is the default) # net.graph.recvspace=20480 (this is the default) recipe-5: Don't use any of the loader.conf optimizations in other words I commented out all values in loader.conf # net.graph.maxdata=4096 (this is the default) # net.graph.maxalloc=4096 (this is the default) In my case, I had the panics with 10.3 and 11-PRERELEASE 11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #2 r305587 With those recipes, I have been running without any crash for a month and counting. Thats' 300 l2tp tunnels and 1400 l2tp sessions generating 700Mbit/s. -DBaud On Tuesday, October 11, 2016 7:30 AM, Cassiano Peixoto wrote: Hi, There are many users complaining about this: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=186114 I've been dealing with this issue for one year with no solution. mpd5 as pppoe server on FreeBSD is useless with this bug. I really would like to see it working again, i think it's quite important to both project and many users. Thanks. On Tue, Oct 11, 2016 at 3:24 AM, Eugene Grosbein wrote: > 11.10.2016 11:02, Андрей Леушкин пишет: > >> Hello. I have problem with "FreeBSD nas 10.3-RELEASE FreeBSD 10.3-RELEASE >> #0: Fri Oct 7 21:12:56 YEKT 2016nas@nas:/usr/obj/usr/src/sys/nasv3 >> amd64" >> >> Kernel panic is repeated at intervals of 2-3 days. At first I thought that >> the problem is in the hardware, but the problem did not go away after >> replacing the server platform. >> >> Coredumps and more info on link >> https://drive.google.com/open?id=0BxciMy2q7ZjTTkIxem9wTE1tM2M >> >> Sorry for my english. >> I'll wait for an answer. >> > > This is known and long-stanging problem in the FreeBSD network stack. > It shows up when you have lots of network interfaced created/removed > frequently > like in your case of Network Access Server (PPtP, PPPoE etc). > > Generally, people run into this problem using mpd5 network daemon. > mpd5 uses NETGRAPH kernel subsystem to process traffic and > if an interface disappears (f.e., ,user disconnected) > while kernel still processes traffic obtained from this interface, it > panices. > > There were lots of reports of this problem. Noone seems to be working on > it at the moment. > You should fill a PR using Bugzilla and attach your logs to it. > > Eugene Grosbein > > > ___ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: FreeBSD10.3-RELEASE. Kernel panic.
I've been plagued with these =daily= panics until I tried the following recipes and the server has been up for 30 days so far: Normally I should expermient more to see which one of the recipes is really the fix, but I'm just glad that the server is stable for now. recipe-1: Don't let mpd5 start automatically when server boots:i.e. in: /etc/rc.conf mpd5_enable="NO"and wait about 5 minutes after server boots then issue: /usr/local/etc/rc.d/mpd5 onestart recipe-2: recompile the kernel with the NETGRAPH_DEBUG option:options NETGRAPH options NETGRAPH_DEBUG options NETGRAPH_KSOCKET options NETGRAPH_L2TPoptions NETGRAPH_SOCKEToptions NETGRAPH_TEEoptions NETGRAPH_VJCoptions NETGRAPH_PPPoptions NETGRAPH_IFACEoptions NETGRAPH_MPPC_COMPRESSIONoptions NETGRAPH_MPPC_ENCRYPTIONoptions NETGRAPH_TCPMSSoptions IPFIREWALL recipe-3: recompile the kernel and disable the IPv6 and SCTP options:nooptions INET6nooptions SCTP recipe-4: Don't use any of the sysctl optimizations in other words I commented out all values in sysctl.conf:# net.graph.maxdgram=20480 (this is the default)# net.graph.recvspace=20480 (this is the default) recipe-5: Don't use any of the loader.conf optimizationsin other words I commented out all values in loader.conf# net.graph.maxdata=4096 (this is the default)# net.graph.maxalloc=4096 (this is the default) In my case, I had the panics with 10.3 and 11-PRERELEASE11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #2 r305587 With those recipes, I have been running without any crash for a month and counting. That's 300 l2tp tunnels and 1400 l2tp sessions generating 700Mbit/s. _ From: Cassiano Peixoto Sent: Tuesday, October 11, 2016 07:30 Subject: Re: FreeBSD10.3-RELEASE. Kernel panic. To: Eugene Grosbein Cc: , Андрей Леушкин Hi, There are many users complaining about this: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=186114 I've been dealing with this issue for one year with no solution. mpd5 as pppoe server on FreeBSD is useless with this bug. I really would like to see it working again, i think it's quite important to both project and many users. Thanks. On Tue, Oct 11, 2016 at 3:24 AM, Eugene Grosbein wrote: > 11.10.2016 11:02, Андрей Леушкин пишет: > >> Hello. I have problem with "FreeBSD nas 10.3-RELEASE FreeBSD 10.3-RELEASE >> #0: Fri Oct 7 21:12:56 YEKT 2016 nas@nas:/usr/obj/usr/src/sys/nasv3 >> amd64" >> >> Kernel panic is repeated at intervals of 2-3 days. At first I thought that >> the problem is in the hardware, but the problem did not go away after >> replacing the server platform. >> >> Coredumps and more info on link >> https://drive.google.com/open?id=0BxciMy2q7ZjTTkIxem9wTE1tM2M >> >> Sorry for my english. >> I'll wait for an answer. >> > > This is known and long-stanging problem in the FreeBSD network stack. > It shows up when you have lots of network interfaced created/removed > frequently > like in your case of Network Access Server (PPtP, PPPoE etc). > > Generally, people run into this problem using mpd5 network daemon. > mpd5 uses NETGRAPH kernel subsystem to process traffic and > if an interface disappears (f.e., ,user disconnected) > while kernel still processes traffic obtained from this interface, it > panices. > > There were lots of reports of this problem. Noone seems to be working on > it at the moment. > You should fill a PR using Bugzilla and attach your logs to it. > > Eugene Grosbein > > > ___ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
netgraph/ng_base.c causing panic daily
I need help troubleshooting what seems to be race conditions with hooks in netgraph/ng_base.c Not sure what to look for in order to stop those daily panics on a machine running net/mpd5 with a few hundreds l2tp sessions. I'm suspecting a crash being caused by: /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:2403 error = (*rcvdata)(hook, item); break; What can I do to confirm my suspicion once I get a crash log using kgdb? -D # uname -a 11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #0 r305284 #GENERIC kernel # mpd5 --version Version 5.8 (root@110amd64-quarterly-job-01 00:24 11-Aug-2016) # cat /boot/loader.conf net.graph.maxdata=16384 net.graph.maxalloc=16384 # cat /etc/sysctl.conf net.inet.ip.intr_queue_maxlen=1024 net.graph.maxdgram=1024000 net.graph.recvspace=1024000 # /etc/rc.conf: mpd_enable="YES" bird_enable="YES" ## crash 0 Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x28 fault code = supervisor read data, page not present instruction pointer = 0x20:0x82247a8b stack pointer = 0x28:0xfe2df390 frame pointer = 0x28:0xfe2df3d0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 596 (ng_queue1) trap number = 12 panic: page fault cpuid = 1 KDB: stack backtrace: #0 0x80b24087 at kdb_backtrace+0x67 #1 0x80ad9432 at vpanic+0x182 #2 0x80ad92a3 at panic+0x43 #3 0x80fa1d51 at trap_fatal+0x351 #4 0x80fa1f43 at trap_pfault+0x1e3 #5 0x80fa14cc at trap+0x26c #6 0x80f84461 at calltrap+0x8 #7 0x8225669b at ng_l2tp_rcvdata_lower+0x4bb #8 0x8224652e at ng_apply_item+0x14e #9 0x822461a3 at ng_snd_item+0x383 #10 0x8225a05a at ng_ksocket_incoming2+0x17a #11 0x822464c5 at ng_apply_item+0xe5 #12 0x82248ddd at ngthread+0x1bd #13 0x80a900a5 at fork_exit+0x85 #14 0x80f8499e at fork_trampoline+0xe Uptime: 4d7h58m21s Dumping 708 out of 6111 MB:..3%..12%..21%..32%..41%..52%..62%..71%..82%..91% Reading symbols from /usr/local/lib/vmware-tools/modules/drivers/vmmemctl.ko...done. Loaded symbols for /usr/local/lib/vmware-tools/modules/drivers/vmmemctl.ko Reading symbols from /boot/kernel/ipfw.ko...Reading symbols from /usr/lib/debug//boot/kernel/ipfw.ko.debug...done. done. Loaded symbols for /boot/kernel/ipfw.ko Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_socket.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_socket.ko Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from /usr/lib/debug//boot/kernel/netgraph.ko.debug...done. done. Loaded symbols for /boot/kernel/netgraph.ko Reading symbols from /boot/kernel/ng_mppc.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_mppc.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_mppc.ko Reading symbols from /boot/kernel/rc4.ko...Reading symbols from /usr/lib/debug//boot/kernel/rc4.ko.debug...done. done. Loaded symbols for /boot/kernel/rc4.ko Reading symbols from /boot/kernel/ng_l2tp.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_l2tp.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_l2tp.ko Reading symbols from /boot/kernel/ng_ksocket.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_ksocket.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_ksocket.ko Reading symbols from /boot/kernel/ng_tee.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_tee.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_tee.ko Reading symbols from /boot/kernel/ng_iface.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_iface.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_iface.ko Reading symbols from /boot/kernel/ng_ppp.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_ppp.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_ppp.ko Reading symbols from /boot/kernel/ng_tcpmss.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_tcpmss.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_tcpmss.ko Reading symbols from /boot/kernel/ng_vjc.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_vjc.ko.debug...done. done. Loaded symbols for /boot/kernel/ng_vjc.ko #0 doadump (textdump=) at pcpu.h:221 221 __asm("movq %%gs:%1,%0" : "=r" (td) (kgdb) ### (kgdb) list *0x82247a8b 0x82247a8b is in ng_address_hook (/usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:3587). 3582 * that the peer node is present, though maybe invalid. 3583 */ 3584TOPOLOGY_RLOCK(); 3585if ((hook == NULL)
Re: kernel panic with netgraph and mpd5.8
On 2016-07-10 10:49, Donald Baud via freebsd-net wrote: Hi I'm running an l2tp lns through mpd5.8 and it's been crashing twice in 24h. This is a new project replacing a cisco 7206, 700-sessions 800mbit/s I am not familiar with troubleshooting kernel panic's, I am suspecting that the crash is happening inside the netgraph module because the crash is happening at the instruction pointer = 0x20:0x81c38283 I included the 2 two crash logs. I need some help to to figure out what to do next. -Dbaud On 7/10/16 5:14 PM, Hooman Fazaeli wrote: - Upgrade to mpd 5 (/usr/ports/net/mpd5) - Try below workarounds: https://lists.freebsd.org/pipermail/freebsd-bugs/2014-June/056548.html https://lists.freebsd.org/pipermail/freebsd-bugs/2014-June/056549.html https://lists.freebsd.org/pipermail/freebsd-net/2014-June/038954.html On 7/10/16 8:43 PM, Donald Baud via freebsd-net wrote: - I'm already using the latest mpd5: > mpd5 --version Version 5.8 (root@101amd64-quarterly-job-15 12:36 5-Jun-2016) - I had already reviewed those links you mentioned. Here is a summary of the main suggestions in them. * Add a "sleep 1" to up-down interface events. * Revert to RELENG8 or 9 * boost net.graph sysctl/loader.conf net.graph.maxdata=262140 # /boot/loader.conf net.graph.maxalloc=262140 # /boot.loader.conf I was using the following tunings net.graph.maxdgram=524288 (via sysctl.conf default=20480) net.graph.recvspace=524288 (via sysctl.conf default=20480) net.graph.maxdata=65536 (via loader.conf default=4096 ) net.graph.maxalloc=65536 (via loader.conf default=4096 ) I am suspecting that the panic might be caused by a too high maxdata and maxalloc values: I reduced the value to 20480, I'll report back if that will reduce the occurence of kernel panics. vmstat -z | head -1 ; vmstat -z | grep -i graph ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP NetGraph items: 72, 20491, 2,1672,467166841, 0, 0 NetGraph data items: 72, 20491, 0, 1643,1240166475, 0, 0 The server crashed again this morning. It looks like it crashes somewhere in the netgraph.ko module Could someone please help me troubleshoot this issue, it crashes around the same location instruction pointer= 0x20:0x81c3828d The crash happens at random times not necessarily under heavy load. - using plain GENERIC kernel 10.3-RELEASE-p4 FreeBSD 10.3-RELEASE-p4 #0: Sat May 28 12:23:44 UTC 2016 r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 - # kldstat Id Refs AddressSize Name 1 32 0x8020 17bc6a8 kernel 22 0x81c11000 114dbipfw.ko 31 0x81c23000 d32f dummynet.ko 41 0x81c31000 3831 ng_socket.ko 58 0x81c35000 ba02 netgraph.ko 61 0x81c41000 2b99 ng_mppc.ko 71 0x81c44000 80c rc4.ko 81 0x81c45000 23dc vmmemctl.ko 91 0x81c48000 397d ng_l2tp.ko 101 0x81c4c000 4b04 ng_ksocket.ko 111 0x81c51000 17d6 ng_tee.ko 121 0x81c53000 40d2 ng_iface.ko 131 0x81c58000 5829 ng_ppp.ko 141 0x81c5e000 18b1 ng_tcpmss.ko - /etc/rc.conf mpd_enable="YES" quagga_daemons="zebra ospfd" devd_enable="NO" ipv6_network_interfaces="none" ip6addrctl_enable="NO" - /etc/sysctl.conf net.inet.ip.fastforwarding=1 hw.intr_storm_threshold=4 net.graph.maxdgram=524288 net.graph.recvspace=524288 - /boot/loader.conf net.graph.maxdata=20480 net.graph.maxalloc=20480 - grep kernel: /var/log/messages Jul 12 04:18:05 mybox syslogd: kernel boot file is /boot/kernel/kernel Jul 12 04:18:05 mybox kernel: Jul 12 04:18:05 mybox kernel: Jul 12 04:18:05 mybox kernel: Fatal trap 9: general protection fault while in kernel mode Jul 12 04:18:05 mybox kernel: cpuid = 0; apic id = 00 Jul 12 04:18:05 mybox kernel: instruction pointer= 0x20:0x81c3828d Jul 12 04:18:05 mybox kernel: stack pointer= 0x28:0xfe0174da8380 Jul 12 04:18:05 mybox kernel: frame pointer= 0x28:0xfe0174da83c0 Jul 12 04:18:05 mybox kernel: code segment= base 0x0, limit 0xf, type 0x1b Jul 12 04:18:05 mybox kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 Jul 12 04:18:05 mybox kernel: processor eflags= interrupt enabled, resume, IOPL = 0 Jul 12 04:18:05 mybox kernel: current process= 659 (ng_queue3) Jul 12 04:18:05 mybox kernel: trap number= 9 Jul 12 04:18:05 mybox kernel: panic: general protection fault Jul 12 04:18:05 mybox kernel: cpuid = 0 Jul 12 04:18:05 mybox kernel: KDB: stack backtrace: Jul 12 04:18:05 mybox kernel: #0 0x8098e390 at kdb_backtrace+0x60 Jul 12 04:18:05 mybox kernel: #1 0x80951066 at vpanic+0x126 Jul 12 04:18:05 mybox kernel: #2 0x80950f33 at panic+0x43 Jul 12 04:18:05
Re: kernel panic with netgraph and mpd3.8
On 2016-07-10 10:49, Donald Baud via freebsd-net wrote: Hi I'm running an l2tp lns through mpd3.8 and it's been crashing twice in 24h. This is a new project replacing a cisco 7206, 700-sessions 800mbit/s I am not familiar with troubleshooting kernel panic's, I am suspecting that the crash is happening inside the netgraph module because the crash is happening at the instruction pointer = 0x20:0x81c38283 I included the 2 two crash logs. I need some help to to figure out what to do next. -Dbaud On 7/10/16 5:14 PM, Hooman Fazaeli wrote: - Upgrade to mpd 5 (/usr/ports/net/mpd5) - Try below workarounds: https://lists.freebsd.org/pipermail/freebsd-bugs/2014-June/056548.html https://lists.freebsd.org/pipermail/freebsd-bugs/2014-June/056549.html https://lists.freebsd.org/pipermail/freebsd-net/2014-June/038954.html - I'm already using the latest mpd5: > mpd5 --version Version 5.8 (root@101amd64-quarterly-job-15 12:36 5-Jun-2016) - I had already reviewed those links you mentioned. Here is a summary of the main suggestions in them. * Add a "sleep 1" to up-down interface events. * Revert to RELENG8 or 9 * boost net.graph sysctl/loader.conf net.graph.maxdata=262140 # /boot/loader.conf net.graph.maxalloc=262140 # /boot.loader.conf I was using the following tunings net.graph.maxdgram=524288 (via sysctl.conf default=20480) net.graph.recvspace=524288 (via sysctl.conf default=20480) net.graph.maxdata=65536 (via loader.conf default=4096 ) net.graph.maxalloc=65536 (via loader.conf default=4096 ) I am suspecting that the panic might be caused by a too high maxdata and maxalloc values: I reduced the value to 20480, I'll report back if that will reduce the occurence of kernel panics. vmstat -z | head -1 ; vmstat -z | grep -i graph ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP NetGraph items: 72, 20491, 2,1672,467166841, 0, 0 NetGraph data items: 72, 20491, 0,1643,1240166475, 0, 0 ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
kernel panic with netgraph and mpd3.8
Hi I'm running an l2tp lns through mpd3.8 and it's been crashing twice in 24h. This is a new project replacing a cisco 7206, 700-sessions 800mbit/s I am not familiar with troubleshooting kernel panic's, I am suspecting that the crash is happening inside the netgraph module because the crash is happening at the instruction pointer = 0x20:0x81c38283 I included the 2 two crash logs. I need some help to to figure out what to do next. -Dbaud The box is a: # uname -a FreeBSD mybox.example.com 10.3-RELEASE-p4 FreeBSD 10.3-RELEASE-p4 #0: Sat May 28 12:23:44 UTC 2016 r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 # kldstat Id Refs AddressSize Name 1 34 0x8020 17bc6a8 kernel 22 0x81c11000 114dbipfw.ko 31 0x81c23000 d32f dummynet.ko 41 0x81c31000 3831 ng_socket.ko 59 0x81c35000 ba02 netgraph.ko 61 0x81c41000 2b99 ng_mppc.ko 71 0x81c44000 80c rc4.ko 81 0x81c45000 23dc vmmemctl.ko 91 0x81c48000 397d ng_l2tp.ko 101 0x81c4c000 4b04 ng_ksocket.ko 111 0x81c51000 17d6 ng_tee.ko 121 0x81c53000 40d2 ng_iface.ko 131 0x81c58000 5829 ng_ppp.ko 141 0x81c5e000 18b1 ng_tcpmss.ko 151 0x81c6 2df7 ng_vjc.ko === First crash dump: Jul 8 08:09:04 mybox syslogd: kernel boot file is /boot/kernel/kernel Jul 8 08:09:04 mybox kernel: Jul 8 08:09:04 mybox kernel: Jul 8 08:09:04 mybox kernel: Fatal trap 12: page fault while in kernel mode Jul 8 08:09:04 mybox kernel: cpuid = 1; apic id = 01 Jul 8 08:09:04 mybox kernel: fault virtual address = 0x28 Jul 8 08:09:04 mybox kernel: fault code= supervisor read data, page not present Jul 8 08:09:04 mybox kernel: instruction pointer = 0x20:0x81c38283 Jul 8 08:09:04 mybox kernel: stack pointer = 0x28:0xfe0174d85540 Jul 8 08:09:04 mybox kernel: frame pointer = 0x28:0xfe0174d85580 Jul 8 08:09:04 mybox kernel: code segment = base 0x0, limit 0xf, type 0x1b Jul 8 08:09:04 mybox kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 Jul 8 08:09:04 mybox kernel: processor eflags = interrupt enabled, resume, IOPL = 0 Jul 8 08:09:04 mybox kernel: current process = 628 (ng_queue2) Jul 8 08:09:04 mybox kernel: trap number = 12 Jul 8 08:09:04 mybox kernel: panic: page fault Jul 8 08:09:04 mybox kernel: cpuid = 1 Jul 8 08:09:04 mybox kernel: KDB: stack backtrace: Jul 8 08:09:04 mybox kernel: #0 0x8098e390 at kdb_backtrace+0x60 Jul 8 08:09:04 mybox kernel: #1 0x80951066 at vpanic+0x126 Jul 8 08:09:04 mybox kernel: #2 0x80950f33 at panic+0x43 Jul 8 08:09:04 mybox kernel: #3 0x80d55f7b at trap_fatal+0x36b Jul 8 08:09:04 mybox kernel: #4 0x80d5627d at trap_pfault+0x2ed Jul 8 08:09:04 mybox kernel: #5 0x80d558fa at trap+0x47a Jul 8 08:09:04 mybox kernel: #6 0x80d3b8d2 at calltrap+0x8 Jul 8 08:09:04 mybox kernel: #7 0x81c5e509 at ng_tcpmss_rcvdata+0x2d9 Jul 8 08:09:04 mybox kernel: #8 0x81c370ca at ng_apply_item+0x21a Jul 8 08:09:04 mybox kernel: #9 0x81c36d1a at ng_snd_item+0x38a Jul 8 08:09:04 mybox kernel: #10 0x81c5a1c8 at ng_ppp_comp_recv+0x148 Jul 8 08:09:04 mybox kernel: #11 0x81c370ca at ng_apply_item+0x21a Jul 8 08:09:04 mybox kernel: #12 0x81c36d1a at ng_snd_item+0x38a Jul 8 08:09:04 mybox kernel: #13 0x81c370ca at ng_apply_item+0x21a Jul 8 08:09:04 mybox kernel: #14 0x81c36d1a at ng_snd_item+0x38a Jul 8 08:09:04 mybox kernel: #15 0x81c370ca at ng_apply_item+0x21a Jul 8 08:09:04 mybox kernel: #16 0x81c36d1a at ng_snd_item+0x38a Jul 8 08:09:04 mybox kernel: #17 0x81c4d3e2 at ng_ksocket_incoming2+0x2f2 Jul 8 08:09:04 mybox kernel: Uptime: 5d17h47m38s Jul 8 08:09:04 mybox kernel: Copyright (c) 1992-2016 The FreeBSD Project. Jul 8 08:09:04 mybox kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Jul 8 08:09:04 mybox kernel: The Regents of the University of California. All rights reserved. Jul 8 08:09:04 mybox kernel: FreeBSD is a registered trademark of The FreeBSD Foundation. Jul 8 08:09:04 mybox kernel: FreeBSD 10.3-RELEASE-p4 #0: Sat May 28 12:23:44 UTC 2016 Jul 8 08:09:04 mybox kernel: r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 Jul 8 08:09:04 mybox kernel: FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512 Jul 8 08:09:04 mybox kernel: CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.00-MHz K8-class CPU) Jul 8 08:09:04 mybox kernel: Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7 Jul 8 08:09:04 mybox kernel: Features=0x1fa3fbff Jul 8