Re: help w/panic under heavy load - 5.4
Max Laier ([EMAIL PROTECTED]) wrote: > > Edwin, what do you have for CFLAGS? Can you try to downgrade to "-O" for now > so that we have a better chance to get a full view? > Max, I have no CFLAGS or COPTFLAGS in /etc/make.conf - this was a basic kern-developer install on a blank PC. The only thing that's a little different about the box that i use to compile is that it's a dual processor machine - but no -j# options used in compilation of the kernel. the compile is proceding with the following as an example output from make/cc $ grep netinet /tmp/make.DEBUG1.output |grep fastfwd cc -c -O -pipe -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -fformat-extensions -std=c99 -g -nostdinc -I- -I. -I/usr/src/sys -I/usr/src/sys/contrib/dev/ acpica -I/usr/src/sys/contrib/altq -I/usr/src/sys/contrib/ipfilter -I/usr/src/sys/contrib/pf -I/usr/src/sys/contrib /dev/ath -I/usr/src/sys/contrib/dev/ath/freebsd -I/usr/src/sys/contrib/ngatm -I/usr/src/sys/dev/twa -D_KERNEL -incl ude opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -ffreestanding -Werror /usr/src/sys/netinet/ip_fastfwd.c $ are you referring to the -fformat-extensions, -fno-common and -finline...etc optimizations as well? or just the -O v. -O2/-O3/-Os one? If yes to the -f* optimizations - besides commenting out parts of the makefiles - is there a 'normal' way to disable them? FWIW - I also had (I think) the same problem with the 5.3 release - but I never worked it out - just other things on my plate, so I don't believe it's a recent code change (ie. 5.4 timeframe) if it does turn out to be a code change. it also has something to do with the load on the box - I'm testing with small udp packets (using iperf) - if I step up the size - I have to step up the bandwidth in order to cause the panic. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: help w/panic under heavy load - 5.4
On Sunday 24 July 2005 17:42, Simon 'corecode' Schubert wrote: > On 24.07.2005, at 16:19, Edwin wrote: > > (kgdb) f 13 > > #13 0xc068f6e9 in ip_fastforward (m=0xc12e2300) at > > /usr/src/sys/netinet/ip_fastfwd.c:572 > > (kgdb) i loc > > ip = (struct ip *) 0xc12f000e > > m0 = (struct mbuf *) 0xc12f000e > > ro = {ro_rt = 0xc11ee420, ro_dst = {sa_len = 16 '\020', sa_family = 2 > > '\002', > > sa_data = "\000\000ˬ\002\005\000\000\000\000\000\000\000"}} > > dst = (struct sockaddr_in *) 0xc76bfc3c > > ia = (struct in_ifaddr *) 0x0 > > ifa = (struct ifaddr *) 0x0 > > ifp = (struct ifnet *) 0xc0f91800 > > odest = {s_addr = 84060352} > > dest = {s_addr = 84060352} > > sum = 0 > > ip_len = 0 > > error = 84060352 > > hlen = -1057417216 > > mtu = 0 > > __func__ = "ip_fastforward" > > error == 84060352 == dest.s_addr > hlen == -1057417216 == 0xc0f91800 == ifp > > > (kgdb) f 12 > > #12 0xc0692b74 in ip_fragment (ip=0xc12f000e, m_frag=0xc76bfc6c, > > mtu=-1056775680, if_hwassist_flags=0, sw_csum=1) > > at /usr/src/sys/netinet/ip_output.c:967 > > 967 m->m_next = m_copy(m0, off, len); > > (kgdb) i loc > > mhip = (struct ip *) 0xc102e240 > > m = (struct mbuf *) 0xc102e200 > > mhlen = 20 > > error = 0 > > hlen = 20 > > len = 1480 > > off = 1500 > > m0 = (struct mbuf *) 0xc12e2300 > > firstlen = 1480 > > mnext = (struct mbuf **) 0xc12e2304 > > nfrags = 1 > > mtu (parameter) == -1056775680 == 0xc102e200 == m > > your stack (or gdb) seems seriously broken Not necessarily. This can well be an effect of higher optimization levels. Edwin, what do you have for CFLAGS? Can you try to downgrade to "-O" for now so that we have a better chance to get a full view? -- /"\ Best regards, | [EMAIL PROTECTED] \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | [EMAIL PROTECTED] / \ ASCII Ribbon Campaign | Against HTML Mail and News pgpX2oM3GC74E.pgp Description: PGP signature
Re: help w/panic under heavy load - 5.4
On 24.07.2005, at 16:19, Edwin wrote: (kgdb) f 13 #13 0xc068f6e9 in ip_fastforward (m=0xc12e2300) at /usr/src/sys/netinet/ip_fastfwd.c:572 (kgdb) i loc ip = (struct ip *) 0xc12f000e m0 = (struct mbuf *) 0xc12f000e ro = {ro_rt = 0xc11ee420, ro_dst = {sa_len = 16 '\020', sa_family = 2 '\002', sa_data = "\000\000ˬ\002\005\000\000\000\000\000\000\000"}} dst = (struct sockaddr_in *) 0xc76bfc3c ia = (struct in_ifaddr *) 0x0 ifa = (struct ifaddr *) 0x0 ifp = (struct ifnet *) 0xc0f91800 odest = {s_addr = 84060352} dest = {s_addr = 84060352} sum = 0 ip_len = 0 error = 84060352 hlen = -1057417216 mtu = 0 __func__ = "ip_fastforward" error == 84060352 == dest.s_addr hlen == -1057417216 == 0xc0f91800 == ifp (kgdb) f 12 #12 0xc0692b74 in ip_fragment (ip=0xc12f000e, m_frag=0xc76bfc6c, mtu=-1056775680, if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 967 m->m_next = m_copy(m0, off, len); (kgdb) i loc mhip = (struct ip *) 0xc102e240 m = (struct mbuf *) 0xc102e200 mhlen = 20 error = 0 hlen = 20 len = 1480 off = 1500 m0 = (struct mbuf *) 0xc12e2300 firstlen = 1480 mnext = (struct mbuf **) 0xc12e2304 nfrags = 1 mtu (parameter) == -1056775680 == 0xc102e200 == m your stack (or gdb) seems seriously broken cheers simon -- Serve - BSD +++ RENT this banner advert +++ASCII Ribbon /"\ Work - Mac +++ space for low $$$ NOW!1 +++ Campaign \ / Party Enjoy Relax | http://dragonflybsd.org Against HTML \ Dude 2c 2 the max ! http://golden-apple.biz Mail + News / \ PGP.sig Description: This is a digitally signed message part
Re: help w/panic under heavy load - 5.4
New kernel: ident D1-0723 (same as D1-0722 - but w/ IPFIREWALL* options removed) same traces asked for previously. Thanks again, /Edwin kgdb kernel.debug /usr/local/STORAGE/crash/vmcore.1 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". #0 doadump () at pcpu.h:159 159 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) where #0 doadump () at pcpu.h:159 #1 0xc0460ef6 in db_fncall (dummy1=0, dummy2=0, dummy3=43, dummy4=0xc76bf9f4 "(úkÇ") at /usr/src/sys/ddb/db_command.c:531 #2 0xc0460d04 in db_command (last_cmdp=0xc08be624, cmd_table=0x0, aux_cmd_tablep=0xc083e324, aux_cmd_tablep_end=0xc083e340) at /usr/src/sys/ddb/db_command.c:349 #3 0xc0460dcc in db_command_loop () at /usr/src/sys/ddb/db_command.c:455 #4 0xc0462951 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:221 #5 0xc06277f2 in kdb_trap (type=3, code=0, tf=0xc76bfb30) at /usr/src/sys/kern/subr_kdb.c:468 #6 0xc07ad874 in trap (frame= {tf_fs = -949288936, tf_es = -1067319280, tf_ds = -1065287664, tf_edi = 1, tf_esi = -1065233792, tf_ebp = -949224592, tf_isp = -949224612, tf_ebx = -949224548, tf_edx = 0, tf_ecx = -1060921344, tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1067289229, tf_cs = -1065287672, tf_eflags = 658, tf_esp = -949224560, tf_ss = -1067377425}) at /usr/src/sys/i386/i386/trap.c:584 #7 0xc079deaa in calltrap () at /usr/src/sys/i386/i386/exception.s:140 #8 0xc76b0018 in ?? () #9 0xc0620010 in sched_runnable () at /usr/src/sys/kern/sched_4bsd.c:641 #10 0xc0611cef in panic (fmt=0xc081d280 "m_copym, offset > size of mbuf chain") at /usr/src/sys/kern/kern_shutdown.c:550 #11 0xc064172c in m_copym (m=0x0, off0=1500, len=1480, wait=1) at /usr/src/sys/kern/uipc_mbuf.c:385 #12 0xc0692b74 in ip_fragment (ip=0xc12f000e, m_frag=0xc76bfc6c, mtu=-1056775680, if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 #13 0xc068f6e9 in ip_fastforward (m=0xc12e2300) at /usr/src/sys/netinet/ip_fastfwd.c:572 #14 0xc0672759 in ether_demux (ifp=0xc0f9, m=0xc12e2300) at /usr/src/sys/net/if_ethersubr.c:770 #15 0xc06724f5 in ether_input (ifp=0xc0f9, m=0xc12e2300) at /usr/src/sys/net/if_ethersubr.c:631 #16 0xc070a9e7 in sis_rxeof (sc=0xc0f9) at /usr/src/sys/pci/if_sis.c:1636 #17 0xc070ae6f in sis_intr (arg=0xc0f9) at /usr/src/sys/pci/if_sis.c:1841 #18 0xc0600130 in ithread_loop (arg=0xc0ec6880) at /usr/src/sys/kern/kern_intr.c:547 #19 0xc05ff5a4 in fork_exit (callout=0xc06c , arg=0xc0ec6880, frame=0xc76bfd48) at /usr/src/sys/kern/kern_fork.c:791 #20 0xc079df0c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:209 (kgdb) f 13 #13 0xc068f6e9 in ip_fastforward (m=0xc12e2300) at /usr/src/sys/netinet/ip_fastfwd.c:572 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, (kgdb) l 567 m->m_pkthdr.csum_flags |= CSUM_IP; 568 /* 569 * ip_fragment expects ip_len and ip_off in host byte 570 * order but returns all packets in network byte order 571 */ 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, 573 (~ifp->if_hwassist & CSUM_DELAY_IP))) { 574 goto drop; 575 } 576 KASSERT(m != NULL, ("null mbuf and no error")); (kgdb) i loc ip = (struct ip *) 0xc12f000e m0 = (struct mbuf *) 0xc12f000e ro = {ro_rt = 0xc11ee420, ro_dst = {sa_len = 16 '\020', sa_family = 2 '\002', sa_data = "\000\000À¨\002\005\000\000\000\000\000\000\000"}} dst = (struct sockaddr_in *) 0xc76bfc3c ia = (struct in_ifaddr *) 0x0 ifa = (struct ifaddr *) 0x0 ifp = (struct ifnet *) 0xc0f91800 odest = {s_addr = 84060352} dest = {s_addr = 84060352} sum = 0 ip_len = 0 error = 84060352 hlen = -1057417216 mtu = 0 __func__ = "ip_fastforward" (kgdb) p *ip $1 = {ip_hl = 5, ip_v = 4, ip_tos = 0 '\0', ip_len = 10240, ip_id = 33436, ip_off = 0, ip_ttl = 64 '@', ip_p = 17 '\021', ip_sum = 59733, ip_src = {s_addr = 67479744}, ip_dst = {s_addr = 84060352}} (kgdb) p *m $2 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc12f000e "E", mh_len = 40, mh_flags = 3, mh_type = 1}, M_dat = {MH = {MH_pkthdr = {rcvif = 0xc0f9, len = 40, header = 0x0, csum_flags = 769, csum_data = 0, tags = {slh_first = 0x0}}, MH_dat = {MH_ext = {ext_buf = 0xc12f "", ext_free = 0, ext_args =
Re: help w/panic under heavy load - 5.4
On Sunday 24 July 2005 04:38, Edwin wrote: > If I understand correctly...(albeit an overly brief understanding :)) > > 1. ethernet packet comes in - stuck into an mbuf > 2. ether_demux calls ip_fastforward passing the mbuf struct > 3. mbuf struct is copied/munged into ip struct by mtod > 4. ntohs is called to change ip->ip_len to host byte order > incidentally - ip_len should be set to ntohs(ip->ip_len) > as well - it seems like neither one of those calls worked? > 5. also - the call to set hlen to ip->ip_hl <<2 didn't work out well > either - right? since hlen = -1057417216, and i think it's > supposed to be 20 (5*4) - am I correct there as well? 4. and 5. are strange but not of too much significance. Given that we got through the initial sanity checks and that neither is used further down, this might jut be an optimization effect. You could try to mark ip_len as volatile. > 6. due to ip->ip_len being in network byte order still a little > gremlin helps us to think we have a 10240 byte packet and we > need to fragment it... > 7. in ip_fragment - ip->ip_len is still 10240 - so we assume that we > need to make several fragments - however, the mbuf is correct > (len = 40) > 8. in ip_fragment - to create the 'second' fragment, we try to copy > 1480 bytes @ offset 1500 out of the mbuf that only has a valid > data length of 40-bytes??? That's what happens, yes. > Are we really looking for the cause of ip->ip_len not being in the correct > order @ the right time then? - in that case - there's two possibilities > that I see - and I don't think that ntohs not working (1) is too realistic, > so I would suppose we are looking for what flipped it in the first place? > > 1. either ntohs didn't work for some reason, or > 2. it was already in host order, and the ntohs call flipped it back to > network order Neither seems very likely. My guess is really *something* along the way messing things up - pfil is the only suspect I have, right now. > If you feel that it's a ipfw/ipfil issue - I can easily take IPFIREWALL* > options out of the kernel and build a new one - just give me about 15 > minutes. Yes please and make sure it isn't loaded as a module either. -- /"\ Best regards, | [EMAIL PROTECTED] \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | [EMAIL PROTECTED] / \ ASCII Ribbon Campaign | Against HTML Mail and News pgprNotDzk7x2.pgp Description: PGP signature
Re: help w/panic under heavy load - 5.4
Max/et.al., replies to your message in-line below... If I understand correctly...(albeit an overly brief understanding :)) 1. ethernet packet comes in - stuck into an mbuf 2. ether_demux calls ip_fastforward passing the mbuf struct 3. mbuf struct is copied/munged into ip struct by mtod 4. ntohs is called to change ip->ip_len to host byte order incidentally - ip_len should be set to ntohs(ip->ip_len) as well - it seems like neither one of those calls worked? 5. also - the call to set hlen to ip->ip_hl <<2 didn't work out well either - right? since hlen = -1057417216, and i think it's supposed to be 20 (5*4) - am I correct there as well? 6. due to ip->ip_len being in network byte order still a little gremlin helps us to think we have a 10240 byte packet and we need to fragment it... 7. in ip_fragment - ip->ip_len is still 10240 - so we assume that we need to make several fragments - however, the mbuf is correct (len = 40) 8. in ip_fragment - to create the 'second' fragment, we try to copy 1480 bytes @ offset 1500 out of the mbuf that only has a valid data length of 40-bytes??? Are we really looking for the cause of ip->ip_len not being in the correct order @ the right time then? - in that case - there's two possibilities that I see - and I don't think that ntohs not working (1) is too realistic, so I would suppose we are looking for what flipped it in the first place? 1. either ntohs didn't work for some reason, or 2. it was already in host order, and the ntohs call flipped it back to network order If you feel that it's a ipfw/ipfil issue - I can easily take IPFIREWALL* options out of the kernel and build a new one - just give me about 15 minutes. cheers. /edwin Max Laier ([EMAIL PROTECTED]) wrote: > On Saturday 23 July 2005 20:41, Edwin wrote: > > Kernel name: D1-0722 (for reference) > > > > mbsd05# kgdb kernel.debug /usr/local/STORAGE/crash/vmcore.5 > > #13 0xc06933c1 in ip_fastforward (m=0xc12e6c00) at > > /usr/src/sys/netinet/ip_fastfwd.c:572 warning: Source file is more recent > > than executable. > > Let's hope that's still correct ... > it is - result of manual patch application and removal - just the timestamp/dates on the file are different (verified by diff from clean source tree just now to make sure again. > > 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, > > (kgdb) l > > 567 m->m_pkthdr.csum_flags |= CSUM_IP; > > 568 /* > > 569 * ip_fragment expects ip_len and ip_off in > > host byte > > 570 * order but returns all packets in network > > byte order > > 571 */ > > 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, > > 573 (~ifp->if_hwassist & > > CSUM_DELAY_IP))) { > > 574 goto drop; > > 575 } > > 576 KASSERT(m != NULL, ("null mbuf and no error")); > > (kgdb) i loc > > ip = (struct ip *) 0xc12f700e > > m0 = (struct mbuf *) 0xc12f700e > > ro = {ro_rt = 0xc11f8420, ro_dst = {sa_len = 16 '\020', sa_family = 2 > > '\002', sa_data = "\000\000ˬ\002\005\000\000\000\000\000\000\000"}} > > dst = (struct sockaddr_in *) 0xc76bfc3c > > ia = (struct in_ifaddr *) 0x0 > > ifa = (struct ifaddr *) 0x0 > > ifp = (struct ifnet *) 0xc0f91800 > > odest = {s_addr = 84060352} > > dest = {s_addr = 84060352} > > sum = 0 > > ip_len = 0 > > This should not happen. ip_len is initialize from ntohs(ip->ip_len) and never > touched again. Anyway, let's look some more ... is it accurate to say that ip->ip_len is 10240 @ this point - but it should be 40? at line 542 of ip_fastfwd.c 1.17.2.7... the ip->ip_len <= mtu should eval to true and fall through to the true case - but it falls through to false (hence the ip_fragment section) - b/c it is still in network order? if (ip->ip_len <= mtu || (ifp->if_hwassist & CSUM_FRAGMENT && (ip->ip_off & IP_DF) == 0)) { /* * Restore packet header fields to original values */ ip->ip_len = htons(ip->ip_len); ip->ip_off = htons(ip->ip_off); /* * Send off the packet via outgoing interface */ error = (*ifp->if_output)(ifp, m, (struct sockaddr *)dst, ro.ro_rt); } else { /* * Handle EMSGSIZE with icmp reply needfrag for TCP MTU discovery */ if (ip->ip_off & IP_DF) { ipstat.ips_cantfrag++; icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, 0, ifp); goto consumed; } else {
Re: help w/panic under heavy load - 5.4
On Saturday 23 July 2005 20:41, Edwin wrote: > Kernel name: D1-0722 (for reference) > > mbsd05# kgdb kernel.debug /usr/local/STORAGE/crash/vmcore.5 > #13 0xc06933c1 in ip_fastforward (m=0xc12e6c00) at > /usr/src/sys/netinet/ip_fastfwd.c:572 warning: Source file is more recent > than executable. Let's hope that's still correct ... > 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, > (kgdb) l > 567 m->m_pkthdr.csum_flags |= CSUM_IP; > 568 /* > 569* ip_fragment expects ip_len and ip_off in > host byte > 570* order but returns all packets in network > byte order > 571*/ > 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, > 573 (~ifp->if_hwassist & > CSUM_DELAY_IP))) { > 574 goto drop; > 575 } > 576 KASSERT(m != NULL, ("null mbuf and no error")); > (kgdb) i loc > ip = (struct ip *) 0xc12f700e > m0 = (struct mbuf *) 0xc12f700e > ro = {ro_rt = 0xc11f8420, ro_dst = {sa_len = 16 '\020', sa_family = 2 > '\002', sa_data = "\000\000ˬ\002\005\000\000\000\000\000\000\000"}} > dst = (struct sockaddr_in *) 0xc76bfc3c > ia = (struct in_ifaddr *) 0x0 > ifa = (struct ifaddr *) 0x0 > ifp = (struct ifnet *) 0xc0f91800 > odest = {s_addr = 84060352} > dest = {s_addr = 84060352} > sum = 0 > ip_len = 0 This should not happen. ip_len is initialize from ntohs(ip->ip_len) and never touched again. Anyway, let's look some more ... > error = 84060352 > hlen = -1057417216 > mtu = 0 > __func__ = "ip_fastforward" > (kgdb) p *ip > $1 = {ip_hl = 5, ip_v = 4, ip_tos = 0 '\0', ip_len = 10240, ip_id = 61249, ip_len should be 40 as ip_len is supposed to be in HOST BYTE ORDER at this point. Feeding 10240 to ntohs() give the correct value, so something obviously went wrong. Let's see how we got here: 355 does the byteorder flip to host byte order 366 pfil OUT 451 pfil IN 527 first check ip_len < if_mtu etc ... Obviously, the only thing that might mess with the byte order (unless I missed something along the way) is one of the pfil consumers. *** *** What firewall(s) are you running with? *** > ip_off = 0, ip_ttl = 63 '?', ip_p = 17 '\021', ip_sum = 31921, ip_src = > {s_addr = 67479744}, ip_dst = {s_addr = 84060352}} (kgdb) p *m > $2 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc12f700e "E", > mh_len = 40, mh_flags = 3, mh_type = 1}, M_dat = {MH = {MH_pkthdr = {rcvif > = 0xc0f9, len = 40, header = 0x0, csum_flags = 769, csum_data = 0, tags 40, there you have it - no need to fragment at all! > /usr/src/sys/netinet/ip_output.c:967 > 967 m->m_next = m_copy(m0, off, len); > (kgdb) l > 962 len = ip->ip_len - off; > 963 m->m_flags |= M_LASTFRAG; > 964 } else > 965 mhip->ip_off |= IP_MF; > 966 mhip->ip_len = htons((u_short)(len + mhlen)); > 967 m->m_next = m_copy(m0, off, len); > 968 if (m->m_next == NULL) {/* copy failed */ > 969 m_free(m); > 970 error = ENOBUFS;/* ??? */ > 971 ipstat.ips_odropped++; Just to make sure, we didn't touch the original packet at this point so the above values are still the ones we based the (wrong) decision on. -- /"\ Best regards, | [EMAIL PROTECTED] \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | [EMAIL PROTECTED] / \ ASCII Ribbon Campaign | Against HTML Mail and News pgpcAgQ9uUdUi.pgp Description: PGP signature
Re: help w/panic under heavy load - 5.4
Max Laier ([EMAIL PROTECTED]) wrote: > On Saturday 23 July 2005 15:53, Edwin wrote: > > Can we see one complete picture, please. This includes: > > A trace > local vars in ip_fastforward including unfolded ip, m, ro.ro_rt and ifp. > local vars in ip_fragment(). > > Thanks. > > -- > /"\ Best regards, | [EMAIL PROTECTED] > \ / Max Laier | ICQ #67774661 > X http://pf4freebsd.love2party.net/ | [EMAIL PROTECTED] > / \ ASCII Ribbon Campaign | Against HTML Mail and News Absolutely ;) - Thanks for taking the time! cheers. /edwin Kernel name: D1-0722 (for reference) mbsd05# kgdb kernel.debug /usr/local/STORAGE/crash/vmcore.5 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". #0 doadump () at pcpu.h:159 159 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) where #0 doadump () at pcpu.h:159 #1 0xc04611f6 in db_fncall (dummy1=0, dummy2=0, dummy3=43, dummy4=0xc76bf9f4 "(úkÇ") at /usr/src/sys/ddb/db_command.c:531 #2 0xc0461004 in db_command (last_cmdp=0xc08c9264, cmd_table=0x0, aux_cmd_tablep=0xc08483b8, aux_cmd_tablep_end=0xc08483d4) at /usr/src/sys/ddb/db_command.c:349 #3 0xc04610cc in db_command_loop () at /usr/src/sys/ddb/db_command.c:455 #4 0xc0462c51 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:221 #5 0xc0627af2 in kdb_trap (type=3, code=0, tf=0xc76bfb30) at /usr/src/sys/kern/subr_kdb.c:468 #6 0xc07b6394 in trap (frame= {tf_fs = -949288936, tf_es = -1067319280, tf_ds = -1065222128, tf_edi = 1, tf_esi = -1065197495, tf_ebp = -949224592, tf_isp = -949224612, tf_ebx = -949224548, tf_edx = 0, tf_ecx = -1060921344, tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1067288461, tf_cs = -1065222136, tf_eflags = 658, tf_esp = -949224560, tf_ss = -1067376657}) at /usr/src/sys/i386/i386/trap.c:584 #7 0xc07a69ca in calltrap () at /usr/src/sys/i386/i386/exception.s:140 #8 0xc76b0018 in ?? () #9 0xc0620010 in schedcpu () at /usr/src/sys/kern/sched_4bsd.c:461 #10 0xc0611fef in panic (fmt=0xc0820008 "default") at /usr/src/sys/kern/kern_shutdown.c:550 #11 0xc0641a2c in m_copym (m=0x0, off0=1500, len=1480, wait=1) at /usr/src/sys/kern/uipc_mbuf.c:385 #12 0xc069b694 in ip_fragment (ip=0xc12f700e, m_frag=0xc76bfc6c, mtu=-1056788992, if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 #13 0xc06933c1 in ip_fastforward (m=0xc12e6c00) at /usr/src/sys/netinet/ip_fastfwd.c:572 #14 0xc0672a59 in ether_demux (ifp=0xc0f9, m=0xc12e6c00) at /usr/src/sys/net/if_ethersubr.c:770 #15 0xc06727f5 in ether_input (ifp=0xc0f9, m=0xc12e6c00) at /usr/src/sys/net/if_ethersubr.c:631 #16 0xc0713507 in sis_rxeof (sc=0xc0f9) at /usr/src/sys/pci/if_sis.c:1636 #17 0xc071398f in sis_intr (arg=0xc0f9) at /usr/src/sys/pci/if_sis.c:1841 #18 0xc0600430 in ithread_loop (arg=0xc0ec6880) at /usr/src/sys/kern/kern_intr.c:547 #19 0xc05ff8a4 in fork_exit (callout=0xc060030c , arg=0xc0ec6880, frame=0xc76bfd48) at /usr/src/sys/kern/kern_fork.c:791 #20 0xc07a6a2c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:209 (kgdb) f 13 #13 0xc06933c1 in ip_fastforward (m=0xc12e6c00) at /usr/src/sys/netinet/ip_fastfwd.c:572 warning: Source file is more recent than executable. 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, (kgdb) l 567 m->m_pkthdr.csum_flags |= CSUM_IP; 568 /* 569 * ip_fragment expects ip_len and ip_off in host byte 570 * order but returns all packets in network byte order 571 */ 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, 573 (~ifp->if_hwassist & CSUM_DELAY_IP))) { 574 goto drop; 575 } 576 KASSERT(m != NULL, ("null mbuf and no error")); (kgdb) i loc ip = (struct ip *) 0xc12f700e m0 = (struct mbuf *) 0xc12f700e ro = {ro_rt = 0xc11f8420, ro_dst = {sa_len = 16 '\020', sa_family = 2 '\002', sa_data = "\000\000À¨\002\005\000\000\000\000\000\000\000"}} dst = (struct sockaddr_in *) 0xc76bfc3c ia = (struct in_ifaddr *) 0x0 ifa = (struct ifaddr *) 0x0 ifp = (struct ifnet *) 0xc0f91800 odest = {s_addr = 84060352} dest = {s_addr = 84060352} sum = 0 ip_len = 0 error = 84060352 hlen = -1057417216 mtu = 0 __func__ = "ip_fastforward" (kgdb) p *ip $1 = {ip_hl = 5, ip_v = 4,
Re: help w/panic under heavy load - 5.4
On Saturday 23 July 2005 15:53, Edwin wrote: Can we see one complete picture, please. This includes: A trace local vars in ip_fastforward including unfolded ip, m, ro.ro_rt and ifp. local vars in ip_fragment(). Thanks. -- /"\ Best regards, | [EMAIL PROTECTED] \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | [EMAIL PROTECTED] / \ ASCII Ribbon Campaign | Against HTML Mail and News pgp4xRiuakSBC.pgp Description: PGP signature
Re: help w/panic under heavy load - 5.4
comments in-line. Giorgos Keramidas ([EMAIL PROTECTED]) wrote: > > This looks rather strange. ip_fastforward() should pass an mtu of 1500 > but somehow the negative strange value gets passed. It would be > interesting to see the value of ``mtu'' in frame 13 too, if you still > have this crash dump stored somewhere. included right below - i have a few other questions while i'm looking @ these - the value of hlen (looks from the code that it should be ip->ip_hl << 2 (5*4=20 right?) not some large -value - plus the value of hlen is sortof close to crazy mtu value - ip->ip_len = 10240 ??? not sure why this would be either - i think mtu should have a value of 1500 here - both the ifp struct and ro struct have values of 1500, but even if it did have a value of 1500 - since ip->ip_len = 10240 - it's still going to drop through to else of line 542 (ie. ip->ip_len <= mtu) which leaves either can't frag/drop or frag (where we are i think) - unless I'm missing something (kgdb) f 13 #13 0xc06933c1 in ip_fastforward (m=0xc12e6c00) at /usr/src/sys/netinet/ip_fastfwd.c:572 warning: Source file is more recent than executable. 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, (kgdb) i loc ip = (struct ip *) 0xc12f700e m0 = (struct mbuf *) 0xc12f700e ro = {ro_rt = 0xc11f8420, ro_dst = {sa_len = 16 '\020', sa_family = 2 '\002', sa_data = "\000\000�\002\005\000\000\000\000\000\000\000"}} dst = (struct sockaddr_in *) 0xc76bfc3c ia = (struct in_ifaddr *) 0x0 ifa = (struct ifaddr *) 0x0 ifp = (struct ifnet *) 0xc0f91800 odest = {s_addr = 84060352} dest = {s_addr = 84060352} sum = 0 ip_len = 0 error = 84060352 hlen = -1057417216 mtu = 0 __func__ = "ip_fastforward" (kgdb) p *ip $1 = {ip_hl = 5, ip_v = 4, ip_tos = 0 '\0', ip_len = 10240, ip_id = 61249, ip_off = 0, ip_ttl = 63 '?', ip_p = 17 '\021', ip_sum = 31921, ip_src = {s_addr = 67479744}, ip_dst = {s_addr = 84060352}} (kgdb) > > You are not running a kernel with optimization and/or architecture- > dependent optimization flags, right? > ntiko - i have added CPU_GEODE/CPU_SOEKRIS to my config - but same crash on the generic config as well..this is a soekris net4801 box (w/ geode proc - i586). generic 'make buildkernel KERNCONF=D1-0722' command line (ie no other make/compiler options). mbsd05# diff /root/kernels/D1-0722 /root/kernels/GENERIC 21,22d20 < makeoptions DEBUG=-g < 24c22 < #cpu I486_CPU --- > cpu I486_CPU 26,27c24,25 < #cpu I686_CPU < ident D1-0722 --- > cpu I686_CPU > ident GENERIC 31,48d28 < < options KDB < options DDB < options INVARIANTS < options INVARIANT_SUPPORT < < options CPU_SOEKRIS < options CPU_GEODE < < options HZ=1000 < options DEVICE_POLLING < < options IPFIREWALL < options IPFIREWALL_VERBOSE < options IPFIREWALL_VERBOSE_LIMIT < options IPFIREWALL_DEFAULT_TO_ACCEPT < options DUMMYNET < options IPDIVERT mbsd05# ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: help w/panic under heavy load - 5.4
On 2005-07-22 17:53, Edwin <[EMAIL PROTECTED]> wrote: > > I also patched ip_fastforward.c w/ your patch - still a crash - still > same type bogus mtu value - a few lines from kgdb included @ end of > message. > (kgdb) f 13 > #13 0xc06933c1 in ip_fastforward (m=0xc12e6c00) at > /usr/src/sys/netinet/ip_fastfwd.c:572 > 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, > (kgdb) p ro.ro_rt->rt_rmx > $1 = {rmx_mtu = 1500, rmx_expire = 333905919, rmx_pksent = 3868} The route entry has mtu = 1500. > (kgdb) p *ifp > $3 = {if_softc = 0xc0f91800, if_link = {tqe_next = 0xc0f9, tqe_prev = > 0xc08ebe84}, > if_xname = "sis0", '\0' , if_dname = 0xc0f2ec2c "sis", > if_dunit = 0, > if_addrhead = {tqh_first = 0xc0ec, tqh_last = 0xc1040460}, if_klist = { > kl_lock = 0xc08e5a40, kl_list = {slh_first = 0x0}}, if_pcount = 0, > if_carp = 0x0, > if_bpf = 0x0, if_index = 1, if_timer = 5, if_nvlans = 0, if_flags = 34883, > if_capabilities = 72, if_capenable = 72, if_linkmib = 0x0, if_linkmiblen = > 0, > if_data = {ifi_type = 6 '\006', ifi_physical = 0 '\0', ifi_addrlen = 6 > '\006', > ifi_hdrlen = 18 '\022', ifi_link_state = 2 '\002', ifi_recvquota = 0 '\0', > ifi_xmitquota = 0 '\0', ifi_datalen = 80 'P', ifi_mtu = 1500, ifi_metric > = 0, The interface also has an mtu of 1500 (ifi_mtu in the last line above). > #10 0xc0611fef in panic (fmt=0xc0820008 "default") > at /usr/src/sys/kern/kern_shutdown.c:550 > #11 0xc0641a2c in m_copym (m=0x0, off0=1500, len=1480, wait=1) > at /usr/src/sys/kern/uipc_mbuf.c:385 > #12 0xc069b694 in ip_fragment (ip=0xc12f700e, m_frag=0xc76bfc6c, > mtu=-1056788992, > if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 > #13 0xc06933c1 in ip_fastforward (m=0xc12e6c00) at > /usr/src/sys/netinet/ip_fastfwd.c:572 This looks rather strange. ip_fastforward() should pass an mtu of 1500 but somehow the negative strange value gets passed. It would be interesting to see the value of ``mtu'' in frame 13 too, if you still have this crash dump stored somewhere. You are not running a kernel with optimization and/or architecture- dependent optimization flags, right? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: help w/panic under heavy load - 5.4
Hi Giorgos, I'm sorry - I have so many kernels I was trying - I belive I overwrote that particular kernel/kernel.debug set - so I created a new kernel as a baseline with the same options per my notes - and included the output from the crash below. It does crash in the same fashion, and the KGDB output shows an MTU of the same type value (-1056788992 v. (-1056787456). I also patched ip_fastforward.c w/ your patch - still a crash - still same type bogus mtu value - a few lines from kgdb included @ end of message. Thanks again, -Edwin the variables you were asking about from this crash. (kgdb) f 13 #13 0xc06933c1 in ip_fastforward (m=0xc12e6c00) at /usr/src/sys/netinet/ip_fastfwd.c:572 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist, (kgdb) p ro.ro_rt->rt_rmx $1 = {rmx_mtu = 1500, rmx_expire = 333905919, rmx_pksent = 3868} (kgdb) p ifp $2 = (struct ifnet *) 0xc0f91800 (kgdb) p *ifp $3 = {if_softc = 0xc0f91800, if_link = {tqe_next = 0xc0f9, tqe_prev = 0xc08ebe84}, if_xname = "sis0", '\0' , if_dname = 0xc0f2ec2c "sis", if_dunit = 0, if_addrhead = {tqh_first = 0xc0ec, tqh_last = 0xc1040460}, if_klist = { kl_lock = 0xc08e5a40, kl_list = {slh_first = 0x0}}, if_pcount = 0, if_carp = 0x0, if_bpf = 0x0, if_index = 1, if_timer = 5, if_nvlans = 0, if_flags = 34883, if_capabilities = 72, if_capenable = 72, if_linkmib = 0x0, if_linkmiblen = 0, if_data = {ifi_type = 6 '\006', ifi_physical = 0 '\0', ifi_addrlen = 6 '\006', ifi_hdrlen = 18 '\022', ifi_link_state = 2 '\002', ifi_recvquota = 0 '\0', ifi_xmitquota = 0 '\0', ifi_datalen = 80 'P', ifi_mtu = 1500, ifi_metric = 0, ifi_baudrate = 1000, ifi_ipackets = 50, ifi_ierrors = 0, ifi_opackets = 3914, ifi_oerrors = 0, ifi_collisions = 0, ifi_ibytes = 6146, ifi_obytes = 213356, ifi_imcasts = 40, ifi_omcasts = 29, ifi_iqdrops = 0, ifi_noproto = 0, ifi_hwassist = 0, ifi_epoch = 0, ifi_lastchange = {tv_sec = 0, tv_usec = 0}}, if_multiaddrs = {tqh_first = 0xc0fab3e0, tqh_last = 0xc0fabcc0}, if_amcount = 0, if_output = 0xc0671e04 , if_input = 0xc0672598 , if_start = 0xc0713c10 , if_ioctl = 0xc071497c , if_watchdog = 0xc0714b04 , if_init = 0xc0713f60 , if_resolvemulti = 0xc0672e48 , if_spare1 = 0x0, if_spare2 = 0x0, if_spare3 = 0x0, if_spare_flags1 = 0, if_spare_flags2 = 0, if_snd = {ifq_head = 0x0, ifq_tail = 0x0, ifq_len = 0, ifq_maxlen = 127, ifq_drops = 0, ifq_mtx = { mtx_object = {lo_class = 0xc0880b3c, lo_name = 0xc0f9180c "sis0", lo_type = 0xc0829304 "if send queue", lo_flags = 196608, lo_list = { tqe_next = 0x0, tqe_prev = 0x0}, lo_witness = 0x0}, mtx_lock = 4, mtx_recurse = 0}, ifq_drv_head = 0x0, ifq_drv_tail = 0x0, ifq_drv_len = 0, ifq_drv_maxlen = 127, altq_type = 0, altq_flags = 1, altq_disc = 0x0, altq_ifp = 0xc0f91800, altq_enqueue = 0, altq_dequeue = 0, altq_request = 0, altq_clfier = 0x0, altq_classify = 0, altq_tbr = 0x0, altq_cdnr = 0x0}, if_broadcastaddr = 0xc07db600 "������", lltables = 0x0, if_label = 0x0, if_prefixhead = {tqh_first = 0x0, tqh_last = 0xc0f91968}, if_afdata = { 0x0 , 0xc0faaab0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, if_afdata_initialized = 1, if_afdata_mtx = {mtx_object = {lo_class = 0xc0880b3c, lo_name = 0xc08292f4 "if_afdata", lo_type = 0xc08292f4 "if_afdata", lo_flags = 196608, lo_list = {tqe_next = 0x0, tqe_prev = 0x0}, lo_witness = 0x0}, mtx_lock = 4, mtx_recurse = 0}, if_starttask = {ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0, ta_func = 0xc06711c0 , ta_context = 0xc0f91800}} (kgdb) for reference going forward - this kernel was named D1-0722, and I'm making cross correlations to save the kernels/debugs/cores. new kernel crash - all options compiled, sysctl ipff=1, polling not enabled fb54c# panic: m_copym, offset > size of mbuf chain KDB: enter: panic [thread pid 21 tid 100015 ] Stopped at kdb_enter+0x2b: nop db> where Tracing pid 21 tid 100015 td 0xc0ecc780 kdb_enter(c0821a6a) at kdb_enter+0x2b panic(c0826049,0,c076b79c,c102ae00,100) at panic+0xbb m_copym(0,5dc,5c8,1,14) at m_copym+0x60 ip_fragment(c12f700e,c76bfc6c,5dc,0,1) at ip_fragment+0x214 ip_fastforward(c12e6c00) at ip_fastforward+0x6ed ether_demux(c0f9,c12e6c00,3c,c0f8a8d8,a) at ether_demux+0x259 ether_input(c0f9,c12e6c00,c0f902d0,0,c08336ab) at ether_input+0x25d sis_rxeof(c0f9) at sis_rxeof+0x1ab sis_intr(c0f9) at sis_intr+0xf3 ithread_loop(c0ec6880,c76bfd48,c0ec6880,c060030c,0) at ithread_loop+0x124 fork_exit(c060030c,c0ec6880,c76bfd48) at fork_exit+0xa4 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xc76bfd7c, ebp = 0 --- db> (kgdb) where #0 doadump () at pcpu.h:159 #1 0xc04611f6 in db_fncall (dummy1=0, dummy2=0, dummy3=43, dummy4=0xc76bf9f4 "(�k�") at /usr/src/sys/ddb/db_command.c:531 #2 0xc0461004 in db_command (last_cmdp=0xc08c9264, cmd_table=0x
Re: help w/panic under heavy load - 5.4
On 2005-07-21 14:57, Giorgos Keramidas <[EMAIL PROTECTED]> wrote: > On 2005-07-20 11:41, Edwin <[EMAIL PROTECTED]> wrote: > > I'm trying to understand the particulars about this - I get the null pointer > > part, but as to ip_fragment - it's fragmenting mbufs to handle ip packets > > during switching? and its failing trying to copy data past the end of the > > chain? > > ip_fastfwd() thinks that it should fragment the packet because it somehow > calculates a bogus ``mtu'' value. See the mtu value in frame 12 of the stack > trace below. > > > #10 0xc0611fef in panic (fmt=0xc0820008 "default") at > > /usr/src/sys/kern/kern_shutdown.c:550 > > #11 0xc0641a2c in m_copym (m=0x0, off0=1500, len=1480, wait=1) > > at /usr/src/sys/kern/uipc_mbuf.c:385 > > #12 0xc069b694 in ip_fragment (ip=0xc11bd80e, m_frag=0xc76bfc6c, > > mtu=-1056787456, > > if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 > > The ``mtu'' is an extremely small integer value, which is definitely a problem > here. Somehow, ip_fastforward() calculates a very wrong value for the > ``mtu''. The check for finding the right MTU in ip_output.c is a bit different, as it includes a check for RTF_UP: 777:if (ip->ip_off & IP_DF) { 778:error = EMSGSIZE; 779:/* 780: * This case can happen if the user changed the MTU 781: * of an interface after enabling IP on it. Because 782: * most netifs don't keep track of routes pointing to 783: * them, there is no way for one to update all its 784: * routes when the MTU is changed. 785: */ 786:if ((ro->ro_rt->rt_flags & (RTF_UP | RTF_HOST)) && 787:(ro->ro_rt->rt_rmx.rmx_mtu > ifp->if_mtu)) { 788:ro->ro_rt->rt_rmx.rmx_mtu = ifp->if_mtu; 789:} 790:ipstat.ips_cantfrag++; 791:goto bad; 792:} The check for RTF_UP doesn't exist in ip_fastfwd.c, except perhaps through the ip_findroute() call. I'm probably confused, but it seems that ip_findroute() in ip_fastfwd.c may still return a route entry for a gateway that is not yet RTF_UP, since the check for the RTF_UP flag is not done for the dst.rt_gateway route entry too. This may be the cause of the invalid MTU value you're seeing. Can you try the following patch for ip_fastfwd.c? The diff is also available online at: http://people.freebsd.org/~keramida/diff/fastfwd-mtu.patch %%% Index: ip_fastfwd.c === RCS file: /home/ncvs/src/sys/netinet/ip_fastfwd.c,v retrieving revision 1.28 diff -u -r1.28 ip_fastfwd.c --- ip_fastfwd.c4 May 2005 13:09:19 - 1.28 +++ ip_fastfwd.c21 Jul 2005 14:38:35 - @@ -537,12 +537,13 @@ } /* -* Check if packet fits MTU or if hardware will fragement for us +* Check if packet fits MTU or if hardware will fragment for us. +* If necessary, update the MTU of the route entry too. */ - if (ro.ro_rt->rt_rmx.rmx_mtu) - mtu = min(ro.ro_rt->rt_rmx.rmx_mtu, ifp->if_mtu); - else - mtu = ifp->if_mtu; + if ((ro.ro_rt.rt_flags & (RTF_UP | RTF_HOST)) && + ro.ro_rt->rt_rmx.rmx_mtu > ipf->if_mtu) + ro.ro_rt->rt_rmx.rmx_mtu = ipf->if_mtu; + mtu = ifp->if_mtu; if (ip->ip_len <= mtu || (ifp->if_hwassist & CSUM_FRAGMENT && (ip->ip_off & IP_DF) == 0)) { %%% ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: help w/panic under heavy load - 5.4
On 2005-07-20 11:41, Edwin <[EMAIL PROTECTED]> wrote: > I'm trying to understand the particulars about this - I get the null pointer > part, but as to ip_fragment - it's fragmenting mbufs to handle ip packets > during switching? and its failing trying to copy data past the end of the > chain? ip_fastfwd() thinks that it should fragment the packet because it somehow calculates a bogus ``mtu'' value. See the mtu value in frame 12 of the stack trace below. > mbsd05# kgdb kernel.debug /tmp/crash/vmcore.3 > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: > Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd". > #0 doadump () at pcpu.h:159 > 159 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); > (kgdb) where > #0 doadump () at pcpu.h:159 > #1 0xc04611f6 in db_fncall (dummy1=0, dummy2=0, dummy3=-1, dummy4=0xc76bf9f4 > "(οΏ½kοΏ½") > at /usr/src/sys/ddb/db_command.c:531 > #2 0xc0461004 in db_command (last_cmdp=0xc08c9264, cmd_table=0x0, > aux_cmd_tablep=0xc08483b8, > aux_cmd_tablep_end=0xc08483d4) at /usr/src/sys/ddb/db_command.c:349 > #3 0xc04610cc in db_command_loop () at /usr/src/sys/ddb/db_command.c:455 > #4 0xc0462c51 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:221 > #5 0xc0627af2 in kdb_trap (type=3, code=0, tf=0xc76bfb30) at > /usr/src/sys/kern/subr_kdb.c:468 > #6 0xc07b6394 in trap (frame= > {tf_fs = -949288936, tf_es = -1067319280, tf_ds = -1065222128, tf_edi = > 1, tf_esi = -1065 > 197495, tf_ebp = -949224592, tf_isp = -949224612, tf_ebx = -949224548, tf_edx > = 0, tf_ecx = -10 > 60921344, tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1067288461, tf_cs > = -1065222136, tf_eflags = 658, tf_esp = -949224560, tf_ss = -1067376657}) at > /usr/src/sys/i386/i386/trap.c:584 > #7 0xc07a69ca in calltrap () at /usr/src/sys/i386/i386/exception.s:140 > #8 0xc76b0018 in ?? () > #9 0xc0620010 in schedcpu () at /usr/src/sys/kern/sched_4bsd.c:461 > #10 0xc0611fef in panic (fmt=0xc0820008 "default") at > /usr/src/sys/kern/kern_shutdown.c:550 > #11 0xc0641a2c in m_copym (m=0x0, off0=1500, len=1480, wait=1) > at /usr/src/sys/kern/uipc_mbuf.c:385 > #12 0xc069b694 in ip_fragment (ip=0xc11bd80e, m_frag=0xc76bfc6c, > mtu=-1056787456, > if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 The ``mtu'' is an extremely small integer value, which is definitely a problem here. Somehow, ip_fastforward() calculates a very wrong value for the ``mtu''. > 6933c1 in ip_fastforward (m=0xc11ab100) at > /usr/src/sys/netinet/ip_fastfwd.c:572 If you have this particular crash dump, can you show me a dump of the ``ro.ro_rt->rt_rmx'' and the ``ifp'' structure that ip_fastforward() is using? One of these two seems to have an invalid mtu value. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: help w/panic under heavy load - 5.4
Giorgos/John/et.al :) I have compiled/tested/traced about 15 separate kernels for this, and am happy to provide crashdumps/etc to anyone interested :) I decided to start over - create a GENERIC kernel (w/ DDB/KDB/INVARIANTS/INVARIANT_SUPPORT) and see what I started to get if I could reproduce the problem more specifically. Just using the GENERIC w/ debug kernel - I did make it crash - although it took some handholding, lots of throwing packets at it and running processes on the box, about 5-10 minutes - didn't really try to reproduce it - since it really wasn't the fast panic that I was concerned about before. i've included the panic below here anyhow. What I did notice - was w/o any options - and turning on ip.fastforwarding via sysctl - the crash was reproducible consistently with the (pretty much) generic kernel, same kernel traces as before basically. I also received an 'interrupt storm' message on the console from the ip.fastforwarding trace - have seen that a few times in the past when polling was not enabled before it panic'd. I welcome all comments/thoughts/directions - happy to poke/prod/compile/debug - just really don't know where to go from here. Thanks for your help! /Edwin Kernel: DDB8-GENDBG (GENERIC + options DDB/KDB/INVARIANTS/INVARIANT_SUPPORT) sysctl: ip.fastforwarding=0 <--- turned off ospfd# panic: m_copym, offset > size of mbuf chain KDB: enter: panic [thread pid 27 tid 100021 ] Stopped at kdb_enter+0x2b: nop db> where Tracing pid 27 tid 100021 td 0xc0ed0180 kdb_enter(c0821a6a) at kdb_enter+0x2b panic(c0826049,0,c076b79c,c102bb00,100) at panic+0xbb m_copym(0,5dc,5c8,1,14) at m_copym+0x60 ip_fragment(c124100e,c76d1a04,5dc,0,1) at ip_fragment+0x214 ip_output(c1201200,0,c76d19d0,1,0,0) at ip_output+0x74c ip_forward(c1201200,0) at ip_forward+0x2d4 ip_input(c1201200) at ip_input+0x4a7 netisr_processqueue(c08ec138) at netisr_processqueue+0x6e swi_net(0) at swi_net+0xc2 ithread_loop(c0ec6580,c76d1d48,c0ec6580,c060030c,0) at ithread_loop+0x124 fork_exit(c060030c,c0ec6580,c76d1d48) at fork_exit+0xa4 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xc76d1d7c, ebp = 0 --- db> call doadump Dumping 128 MB 16 32 48 64 80 96 112 Dump complete 0xf db> Kernel: DDB8-GENDBG (GENERIC + options DDB/KDB/INVARIANTS/INVARIANT_SUPPORT) Sysctl: ip.fastforwarding=1 fb54c# Interrupt storm detected on "irq10: sis0 sis1+"; throttling interrupt source fb54c# fb54c# fb54c# fb54c# panic: m_copym, offset > size of mbuf chain KDB: enter: panic [thread pid 21 tid 100015 ] Stopped at kdb_enter+0x2b: nop db> where Tracing pid 21 tid 100015 td 0xc0ecc780 kdb_enter(c08165b2) at kdb_enter+0x2b panic(c081ab91,0,c0760a0c,c1028800,100) at panic+0xbb m_copym(0,5dc,5c8,1,14) at m_copym+0x60 ip_fragment(c121880e,c76bfc6c,5dc,0,1) at ip_fragment+0x214 ip_fastforward(c11f2600) at ip_fastforward+0x6ed ether_demux(c0f9,c11f2600,52,c0f8b8d8,a) at ether_demux+0x259 ether_input(c0f9,c11f2600,c0f902cc,0,c0826fc6) at ether_input+0x25d sis_rxeof(c0f9) at sis_rxeof+0x18b sis_intr(c0f9) at sis_intr+0xa3 ithread_loop(c0ec6880,c76bfd48,c0ec6880,c05feb3c,0) at ithread_loop+0x124 fork_exit(c05feb3c,c0ec6880,c76bfd48) at fork_exit+0xa4 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xc76bfd7c, ebp = 0 --- db> doadump No such command db> call doadump Dumping 128 MB 16 32 48 64 80 96 112 Dump complete 0xf db> reset . Giorgos Keramidas ([EMAIL PROTECTED]) wrote: > On 2005-07-19 22:03, Edwin <[EMAIL PROTECTED]> wrote: > > Hi John, > > > > Updated the kernel, same crash under load, looks like m is null, you're > > right. > > > > Not quite sure where to go from here. I'm happy to do the footwork - just > > still real > > hazy on the BSD kernel part of things. > > > > panic: m_copym, offset > size of mbuf chain > > KDB: enter: panic > > [thread pid 27 tid 100021 ] > > Stopped at kdb_enter+0x2b: nop > > db> where > > Tracing pid 27 tid 100021 td 0xc0ed0180 > > kdb_enter(c0821a6a) at kdb_enter+0x2b > > panic(c0826049,0,c076b79c,c102d600,100) at panic+0xbb > > m_copym(0,5dc,5c8,1,14) at m_copym+0x60 > > ip_fragment(c123180e,c76d1c38,5dc,0,1) at ip_fragment+0x214 > > ip_fastforward(c11fee00) at ip_fastforward+0x6ed > > ether_demux(c0f9,c11fee00,52,c0f8aad0,1f) at ether_demux+0x259 > > ether_input(c0f9,c11fee00,c0f902d0,0,c08336ab) at ether_input+0x25d > > sis_rxeof(c0f9,1,5,c08e5500,c76d1ce0) at sis_rxeof+0x1ab > > sis_poll(c0f9,0,5) at sis_poll+0x7f > > netisr_poll(0) at netisr_poll+0x188 > > swi_net(0) at swi_net+0x81 > > ithread_loop(c0ec6580,c76d1d48,c0ec6580,c060030c,0) at ithread_loop+0x124 > > fork_exit(c060030c,c0ec6580,c76d1d48) at fork_exit+0xa4 > > fork_trampoline() at fork_trampoline+0x8 > > --- trap 0x1, eip = 0, esp = 0xc76d1d7c, ebp = 0 --- > > Both tracebacks contain sis_poll() somewhere in the call stack? Are you > using POLLING? If yes, can you try without POLLING and see if the crash > can still be reproduced? > >
Re: help w/panic under heavy load - 5.4
Hi Giorgos, Yes - I'm using polling, but it still panics even w/ polling disabled or not compiled in. Still reproducible - same scenario (high load - actually, not even really high load - relative load,- small network packets). I did both (output included below): - disable polling via sysctl - re-compile new kernel w/o option It appears to be still the same error - traces the same w/ the exception of sis_poll versus sis_intr. I have tried various different options in my kernel before posting - w/ and/wo ipff, ipfw, polling, didn't seem to make a difference - but then again - I wasn't getting traces from DDB w/ INVARIANTS - so not for sure. I'm trying to understand the particulars about this - I get the null pointer part, but as to ip_fragment - it's fragmenting mbufs to handle ip packets during switching? and its failing trying to copy data past the end of the chain? Thanks! /edwin Giorgos Keramidas ([EMAIL PROTECTED]) wrote: <> > > ether_input(c0f9,c11fee00,c0f902d0,0,c08336ab) at ether_input+0x25d > > sis_rxeof(c0f9,1,5,c08e5500,c76d1ce0) at sis_rxeof+0x1ab > > sis_poll(c0f9,0,5) at sis_poll+0x7f > > netisr_poll(0) at netisr_poll+0x188 > > swi_net(0) at swi_net+0x81 > > ithread_loop(c0ec6580,c76d1d48,c0ec6580,c060030c,0) at ithread_loop+0x124 > > fork_exit(c060030c,c0ec6580,c76d1d48) at fork_exit+0xa4 > > fork_trampoline() at fork_trampoline+0x8 > > --- trap 0x1, eip = 0, esp = 0xc76d1d7c, ebp = 0 --- > > Both tracebacks contain sis_poll() somewhere in the call stack? Are you > using POLLING? If yes, can you try without POLLING and see if the crash > can still be reproduced? > > - Giorgos > DDB output from disabling polling via sysctl - trace fb54c# sysctl kern.polling.enable=0 kern.polling.enable: 1 -> 0 fb54c# panic: m_copym, offset > size of mbuf chain KDB: enter: panic [thread pid 21 tid 100015 ] Stopped at kdb_enter+0x2b: nop db> where Tracing pid 21 tid 100015 td 0xc0ecc780 kdb_enter(c0821a6a) at kdb_enter+0x2b panic(c0826049,0,c076b79c,c102b400,100) at panic+0xbb m_copym(0,5dc,5c8,1,14) at m_copym+0x60 ip_fragment(c11bd80e,c76bfc6c,5dc,0,1) at ip_fragment+0x214 ip_fastforward(c11ab100) at ip_fastforward+0x6ed ether_demux(c0f9,c11ab100,52,c0f8abc0,29) at ether_demux+0x259 ether_input(c0f9,c11ab100,c0f902d0,0,c08336ab) at ether_input+0x25d sis_rxeof(c0f9) at sis_rxeof+0x1ab sis_intr(c0f9) at sis_intr+0xf3 ithread_loop(c0ec6880,c76bfd48,c0ec6880,c060030c,0) at ithread_loop+0x124 fork_exit(c060030c,c0ec6880,c76bfd48) at fork_exit+0xa4 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xc76bfd7c, ebp = 0 --- db> call doadump Dumping 128 MB 16 32 48 64 80 96 112 Dump complete 0xf db> mbsd05# kgdb kernel.debug /tmp/crash/vmcore.3 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". #0 doadump () at pcpu.h:159 159 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) where #0 doadump () at pcpu.h:159 #1 0xc04611f6 in db_fncall (dummy1=0, dummy2=0, dummy3=-1, dummy4=0xc76bf9f4 "(�k�") at /usr/src/sys/ddb/db_command.c:531 #2 0xc0461004 in db_command (last_cmdp=0xc08c9264, cmd_table=0x0, aux_cmd_tablep=0xc08483b8, aux_cmd_tablep_end=0xc08483d4) at /usr/src/sys/ddb/db_command.c:349 #3 0xc04610cc in db_command_loop () at /usr/src/sys/ddb/db_command.c:455 #4 0xc0462c51 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:221 #5 0xc0627af2 in kdb_trap (type=3, code=0, tf=0xc76bfb30) at /usr/src/sys/kern/subr_kdb.c:468 #6 0xc07b6394 in trap (frame= {tf_fs = -949288936, tf_es = -1067319280, tf_ds = -1065222128, tf_edi = 1, tf_esi = -1065 197495, tf_ebp = -949224592, tf_isp = -949224612, tf_ebx = -949224548, tf_edx = 0, tf_ecx = -10 60921344, tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1067288461, tf_cs = -1065222136, tf_eflags = 658, tf_esp = -949224560, tf_ss = -1067376657}) at /usr/src/sys/i386/i386/trap.c:584 #7 0xc07a69ca in calltrap () at /usr/src/sys/i386/i386/exception.s:140 #8 0xc76b0018 in ?? () #9 0xc0620010 in schedcpu () at /usr/src/sys/kern/sched_4bsd.c:461 #10 0xc0611fef in panic (fmt=0xc0820008 "default") at /usr/src/sys/kern/kern_shutdown.c:550 #11 0xc0641a2c in m_copym (m=0x0, off0=1500, len=1480, wait=1) at /usr/src/sys/kern/uipc_mbuf.c:385 #12 0xc069b694 in ip_fragment (ip=0xc11bd80e, m_frag=0xc76bfc6c, mtu=-1056787456, if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 6933c1 in ip_fastforward (m=0xc11ab100) at /usr/src/sys/netinet/ip_fastfwd.c:572 #14 0xc0672a59 in ether_de
Re: help w/panic under heavy load - 5.4
On 2005-07-19 22:03, Edwin <[EMAIL PROTECTED]> wrote: > Hi John, > > Updated the kernel, same crash under load, looks like m is null, you're right. > > Not quite sure where to go from here. I'm happy to do the footwork - just > still real > hazy on the BSD kernel part of things. > > panic: m_copym, offset > size of mbuf chain > KDB: enter: panic > [thread pid 27 tid 100021 ] > Stopped at kdb_enter+0x2b: nop > db> where > Tracing pid 27 tid 100021 td 0xc0ed0180 > kdb_enter(c0821a6a) at kdb_enter+0x2b > panic(c0826049,0,c076b79c,c102d600,100) at panic+0xbb > m_copym(0,5dc,5c8,1,14) at m_copym+0x60 > ip_fragment(c123180e,c76d1c38,5dc,0,1) at ip_fragment+0x214 > ip_fastforward(c11fee00) at ip_fastforward+0x6ed > ether_demux(c0f9,c11fee00,52,c0f8aad0,1f) at ether_demux+0x259 > ether_input(c0f9,c11fee00,c0f902d0,0,c08336ab) at ether_input+0x25d > sis_rxeof(c0f9,1,5,c08e5500,c76d1ce0) at sis_rxeof+0x1ab > sis_poll(c0f9,0,5) at sis_poll+0x7f > netisr_poll(0) at netisr_poll+0x188 > swi_net(0) at swi_net+0x81 > ithread_loop(c0ec6580,c76d1d48,c0ec6580,c060030c,0) at ithread_loop+0x124 > fork_exit(c060030c,c0ec6580,c76d1d48) at fork_exit+0xa4 > fork_trampoline() at fork_trampoline+0x8 > --- trap 0x1, eip = 0, esp = 0xc76d1d7c, ebp = 0 --- Both tracebacks contain sis_poll() somewhere in the call stack? Are you using POLLING? If yes, can you try without POLLING and see if the crash can still be reproduced? - Giorgos ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: help w/panic under heavy load - 5.4
Hi John, Updated the kernel, same crash under load, looks like m is null, you're right. Not quite sure where to go from here. I'm happy to do the footwork - just still real hazy on the BSD kernel part of things. Thanks for the help! /Edwin Results from KDB/DDB/INVARIANTS/INVARIANT_SUPPORT - same crash (ddb and kdb output) panic: m_copym, offset > size of mbuf chain KDB: enter: panic [thread pid 27 tid 100021 ] Stopped at kdb_enter+0x2b: nop db> where Tracing pid 27 tid 100021 td 0xc0ed0180 kdb_enter(c0821a6a) at kdb_enter+0x2b panic(c0826049,0,c076b79c,c102d600,100) at panic+0xbb m_copym(0,5dc,5c8,1,14) at m_copym+0x60 ip_fragment(c123180e,c76d1c38,5dc,0,1) at ip_fragment+0x214 ip_fastforward(c11fee00) at ip_fastforward+0x6ed ether_demux(c0f9,c11fee00,52,c0f8aad0,1f) at ether_demux+0x259 ether_input(c0f9,c11fee00,c0f902d0,0,c08336ab) at ether_input+0x25d sis_rxeof(c0f9,1,5,c08e5500,c76d1ce0) at sis_rxeof+0x1ab sis_poll(c0f9,0,5) at sis_poll+0x7f netisr_poll(0) at netisr_poll+0x188 swi_net(0) at swi_net+0x81 ithread_loop(c0ec6580,c76d1d48,c0ec6580,c060030c,0) at ithread_loop+0x124 fork_exit(c060030c,c0ec6580,c76d1d48) at fork_exit+0xa4 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xc76d1d7c, ebp = 0 --- db> call doadump Dumping 128 MB 16 32 48 64 80 96 112 Dump complete 0xf db> reset mbsd05# kgdb kernel.debug /tmp/crash/vmcore.1 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". #0 doadump () at pcpu.h:159 159 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) where #0 doadump () at pcpu.h:159 #1 0xc04611f6 in db_fncall (dummy1=0, dummy2=0, dummy3=43, dummy4=0xc76d19c0 "�\031m�") at /usr/src/sys/ddb/db_command.c:531 #2 0xc0461004 in db_command (last_cmdp=0xc08c9264, cmd_table=0x0, aux_cmd_tablep=0xc08483b8, aux_cmd_tablep_end=0xc08483d4) at /usr/src/sys/ddb/db_command.c:349 #3 0xc04610cc in db_command_loop () at /usr/src/sys/ddb/db_command.c:455 #4 0xc0462c51 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:221 #5 0xc0627af2 in kdb_trap (type=3, code=0, tf=0xc76d1afc) at /usr/src/sys/kern/subr_kdb.c:468 #6 0xc07b6394 in trap (frame= {tf_fs = -949157864, tf_es = -1067319280, tf_ds = -1065222128, tf_edi = 1, tf_esi = - 1065197495, tf_ebp = -949150916, tf_isp = -949150936, tf_ebx = -949150872, tf_edx = 0, tf_e cx = -1060921344, tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1067288461, tf_cs = -1065222136, tf_eflags = 646, tf_esp = -949150884, tf_ss = -1067376657}) at /usr/src/sys/i386/i386/trap.c:584 #7 0xc07a69ca in calltrap () at /usr/src/sys/i386/i386/exception.s:140 #8 0xc76d0018 in ?? () #9 0xc0620010 in schedcpu () at /usr/src/sys/kern/sched_4bsd.c:461 #10 0xc0611fef in panic (fmt=0xc0820008 "default") at /usr/src/sys/kern/kern_shutdown.c:550 #11 0xc0641a2c in m_copym (m=0x0, off0=1500, len=1480, wait=1) at /usr/src/sys/kern/uipc_mbuf.c:385 #12 0xc069b694 in ip_fragment (ip=0xc123180e, m_frag=0xc76d1c38, mtu=-1056778752, if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 #13 0xc06933c1 in ip_fastforward (m=0xc11fee00) at /usr/src/sys/netinet/ip_fastfwd.c:572 #14 0xc0672a59 in ether_demux (ifp=0xc0f9, m=0xc11fee00) at /usr/src/sys/net/if_ethersubr.c:770 #15 0xc06727f5 in ether_input (ifp=0xc0f9, m=0xc11fee00) at /usr/src/sys/net/if_ethersubr.c:631 #16 0xc0713507 in sis_rxeof (sc=0xc0f9) at /usr/src/sys/pci/if_sis.c:1636 #17 0xc07137cf in sis_poll (ifp=0xc0f9, cmd=POLL_ONLY, count=0) at /usr/src/sys/pci/if_sis.c:1769 #18 0xc05f8280 in netisr_poll () at /usr/src/sys/kern/kern_poll.c:384 #19 0xc0679985 in swi_net (dummy=0x0) at /usr/src/sys/net/netisr.c:338 #20 0xc0600430 in ithread_loop (arg=0xc0ec6580) at /usr/src/sys/kern/kern_intr.c:547 #21 0xc05ff8a4 in fork_exit (callout=0xc060030c , arg=0xc0ec6580, frame=0xc76d1d48) at /usr/src/sys/kern/kern_fork.c:791 #22 0xc07a6a2c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:209 (kgdb) f 12 #12 0xc069b694 in ip_fragment (ip=0xc123180e, m_frag=0xc76d1c38, mtu=-1056778752, if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 967 m->m_next = m_copy(m0, off, len); (kgdb) f 11 #11 0xc0641a2c in m_copym (m=0x0, off0=1500, len=1480, wait=1) at /usr/src/sys/kern/uipc_mbuf.c:385 385 KASSERT(m != NULL, ("m_copym, offset > size of mbuf chain")); (kgdb) l 380 KASSERT(len >= 0, ("m_copym, negative len %d", len)); 381
Re: help w/panic under heavy load - 5.4
Hi John, Re-compiled with INVARIANTS/INVARIANT_SUPPORT included the gdb output below - same situation (put heavy load on the box - incidentally - small (68 byte UDP packets) - fwiw. my buildkernel kept failing on the options DDB (even tried GENERIC kernel) - so I'm sure I'm doing something wrong there - just didn't figure it out yet. I wanted to get back to you with the output for the above asap though - I'm happy to input/output whatever commands you would like if necc. Thanks! /Edwin mbsd05# kgdb kernel.debug /tmp/crash/vmcore.5 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". #0 doadump () at pcpu.h:159 159 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) where #0 doadump () at pcpu.h:159 #1 0xc060c474 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:410 #2 0xc060c6f2 in panic (fmt=0xc081e5ad "m_copym, offset > size of mbuf chain") at /usr/src/sys/kern/kern_shutdown.c:566 #3 0xc063beb8 in m_copym (m=0x0, off0=1500, len=1480, wait=1) at /usr/src/sys/kern/uipc_mbuf.c:385 #4 0xc06996b4 in ip_fragment (ip=0xc124400e, m_frag=0xc7692c38, mtu=-1056768768, if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 #5 0xc069132e in ip_fastforward (m=0xc120e700) at /usr/src/sys/netinet/ip_fastfwd.c:572 #6 0xc066cd99 in ether_demux (ifp=0xc0f9, m=0xc120e700) at /usr/src/sys/net/if_ethersubr.c:770 #7 0xc066cb1d in ether_input (ifp=0xc0f9, m=0xc120e700) at /usr/src/sys/net/if_ethersubr.c:631 #8 0xc0711597 in sis_rxeof (sc=0xc0f9) at /usr/src/sys/pci/if_sis.c:1636 #9 0xc071185f in sis_poll (ifp=0xc0f9, cmd=POLL_ONLY, count=0) at /usr/src/sys/pci/if_sis.c:1769 #10 0xc05f2f5c in netisr_poll () at /usr/src/sys/kern/kern_poll.c:384 #11 0xc0673cc5 in swi_net (dummy=0x0) at /usr/src/sys/net/netisr.c:338 #12 0xc05fb10c in ithread_loop (arg=0xc0ec6480) at /usr/src/sys/kern/kern_intr.c:547 #13 0xc05fa580 in fork_exit (callout=0xc05fafe8 , arg=0xc0ec6480, frame=0xc7692d48) at /usr/src/sys/kern/kern_fork.c:791 #14 0xc07a26cc in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:209 (kgdb) f 4 #4 0xc06996b4 in ip_fragment (ip=0xc124400e, m_frag=0xc7692c38, mtu=-1056768768, if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 967 m->m_next = m_copy(m0, off, len); (kgdb) f 3 #3 0xc063beb8 in m_copym (m=0x0, off0=1500, len=1480, wait=1) at /usr/src/sys/kern/uipc_mbuf.c:385 385 KASSERT(m != NULL, ("m_copym, offset > size of mbuf chain")); (kgdb) quit mbsd05# John Baldwin ([EMAIL PROTECTED]) wrote: > On Monday 18 July 2005 11:42 pm, Edwin wrote: > > Hi, > > > > I have a recurring (re-producible) panic on the 5.3/5.4 kernels and I would > > like to ask for some help in tracking it down. :) - it could be some > > misconfig on my part - but i have tried several different configs of the > > kernel - ultimately w/ polling on/off, ipfw on/off, ipfastforwarding on/off > > - although with ipff off - the box still crashes but in a different > > location - it will even crash w/ GENERIC kernel under heavy load. > > > > I'm not quite sure where to look past the below (ie. what variables/etc to > > present to the list). > > Try turning INVARIANTS and INVARIANT_SUPPORT on in your kernel and see if you > can reproduce this. Also, try to get a traceback in ddb if possible as > sometimes ddb gives more reliable stack traces. It looks like your m is > NULL, in which case the KASSERT() on the previous line should fire if > INVARIANTS is on. > > -- > John Baldwin <[EMAIL PROTECTED]> <>< http://www.FreeBSD.org/~jhb/ > "Power Users Use the Power to Serve" = http://www.FreeBSD.org ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: help w/panic under heavy load - 5.4
On Monday 18 July 2005 11:42 pm, Edwin wrote: > Hi, > > I have a recurring (re-producible) panic on the 5.3/5.4 kernels and I would > like to ask for some help in tracking it down. :) - it could be some > misconfig on my part - but i have tried several different configs of the > kernel - ultimately w/ polling on/off, ipfw on/off, ipfastforwarding on/off > - although with ipff off - the box still crashes but in a different > location - it will even crash w/ GENERIC kernel under heavy load. > > I'm not quite sure where to look past the below (ie. what variables/etc to > present to the list). Try turning INVARIANTS and INVARIANT_SUPPORT on in your kernel and see if you can reproduce this. Also, try to get a traceback in ddb if possible as sometimes ddb gives more reliable stack traces. It looks like your m is NULL, in which case the KASSERT() on the previous line should fire if INVARIANTS is on. -- John Baldwin <[EMAIL PROTECTED]> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"