Re: slab corruption in 2.6.16rc1-git4
From: Dave Jones <[EMAIL PROTECTED]> Date: Sat, 4 Feb 2006 12:14:11 -0500 > I've hit it three times now, and every time it seems to have happened > whilst it was under attack from junk icmp, which hopefully narrows > it down a little to a specific set of isic parameters. I've sent the following fix from Herbert to Linus and -stable. diff-tree 429563d07b4feda0729f296b90c722f4d431adac (from 53ea68ecea11bcbb3451c2758ce181bd97b569a9) Author: Herbert Xu <[EMAIL PROTECTED]> Date: Sat Feb 4 02:09:34 2006 -0800 [ICMP]: Fix extra dst release when ip_options_echo fails When two ip_route_output_key lookups in icmp_send were combined I forgot to change the error path for ip_options_echo to not drop the dst reference since it now sits before the dst lookup. To fix it we simply jump past the ip_rt_put call. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 6bc0887..4d1c409 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -524,7 +524,7 @@ void icmp_send(struct sk_buff *skb_in, i iph->tos; if (ip_options_echo(&icmp_param.replyopts, skb_in)) - goto ende; + goto out_unlock; /* - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: slab corruption in 2.6.16rc1-git4
On Fri, Feb 03, 2006 at 08:27:02PM -0800, Stephen Hemminger wrote: > It takes about 15 minutes of over a gigabit link for me to trigger > on a dual Opteron with 2G of mem. Maybe Dave's niagra's would be > faster. > > Although it might be depend on what level of debugging is turned on. > What would be helpful is knowing whether it is related to code path > (ie input packet), or dst cache fillup/release. I've hit it three times now, and every time it seems to have happened whilst it was under attack from junk icmp, which hopefully narrows it down a little to a specific set of isic parameters. Dave - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: slab corruption in 2.6.16rc1-git4
On Sat, 04 Feb 2006 13:50:44 +1100 Herbert Xu <[EMAIL PROTECTED]> wrote: > Dave Jones <[EMAIL PROTECTED]> wrote: > > Note the first slab corruption line.. > > > > 000: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > > > > has a single bit error, which _could_ be bad ram, as this box is an ancient > > Actually, this is exactly what would've happened if someone did a > dst_release on a freed dst entry. So this probably ties in with > your report about dst badness. > > Unfrotunately, I was able to reproduce this bug exactly once with isic > and since then no matter what I do it just works perfectly. It takes about 15 minutes of over a gigabit link for me to trigger on a dual Opteron with 2G of mem. Maybe Dave's niagra's would be faster. Although it might be depend on what level of debugging is turned on. What would be helpful is knowing whether it is related to code path (ie input packet), or dst cache fillup/release. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: slab corruption in 2.6.16rc1-git4
Dave Jones <[EMAIL PROTECTED]> wrote: > Note the first slab corruption line.. > > 000: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b > > has a single bit error, which _could_ be bad ram, as this box is an ancient Actually, this is exactly what would've happened if someone did a dst_release on a freed dst entry. So this probably ties in with your report about dst badness. Unfrotunately, I was able to reproduce this bug exactly once with isic and since then no matter what I do it just works perfectly. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
slab corruption in 2.6.16rc1-git4
I've had a box being tortured with random junk packets (created with isic) for a few days, and it spat this out last night.. Feb 1 04:28:09 trogdor kernel: Slab corruption: (Not tainted) start=cefc8a9c, len=244 Feb 1 04:28:09 trogdor kernel: Redzone: 0x5a2cf071/0x5a2cf071. Feb 1 04:28:09 trogdor kernel: Last user: [](dst_destroy+0x7f/0xab) Feb 1 04:28:09 trogdor kernel: [] check_poison_obj+0x73/0x16a [] cache_alloc_debugcheck_after+0x22/0xf9 Feb 1 04:28:09 trogdor kernel: [] kmem_cache_alloc+0x7d/0x86 [] dst_alloc+0x27/0x7b Feb 1 04:28:09 trogdor kernel: [] dst_alloc+0x27/0x7b [] __ip_route_output_key+0x5a2/0x843 Feb 1 04:28:09 trogdor kernel: [] issue_and_wait+0x28/0x93 [3c59x] [] boomerang_start_xmit+0x31c/0x335 [3c59x] Feb 1 04:28:09 trogdor kernel: [] dev_queue_xmit+0x208/0x20f [] ip_route_output_flow+0x13/0x57 Feb 1 04:28:09 trogdor kernel: [] ip_route_output_key+0x9/0xb [] icmp_send+0x282/0x397 Feb 1 04:28:09 trogdor kernel: [] ip_route_input+0x3b/0xc6a [] _spin_lock_irqsave+0x9/0xd Feb 1 04:28:09 trogdor kernel: [] ip_options_compile+0x3da/0x3f3 [] ip_rcv+0x322/0x478 Feb 1 04:28:09 trogdor kernel: [] netif_receive_skb+0x211/0x259 [] process_backlog+0x7a/0x100 Feb 1 04:28:09 trogdor kernel: [] net_rx_action+0x99/0x170 [] __do_softirq+0x58/0xc2 Feb 1 04:28:09 trogdor kernel: [] do_softirq+0x46/0x4e<0> === Feb 1 04:28:09 trogdor kernel: [] do_IRQ+0x72/0x7b Feb 1 04:28:09 trogdor kernel: [] common_interrupt+0x1a/0x20 [] default_idle+0x0/0x55 Feb 1 04:28:09 trogdor kernel: [] default_idle+0x2c/0x55 [] cpu_idle+0x8f/0xa8 Feb 1 04:28:09 trogdor kernel: [] start_kernel+0x301/0x307 <3>000: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Feb 1 04:28:09 trogdor kernel: Prev obj: start=cefc899c, len=244 Feb 1 04:28:09 trogdor kernel: Redzone: 0x5a2cf071/0x5a2cf071. Feb 1 04:28:09 trogdor kernel: Last user: [](dst_destroy+0x7f/0xab) Feb 1 04:28:09 trogdor kernel: 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Feb 1 04:28:09 trogdor kernel: 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Feb 1 04:28:09 trogdor kernel: Next obj: start=cefc8b9c, len=244 Feb 1 04:28:09 trogdor kernel: Redzone: 0x170fc2a5/0x170fc2a5. Feb 1 04:28:09 trogdor kernel: Last user: [](dst_alloc+0x27/0x7b) Feb 1 04:28:09 trogdor kernel: 000: 7c dd 63 cf 01 00 00 00 a6 a2 00 00 00 00 00 00 Feb 1 04:28:09 trogdor kernel: 010: 00 b7 37 c0 00 00 02 00 01 00 00 00 56 a9 1b 02 Note the first slab corruption line.. 000: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b has a single bit error, which _could_ be bad ram, as this box is an ancient 4-way pentium pro, so it's days may be numbered. I'll give it a spin with memtest86 next time I'm at the office, but I wanted to report this just in case, as the last few days I've been seeing a number of slab corruption issues on different boxes, some of which I know are definitly ok wrt hardware problems. Dave - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html