tcp_sacktag_one() WARNING (was Re: 2.6.24-rc4-mm1)
Cedric Le Goater wrote: Ilpo Järvinen wrote: On Wed, 5 Dec 2007, Andrew Morton wrote: On Thu, 06 Dec 2007 17:59:37 +1100 Reuben Farrelly [EMAIL PROTECTED] wrote: This non fatal oops which I have just noticed may be related to this change then - certainly looks networking related. yep, but it isn't e1000. It's core TCP. WARNING: at net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert() Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #1 Ilpo, Reuben's kernel is talking to you ;) ...Please try the patch below. Andrew, this probably fixes your problem (the packets = tp-packets_out) as well. nah. I got the WARNINGs again with this patch. I got this new one on a 2.6.24-rc5-mm1. It looked similar ? C. WARNING: at /home/legoater/linux/2.6.24-rc5-mm1/net/ipv4/tcp_input.c:1280 tcp_sacktag_one() Pid: 0, comm: swapper Not tainted 2.6.24-rc5-mm1 #1 Call Trace: IRQ [80410e0e] tcp_sacktag_walk+0x2bc/0x62a [80411711] tcp_sacktag_write_queue+0x595/0xa7c [8028ce66] kfree+0xd4/0xe0 [80411e9f] tcp_ack+0x2a7/0xfc7 [80252ca1] mark_held_locks+0x47/0x6a [80252e5c] trace_hardirqs_on+0xfe/0x139 [80415d59] tcp_rcv_established+0x66a/0x76d [8041bd35] tcp_v4_do_rcv+0x37/0x3aa [8041e623] tcp_v4_rcv+0x9a9/0xa76 [80401832] ip_local_deliver_finish+0x161/0x23c [80401d47] ip_local_deliver+0x72/0x77 [8040168d] ip_rcv_finish+0x371/0x3b5 [80401ca1] ip_rcv+0x292/0x2c6 [803e2aae] netif_receive_skb+0x267/0x340 [8806eff4] :tg3:tg3_poll+0x5d2/0x89e [803e505c] net_rx_action+0xd5/0x1ad [8023b0b9] __do_softirq+0x5f/0xe3 [8020c8ec] call_softirq+0x1c/0x28 [8020e7b9] do_softirq+0x39/0x9f [8023b058] irq_exit+0x4e/0x50 [8020e900] do_IRQ+0xb7/0xd7 [8020a892] mwait_idle+0x0/0x52 [8020bbe6] ret_from_intr+0x0/0xf EOI [8024d0cb] __atomic_notifier_call_chain+0x20/0x83 [8020a8da] mwait_idle+0x48/0x52 [80209e79] enter_idle+0x22/0x24 [8020a822] cpu_idle+0xa1/0xc5 [8021e755] start_secondary+0x3b9/0x3c5 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc4-mm1 - BUG in tcp_fragment
Andrew Morton wrote: Temporarily at http://userweb.kernel.org/~akpm/2.6.24-rc4-mm1/ Will appear later at ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/ I got this one while compiling on NFS. C. kernel BUG at /home/legoater/linux/2.6.24-rc4-mm1/include/net/tcp.h:1480! invalid opcode: [1] SMP last sysfs file: /sys/devices/pci:00/:00:1e.0/:01:01.0/local_cpus CPU 1 Modules linked in: autofs4 nfs lockd sunrpc tg3 sg joydev ext3 jbd ehci_hcd ohci_hcd uhci_hcd Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #3 RIP: 0010:[80418d93] [80418d93] tcp_fragment+0x5ee/0x6f7 RSP: 0018:810147c9f9e0 EFLAGS: 00010217 RAX: 1526c311 RBX: 8100c2ce1d00 RCX: 810143cc6aa0 RDX: 0001 RSI: 810102b37b00 RDI: 810102b37b50 RBP: 810147c9fa50 R08: 004a R09: 0001 R10: 0b50 R11: 0001 R12: 81013a575700 R13: R14: 810143cc6400 R15: 81013a575750 FS: () GS:810147c57140() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 2ad5d294b000 CR3: bd11b000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process swapper (pid: 0, threadinfo 810147c98000, task 810147c89040) Stack: 810147c9fa00 05a843cc6400 810143cc6400 810147c9fa70 8100c2ce1d50 810143cc6590 810143cc6aa0 15265421 810143cc6400 810143cc6400 81013a575700 Call Trace: IRQ [804190c7] tcp_retransmit_skb+0xd6/0x713 [804197d4] tcp_xmit_retransmit_queue+0xd0/0x330 [8041209b] tcp_fastretrans_alert+0xb92/0xbf2 [80413f30] tcp_ack+0xdf3/0xfbe [80417295] tcp_rcv_established+0x66a/0x76d [8041d285] tcp_v4_do_rcv+0x37/0x3aa [8041fb73] tcp_v4_rcv+0x9a9/0xa76 [80402e4e] ip_local_deliver_finish+0x161/0x23c [80403363] ip_local_deliver+0x72/0x77 [80402ca9] ip_rcv_finish+0x371/0x3b5 [804032bd] ip_rcv+0x292/0x2c6 [803e3dcc] netif_receive_skb+0x267/0x340 [8806eff4] :tg3:tg3_poll+0x5d2/0x89e [803e639d] net_rx_action+0xd5/0x1ad [8023b605] __do_softirq+0x5f/0xe3 [8020c86c] call_softirq+0x1c/0x28 [8020e739] do_softirq+0x39/0x9f [8023b5a4] irq_exit+0x4e/0x50 [8020e880] do_IRQ+0xb7/0xd7 [8020a803] mwait_idle+0x0/0x55 [8020bb66] ret_from_intr+0x0/0xf EOI [8024d623] __atomic_notifier_call_chain+0x20/0x83 [8020a84b] mwait_idle+0x48/0x55 [80209e79] enter_idle+0x22/0x24 [8020a793] cpu_idle+0xa1/0xc5 [8021dfd5] start_secondary+0x3b9/0x3c5 Code: 0f 0b eb fe 48 85 f6 74 08 8b 46 6c 3b 41 68 75 55 48 8d 41 RIP [80418d93] tcp_fragment+0x5ee/0x6f7 RSP 810147c9f9e0 Kernel panic - not syncing: Aiee, killing interrupt handler! -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc4-mm1 - BUG in tcp_fragment
Ilpo Järvinen wrote: On Thu, 13 Dec 2007, Cedric Le Goater wrote: I got this one while compiling on NFS. C. kernel BUG at /home/legoater/linux/2.6.24-rc4-mm1/include/net/tcp.h:1480! I'm not exactly sure what patches you have applied and which patches are not, with rc4-mm1 there are two patches (first one was incomplete, I assume you had at least that one based on your other mail) to really fix the issues in (__|)tcp_reset_fack_counts(...). Yes I only have the first patch you sent on lkml on top of 2.6.24-rc4-mm1. attached below. I didn't see the second one on lkml ? However, there seems to be so much breakage that I have a bit trouble to decide where to start... The situation seems bit scary :-). my n/w environment seems to reproduce these issues quite easily. if you need some testing, just ping me. Cheers, C. So, I might soon prepare a revert patch for most of the questionable TCP parts and ask Dave to apply it (and drop them fully during next rebase) unless I suddently figure something out soon which explains all/most of the problems, then return to drawing board. ...As it seems that the cumulative ACK processing problem discovered later on (having rather cumbersome solution with skbs only) will make part of the work that's currently in net-2.6.25 quite useless/duplicate effort. But thanks anyway for reporting these. Subject: [PATCH] [TCP]: Fix fack_count miscountings (multiple places) 1) Fack_count is set incorrectly if the highest sent skb is already sacked (the skb-prev won't return it because it's on the other list already). These manifest as fackets_out counting error later on, the second-order effects are very hard to track, so it may fix all out-standing TCP bug reports. 2) Prev == NULL check was wrong way around 3) Last skb's fack count was incorrectly skipped while() {} loop Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED] --- include/net/tcp.h | 22 -- 1 files changed, 16 insertions(+), 6 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 9dbed0b..11a7e3e 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1337,10 +1337,20 @@ static inline struct sk_buff *tcp_send_head(struct sock *sk) static inline void tcp_advance_send_head(struct sock *sk, struct sk_buff *skb) { struct sk_buff *prev = tcp_write_queue_prev(sk, skb); + unsigned int fc = 0; + + if (prev == (struct sk_buff *)sk-sk_write_queue) + prev = NULL; + else if (!tcp_skb_adjacent(sk, prev, skb)) + prev = NULL; - if (prev != (struct sk_buff *)sk-sk_write_queue) - TCP_SKB_CB(skb)-fack_count = TCP_SKB_CB(prev)-fack_count + - tcp_skb_pcount(prev); + if ((prev == NULL) !__tcp_write_queue_empty(sk, TCP_WQ_SACKED)) + prev = __tcp_write_queue_tail(sk, TCP_WQ_SACKED); + + if (prev != NULL) + fc = TCP_SKB_CB(prev)-fack_count + tcp_skb_pcount(prev); + + TCP_SKB_CB(skb)-fack_count = fc; sk-sk_send_head = tcp_write_queue_next(sk, skb); if (sk-sk_send_head == (struct sk_buff *)sk-sk_write_queue) @@ -1464,7 +1474,7 @@ static inline struct sk_buff *__tcp_reset_fack_counts(struct sock *sk, { unsigned int fc = 0; - if (prev == NULL) + if (prev != NULL) fc = TCP_SKB_CB(*prev)-fack_count + tcp_skb_pcount(*prev); BUG_ON((*prev != NULL) !tcp_skb_adjacent(sk, *prev, skb)); @@ -1521,7 +1531,7 @@ static inline void tcp_reset_fack_counts(struct sock *sk, struct sk_buff *inskb) skb[otherq] = prev-next; } - while (skb[queue] != __tcp_write_queue_tail(sk, queue)) { + do { /* Lazy find for the other queue */ if (skb[queue] == NULL) { skb[queue] = tcp_write_queue_find(sk, TCP_SKB_CB(prev)-seq, @@ -1535,7 +1545,7 @@ static inline void tcp_reset_fack_counts(struct sock *sk, struct sk_buff *inskb) break; queue ^= TCP_WQ_SACKED; - } + } while (skb[queue] != __tcp_write_queue_tail(sk, queue)); } static inline void __tcp_insert_write_queue_after(struct sk_buff *skb, -- 1.5.0.6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc4-mm1
Ilpo Järvinen wrote: On Wed, 5 Dec 2007, David Miller wrote: From: Reuben Farrelly [EMAIL PROTECTED] Date: Thu, 06 Dec 2007 17:59:37 +1100 On 5/12/2007 4:17 PM, Andrew Morton wrote: - Lots of device IDs have been removed from the e1000 driver and moved over to e1000e. So if your e1000 stops working, you forgot to set CONFIG_E1000E. This non fatal oops which I have just noticed may be related to this change then - certainly looks networking related. WARNING: at net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert() Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #1 Call Trace: IRQ [8046e038] tcp_fastretrans_alert+0x229/0xe63 [80470975] tcp_ack+0xa3f/0x127d [804747b7] tcp_rcv_established+0x55f/0x7f8 [8047b1aa] tcp_v4_do_rcv+0xdb/0x3a7 [881148a8] :nf_conntrack:nf_ct_deliver_cached_events+0x75/0x99 No, it's from TCP assertions and changes added by Ilpo to the net-2.6.25 tree recently. Yeah, this (very likely) due to the new SACK processing (in net-2.6.25). I'll look what could go wrong with fack_count calculations, most likely it's the reason (I've found earlier one out-of-place retransmission segment in one of my test case which already indicated that there's something incorrect with them but didn't have time to debug it yet). Thanks for report. Some info about how easily you can reproduce couple of sentences about the test case might be useful later on when evaluating the fix. I also got plenty of these when untaring a tarball on NFS. C. WARNING: at /home/legoater/linux/2.6.24-rc4-mm1/net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert() Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #2 Call Trace: IRQ [804115bf] tcp_fastretrans_alert+0xb6/0xbf2 [80413f30] tcp_ack+0xdf3/0xfbe [803da8fb] sk_reset_timer+0x17/0x23 [80416d1e] tcp_rcv_established+0xf3/0x76d [8041d231] tcp_v4_do_rcv+0x37/0x3aa [8041fb1f] tcp_v4_rcv+0x9a9/0xa76 [80402e4e] ip_local_deliver_finish+0x161/0x23c [80403363] ip_local_deliver+0x72/0x77 [80402ca9] ip_rcv_finish+0x371/0x3b5 [804032bd] ip_rcv+0x292/0x2c6 [803e3dcc] netif_receive_skb+0x267/0x340 [8806eff4] :tg3:tg3_poll+0x5d2/0x89e [803e639d] net_rx_action+0xd5/0x1ad [8023b605] __do_softirq+0x5f/0xe3 [8020c86c] call_softirq+0x1c/0x28 [8020e739] do_softirq+0x39/0x9f [8023b5a4] irq_exit+0x4e/0x50 [8020e880] do_IRQ+0xb7/0xd7 [8020a803] mwait_idle+0x0/0x55 [8020bb66] ret_from_intr+0x0/0xf EOI [8024d623] __atomic_notifier_call_chain+0x20/0x83 [8020a84b] mwait_idle+0x48/0x55 [80209e79] enter_idle+0x22/0x24 [8020a793] cpu_idle+0xa1/0xc5 [8021dfd5] start_secondary+0x3b9/0x3c5 WARNING: at /home/legoater/linux/2.6.24-rc4-mm1/net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert() Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #2 Call Trace: IRQ [804115bf] tcp_fastretrans_alert+0xb6/0xbf2 [80413f30] tcp_ack+0xdf3/0xfbe [804153b8] tcp_data_queue+0x5da/0xb0a [80416d1e] tcp_rcv_established+0xf3/0x76d [8041d231] tcp_v4_do_rcv+0x37/0x3aa [8041fb1f] tcp_v4_rcv+0x9a9/0xa76 [80402e4e] ip_local_deliver_finish+0x161/0x23c [80403363] ip_local_deliver+0x72/0x77 [80402ca9] ip_rcv_finish+0x371/0x3b5 [804032bd] ip_rcv+0x292/0x2c6 [803e3dcc] netif_receive_skb+0x267/0x340 [8806eff4] :tg3:tg3_poll+0x5d2/0x89e [803e639d] net_rx_action+0xd5/0x1ad [8023b605] __do_softirq+0x5f/0xe3 [8020c86c] call_softirq+0x1c/0x28 [8020e739] do_softirq+0x39/0x9f [8023b5a4] irq_exit+0x4e/0x50 [8020e880] do_IRQ+0xb7/0xd7 [8020a803] mwait_idle+0x0/0x55 [8020bb66] ret_from_intr+0x0/0xf EOI [8024d623] __atomic_notifier_call_chain+0x20/0x83 [8020a84b] mwait_idle+0x48/0x55 [80209e79] enter_idle+0x22/0x24 [8020a793] cpu_idle+0xa1/0xc5 [8021dfd5] start_secondary+0x3b9/0x3c5 WARNING: at /home/legoater/linux/2.6.24-rc4-mm1/net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert() Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #2 Call Trace: IRQ [804115bf] tcp_fastretrans_alert+0xb6/0xbf2 [80413f30] tcp_ack+0xdf3/0xfbe [804153b8] tcp_data_queue+0x5da/0xb0a [80416d1e] tcp_rcv_established+0xf3/0x76d [8041d231] tcp_v4_do_rcv+0x37/0x3aa [8041fb1f] tcp_v4_rcv+0x9a9/0xa76 [80402e4e] ip_local_deliver_finish+0x161/0x23c [80403363] ip_local_deliver+0x72/0x77 [80402ca9] ip_rcv_finish+0x371/0x3b5 [804032bd] ip_rcv+0x292/0x2c6 [803e3dcc] netif_receive_skb+0x267/0x340 [8806eff4] :tg3:tg3_poll+0x5d2/0x89e [803e639d] net_rx_action+0xd5/0x1ad [8023b605] __do_softirq+0x5f/0xe3 [8020c86c]
Re: 2.6.24-rc4-mm1
Ilpo Järvinen wrote: On Wed, 5 Dec 2007, Andrew Morton wrote: On Thu, 06 Dec 2007 17:59:37 +1100 Reuben Farrelly [EMAIL PROTECTED] wrote: This non fatal oops which I have just noticed may be related to this change then - certainly looks networking related. yep, but it isn't e1000. It's core TCP. WARNING: at net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert() Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #1 Ilpo, Reuben's kernel is talking to you ;) ...Please try the patch below. Andrew, this probably fixes your problem (the packets = tp-packets_out) as well. nah. I got the WARNINGs again with this patch. C. Dave, please include this one to net-2.6.25. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.23-mm1 s390 driver problem
that helped going a little further in the boot process but we then have a network issue when bringing the network interface up : please cc netdev on network issues. yes. Bringing up interface eth0: Ý cut here ¨ Kernel BUG at 0002 Ýverbose debug info unavailable¨ illegal operation: 0001 Ý#1¨ Modules linked in: CPU:0Not tainted Process ip (pid: 1167, task: 01d46038, ksp: 025efb28) Krnl PSW : 070420018000 0002 (0x2) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3 Krnl GPRS: 0241f600 01c8d 86dd 01eb6d70 01c8d 01eb6d40 003abc28 01eb6d00 025ef 0241f600 003b6d18 002b33d2 025ef Krnl Code:0002: unknown 0004: unknown 0006: unknown 0008: unknown 000a: unknown 000c: unknown 000e: unknown 0010: unknown Call Trace: (Ý002b3352¨ neigh_connected_output+0x76/0x138) Ý00325402¨ ip6_output2+0x2da/0x470 Ý00326ea6¨ ip6_output+0x816/0x1064 Ý00338e46¨ __ndisc_send+0x416/0x6a8 Ý00339330¨ ndisc_send_rs+0x58/0x68 Ý0032cbf4¨ addrconf_dad_completed+0xbc/0x100 Ý0032d2de¨ addrconf_dad_start+0xa2/0x14c Ý0032d408¨ addrconf_add_linklocal+0x80/0xa8 Ý0032fa7e¨ addrconf_notify+0x2de/0x8d4 Ý00383990¨ notifier_call_chain+0x5c/0x98 Ý00063bca¨ __raw_notifier_call_chain+0x26/0x34 Ý00063c06¨ raw_notifier_call_chain+0x2e/0x3c Ý002aa54c¨ call_netdevice_notifiers+0x34/0x44 Ý002ad1aa¨ dev_open+0x9e/0xe0 Ý002ad80a¨ dev_change_flags+0x9e/0x1cc Ý00302c74¨ devinet_ioctl+0x650/0x73c Ý003050ba¨ inet_ioctl+0xde/0xf4 Ý0029a8d0¨ sock_ioctl+0x1cc/0x2dc Ý000cb844¨ do_ioctl+0xb8/0xcc Ý000cb8f2¨ vfs_ioctl+0x9a/0x3ec Ý000cbc96¨ sys_ioctl+0x52/0x7c Ý00022484¨ sysc_noemu+0x10/0x16 Ý0210df12¨ 0x210df12 that's a network issue ;) 002b3352 gives : include/linux/netdevice.h:819 I have a feeling that we fixed this. But there's no BUG at 2.6.23-mm1's include/linux/netdevice.h:819. but dev-header_ops is bogus. right ? How about setting CONFIG_DEBUG_BUGVERBOSE=y? it is set :( C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.23-mm1 s390 driver problem
Martin Schwidefsky wrote: On Fri, 2007-10-19 at 11:16 +0200, Cedric Le Goater wrote: This is the vmlinux.lds.S problem. The cleanup patch from Sam Ravnborg moved the __initramfs_start and __initramfs_end symbols into the .init.ramfs section. This is in itself not a problem, but it surfaced a bug: there is no *(.init.initramfs), that needs to be *(init.ramfs). I corrected this in the upstream patch but 2.6.23-mm1 has the older one that still causes the Cannot open root device. For 2.6.23-mm1 use the patch below. thanks martin, that helped going a little further in the boot process but we then have a network issue when bringing the network interface up : See http://marc.info/?l=linux-kernelm=119270398931208w=2 hmm, that doesn't fix the oops. /me looking. Thanks, C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Ilpo Järvinen wrote: On Wed, 3 Oct 2007, Cedric Le Goater wrote: Cedric Le Goater wrote: Below are the messages I got on 2) right after running ketchup (which does a wget www.kernel.org) Oops, those tcp_fragment WARNINGs in the other mail were due to bug in the debug patch as it called verify too early in there (before queue was adjusted, no wonder it finds state inconsistent at that point, fixed that)... ...So please discard all old debug patches, they're all broken in this respect... :-( not a warning on 1) with your extra verbose patch. bummer, I got this one on 1) :( WARNING: at /home/legoater/linux/net-2.6.24.git/net/ipv4/tcp_input.c:2325 tcp_fastretrans_alert() Call Trace: IRQ [8022ddb6] __wake_up+0x1f/0x4c [803fd9d3] tcp_ack+0xcee/0x18ac [80400764] tcp_rcv_established+0x61f/0x6df ...I just wonder why that's the first place where it occurs... Can you try the debug patch below (fixed verify place in tcp_fragment/collapse, added some of them to narrow it down, and handled GSO more user friendly way in the printout). Put it on top of those three patches (mm should be fine :-)). ...I wish the verify triggers way before the fastretrans trap (for some reason it didn't do that in the quoted trace, maybe I had some verifys missing in that old patch or something)... so here are the results on a net-2.6.24 kernel. I've put the patchset here to make sure it's correct: http://legoater.free.fr/patches/2.6.23/net-2.6.24.git-tcp_fastretrans/ and plenty of logs : http://legoater.free.fr/patches/2.6.23/net-2.6.24.git-tcp_fastretrans.messages FYI, config is here : http://legoater.free.fr/patches/2.6.23/net-2.6.24.git-tcp_fastretrans.config C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Hello Ilpo ! Ilpo Järvinen wrote: Hi Dave, Sacktag fastpath_cnt_hint seems to be very tricky to get right... I suppose this one fixes Cedric's case. I cannot say for sure until there is something more definite indication of tcp_retrans_try_collapse origin than what the simple late WARN_ON gave for us. ...Especially since it's non-trivial to have skb hint correctly positioned in the write_queue while still ending up calling that function. However, considering how difficult it seems to be for Cedric to reproduce, it might well be this one. In addition, I noticed another reset which wasn't previously converted to WARN_ON, so doing that now. Boot + simple xfer tested. Please apply to net-2.6.24. I'm dropping the previous patches you sent me and switching to this patchset. right ? Thanks, C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Ilpo Järvinen wrote: On Wed, 3 Oct 2007, Cedric Le Goater wrote: Ilpo Järvinen wrote: Sacktag fastpath_cnt_hint seems to be very tricky to get right... I suppose this one fixes Cedric's case. I cannot say for sure until there is something more definite indication of tcp_retrans_try_collapse origin than what the simple late WARN_ON gave for us. ...Especially since it's non-trivial to have skb hint correctly positioned in the write_queue while still ending up calling that function. However, considering how difficult it seems to be for Cedric to reproduce, it might well be this one. In addition, I noticed another reset which wasn't previously converted to WARN_ON, so doing that now. Boot + simple xfer tested. Please apply to net-2.6.24. I'm dropping the previous patches you sent me and switching to this patchset. right ? Yes you can do that... However, there are two ways forward: 1) Drop and test with this patchset long enough to verify it's gone... 2) No dropping and get the more exact trace by reproducing, which can point out to tcp_retrans_try_collapse confirming the source of the bug or revealing yet another bug... The first one has one drawback, it cannot prove the fix very well since the bug could just not occur by chance... Path 2 would clearly show the place from where the problem originates because we will know that it got triggered! I personally would prefer path 2 but whether you want to go for that depends on the time you want to invest in it... ...I rediffed the tcp_verify_fackets patch too (below) just in case it would be something else in you case and you choose path 1 (put it on top of this patchset, applies with some offsets). In case the problem is gone, it shouldn't trigger and if it does, we'll have another bug caught. I have a spare node so I'm starting 2) with the 3 patches you sent and that last one which applied fine. all of them on a fresh git pull of net-2.6.24 Anyway, thanks for ccing right persons and netdev right from the beginning. thanks to git ! :) C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Ilpo Järvinen wrote: On Wed, 3 Oct 2007, Cedric Le Goater wrote: Ilpo Järvinen wrote: On Wed, 3 Oct 2007, Cedric Le Goater wrote: I'm dropping the previous patches you sent me and switching to this patchset. right ? Yes you can do that... However, there are two ways forward: 1) Drop and test with this patchset long enough to verify it's gone... 2) No dropping and get the more exact trace by reproducing, which can point out to tcp_retrans_try_collapse confirming the source of the bug or revealing yet another bug... The first one has one drawback, it cannot prove the fix very well since the bug could just not occur by chance... Path 2 would clearly show the place from where the problem originates because we will know that it got triggered! I personally would prefer path 2 but whether you want to go for that depends on the time you want to invest in it... ...I rediffed the tcp_verify_fackets patch too (below) just in case it would be something else in you case and you choose path 1 (put it on top of this patchset, applies with some offsets). In case the problem is gone, it shouldn't trigger and if it does, we'll have another bug caught. I have a spare node so I'm starting 2) with the 3 patches you sent and that last one which applied fine. Ah, that's path 1) then... Since you seem to have enough time, I would say that the path 1 is good as well and bugs unrelated to the fix will show up there too... arg. yes. sorry for the confusion. I should have stated it explicitly that with path 2 those 3 patches should not be applied because the aim is not a fix but reproducal. Path 2 was intentionally left without the potentional fix as then nice backtrace informs when we can stop trying (which would hopefully occurred pretty soon) :-). But lets discard that path 2... I have 2 spare nodes so i'll run both. 1) is on already without any issues i'm just compiling 2) I usually work on -mm, so what would be interesting for me is to have what you need in net-2.6.24 which is getting pulled in -mm by andrew. then, if you need an extra patch for verbosity, that's fine, i'll include it in my usual patchset. Cheers, C. all of them on a fresh git pull of net-2.6.24 That's fine, they're pretty well in sync (mm and net-2.6.24, and soon 2.6.24-rcs too). - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Ilpo Järvinen wrote: On Wed, 3 Oct 2007, Cedric Le Goater wrote: Ilpo Järvinen wrote: Ah, that's path 1) then... Since you seem to have enough time, I would say that the path 1 is good as well and bugs unrelated to the fix will show up there too... arg. yes. sorry for the confusion. I should have stated it explicitly that with path 2 those 3 patches should not be applied because the aim is not a fix but reproducal. Path 2 was intentionally left without the potentional fix as then nice backtrace informs when we can stop trying (which would hopefully occurred pretty soon) :-). But lets discard that path 2... I have 2 spare nodes so i'll run both. 1) is on already without any issues i'm just compiling 2) Below are the messages I got on 2) right after running ketchup (which does a wget www.kernel.org) not a warning on 1) with your extra verbose patch. I usually work on -mm, so what would be interesting for me is to have what you need in net-2.6.24 which is getting pulled in -mm by andrew. then, if you need an extra patch for verbosity, that's fine, i'll include it in my usual patchset. Ah, I'm sorry about the subject and the extra work it caused, no problem, that was a comment for the futur patchset. it was meant for DaveM only, didn't realize at that time it would be meaningful to you as well, thus couldn't warn you back then... Testing on top of mm would be (/ have been) fine as well... From my point of view both mm and net-2.6.24 are pretty much the same (I even verified that those patches apply fine on top of rc8-mm2 since I thought that you might want to use that one). He, you might have solved it with 1). If not, I'm keeping the hardware for you. Cheers, C. WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 tcp_verify_fackets() Call Trace: IRQ [8041aa86] tcp_verify_fackets+0x119/0x237 [80416e57] tcp_fragment+0x468/0x4b8 [804184a5] tcp_retransmit_skb+0xcf/0x2f4 [8041878d] tcp_xmit_retransmit_queue+0xc3/0x31e [8041220a] tcp_fastretrans_alert+0xb36/0xb43 [80412f0f] tcp_ack+0x5d3/0x71b [80415229] tcp_rcv_established+0x61f/0x6df [8025419a] __lock_acquire+0x8a1/0xf1b [8041c7ff] tcp_v4_do_rcv+0x3e/0x394 [8041d171] tcp_v4_rcv+0x61c/0x9a9 [804017e3] ip_local_deliver+0x1da/0x2a4 [8040214e] ip_rcv+0x583/0x5c9 [8046fe43] packet_rcv_spkt+0x19a/0x1a8 [803e2e1c] netif_receive_skb+0x2cf/0x2f5 [88042505] :tg3:tg3_poll+0x65d/0x8a4 [803e2fe8] net_rx_action+0xb8/0x191 [8023a9b7] __do_softirq+0x5f/0xe0 [8020c98c] call_softirq+0x1c/0x28 [8020e9c3] do_softirq+0x3b/0xb8 [8023aaae] irq_exit+0x4e/0x50 [8020e7df] do_IRQ+0xbd/0xd7 [80209cb9] mwait_idle+0x0/0x4d [8020bce6] ret_from_intr+0x0/0xf EOI [80209cfc] mwait_idle+0x43/0x4d [802099fb] enter_idle+0x22/0x24 [80209c4f] cpu_idle+0x9d/0xc0 [80479591] rest_init+0x55/0x57 [8063681f] start_kernel+0x2e0/0x2ec [80636134] _sinittext+0x134/0x13b WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:198 tcp_verify_fackets() Call Trace: IRQ [80323bf6] vgacon_set_cursor_size+0x39/0xd5 [8041aad0] tcp_verify_fackets+0x163/0x237 [80416e57] tcp_fragment+0x468/0x4b8 [804184a5] tcp_retransmit_skb+0xcf/0x2f4 [8041878d] tcp_xmit_retransmit_queue+0xc3/0x31e [8041220a] tcp_fastretrans_alert+0xb36/0xb43 [80412f0f] tcp_ack+0x5d3/0x71b [80415229] tcp_rcv_established+0x61f/0x6df [8025419a] __lock_acquire+0x8a1/0xf1b [8041c7ff] tcp_v4_do_rcv+0x3e/0x394 [8041d171] tcp_v4_rcv+0x61c/0x9a9 [804017e3] ip_local_deliver+0x1da/0x2a4 [8040214e] ip_rcv+0x583/0x5c9 [8046fe43] packet_rcv_spkt+0x19a/0x1a8 [803e2e1c] netif_receive_skb+0x2cf/0x2f5 [88042505] :tg3:tg3_poll+0x65d/0x8a4 [803e2fe8] net_rx_action+0xb8/0x191 [8023a9b7] __do_softirq+0x5f/0xe0 [8020c98c] call_softirq+0x1c/0x28 [8020e9c3] do_softirq+0x3b/0xb8 [8023aaae] irq_exit+0x4e/0x50 [8020e7df] do_IRQ+0xbd/0xd7 [80209cb9] mwait_idle+0x0/0x4d [8020bce6] ret_from_intr+0x0/0xf EOI [80209cfc] mwait_idle+0x43/0x4d [802099fb] enter_idle+0x22/0x24 [80209c4f] cpu_idle+0x9d/0xc0 [80479591] rest_init+0x55/0x57 [8063681f] start_kernel+0x2e0/0x2ec [80636134] _sinittext+0x134/0x13b TCP wq(s) -S--SSS TCP wq(i) hf s4 f9 (47) p9 seq: su3460595874 hs3460607374 sn3460659962 (3460608822) WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 tcp_verify_fackets() Call Trace: IRQ [80323bf6] vgacon_set_cursor_size+0x39/0xd5
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Cedric Le Goater wrote: Ilpo Järvinen wrote: On Wed, 3 Oct 2007, Cedric Le Goater wrote: Ilpo Järvinen wrote: Ah, that's path 1) then... Since you seem to have enough time, I would say that the path 1 is good as well and bugs unrelated to the fix will show up there too... arg. yes. sorry for the confusion. I should have stated it explicitly that with path 2 those 3 patches should not be applied because the aim is not a fix but reproducal. Path 2 was intentionally left without the potentional fix as then nice backtrace informs when we can stop trying (which would hopefully occurred pretty soon) :-). But lets discard that path 2... I have 2 spare nodes so i'll run both. 1) is on already without any issues i'm just compiling 2) Below are the messages I got on 2) right after running ketchup (which does a wget www.kernel.org) and a second run of ketchup gave the following. cheers, C. WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 tcp_verify_fackets() Call Trace: IRQ [8041aa86] tcp_verify_fackets+0x119/0x237 [80416e57] tcp_fragment+0x468/0x4b8 [804184a5] tcp_retransmit_skb+0xcf/0x2f4 [8041878d] tcp_xmit_retransmit_queue+0xc3/0x31e [8041220a] tcp_fastretrans_alert+0xb36/0xb43 [80412f0f] tcp_ack+0x5d3/0x71b [80415229] tcp_rcv_established+0x61f/0x6df [8025419a] __lock_acquire+0x8a1/0xf1b [8041c7ff] tcp_v4_do_rcv+0x3e/0x394 [8041d171] tcp_v4_rcv+0x61c/0x9a9 [804017e3] ip_local_deliver+0x1da/0x2a4 [8040214e] ip_rcv+0x583/0x5c9 [8046fe43] packet_rcv_spkt+0x19a/0x1a8 [803e2e1c] netif_receive_skb+0x2cf/0x2f5 [88042505] :tg3:tg3_poll+0x65d/0x8a4 [803e2fe8] net_rx_action+0xb8/0x191 [8023a9b7] __do_softirq+0x5f/0xe0 [8020c98c] call_softirq+0x1c/0x28 [8020e9c3] do_softirq+0x3b/0xb8 [8023aaae] irq_exit+0x4e/0x50 [8020e7df] do_IRQ+0xbd/0xd7 [80209cb9] mwait_idle+0x0/0x4d [8020bce6] ret_from_intr+0x0/0xf EOI [80209cfc] mwait_idle+0x43/0x4d [802099fb] enter_idle+0x22/0x24 [80209c4f] cpu_idle+0x9d/0xc0 [80479591] rest_init+0x55/0x57 [8063681f] start_kernel+0x2e0/0x2ec [80636134] _sinittext+0x134/0x13b TCP wq(s) --S-- TCP wq(i) h s1 f5 (14) p6 seq: su110259658 hs110265450 sn110278722 (0) WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 tcp_verify_fackets() Call Trace: IRQ [803250aa] vgacon_scroll+0x188/0x1dd [8041aa86] tcp_verify_fackets+0x119/0x237 [80416e57] tcp_fragment+0x468/0x4b8 [804184a5] tcp_retransmit_skb+0xcf/0x2f4 [8041878d] tcp_xmit_retransmit_queue+0xc3/0x31e [8041220a] tcp_fastretrans_alert+0xb36/0xb43 [80412f0f] tcp_ack+0x5d3/0x71b [80415229] tcp_rcv_established+0x61f/0x6df [8025419a] __lock_acquire+0x8a1/0xf1b [8041c7ff] tcp_v4_do_rcv+0x3e/0x394 [8041d171] tcp_v4_rcv+0x61c/0x9a9 [804017e3] ip_local_deliver+0x1da/0x2a4 [8040214e] ip_rcv+0x583/0x5c9 [8046fe43] packet_rcv_spkt+0x19a/0x1a8 [803e2e1c] netif_receive_skb+0x2cf/0x2f5 [88042505] :tg3:tg3_poll+0x65d/0x8a4 [803e2fe8] net_rx_action+0xb8/0x191 [8023a9b7] __do_softirq+0x5f/0xe0 [8020c98c] call_softirq+0x1c/0x28 [8020e9c3] do_softirq+0x3b/0xb8 [8023aaae] irq_exit+0x4e/0x50 [8020e7df] do_IRQ+0xbd/0xd7 [80209cb9] mwait_idle+0x0/0x4d [8020bce6] ret_from_intr+0x0/0xf EOI [80209cfc] mwait_idle+0x43/0x4d [802099fb] enter_idle+0x22/0x24 [80209c4f] cpu_idle+0x9d/0xc0 [80479591] rest_init+0x55/0x57 [8063681f] start_kernel+0x2e0/0x2ec [80636134] _sinittext+0x134/0x13b WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:198 tcp_verify_fackets() Call Trace: IRQ [803250aa] vgacon_scroll+0x188/0x1dd [8041aad0] tcp_verify_fackets+0x163/0x237 [80416e57] tcp_fragment+0x468/0x4b8 [804184a5] tcp_retransmit_skb+0xcf/0x2f4 [8041878d] tcp_xmit_retransmit_queue+0xc3/0x31e [8041220a] tcp_fastretrans_alert+0xb36/0xb43 [80412f0f] tcp_ack+0x5d3/0x71b [80415229] tcp_rcv_established+0x61f/0x6df [8025419a] __lock_acquire+0x8a1/0xf1b [8041c7ff] tcp_v4_do_rcv+0x3e/0x394 [8041d171] tcp_v4_rcv+0x61c/0x9a9 [804017e3] ip_local_deliver+0x1da/0x2a4 [8040214e] ip_rcv+0x583/0x5c9 [8046fe43] packet_rcv_spkt+0x19a/0x1a8 [803e2e1c] netif_receive_skb+0x2cf/0x2f5 [88042505] :tg3:tg3_poll+0x65d/0x8a4 [803e2fe8] net_rx_action+0xb8/0x191 [8023a9b7] __do_softirq+0x5f/0xe0 [8020c98c] call_softirq+0x1c/0x28 [8020e9c3] do_softirq+0x3b/0xb8
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Cedric Le Goater wrote: Ilpo Järvinen wrote: On Wed, 3 Oct 2007, Cedric Le Goater wrote: Ilpo Järvinen wrote: Ah, that's path 1) then... Since you seem to have enough time, I would say that the path 1 is good as well and bugs unrelated to the fix will show up there too... arg. yes. sorry for the confusion. I should have stated it explicitly that with path 2 those 3 patches should not be applied because the aim is not a fix but reproducal. Path 2 was intentionally left without the potentional fix as then nice backtrace informs when we can stop trying (which would hopefully occurred pretty soon) :-). But lets discard that path 2... I have 2 spare nodes so i'll run both. 1) is on already without any issues i'm just compiling 2) Below are the messages I got on 2) right after running ketchup (which does a wget www.kernel.org) not a warning on 1) with your extra verbose patch. bummer, I got this one on 1) :( C. WARNING: at /home/legoater/linux/net-2.6.24.git/net/ipv4/tcp_input.c:2325 tcp_fastretrans_alert() Call Trace: IRQ [8022ddb6] __wake_up+0x1f/0x4c [803fd9d3] tcp_ack+0xcee/0x18ac [80400764] tcp_rcv_established+0x61f/0x6df [8024e8d8] __lock_acquire+0x8a1/0xf1b [8040795b] tcp_v4_do_rcv+0x3e/0x394 [804082d5] tcp_v4_rcv+0x624/0x9b1 [803ecfa3] ip_local_deliver+0x1da/0x2a4 [803ed900] ip_rcv+0x57c/0x5c4 [8045ae53] packet_rcv_spkt+0x19a/0x1a8 [803ce78e] netif_receive_skb+0x2ba/0x2de [88044505] :tg3:tg3_poll+0x65d/0x8a4 [803ce958] net_rx_action+0xb8/0x191 [802385cb] __do_softirq+0x5f/0xe0 [8020c97c] call_softirq+0x1c/0x28 [8020e672] do_softirq+0x3b/0xb9 [802386c2] irq_exit+0x4e/0x50 [8020e48e] do_IRQ+0xbe/0xd8 [80209cb9] mwait_idle+0x0/0x4d [8020bcc6] ret_from_intr+0x0/0xf EOI [80464e10] __sched_text_start+0x5f0/0x62b [80464e10] __sched_text_start+0x5f0/0x62b [80209cfc] mwait_idle+0x43/0x4d [802099fb] enter_idle+0x22/0x24 [80209c4f] cpu_idle+0x9d/0xc0 [80464513] rest_init+0x57/0x59 [8060c82a] start_kernel+0x2d1/0x2dd [8060c14e] _sinittext+0x14e/0x155 WARNING: at /home/legoater/linux/net-2.6.24.git/net/ipv4/tcp_input.c:2325 tcp_fastretrans_alert() Call Trace: IRQ [8022ddb6] __wake_up+0x1f/0x4c [803fd9d3] tcp_ack+0xcee/0x18ac [80400764] tcp_rcv_established+0x61f/0x6df [8024e8d8] __lock_acquire+0x8a1/0xf1b [8040795b] tcp_v4_do_rcv+0x3e/0x394 [804082d5] tcp_v4_rcv+0x624/0x9b1 [803ecfa3] ip_local_deliver+0x1da/0x2a4 [803ed900] ip_rcv+0x57c/0x5c4 [8045ae53] packet_rcv_spkt+0x19a/0x1a8 [803ce78e] netif_receive_skb+0x2ba/0x2de [88044505] :tg3:tg3_poll+0x65d/0x8a4 [803ce958] net_rx_action+0xb8/0x191 [802385cb] __do_softirq+0x5f/0xe0 [8020c97c] call_softirq+0x1c/0x28 [8020e672] do_softirq+0x3b/0xb9 [802386c2] irq_exit+0x4e/0x50 [8020e48e] do_IRQ+0xbe/0xd8 [80209cb9] mwait_idle+0x0/0x4d [8020bcc6] ret_from_intr+0x0/0xf EOI [80464e10] __sched_text_start+0x5f0/0x62b [80464e10] __sched_text_start+0x5f0/0x62b [80209cfc] mwait_idle+0x43/0x4d [802099fb] enter_idle+0x22/0x24 [80209c4f] cpu_idle+0x9d/0xc0 [80464513] rest_init+0x57/0x59 [8060c82a] start_kernel+0x2d1/0x2dd - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
Ilpo Järvinen wrote: On Sat, 29 Sep 2007, Cedric Le Goater wrote: Ilpo Järvinen wrote: On Fri, 28 Sep 2007, Ilpo Järvinen wrote: On Fri, 28 Sep 2007, Cedric Le Goater wrote: I just found that warning in my logs. It seems that it's been happening since rc7-mm1 at least. WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert() Call Trace: IRQ [8040fdc3] tcp_ack+0xcd6/0x1894 ...snip... ...Thanks for the report, I'll have look what could still break fackets_out... I think this one is now clear to me, tcp_fragment/collapse adjusts fackets_out (incorrectly) also for reno flow when there were some dupACKs that made sacked_out != 0. Could you please try if patch below proves all them to be of non-SACK origin... In case that's true, it's rather harmless, I'll send a fix on Monday or so (this would anyway be needed)... If you find out that them occur with SACK enabled flow, that would be more interesting and requires more digging... I'm trying now to reproduce this WARNING. It seems that the n/w behaves differently during the week ends. Probably taking a break. Thanks. Of course there are other means too to determine if TCP flows do negotiate SACK enabled or not. Depending on your test case (which is fully unknown to me) they may or may not be usable... At least the value of tcp_sack sysctl on both systems or tcpdump catching SYN packets should give that detail. ...If you know to which hosts TCP could be connected (and active) to, while the WARNING triggers, it's really easy to test what is being negotiated as it's unlikely to change at short notice and any TCP flow to that host will get us the same information though the WARNING would not be triggered with it at this time. Obviously if at least one of the remotes is not known or the set ends up being mixture of reno and SACK flows, then we'll just have to wait and see which fish we get... got it ! r3-06.test.meiosys.com login: WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert() Call Trace: IRQ [8040fdc3] tcp_ack+0xcd6/0x18af [80412b6f] tcp_rcv_established+0x61f/0x6df [80254146] __lock_acquire+0x8a1/0xf1b [80419d19] tcp_v4_do_rcv+0x3e/0x394 [8041a68b] tcp_v4_rcv+0x61c/0x9a9 [803ff1e3] ip_local_deliver+0x1da/0x2a4 [803ffb4e] ip_rcv+0x583/0x5c9 [8046d35b] packet_rcv_spkt+0x19a/0x1a8 [803e081c] netif_receive_skb+0x2cf/0x2f5 [88042505] :tg3:tg3_poll+0x65d/0x8a4 [803e09e8] net_rx_action+0xb8/0x191 [8023a927] __do_softirq+0x5f/0xe0 [8020c98c] call_softirq+0x1c/0x28 [8020e9c3] do_softirq+0x3b/0xb8 [8023aa1e] irq_exit+0x4e/0x50 [8020e7df] do_IRQ+0xbd/0xd7 [80209cb9] mwait_idle+0x0/0x4d [8020bce6] ret_from_intr+0x0/0xf EOI [80209cfc] mwait_idle+0x43/0x4d [802099fb] enter_idle+0x22/0x24 [80209c4f] cpu_idle+0x9d/0xc0 [80476aa1] rest_init+0x55/0x57 [80630815] start_kernel+0x2d6/0x2e2 [80630134] _sinittext+0x134/0x13b TCP 0 I wasn't doing any particular test on n/w so it took me a while to figure out how I was triggering the WARNING. Apparently, this is happening when I run ketchup, but not always. This test machine is behind many firewall routers so it might be a reason. tcpdump gave me this output for a wget on kernel.org : 10:51:14.835981 IP r3-06.test.meiosys.com.40322 pub2.kernel.org.http: S 737836267:737836267(0) win 5840 mss 1460,sackOK,timestamp 1309245 0,nop,wscale 7 10:51:14.975153 IP pub2.kernel.org.http r3-06.test.meiosys.com.40321: F 524:524(0) ack 166 win 5840 10:51:14.975177 IP r3-06.test.meiosys.com.40321 pub2.kernel.org.http: . ack 525 win 7504 I'm trying to get the WARNING and the tcpdump output for it but for the moment, it seems it's beyond my reach :/ Hope it helps ! C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
Ilpo Järvinen wrote: On Fri, 28 Sep 2007, Ilpo Järvinen wrote: On Fri, 28 Sep 2007, Cedric Le Goater wrote: I just found that warning in my logs. It seems that it's been happening since rc7-mm1 at least. WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert() Call Trace: IRQ [8040fdc3] tcp_ack+0xcd6/0x1894 ...snip... ...Thanks for the report, I'll have look what could still break fackets_out... I think this one is now clear to me, tcp_fragment/collapse adjusts fackets_out (incorrectly) also for reno flow when there were some dupACKs that made sacked_out != 0. Could you please try if patch below proves all them to be of non-SACK origin... In case that's true, it's rather harmless, I'll send a fix on Monday or so (this would anyway be needed)... If you find out that them occur with SACK enabled flow, that would be more interesting and requires more digging... I'm trying now to reproduce this WARNING. It seems that the n/w behaves differently during the week ends. Probably taking a break. C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.23-rc8-mm2 - tcp_fastretrans_alert() WARNING
Hello ! Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc8/2.6.23-rc8-mm2/ I just found that warning in my logs. It seems that it's been happening since rc7-mm1 at least. Thanks ! C. WARNING: at /home/legoater/linux/2.6.23-rc8-mm2/net/ipv4/tcp_input.c:2314 tcp_fastretrans_alert() Call Trace: IRQ [8040fdc3] tcp_ack+0xcd6/0x1894 [80411c79] tcp_data_queue+0x5be/0xae7 [80412b54] tcp_rcv_established+0x61f/0x6df [80254146] __lock_acquire+0x8a1/0xf1b [80419cfd] tcp_v4_do_rcv+0x3e/0x394 [8041a66f] tcp_v4_rcv+0x61c/0x9a9 [803ff1e3] ip_local_deliver+0x1da/0x2a4 [803ffb4e] ip_rcv+0x583/0x5c9 [8046d33f] packet_rcv_spkt+0x19a/0x1a8 [803e081c] netif_receive_skb+0x2cf/0x2f5 [88042505] :tg3:tg3_poll+0x65d/0x8a4 [803e09e8] net_rx_action+0xb8/0x191 [8023a927] __do_softirq+0x5f/0xe0 [8020c98c] call_softirq+0x1c/0x28 [8020e9c3] do_softirq+0x3b/0xb8 [8023aa1e] irq_exit+0x4e/0x50 [8020e7df] do_IRQ+0xbd/0xd7 [80209cb9] mwait_idle+0x0/0x4d [8020bce6] ret_from_intr+0x0/0xf EOI [80209cfc] mwait_idle+0x43/0x4d [802099fb] enter_idle+0x22/0x24 [80209c4f] cpu_idle+0x9d/0xc0 [80476a91] rest_init+0x55/0x57 [80630815] start_kernel+0x2d6/0x2e2 [80630134] _sinittext+0x134/0x13b - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net: Add network namespace clone unshare support.
Andrew Morton wrote: On Fri, 28 Sep 2007 11:12:13 +0200 Cedric Le Goater [EMAIL PROTECTED] wrote: Cedric made a good point that we will have conflicts of code being added to the same place in nsproxy.c and the like. So I copied Andrew to give him a heads up. here's a suggestion, we could keep the net namespace unshare patch out of david's tree, let andrew merge and release a new -mm and, then, send the net namespace unshare patch to andrew. that should keep nsproxy out of the andrew's merge challenge. But david's tree will miss the unshare part for a while. This patch only generates two rejects against the current -mm poop pile. That's insignificant. We don't need to do anything special to merge a little patch like this one. Thanks Andrew. C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net: Add network namespace clone unshare support.
Eric W. Biederman wrote: David Miller [EMAIL PROTECTED] writes: Eric, pick an appropriate new non-conflicting number NOW. Done. My apologies for the confusion. I thought the way Cedric and the IBM guys were testing someone would have shouted at me long before now. This adds unnecessary extra work for Andrew Morton, which he has enough of already. Cedric made a good point that we will have conflicts of code being added to the same place in nsproxy.c and the like. So I copied Andrew to give him a heads up. here's a suggestion, we could keep the net namespace unshare patch out of david's tree, let andrew merge and release a new -mm and, then, send the net namespace unshare patch to andrew. that should keep nsproxy out of the andrew's merge challenge. But david's tree will miss the unshare part for a while. As for the clone flags, the values *must not* conflict but the patches probably will. C. I will gladly do what I can, to help. Working against 3 trees development at the moment is a bit of a development challenge. Eric ___ Containers mailing list [EMAIL PROTECTED] https://lists.linux-foundation.org/mailman/listinfo/containers - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net: Add network namespace clone unshare support.
Eric W. Biederman wrote: This patch allows you to create a new network namespace using sys_clone, or sys_unshare. As the network namespace is still experimental and under development clone and unshare support is only made available when CONFIG_NET_NS is selected at compile time. As this patch introduces network namespace support into code paths that exist when the CONFIG_NET is not selected there are a few additions made to net_namespace.h to allow a few more functions to be used when the networking stack is not compiled in. Signed-off-by: Eric W. Biederman [EMAIL PROTECTED] --- include/linux/sched.h |1 + include/net/net_namespace.h | 18 ++ kernel/fork.c |3 ++- kernel/nsproxy.c| 15 +-- net/Kconfig |8 net/core/net_namespace.c| 43 +-- 6 files changed, 83 insertions(+), 5 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index a01ac6d..e10a0a8 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -27,6 +27,7 @@ #define CLONE_NEWUTS 0x0400 /* New utsname group? */ #define CLONE_NEWIPC 0x0800 /* New ipcs */ #define CLONE_NEWUSER0x1000 /* New user namespace */ +#define CLONE_NEWNET 0x2000 /* New network namespace */ This new flag is going to conflict with the pid namespace flag CLONE_NEWPID in -mm. It might be worth changing it to: #define CLONE_NEWNET0x4000 The changes in nxproxy.c and fork.c will also conflict but I don't think we can do much about it for now. C. /* * Scheduling policies diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index ac8f830..3ea4194 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -38,11 +38,23 @@ extern struct net init_net; extern struct list_head net_namespace_list; +#ifdef CONFIG_NET +extern struct net *copy_net_ns(unsigned long flags, struct net *net_ns); +#else +static inline struct net *copy_net_ns(unsigned long flags, struct net *net_ns) +{ + /* There is nothing to copy so this is a noop */ + return net_ns; +} +#endif + extern void __put_net(struct net *net); static inline struct net *get_net(struct net *net) { +#ifdef CONFIG_NET atomic_inc(net-count); +#endif return net; } @@ -60,19 +72,25 @@ static inline struct net *maybe_get_net(struct net *net) static inline void put_net(struct net *net) { +#ifdef CONFIG_NET if (atomic_dec_and_test(net-count)) __put_net(net); +#endif } static inline struct net *hold_net(struct net *net) { +#ifdef CONFIG_NET atomic_inc(net-use_count); +#endif return net; } static inline void release_net(struct net *net) { +#ifdef CONFIG_NET atomic_dec(net-use_count); +#endif } extern void net_lock(void); diff --git a/kernel/fork.c b/kernel/fork.c index 33f12f4..5e67f90 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1608,7 +1608,8 @@ asmlinkage long sys_unshare(unsigned long unshare_flags) err = -EINVAL; if (unshare_flags ~(CLONE_THREAD|CLONE_FS|CLONE_NEWNS|CLONE_SIGHAND| CLONE_VM|CLONE_FILES|CLONE_SYSVSEM| - CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWUSER)) + CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWUSER| + CLONE_NEWNET)) goto bad_unshare_out; if ((err = unshare_thread(unshare_flags))) diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c index a4fb7d4..f1decd2 100644 --- a/kernel/nsproxy.c +++ b/kernel/nsproxy.c @@ -20,6 +20,7 @@ #include linux/mnt_namespace.h #include linux/utsname.h #include linux/pid_namespace.h +#include net/net_namespace.h static struct kmem_cache *nsproxy_cachep; @@ -98,8 +99,17 @@ static struct nsproxy *create_new_namespaces(unsigned long flags, goto out_user; } + new_nsp-net_ns = copy_net_ns(flags, tsk-nsproxy-net_ns); + if (IS_ERR(new_nsp-net_ns)) { + err = PTR_ERR(new_nsp-net_ns); + goto out_net; + } + return new_nsp; +out_net: + if (new_nsp-user_ns) + put_user_ns(new_nsp-user_ns); out_user: if (new_nsp-pid_ns) put_pid_ns(new_nsp-pid_ns); @@ -132,7 +142,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk) get_nsproxy(old_ns); - if (!(flags (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWUSER))) + if (!(flags (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWUSER | CLONE_NEWNET))) return 0; if (!capable(CAP_SYS_ADMIN)) { @@ -164,6 +174,7 @@ void free_nsproxy(struct nsproxy *ns) put_pid_ns(ns-pid_ns); if (ns-user_ns)
Re: [PATCH 0/12] L2 network namespace (v3)
Dmitry Mishin wrote: This is an update of L2 network namespaces patches. They are applicable to Cedric's 2.6.20-rc4-mm1-lxc2 tree. Changes: - updated to 2.6.20-rc4-mm1-lxc2 - current network context is per-CPU now - fixed compilation without CONFIG_NET_NS Changed current context definition should fix all mentioned by Cedric issues: - the nsproxy backpointer is unnecessary now - thus removed; - the push_net_ns() and pop_net_ns() use per-CPU variable now; - there is no race on -nsproxy between push_net_ns() and exit_task_namespaces() because they deals with differrent pointers. great ! Will integrate ASAP with daniel l3 patches and resend. thanks, C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] network namespaces
Eric W. Biederman wrote: This family of containers are used too for HPC (high performance computing) and for distributed checkpoint/restart. The cluster runs hundred of jobs, spawning them on different hosts inside an application container. Usually the jobs communicates with broadcast and multicast. Application containers does not care of having different MAC address and rely on a layer 3 approach. Ok I think to understand this we need some precise definitions. In the normal case it is an error for a job to communication with a different job. hmm ? What about an MPI application ? I would expect each MPI task to be run in its container on different nodes or on the same node. These individual tasks _communicate_ between each other through the MPI layer (not only TCP btw) to complete a large calculation. The basic advantage with a different MAC is that you can found out who the intended recipient is sooner in the networking stack and you have truly separate network devices. Allowing for a cleaner implementation. Changing the MAC after migration is likely to be fine. indeed. C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] network namespaces
Kir Kolyshkin wrote: snip I am not sure about network isolation (used by Linux-VServer), but as it comes for level2 vs. level3 virtualization, I see a need for both. Here is the easy-to-understand comparison which can shed some light: http://wiki.openvz.org/Differences_between_venet_and_veth thanks kir, Here are a couple of examples * Do we want to let container's owner (i.e. root) to add/remove IP addresses? Most probably not, but in some cases we want that. * Do we want to be able to run DHCP server and/or DHCP client inside a container? Sometimes...but not always. * Do we want to let container's owner to create/manage his own set of iptables? In half of the cases we do. The problem here is single solution will not cover all those scenarios. some would argue that there is one single solution : Xen or similar. IMO, I think containers should try to leverage their difference, performance, and not try to simulate a real hardware environment. Restricting the network environment of a container should be considered acceptable if this is for the sake of performance. The network interface(s) could be pre-configured and provided to the container. Protocol(s) could be forbidden. Now, if you need more network power in a container, you will need a real or a virtualized interface. But let's consider both alternatives. thanks, C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: strict isolation of net interfaces
Serge E. Hallyn wrote: The last one in your diagram confuses me - why foo0:1? I would have thought it'd be just thinking aloud. I thought that any kind/type of interface could be mapped from host to guest. host | guest 0 | guest 1 | guest2 --+---+---+-- | | | | |- l0 ---+- lo0 ... | lo0 | lo0 | | | | |- eth0| | | | | | | |- veth0 +- eth0| | | | | | |- veth1 +---+---+- eth0 | | | | |- veth2 ---+---+- eth0| I think we should avoid using device aliases, as trying to do something like giving eth0:1 to guest1 and eth0:2 to guest2 while hiding eth0:1 from guest2 requires some uglier code (as I recall) than working with full devices. In other words, if a namespace can see eth0, and eth0:2 exists, it should always see eth0:2. So conceptually using a full virtual net device per container certainly seems cleaner to me, and it seems like it should be simpler by way of statistics gathering etc, but are there actually any real gains? Or is the support for multiple IPs per device actually enough? Herbert, is this basically how ngnet is supposed to work? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network namespaces a path to mergable code.
Eric W. Biederman wrote: Despite what it might look like unix domain sockets do not live in the filesystem. They store a cookie in the filesystem that roughly corresponds to the port number of an AF_INET socket. When you open a socket the lookup is done by the cookie retrieved from the filesystem. unix domain socket lookup uses a path_lookup for sockets in the filesystem namespace and a find_by_name for socket in the abstract namespace. So except for their cookies unix domain sockets are always in the network stack. what is that cookie ? the file dentry and mnt ref ? so, ok, the resulting struct sock is part of the network namespace but there is a bridge with the filesystem namespace which does not prevent other namespaces to do a lookup. the lookup routine needs to be changed, this is any way necessary for the abstract namespace. I think we're reaching the limits of namespaces. It would be much easier with a container id in each kernel object we want to isolate. C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Network namespaces a path to mergable code.
Hello, Eric W. Biederman wrote: Thinking about this I am going to suggest a slightly different direction for get a patchset we can merge. First we concentrate on the fundamentals. - How we mark a device as belonging to a specific network namespace. - How we mark a socket as belonging to a specific network namespace. As part of the fundamentals we add a patch to the generic socket code that by default will disable it for protocol families that do not indicate support for handling network namespaces, on a non-default network namespace. I think that gives us a path that will allow us to convert the network stack one protocol family at a time instead of in one big lump. Stubbing off the sysfs and sysctl interfaces in the first round for the non-default namespaces as you have done should be good enough. The reason for the suggestion is that most of the work for the protocol stacks ipv4 ipv6 af_packet af_unix is largely noise, and simple replacement without real design work happening. Mostly it is just tweaking the code to remove global variables, and doing a couple lookups. How that proposal differs from the initial Daniel's patchset ? how far was that patchset to reach a similar agreement ? OK, i wear blue socks :), but I'm not advocating a patchset more than another i'm just looking for a shorter path. thanks, C. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html