net-2.6.24 rebased
Available as usual at: kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.24.git I resolved the SCTP and network driver conflicts that come standard with a rebase to Linus's current tree. We're up to 700 changesets and an 8.7 MB patch, w00t! I've been using it for an hour or so on my workstation so something works. It also passes an allmodconfig build on sparc64. Either later tonight or some time tomorrow I'll start hitting the patch backlog in my inbox. Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 3/3] git-net: sctp build fix (not for applying)
From: Vlad Yasevich <[EMAIL PROTECTED]> Date: Wed, 03 Oct 2007 09:50:55 -0400 > [EMAIL PROTECTED] wrote: > > From: Andrew Morton <[EMAIL PROTECTED]> > > > > net/sctp/sm_statetable.c:551: error: 'sctp_sf_tabort_8_4_8' undeclared here > > (not in a function) > > > > Andrew, is the a result of the merge of net-2.6.24 with net-2.6? Actually, it is a result of merging with Linus's tree since your SCTP bits were there already, that's why Andrew hit this. > That's the only way I see this happening. Right. I'll resolve this cleanly as I rebase net-2.6.24 today. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net: fix race in process_backlog
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Wed, 3 Oct 2007 15:05:19 -0700 > On Wed, 03 Oct 2007 14:58:07 -0700 (PDT) > David Miller <[EMAIL PROTECTED]> wrote: > > > From: Peter Zijlstra <[EMAIL PROTECTED]> > > Date: Wed, 03 Oct 2007 17:44:53 +0200 > > > > > Index: linux-2.6/net/core/dev.c > > > === > > > --- linux-2.6.orig/net/core/dev.c > > > +++ linux-2.6/net/core/dev.c > > > @@ -2095,11 +2095,11 @@ static int process_backlog(struct napi_s > > > > > > local_irq_disable(); > > > skb = __skb_dequeue(&queue->input_pkt_queue); > > > - local_irq_enable(); > > > if (!skb) { > > > - napi_complete(napi); > > > + __napi_complete(napi); > > > break; > > > } > > > + local_irq_enable(); > > > > What re-enables interrupts in the !skb path? > > This looks like a better fix. the irq_enable is needed in both cases. Yep, applied, thanks Peter and Stephen. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] r8169: revert part of 6dccd16b7c2703e8bbf8bca62b5cf248332afbe2
The 8169/8110SC currently announces itself as: [...] eth0: RTL8169sc/8110sc at 0x, ..:..:..:..:..:.., XID 1800 IRQ .. It uses RTL_GIGA_MAC_VER_05 and this part of the changeset can cut its performance by a factor of 2~2.5 as reported by Timo. (the driver includes code just before the hunk to write the ChipCmd register when mac_version == RTL_GIGA_MAC_VER_0[1-4]) Signed-off-by: Francois Romieu <[EMAIL PROTECTED]> Cc: Timo Jantunen <[EMAIL PROTECTED]> --- drivers/net/r8169.c | 16 +--- 1 files changed, 13 insertions(+), 3 deletions(-) diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c index c921ec3..c76dd29 100644 --- a/drivers/net/r8169.c +++ b/drivers/net/r8169.c @@ -1918,7 +1918,11 @@ static void rtl_hw_start_8169(struct net_device *dev) rtl_set_rx_max_size(ioaddr); - rtl_set_rx_tx_config_registers(tp); + if ((tp->mac_version == RTL_GIGA_MAC_VER_01) || + (tp->mac_version == RTL_GIGA_MAC_VER_02) || + (tp->mac_version == RTL_GIGA_MAC_VER_03) || + (tp->mac_version == RTL_GIGA_MAC_VER_04)) + rtl_set_rx_tx_config_registers(tp); tp->cp_cmd |= rtl_rw_cpluscmd(ioaddr) | PCIMulRW; @@ -1941,6 +1945,14 @@ static void rtl_hw_start_8169(struct net_device *dev) rtl_set_rx_tx_desc_registers(tp, ioaddr); + if ((tp->mac_version != RTL_GIGA_MAC_VER_01) && + (tp->mac_version != RTL_GIGA_MAC_VER_02) && + (tp->mac_version != RTL_GIGA_MAC_VER_03) && + (tp->mac_version != RTL_GIGA_MAC_VER_04)) { + RTL_W8(ChipCmd, CmdTxEnb | CmdRxEnb); + rtl_set_rx_tx_config_registers(tp); + } + RTL_W8(Cfg9346, Cfg9346_Lock); /* Initially a 10 us delay. Turned it into a PCI commit. - FR */ @@ -1955,8 +1967,6 @@ static void rtl_hw_start_8169(struct net_device *dev) /* Enable all known interrupts by setting the interrupt mask. */ RTL_W16(IntrMask, tp->intr_event); - - RTL_W8(ChipCmd, CmdTxEnb | CmdRxEnb); } static void rtl_hw_start_8168(struct net_device *dev) -- 1.5.3.2 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] qe: miscellaneous code improvements and fixes to the QE library
On Oct 3, 2007, at 1:00 PM, Timur Tabi wrote: Stephen Hemminger wrote: Separate the changes into individual patches to allow for better comment/review and bisection in case of regression. That would be too difficult. Some of the changes are single lines, and this patch has already been approved -- I just cross-posted to netdev because I made a few ucc_geth changes that can't be docoupled from the powerpc changes. A series of 18 patches would just be convoluted. Normally I would agree, but at this point I'm not going to gripe too much about it. - k - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [git patches] net driver updates
From: Jeff Garzik <[EMAIL PROTECTED]> Date: Wed, 3 Oct 2007 14:39:16 -0400 > > Normally I wait a day or two between pushes, to queue up patches and > also to avoid annoying my upstream :) But this includes a couple fixes > I felt should be upstreamed sooner rather than later. > > Please pull from 'upstream' branch of > master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git upstream Pulled, thanks Jeff! - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net: fix race in process_backlog
On Wed, 03 Oct 2007 14:58:07 -0700 (PDT) David Miller <[EMAIL PROTECTED]> wrote: > From: Peter Zijlstra <[EMAIL PROTECTED]> > Date: Wed, 03 Oct 2007 17:44:53 +0200 > > > Index: linux-2.6/net/core/dev.c > > === > > --- linux-2.6.orig/net/core/dev.c > > +++ linux-2.6/net/core/dev.c > > @@ -2095,11 +2095,11 @@ static int process_backlog(struct napi_s > > > > local_irq_disable(); > > skb = __skb_dequeue(&queue->input_pkt_queue); > > - local_irq_enable(); > > if (!skb) { > > - napi_complete(napi); > > + __napi_complete(napi); > > break; > > } > > + local_irq_enable(); > > What re-enables interrupts in the !skb path? This looks like a better fix. the irq_enable is needed in both cases. --- a/net/core/dev.c2007-09-27 07:19:10.0 -0700 +++ b/net/core/dev.c2007-10-03 15:03:54.0 -0700 @@ -2077,12 +2077,14 @@ static int process_backlog(struct napi_s local_irq_disable(); skb = __skb_dequeue(&queue->input_pkt_queue); - local_irq_enable(); if (!skb) { - napi_complete(napi); + __napi_complete(napi); + local_irq_enable(); break; } + local_irq_enable(); + dev = skb->dev; netif_receive_skb(skb); -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net: fix race in process_backlog
From: Peter Zijlstra <[EMAIL PROTECTED]> Date: Wed, 03 Oct 2007 17:44:53 +0200 > Index: linux-2.6/net/core/dev.c > === > --- linux-2.6.orig/net/core/dev.c > +++ linux-2.6/net/core/dev.c > @@ -2095,11 +2095,11 @@ static int process_backlog(struct napi_s > > local_irq_disable(); > skb = __skb_dequeue(&queue->input_pkt_queue); > - local_irq_enable(); > if (!skb) { > - napi_complete(napi); > + __napi_complete(napi); > break; > } > + local_irq_enable(); What re-enables interrupts in the !skb path? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please pull 'upstream-davem' branch of wireless-2.6
From: "John W. Linville" <[EMAIL PROTECTED]> Date: Wed, 3 Oct 2007 10:10:51 -0400 > So I'm not sure what happened for you. But I think it must have been > some other anomaly. Ok, I'll take some detailed notes next time it happens so we can figure out why :-) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tcp bw in 2.6
On Wed, Oct 03, 2007 at 02:23:58PM -0700, Larry McVoy wrote: > > A few notes to the discussion. I've seen one e1000 "bug" that ended up being > > a crappy AMD pre-opteron SMP chipset with a totally useless PCI bus > > implementation, which limited performance quite a bit-totally depending on > > what you plugged in and in which slot. 10e milk-and-bread-store > > 32/33 gige nics actually were better than server-class e1000's > > in those, but weren't that great either. > > That could well be my problem, this is a dual processor (not core) athlon > (not opteron) tyan motherboard if I recall correctly. If it's AMD760/768MPX, here's some relevant discussion: http://lkml.org/lkml/2002/7/18/292 http://www.ussg.iu.edu/hypermail/linux/kernel/0307.1/1109.html http://www.ussg.iu.edu/hypermail/linux/kernel/0307.1/1154.html http://www.ussg.iu.edu/hypermail/linux/kernel/0307.1/1212.html http://forums.2cpu.com/showthread.php?s=&threadid=31211 > > > Check your interrupt rates for the interface. You shouldn't be getting > > anywhere near 1 interrupt/packet. If you are, something is badly wrong :). > > The acks (because I'm sending) are about 1.5 packets/interrupt. > When this box is receiving it's moving about 3x ass much data > and has a _lower_ (absolute, not per packet) interrupt load. Probably not a problem then, since those acks probably cover many sent packets. Current interrupt mitigation schemes are pretty dynamic, balancing between latency and bulk performance so the acks might be fine (thousands vs. tens of thousands/sec) -- Pekka Pietikainen - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tcp bw in 2.6
On Tue, Oct 02, 2007 at 02:21:32PM -0700, Larry McVoy wrote: > More data, sky2 works fine (really really fine, like 79MB/sec) between > Linux dylan.bitmover.com 2.6.18.1 #5 SMP Mon Oct 23 17:36:00 PDT 2006 i686 > Linux steele 2.6.20-16-generic #2 SMP Sun Sep 23 18:31:23 UTC 2007 x86_64 > > So this is looking like a e1000 bug. I'll try to upgrade the kernel on > the ia64 box and see what happens. A few notes to the discussion. I've seen one e1000 "bug" that ended up being a crappy AMD pre-opteron SMP chipset with a totally useless PCI bus implementation, which limited performance quite a bit-totally depending on what you plugged in and in which slot. 10e milk-and-bread-store 32/33 gige nics actually were better than server-class e1000's in those, but weren't that great either. A few things worth trying out is using recv(.., MSG_TRUNC ) on the receiver, that tests the theoretical sender maximum performance much better (but memory bandwidth vs. GigE is much higher these days than it was in 2001 so maybe not that useful anymore). Check your interrupt rates for the interface. You shouldn't be getting anywhere near 1 interrupt/packet. If you are, something is badly wrong :). Running getsockopt(...TCP_INFO) every few secs on the socket and printing that out can be useful too. That gives you both sides' idea on what the tcp windows etc. are. My favourite tool is a home-made thing called yantt btw. ( http://www.ee.oulu.fi/~pp/yantt.tgz . Needs lots of cleanup love, it mucks with the window sizes by default, since in the 2.4 days you really had to do that to get any kind of performance and the help text is wrong. But it's pretty easy to hack to try out new ideas, use sendfile/MSG_TRUNC/TCP_INFO etc. Netperf is the kitchen sink of network benchmark tools. But trying out a few tiny things with it is not fun at all, I tried and quickly decided to write my own tool for my master's thesis work ;-) Oh. Don't measure CPU usage with top. Use a cyclesoaker (google for cyclesoak, I included akpm's with yantt) :-) And yes. TCP stacks do have bugs, especially when things get outside the equipment most people have. Having a dedicated transatlantic 2.5Gbps connection found a really fun one a long time ago ;) -- Pekka Pietikainen - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tcp bw in 2.6
> A few notes to the discussion. I've seen one e1000 "bug" that ended up being > a crappy AMD pre-opteron SMP chipset with a totally useless PCI bus > implementation, which limited performance quite a bit-totally depending on > what you plugged in and in which slot. 10e milk-and-bread-store > 32/33 gige nics actually were better than server-class e1000's > in those, but weren't that great either. That could well be my problem, this is a dual processor (not core) athlon (not opteron) tyan motherboard if I recall correctly. > Check your interrupt rates for the interface. You shouldn't be getting > anywhere near 1 interrupt/packet. If you are, something is badly wrong :). The acks (because I'm sending) are about 1.5 packets/interrupt. When this box is receiving it's moving about 3x ass much data and has a _lower_ (absolute, not per packet) interrupt load. -- --- Larry McVoylm at bitmover.com http://www.bitkeeper.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: lockdep report from bonding.
On Wed, Oct 03, 2007 at 01:05:14PM -0400, Dave Jones wrote: > Reported by a Fedora user this morning. > > Ethernet Channel Bonding Driver: v3.1.3 (June 13, 2007) > bonding: MII link monitoring set to 100 ms > ADDRCONF(NETDEV_UP): bond0: link is not ready > bonding: bond0: Adding slave eth0. > e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex > bonding: bond0: making interface eth0 the new active one. > bonding: bond0: enslaving eth0 as an active interface with an up link. > > = > [ INFO: inconsistent lock state ] > 2.6.23-0.214.rc8.git2.fc8 #1 > - > inconsistent {softirq-on-W} -> {in-softirq-W} usage. > events/1/10 [HC0[0]:SC1[1]:HE1:SE0] takes: > (&(bond_info->tx_hashtbl_lock)){-+..}, at: [] > tlb_clear_slave+0x1d/0x9a [bonding] > {softirq-on-W} state was registered at: > [] __lock_acquire+0x4ff/0xc67 > [] lock_acquire+0x7b/0x9e > [] _spin_lock+0x2e/0x58 > [] bond_alb_initialize+0x64/0x18e [bonding] > [] bond_open+0x33/0x178 [bonding] > [] dev_open+0x31/0x6c > [] dev_change_flags+0xa3/0x156 > [] devinet_ioctl+0x207/0x50e > [] inet_ioctl+0x86/0xa4 > [] sock_ioctl+0x1ac/0x1c9 > [] do_ioctl+0x22/0x68 > [] vfs_ioctl+0x249/0x25c > [] sys_ioctl+0x49/0x64 > [] syscall_call+0x7/0xb > [] 0x > irq event stamp: 40878 > hardirqs last enabled at (40878): [] _spin_unlock_irq+0x22/0x2f > hardirqs last disabled at (40877): [] _spin_lock_irq+0x19/0x67 > softirqs last enabled at (40872): [] rt_run_flush+0x6e/0x97 > softirqs last disabled at (40873): [] do_softirq+0x74/0xf7 > > other info that might help us debug this: > 3 locks held by events/1/10: > #0: (rtnl_mutex){--..}, at: [] mutex_lock+0x21/0x24 > #1: (&bond->lock){-.-+}, at: [] bond_alb_monitor+0x16/0x26e > [bonding] > #2: (&bond->curr_slave_lock){..-+}, at: [] > bond_alb_monitor+0xa9/0x26e [bonding] > > stack backtrace: > [] show_trace_log_lvl+0x1a/0x2f > [] show_trace+0x12/0x14 > [] dump_stack+0x16/0x18 > [] print_usage_bug+0x141/0x14b > [] mark_lock+0x12f/0x472 > [] __lock_acquire+0x487/0xc67 > [] lock_acquire+0x7b/0x9e > [] _spin_lock+0x2e/0x58 > [] tlb_clear_slave+0x1d/0x9a [bonding] > [] bond_alb_monitor+0xc3/0x26e [bonding] > [] run_timer_softirq+0x127/0x18f > [] __do_softirq+0x78/0xff > [] do_softirq+0x74/0xf7 > === > ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready > bonding: bond0: Adding slave eth1. > This isn't surprising and rears it's head every once in a while. It probably becomes more apparent on faster, multiprocessor systems. The big bonding-workqueue conversion patch that Jay and I have been testing for a while should resolve this one too (since it moves the monitoring out of softirq context *and* it switches the hashtbl locks to _bh ones along with a bunch of other changes). - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix rose.ko oops on unload
On Wed, Oct 03, 2007 at 03:04:20PM -0400, Jeff Garzik wrote: > Alexey Dobriyan wrote: > >Quick'n'dirty fix to 100% oops on "rmmod rose". Do you want me to > >properly unwind everything before .24? > >--- > >Commit a3d384029aa304f8f3f5355d35f0ae274454f7cd aka > >"[AX.25]: Fix unchecked rose_add_loopback_neigh uses" > >transformed rose_loopback_neigh var into statically allocated one. > >However, on unload it will be kfree's which can't work. > > I'm definitely missing something... assuming your patch is applied, > where is the kfree() for rose_loopback_neigh ? AFAICS, it will be glued to rose_neigh_list in rose_add_loopback_neigh(). On unload, found and rose_remove_neigh() will free it. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix rose.ko oops on unload
Alexey Dobriyan wrote: Quick'n'dirty fix to 100% oops on "rmmod rose". Do you want me to properly unwind everything before .24? --- Commit a3d384029aa304f8f3f5355d35f0ae274454f7cd aka "[AX.25]: Fix unchecked rose_add_loopback_neigh uses" transformed rose_loopback_neigh var into statically allocated one. However, on unload it will be kfree's which can't work. I'm definitely missing something... assuming your patch is applied, where is the kfree() for rose_loopback_neigh ? Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sky2: jumbo frame regression fix
On ons, 2007-10-03 at 14:04 -0400, Bill Davidsen wrote: > Ian Kumlien wrote: > > On tis, 2007-10-02 at 18:02 -0700, Stephen Hemminger wrote: > >> Remove unneeded check that caused problems with jumbo frame sizes. > >> The check was recently added and is wrong. > >> When using jumbo frames the sky2 driver does fragmentation, so > >> rx_data_size is less than mtu. > > > > Confirmed working. > > > > Now running with 9k mtu with no errors, =) > > Have you verified that you are actually getting jumbo packets out of the > NIC? I had one machine which did standard packets silently using sky2 > and jumbo using sk98lin. I was looking for something else with tcpdump > and got one of those WTF moments when I saw all the tiny packets. 20:27:06.542461 IP pi.local > blue.local: ICMP echo request, id 27173, seq 42, length 8008 20:27:06.543136 IP blue.local > pi.local: ICMP echo reply, id 27173, seq 42, length 8008 That should solve it for us, right? =) -- Ian Kumlien -- http://pomac.netswarm.net signature.asc Description: This is a digitally signed message part
[PATCH] Fix rose.ko oops on unload
Quick'n'dirty fix to 100% oops on "rmmod rose". Do you want me to properly unwind everything before .24? --- Commit a3d384029aa304f8f3f5355d35f0ae274454f7cd aka "[AX.25]: Fix unchecked rose_add_loopback_neigh uses" transformed rose_loopback_neigh var into statically allocated one. However, on unload it will be kfree's which can't work. Steps to reproduce: modprobe rose rmmod rose BUG: unable to handle kernel NULL pointer dereference at virtual address 0008 printing eip: c014c664 *pde = Oops: [#1] PREEMPT DEBUG_PAGEALLOC Modules linked in: rose ax25 fan ufs loop usbhid rtc snd_intel8x0 snd_ac97_codec ehci_hcd ac97_bus uhci_hcd thermal usbcore button processor evdev sr_mod cdrom CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 00210086 (2.6.23-rc9 #3) EIP is at kfree+0x48/0xa1 eax: 0556 ebx: c1734aa0 ecx: f6a5e000 edx: f7082000 esi: edi: f9a55d20 ebp: 00200287 esp: f6a5ef28 ds: 007b es: 007b fs: gs: 0033 ss: 0068 Process rmmod (pid: 1823, ti=f6a5e000 task=f7082000 task.ti=f6a5e000) Stack: f9a55d20 f9a5200c f6a5e000 f9a5200c f9a55a00 bf818cf0 f9a51f3f f9a55a00 c0132c60 65736f72 f69f9630 f69f9528 c014244a f6a4e900 00200246 f7082000 c01025e6 Call Trace: [] rose_rt_free+0x1d/0x49 [rose] [] rose_rt_free+0x1d/0x49 [rose] [] rose_exit+0x4c/0xd5 [rose] [] sys_delete_module+0x15e/0x186 [] remove_vma+0x40/0x45 [] sysenter_past_esp+0x8f/0x99 [] trace_hardirqs_on+0x118/0x13b [] sysenter_past_esp+0x5f/0x99 === Code: 05 03 1d 80 db 5b c0 8b 03 25 00 40 02 00 3d 00 40 02 00 75 03 8b 5b 0c 8b 73 10 8b 44 24 18 89 44 24 04 9c 5d fa e8 77 df fd ff <8b> 56 08 89 f8 e8 84 f4 fd ff e8 bd 32 06 00 3b 5c 86 60 75 0f EIP: [] kfree+0x48/0xa1 SS:ESP 0068:f6a5ef28 Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> --- include/net/rose.h |2 +- net/rose/rose_loopback.c |4 ++-- net/rose/rose_route.c| 15 ++- 3 files changed, 13 insertions(+), 8 deletions(-) --- a/include/net/rose.h +++ b/include/net/rose.h @@ -188,7 +188,7 @@ extern void rose_kick(struct sock *); extern void rose_enquiry_response(struct sock *); /* rose_route.c */ -extern struct rose_neigh rose_loopback_neigh; +extern struct rose_neigh *rose_loopback_neigh; extern const struct file_operations rose_neigh_fops; extern const struct file_operations rose_nodes_fops; extern const struct file_operations rose_routes_fops; --- a/net/rose/rose_loopback.c +++ b/net/rose/rose_loopback.c @@ -79,7 +79,7 @@ static void rose_loopback_timer(unsigned long param) skb_reset_transport_header(skb); - sk = rose_find_socket(lci_o, &rose_loopback_neigh); + sk = rose_find_socket(lci_o, rose_loopback_neigh); if (sk) { if (rose_process_rx_frame(sk, skb) == 0) kfree_skb(skb); @@ -88,7 +88,7 @@ static void rose_loopback_timer(unsigned long param) if (frametype == ROSE_CALL_REQUEST) { if ((dev = rose_dev_get(dest)) != NULL) { - if (rose_rx_call_request(skb, dev, &rose_loopback_neigh, lci_o) == 0) + if (rose_rx_call_request(skb, dev, rose_loopback_neigh, lci_o) == 0) kfree_skb(skb); } else { kfree_skb(skb); --- a/net/rose/rose_route.c +++ b/net/rose/rose_route.c @@ -45,7 +45,7 @@ static DEFINE_SPINLOCK(rose_neigh_list_lock); static struct rose_route *rose_route_list; static DEFINE_SPINLOCK(rose_route_list_lock); -struct rose_neigh rose_loopback_neigh; +struct rose_neigh *rose_loopback_neigh; /* * Add a new route to a node, and in the process add the node and the @@ -362,7 +362,12 @@ out: */ void rose_add_loopback_neigh(void) { - struct rose_neigh *sn = &rose_loopback_neigh; + struct rose_neigh *sn; + + rose_loopback_neigh = kmalloc(sizeof(struct rose_neigh), GFP_KERNEL); + if (!rose_loopback_neigh) + return; + sn = rose_loopback_neigh; sn->callsign = null_ax25_address; sn->digipeat = NULL; @@ -417,13 +422,13 @@ int rose_add_loopback_node(rose_address *address) rose_node->mask = 10; rose_node->count= 1; rose_node->loopback = 1; - rose_node->neighbour[0] = &rose_loopback_neigh; + rose_node->neighbour[0] = rose_loopback_neigh; /* Insert at the head of list. Address is always mask=10 */ rose_node->next = rose_node_list; rose_node_list = rose_node; - rose_loopback_neigh.count++; + rose_loopback_neigh->count++; out: spin_unlock_bh(&rose_node_list_lock); @@ -454,7 +459,7 @@ void rose_del_loopback_node(rose_address *address) rose_remove_node(rose_node); -
Re: InfiniBand/RDMA merge plans for 2.6.24
Roland Dreier <[EMAIL PROTECTED]> wrote on 09/17/2007 02:47:42 PM: > > > IPoIB CM handles this properly by gathering together single pages in > > > skbs' fragment lists. > > > Then can we reuse IPoIB CM code here? > > Yes, if possible, refactoring things so that the rx skb allocation > code becomes common between CM and non-CM would definitely make sense. IPoIB-CM rx skb allocation is not generic to be used by UD, it allocates more buffers than needed if mtu is not 64K, and doesn't query the real max_num_sg from the device. I am thinking to have a generic skb allocation in IPoIB based on matrix of (ipoib-mtu-size, page-size, max_num_sg, head-size). Thanks Shirley - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[git patches] net driver fixes
sky2 is really the only important fix, the others are trivial. Please pull from 'upstream-linus' branch of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git upstream-linus to receive the following updates: drivers/net/sky2.c |3 --- drivers/net/wireless/bcm43xx/bcm43xx_wx.c |2 +- net/ieee80211/softmac/ieee80211softmac_wx.c |2 +- 3 files changed, 2 insertions(+), 5 deletions(-) Joe Perches (1): bcm43xx: Correct printk with PFX before KERN_ Richard Knutsson (1): softmac: Fix compiler-warning Stephen Hemminger (1): sky2: jumbo frame regression fix diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c index 162489b..ea117fc 100644 --- a/drivers/net/sky2.c +++ b/drivers/net/sky2.c @@ -2163,9 +2163,6 @@ static struct sk_buff *sky2_receive(struct net_device *dev, sky2->rx_next = (sky2->rx_next + 1) % sky2->rx_pending; prefetch(sky2->rx_ring + sky2->rx_next); - if (length < ETH_ZLEN || length > sky2->rx_data_size) - goto len_error; - /* This chip has hardware problems that generates bogus status. * So do only marginal checking and expect higher level protocols * to handle crap frames. diff --git a/drivers/net/wireless/bcm43xx/bcm43xx_wx.c b/drivers/net/wireless/bcm43xx/bcm43xx_wx.c index d6d9413..6acfdc4 100644 --- a/drivers/net/wireless/bcm43xx/bcm43xx_wx.c +++ b/drivers/net/wireless/bcm43xx/bcm43xx_wx.c @@ -444,7 +444,7 @@ static int bcm43xx_wx_set_xmitpower(struct net_device *net_dev, u16 maxpower; if ((data->txpower.flags & IW_TXPOW_TYPE) != IW_TXPOW_DBM) { - printk(PFX KERN_ERR "TX power not in dBm.\n"); + printk(KERN_ERR PFX "TX power not in dBm.\n"); return -EOPNOTSUPP; } diff --git a/net/ieee80211/softmac/ieee80211softmac_wx.c b/net/ieee80211/softmac/ieee80211softmac_wx.c index 442b987..5742dc8 100644 --- a/net/ieee80211/softmac/ieee80211softmac_wx.c +++ b/net/ieee80211/softmac/ieee80211softmac_wx.c @@ -114,7 +114,7 @@ check_assoc_again: sm->associnfo.associating = 1; /* queue lower level code to do work (if necessary) */ schedule_delayed_work(&sm->associnfo.work, 0); -out: + mutex_unlock(&sm->associnfo.mutex); return 0; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[git patches] net driver updates
Normally I wait a day or two between pushes, to queue up patches and also to avoid annoying my upstream :) But this includes a couple fixes I felt should be upstreamed sooner rather than later. Please pull from 'upstream' branch of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git upstream to receive the following updates: Jeff Garzik (1): drivers/net/qla3xxx: trim trailing whitespace Olof Johansson (11): pasemi_mac: basic error checking pasemi_mac: fix bug in receive buffer dma mapping pasemi_mac: rework ring management pasemi_mac: implement sg support pasemi_mac: workaround for erratum 5971 pasemi_mac: add local skb alignment pasemi_mac: further performance tweaks pasemi_mac: update todo list pasemi_mac: clear out old errors on interface open pasemi_mac: use buffer index pointer in clean_rx() pasemi_mac: enable iommu support trem (1): ipg.c doesn't compile with with CONFIG_HIGHMEM64G [EMAIL PROTECTED] (1): Fix typo in new EMAC driver. arch/powerpc/platforms/pasemi/Kconfig | 10 arch/powerpc/platforms/pasemi/iommu.c | 15 drivers/net/ibm_newemac/core.c|4 drivers/net/ipg.c | 10 drivers/net/pasemi_mac.c | 595 ++ drivers/net/pasemi_mac.h | 67 ++- drivers/net/qla3xxx.c | 128 +++ drivers/net/qla3xxx.h |6 8 files changed, 527 insertions(+), 308 deletions(-) diff --git a/arch/powerpc/platforms/pasemi/Kconfig b/arch/powerpc/platforms/pasemi/Kconfig index 95cd90f..e95261e 100644 --- a/arch/powerpc/platforms/pasemi/Kconfig +++ b/arch/powerpc/platforms/pasemi/Kconfig @@ -18,6 +18,16 @@ config PPC_PASEMI_IOMMU help IOMMU support for PA6T-1682M +config PPC_PASEMI_IOMMU_DMA_FORCE + bool "Force DMA engine to use IOMMU" + depends on PPC_PASEMI_IOMMU + help + This option forces the use of the IOMMU also for the + DMA engine. Otherwise the kernel will use it only when + running under a hypervisor. + + If in doubt, say "N". + config PPC_PASEMI_MDIO depends on PHYLIB tristate "MDIO support via GPIO" diff --git a/arch/powerpc/platforms/pasemi/iommu.c b/arch/powerpc/platforms/pasemi/iommu.c index 9014d55..ab5 100644 --- a/arch/powerpc/platforms/pasemi/iommu.c +++ b/arch/powerpc/platforms/pasemi/iommu.c @@ -25,6 +25,7 @@ #include #include #include +#include #define IOBMAP_PAGE_SHIFT 12 @@ -175,13 +176,17 @@ static void pci_dma_dev_setup_pasemi(struct pci_dev *dev) { pr_debug("pci_dma_dev_setup, dev %p (%s)\n", dev, pci_name(dev)); - /* DMA device is untranslated, but all other PCI-e goes through -* the IOMMU +#if !defined(CONFIG_PPC_PASEMI_IOMMU_DMA_FORCE) + /* For non-LPAR environment, don't translate anything for the DMA +* engine. The exception to this is if the user has enabled +* CONFIG_PPC_PASEMI_IOMMU_DMA_FORCE at build time. */ - if (dev->vendor == 0x1959 && dev->device == 0xa007) + if (dev->vendor == 0x1959 && dev->device == 0xa007 && + !firmware_has_feature(FW_FEATURE_LPAR)) dev->dev.archdata.dma_ops = &dma_direct_ops; - else - dev->dev.archdata.dma_data = &iommu_table_iobmap; +#endif + + dev->dev.archdata.dma_data = &iommu_table_iobmap; } static void pci_dma_bus_setup_null(struct pci_bus *b) { } diff --git a/drivers/net/ibm_newemac/core.c b/drivers/net/ibm_newemac/core.c index 653bfdc..ce127b9 100644 --- a/drivers/net/ibm_newemac/core.c +++ b/drivers/net/ibm_newemac/core.c @@ -1232,9 +1232,9 @@ static inline int emac_xmit_finish(struct emac_instance *dev, int len) * instead */ if (emac_has_feature(dev, EMAC_FTR_EMAC4)) - out_be32(&p->tmr0, EMAC_TMR0_XMIT); - else out_be32(&p->tmr0, EMAC4_TMR0_XMIT); + else + out_be32(&p->tmr0, EMAC_TMR0_XMIT); if (unlikely(++dev->tx_cnt == NUM_TX_BUFF)) { netif_stop_queue(ndev); diff --git a/drivers/net/ipg.c b/drivers/net/ipg.c index dfdc96f..59898ce 100644 --- a/drivers/net/ipg.c +++ b/drivers/net/ipg.c @@ -25,6 +25,8 @@ #include #include +#include + #define IPG_RX_RING_BYTES (sizeof(struct ipg_rx) * IPG_RFDLIST_LENGTH) #define IPG_TX_RING_BYTES (sizeof(struct ipg_tx) * IPG_TFDLIST_LENGTH) #define IPG_RESET_MASK \ @@ -836,10 +838,14 @@ static void ipg_nic_txfree(struct net_device *dev) { struct ipg_nic_private *sp = netdev_priv(dev); void __iomem *ioaddr = sp->ioaddr; - const unsigned int curr = ipg_r32(TFD_LIST_PTR_0) - - (sp->txd_map / sizeof(struct ipg_tx)) - 1; + unsigned int curr; + u64 txd_map; unsigned int released, pending; + txd_map = (u64)sp->txd_map; + curr = ipg_r32(TFD_LIST_PTR_0) - +
Re: [PATCH RESEND] [11/11] pasemi_mac: enable iommu support
Olof Johansson wrote: pasemi_mac: enable iommu support Enable IOMMU support for pasemi_mac, but avoid using it on non-partitioned systems for performance reasons. The user can override this by selecting the PPC_PASEMI_IOMMU_DMA_FORCE configuration option. Signed-off-by: Olof Johansson <[EMAIL PROTECTED]> applied - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [9/11] pasemi_mac: clear out old errors on interface open
Olof Johansson wrote: On Wed, Oct 03, 2007 at 01:46:16PM -0400, Jeff Garzik wrote: Olof Johansson wrote: pasemi_mac: clear out old errors on interface open Clear out any pending errors when an interface is brought up. Since the bits are sticky, they might be from interface shutdown time after firmware has used it, etc. Signed-off-by: Olof Johansson <[EMAIL PROTECTED]> In general, interface-open should completely reset and initialize the hardware. does pasemi_mac not do that? There's no explicit way to reset just one interface besides disabling it (which we do at close, and re-enable at open). It seems that some of the error bits are sticky across disable/enable, which is why this was needed. Also, they're RW1C, so writing 0 doesn't remove them (need to write 1 to clear). OK just making sure, thanks. The only other dependency from firmware at this time is the setting of mac addresses, something that will be taken care of once we allow override of them via ethtool, since we'd need to program them from the driver then no matter what. Right now we assume that firmware has programmed it. Standard procedure for this is * upon module-load, obtain the MAC address from * upon interface-up, program dev->dev_addr[] into chip's RX filter (aka MAC address) registers That permits the admin to override the MAC address via ifconfig. (ethtool doesn't support that, but you basically had the right idea) Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] qe: miscellaneous code improvements and fixes to the QE library
Stephen Hemminger wrote: Separate the changes into individual patches to allow for better comment/review and bisection in case of regression. That would be too difficult. Some of the changes are single lines, and this patch has already been approved -- I just cross-posted to netdev because I made a few ucc_geth changes that can't be docoupled from the powerpc changes. A series of 18 patches would just be convoluted. -- Timur Tabi Linux Kernel Developer @ Freescale - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RESEND] [11/11] pasemi_mac: enable iommu support
pasemi_mac: enable iommu support Enable IOMMU support for pasemi_mac, but avoid using it on non-partitioned systems for performance reasons. The user can override this by selecting the PPC_PASEMI_IOMMU_DMA_FORCE configuration option. Signed-off-by: Olof Johansson <[EMAIL PROTECTED]> --- On Wed, Oct 03, 2007 at 01:47:17PM -0400, Jeff Garzik wrote: > You sent patch #10 against as patch #11 :) Oops! Here's the real copy. -Olof Index: k.org/arch/powerpc/platforms/pasemi/iommu.c === --- k.org.orig/arch/powerpc/platforms/pasemi/iommu.c +++ k.org/arch/powerpc/platforms/pasemi/iommu.c @@ -25,6 +25,7 @@ #include #include #include +#include #define IOBMAP_PAGE_SHIFT 12 @@ -175,13 +176,17 @@ static void pci_dma_dev_setup_pasemi(str { pr_debug("pci_dma_dev_setup, dev %p (%s)\n", dev, pci_name(dev)); - /* DMA device is untranslated, but all other PCI-e goes through -* the IOMMU +#if !defined(CONFIG_PPC_PASEMI_IOMMU_DMA_FORCE) + /* For non-LPAR environment, don't translate anything for the DMA +* engine. The exception to this is if the user has enabled +* CONFIG_PPC_PASEMI_IOMMU_DMA_FORCE at build time. */ - if (dev->vendor == 0x1959 && dev->device == 0xa007) + if (dev->vendor == 0x1959 && dev->device == 0xa007 && + !firmware_has_feature(FW_FEATURE_LPAR)) dev->dev.archdata.dma_ops = &dma_direct_ops; - else - dev->dev.archdata.dma_data = &iommu_table_iobmap; +#endif + + dev->dev.archdata.dma_data = &iommu_table_iobmap; } static void pci_dma_bus_setup_null(struct pci_bus *b) { } Index: k.org/drivers/net/pasemi_mac.c === --- k.org.orig/drivers/net/pasemi_mac.c +++ k.org/drivers/net/pasemi_mac.c @@ -34,6 +34,7 @@ #include #include +#include #include "pasemi_mac.h" @@ -89,6 +90,15 @@ MODULE_PARM_DESC(debug, "PA Semi MAC bit static struct pasdma_status *dma_status; +static int translation_enabled(void) +{ +#if defined(CONFIG_PPC_PASEMI_IOMMU_DMA_FORCE) + return 1; +#else + return firmware_has_feature(FW_FEATURE_LPAR); +#endif +} + static void write_iob_reg(struct pasemi_mac *mac, unsigned int reg, unsigned int val) { @@ -193,6 +203,7 @@ static int pasemi_mac_setup_rx_resources struct pasemi_mac_rxring *ring; struct pasemi_mac *mac = netdev_priv(dev); int chan_id = mac->dma_rxch; + unsigned int cfg; ring = kzalloc(sizeof(*ring), GFP_KERNEL); @@ -232,20 +243,28 @@ static int pasemi_mac_setup_rx_resources PAS_DMA_RXCHAN_BASEU_BRBH(ring->dma >> 32) | PAS_DMA_RXCHAN_BASEU_SIZ(RX_RING_SIZE >> 3)); - write_dma_reg(mac, PAS_DMA_RXCHAN_CFG(chan_id), - PAS_DMA_RXCHAN_CFG_HBU(2)); + cfg = PAS_DMA_RXCHAN_CFG_HBU(2); + + if (translation_enabled()) + cfg |= PAS_DMA_RXCHAN_CFG_CTR; + + write_dma_reg(mac, PAS_DMA_RXCHAN_CFG(chan_id), cfg); write_dma_reg(mac, PAS_DMA_RXINT_BASEL(mac->dma_if), - PAS_DMA_RXINT_BASEL_BRBL(__pa(ring->buffers))); + PAS_DMA_RXINT_BASEL_BRBL(ring->buf_dma)); write_dma_reg(mac, PAS_DMA_RXINT_BASEU(mac->dma_if), - PAS_DMA_RXINT_BASEU_BRBH(__pa(ring->buffers) >> 32) | + PAS_DMA_RXINT_BASEU_BRBH(ring->buf_dma >> 32) | PAS_DMA_RXINT_BASEU_SIZ(RX_RING_SIZE >> 3)); - write_dma_reg(mac, PAS_DMA_RXINT_CFG(mac->dma_if), - PAS_DMA_RXINT_CFG_DHL(3) | PAS_DMA_RXINT_CFG_L2 | - PAS_DMA_RXINT_CFG_LW | PAS_DMA_RXINT_CFG_RBP | - PAS_DMA_RXINT_CFG_HEN); + cfg = PAS_DMA_RXINT_CFG_DHL(3) | PAS_DMA_RXINT_CFG_L2 | + PAS_DMA_RXINT_CFG_LW | PAS_DMA_RXINT_CFG_RBP | + PAS_DMA_RXINT_CFG_HEN; + + if (translation_enabled()) + cfg |= PAS_DMA_RXINT_CFG_ITRR | PAS_DMA_RXINT_CFG_ITR; + + write_dma_reg(mac, PAS_DMA_RXINT_CFG(mac->dma_if), cfg); ring->next_to_fill = 0; ring->next_to_clean = 0; @@ -275,6 +294,7 @@ static int pasemi_mac_setup_tx_resources u32 val; int chan_id = mac->dma_txch; struct pasemi_mac_txring *ring; + unsigned int cfg; ring = kzalloc(sizeof(*ring), GFP_KERNEL); if (!ring) @@ -304,11 +324,15 @@ static int pasemi_mac_setup_tx_resources write_dma_reg(mac, PAS_DMA_TXCHAN_BASEU(chan_id), val); - write_dma_reg(mac, PAS_DMA_TXCHAN_CFG(chan_id), - PAS_DMA_TXCHAN_CFG_TY_IFACE | - PAS_DMA_TXCHAN_CFG_TATTR(mac->dma_if) | - PAS_DMA_TXCHAN_CFG_UP | - PAS_DMA_TXCHAN_CFG_WT(2)); + cfg = PAS_DMA_TXCHAN_C
Re: [PATCH] sky2: jumbo frame regression fix
Ian Kumlien wrote: On tis, 2007-10-02 at 18:02 -0700, Stephen Hemminger wrote: Remove unneeded check that caused problems with jumbo frame sizes. The check was recently added and is wrong. When using jumbo frames the sky2 driver does fragmentation, so rx_data_size is less than mtu. Confirmed working. Now running with 9k mtu with no errors, =) Have you verified that you are actually getting jumbo packets out of the NIC? I had one machine which did standard packets silently using sky2 and jumbo using sk98lin. I was looking for something else with tcpdump and got one of those WTF moments when I saw all the tiny packets. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [9/11] pasemi_mac: clear out old errors on interface open
On Wed, Oct 03, 2007 at 01:46:16PM -0400, Jeff Garzik wrote: > Olof Johansson wrote: >> pasemi_mac: clear out old errors on interface open >> Clear out any pending errors when an interface is brought up. Since the >> bits >> are sticky, they might be from interface shutdown time after firmware has >> used it, etc. >> Signed-off-by: Olof Johansson <[EMAIL PROTECTED]> > > In general, interface-open should completely reset and initialize the > hardware. does pasemi_mac not do that? There's no explicit way to reset just one interface besides disabling it (which we do at close, and re-enable at open). It seems that some of the error bits are sticky across disable/enable, which is why this was needed. Also, they're RW1C, so writing 0 doesn't remove them (need to write 1 to clear). The only other dependency from firmware at this time is the setting of mac addresses, something that will be taken care of once we allow override of them via ethtool, since we'd need to program them from the driver then no matter what. Right now we assume that firmware has programmed it. -Olof - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [1/11] pasemi_mac: basic error checking
Olof Johansson wrote: pasemi_mac: basic error checking Add some rudimentary error checking to pasemi_mac. Signed-off-by: Olof Johansson <[EMAIL PROTECTED]> applied 1-10 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 2/3] ipg.c doesn't compile with with CONFIG_HIGHMEM64G
applied - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix typo in new EMAC driver.
Valentine Barshak (by way of Josh Boyer <[EMAIL PROTECTED]>) wrote: Fix an obvious typo in emac_xmit_finish. Signed-off-by: Valentine Barshak <[EMAIL PROTECTED]> --- drivers/net/ibm_newemac/core.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) applied - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/7] CAN: Add virtual CAN netdevice driver
David Miller wrote: From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Tue, 2 Oct 2007 14:52:36 -0700 Please consider using netif_msg_xxx() and module parameter to set default message level, like other real network drivers already do. I keep seeing this recommendation, but the two supposedly most mature and actively used drivers in the tree, tg3 and e1000 and e1000e, all do not use this scheme. In fact there are tons of drivers that even hook up the ethtool msg_level setting function and never even use the value. If people aren't using netif_msg_xxx() and the ethtool msg_level facilities properly, it's because there is a severe dearth of good example drivers to learn about it from. The currently available CAN netdevice drivers do not have a common debug concept neither any runtime control mechanism for this debugging. So netif_msg_xxx() is definitely worth to look at instead of creating any new stuff in this direction, before posting any 'real' CAN network driver here. Thanks very much for that hint! Oliver - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [11/11] pasemi_mac: enable iommu support
Olof Johansson wrote: pasemi_mac: use buffer index pointer in clean_rx() Use the new features in B0 for buffer ring index on the receive side. This means we no longer have to search in the ring for where the buffer came from. Also cleanup the RX cleaning side a little, while I was at it. Note: Pre-B0 hardware is no longer supported, and needs a pile of other workarounds that are not being submitted for mainline inclusion. So the fact that this breaks old hardware is not a problem at this time. Signed-off-by: Olof Johansson <[EMAIL PROTECTED]> You sent patch #10 against as patch #11 :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [9/11] pasemi_mac: clear out old errors on interface open
Olof Johansson wrote: pasemi_mac: clear out old errors on interface open Clear out any pending errors when an interface is brought up. Since the bits are sticky, they might be from interface shutdown time after firmware has used it, etc. Signed-off-by: Olof Johansson <[EMAIL PROTECTED]> In general, interface-open should completely reset and initialize the hardware. does pasemi_mac not do that? Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sky2: jumbo frame regression fix
Stephen Hemminger wrote: Remove unneeded check that caused problems with jumbo frame sizes. The check was recently added and is wrong. When using jumbo frames the sky2 driver does fragmentation, so rx_data_size is less than mtu. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> applied - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please pull 'fixes-jgarzik' branch of wireless-2.6
John W. Linville wrote: The following changes since commit 3146b39c185f8a436d430132457e84fa1d8f8208: Linus Torvalds (1): Linux 2.6.23-rc9 are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git fixes-jgarzik Joe Perches (1): bcm43xx: Correct printk with PFX before KERN_ Richard Knutsson (1): softmac: Fix compiler-warning pulled - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
lockdep report from bonding.
Reported by a Fedora user this morning. Ethernet Channel Bonding Driver: v3.1.3 (June 13, 2007) bonding: MII link monitoring set to 100 ms ADDRCONF(NETDEV_UP): bond0: link is not ready bonding: bond0: Adding slave eth0. e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex bonding: bond0: making interface eth0 the new active one. bonding: bond0: enslaving eth0 as an active interface with an up link. = [ INFO: inconsistent lock state ] 2.6.23-0.214.rc8.git2.fc8 #1 - inconsistent {softirq-on-W} -> {in-softirq-W} usage. events/1/10 [HC0[0]:SC1[1]:HE1:SE0] takes: (&(bond_info->tx_hashtbl_lock)){-+..}, at: [] tlb_clear_slave+0x1d/0x9a [bonding] {softirq-on-W} state was registered at: [] __lock_acquire+0x4ff/0xc67 [] lock_acquire+0x7b/0x9e [] _spin_lock+0x2e/0x58 [] bond_alb_initialize+0x64/0x18e [bonding] [] bond_open+0x33/0x178 [bonding] [] dev_open+0x31/0x6c [] dev_change_flags+0xa3/0x156 [] devinet_ioctl+0x207/0x50e [] inet_ioctl+0x86/0xa4 [] sock_ioctl+0x1ac/0x1c9 [] do_ioctl+0x22/0x68 [] vfs_ioctl+0x249/0x25c [] sys_ioctl+0x49/0x64 [] syscall_call+0x7/0xb [] 0x irq event stamp: 40878 hardirqs last enabled at (40878): [] _spin_unlock_irq+0x22/0x2f hardirqs last disabled at (40877): [] _spin_lock_irq+0x19/0x67 softirqs last enabled at (40872): [] rt_run_flush+0x6e/0x97 softirqs last disabled at (40873): [] do_softirq+0x74/0xf7 other info that might help us debug this: 3 locks held by events/1/10: #0: (rtnl_mutex){--..}, at: [] mutex_lock+0x21/0x24 #1: (&bond->lock){-.-+}, at: [] bond_alb_monitor+0x16/0x26e [bonding] #2: (&bond->curr_slave_lock){..-+}, at: [] bond_alb_monitor+0xa9/0x26e [bonding] stack backtrace: [] show_trace_log_lvl+0x1a/0x2f [] show_trace+0x12/0x14 [] dump_stack+0x16/0x18 [] print_usage_bug+0x141/0x14b [] mark_lock+0x12f/0x472 [] __lock_acquire+0x487/0xc67 [] lock_acquire+0x7b/0x9e [] _spin_lock+0x2e/0x58 [] tlb_clear_slave+0x1d/0x9a [bonding] [] bond_alb_monitor+0xc3/0x26e [bonding] [] run_timer_softirq+0x127/0x18f [] __do_softirq+0x78/0xff [] do_softirq+0x74/0xf7 === ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready bonding: bond0: Adding slave eth1. -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] qe: miscellaneous code improvements and fixes to the QE library
On Wed, 3 Oct 2007 11:34:59 -0500 Timur Tabi <[EMAIL PROTECTED]> wrote: > This patch makes numerous miscellaneous code improvements to the QE library. > > 1. Remove struct ucc_common and merge ucc_init_guemr() into ucc_set_type() >(every caller of ucc_init_guemr() also calls ucc_set_type()). Modify all >callers of ucc_set_type() accordingly. > > 2. Remove the unused enum ucc_pram_initial_offset. > > 3. Refactor qe_setbrg(), also implement work-around for errata QE_General4. > > 4. Several printk() calls were missing the terminating \n. > > 5. Add __iomem where needed, and change u16 to __be16 and u32 to __be32 where >appropriate. > > 6. In ucc_slow_init() the RBASE and TBASE registers in the PRAM were > programmed >with the wrong value. > > 7. Add the protocol type to struct us_info and updated ucc_slow_init() to >use it, instead of always programming QE_CR_PROTOCOL_UNSPECIFIED. > > 8. Rename ucc_slow_restart_x() to ucc_slow_restart_tx() > > 9. Add several macros in qe.h (mostly for slow UCC support, but also to >standardize some naming convention) and remove several unused macros. > > 10. Update ucc_geth.c to use the new macros. > > 11. Add ucc_slow_info.protocol to specify which QE_CR_PROTOCOL_xxx protcol > to use when initializing the UCC in ucc_slow_init(). > > 12. Rename ucc_slow_pram.rfcr to rbmr and ucc_slow_pram.tfcr to tbmr, since > these are the real names of the registers. > > 13. Use the setbits, clrbits, and clrsetbits where appropriate. > > 14. Refactor ucc_set_qe_mux_rxtx(). > > 15. Remove all instances of 'volatile'. > > 16. Simplify get_cmxucr_reg(); > > 17. Replace qe_mux.cmxucrX with qe_mux.cmxucr[]. > > 18. Updated struct ucc_geth because struct ucc_fast is not padded any more. > > Signed-off-by: Timur Tabi <[EMAIL PROTECTED]> > --- > Separate the changes into individual patches to allow for better comment/review and bisection in case of regression. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4] qe: miscellaneous code improvements and fixes to the QE library
This patch makes numerous miscellaneous code improvements to the QE library. 1. Remove struct ucc_common and merge ucc_init_guemr() into ucc_set_type() (every caller of ucc_init_guemr() also calls ucc_set_type()). Modify all callers of ucc_set_type() accordingly. 2. Remove the unused enum ucc_pram_initial_offset. 3. Refactor qe_setbrg(), also implement work-around for errata QE_General4. 4. Several printk() calls were missing the terminating \n. 5. Add __iomem where needed, and change u16 to __be16 and u32 to __be32 where appropriate. 6. In ucc_slow_init() the RBASE and TBASE registers in the PRAM were programmed with the wrong value. 7. Add the protocol type to struct us_info and updated ucc_slow_init() to use it, instead of always programming QE_CR_PROTOCOL_UNSPECIFIED. 8. Rename ucc_slow_restart_x() to ucc_slow_restart_tx() 9. Add several macros in qe.h (mostly for slow UCC support, but also to standardize some naming convention) and remove several unused macros. 10. Update ucc_geth.c to use the new macros. 11. Add ucc_slow_info.protocol to specify which QE_CR_PROTOCOL_xxx protcol to use when initializing the UCC in ucc_slow_init(). 12. Rename ucc_slow_pram.rfcr to rbmr and ucc_slow_pram.tfcr to tbmr, since these are the real names of the registers. 13. Use the setbits, clrbits, and clrsetbits where appropriate. 14. Refactor ucc_set_qe_mux_rxtx(). 15. Remove all instances of 'volatile'. 16. Simplify get_cmxucr_reg(); 17. Replace qe_mux.cmxucrX with qe_mux.cmxucr[]. 18. Updated struct ucc_geth because struct ucc_fast is not padded any more. Signed-off-by: Timur Tabi <[EMAIL PROTECTED]> --- Add fix 18. arch/powerpc/sysdev/qe_lib/qe.c | 36 +++-- arch/powerpc/sysdev/qe_lib/qe_ic.c|2 - arch/powerpc/sysdev/qe_lib/qe_io.c| 35 ++--- arch/powerpc/sysdev/qe_lib/ucc.c | 270 ++--- arch/powerpc/sysdev/qe_lib/ucc_fast.c | 127 arch/powerpc/sysdev/qe_lib/ucc_slow.c | 48 +++--- drivers/net/ucc_geth.c|2 +- drivers/net/ucc_geth.h|1 + include/asm-powerpc/immap_qe.h| 30 ++--- include/asm-powerpc/qe.h | 243 - include/asm-powerpc/ucc.h | 40 ++ include/asm-powerpc/ucc_slow.h|9 +- 12 files changed, 431 insertions(+), 412 deletions(-) diff --git a/arch/powerpc/sysdev/qe_lib/qe.c b/arch/powerpc/sysdev/qe_lib/qe.c index 90f8740..3d57d38 100644 --- a/arch/powerpc/sysdev/qe_lib/qe.c +++ b/arch/powerpc/sysdev/qe_lib/qe.c @@ -141,7 +141,7 @@ EXPORT_SYMBOL(qe_issue_cmd); * 16 BRGs, which can be connected to the QE channels or output * as clocks. The BRGs are in two different block of internal * memory mapped space. - * The baud rate clock is the system clock divided by something. + * The BRG clock is the QE clock divided by 2. * It was set up long ago during the initial boot phase and is * is given to us. * Baud rate clocks are zero-based in the driver code (as that maps @@ -165,28 +165,38 @@ unsigned int get_brg_clk(void) return brg_clk; } -/* This function is used by UARTS, or anything else that uses a 16x - * oversampled clock. +/* Program the BRG to the given sampling rate and multiplier + * + * @brg: the BRG, 1-16 + * @rate: the desired sampling rate + * @multiplier: corresponds to the value programmed in GUMR_L[RDCR] or + * GUMR_L[TDCR]. E.g., if this BRG is the RX clock, and GUMR_L[RDCR]=01, + * then 'multiplier' should be 8. + * + * Also note that the value programmed into the BRGC register must be even. */ -void qe_setbrg(u32 brg, u32 rate) +void qe_setbrg(unsigned int brg, unsigned int rate, unsigned int multiplier) { - volatile u32 *bp; u32 divisor, tempval; - int div16 = 0; + u32 div16 = 0; - bp = &qe_immr->brg.brgc[brg]; + divisor = get_brg_clk() / (rate * multiplier); - divisor = (get_brg_clk() / rate); if (divisor > QE_BRGC_DIVISOR_MAX + 1) { - div16 = 1; + div16 = QE_BRGC_DIV16; divisor /= 16; } - tempval = ((divisor - 1) << QE_BRGC_DIVISOR_SHIFT) | QE_BRGC_ENABLE; - if (div16) - tempval |= QE_BRGC_DIV16; + /* Errata QE_General4, which affects some MPC832x and MPC836x SOCs, says + that the BRG divisor must be even if you're not using divide-by-16 + mode. */ + if (!div16 && (divisor & 1)) + divisor++; + + tempval = ((divisor - 1) << QE_BRGC_DIVISOR_SHIFT) | + QE_BRGC_ENABLE | div16; - out_be32(bp, tempval); + out_be32(&qe_immr->brg.brgc[brg - 1], tempval); } /* Initialize SNUMs (thread serial numbers) according to diff --git a/arch/powerpc/sysdev/qe_lib/qe_ic.c b/arch/powerpc/sysdev/qe_lib/qe_ic.c index 55e6f39..9a2d1ed 100644 --- a/arch/powerpc/sysdev/qe_lib/qe_ic.c +++ b/arch/powerpc/sysdev/qe_lib/qe_ic.c @@ -405,8 +405,6
Re: [PATCH] net: fix race in process_backlog
On Wed, 03 Oct 2007 17:44:53 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote: > Subject: net: fix race in process_backlog > > The recent NAPI rework (4fa57c9ea9f36f9ca852f3a88ca5d2f1aebbc960) > introduced a race between netif_rx() and process_backlog() which > resulted in softirq processing to drop dead. > > netif_rx()process_backlog() > > irq_disable(); > skb = __skb_dequeue(); > irq_enable(); > > irq_disable(); > __skb_queue_tail(); > napi_schedule(); > irq_enable(); > > if (!skb) > napi_complete(); <-- oops! > > we cleared the napi bit, even though there is data to process. > > Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> Acked-by: Stephen Hemminger <[EMAIL PROTECTED]> -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net: fix race in process_backlog
Subject: net: fix race in process_backlog The recent NAPI rework (4fa57c9ea9f36f9ca852f3a88ca5d2f1aebbc960) introduced a race between netif_rx() and process_backlog() which resulted in softirq processing to drop dead. netif_rx() process_backlog() irq_disable(); skb = __skb_dequeue(); irq_enable(); irq_disable(); __skb_queue_tail(); napi_schedule(); irq_enable(); if (!skb) napi_complete(); <-- oops! we cleared the napi bit, even though there is data to process. Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> --- net/core/dev.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Index: linux-2.6/net/core/dev.c === --- linux-2.6.orig/net/core/dev.c +++ linux-2.6/net/core/dev.c @@ -2095,11 +2095,11 @@ static int process_backlog(struct napi_s local_irq_disable(); skb = __skb_dequeue(&queue->input_pkt_queue); - local_irq_enable(); if (!skb) { - napi_complete(napi); + __napi_complete(napi); break; } + local_irq_enable(); dev = skb->dev; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Cedric Le Goater wrote: > Ilpo Järvinen wrote: >> On Wed, 3 Oct 2007, Cedric Le Goater wrote: >> >>> Ilpo Järvinen wrote: Ah, that's path 1) then... Since you seem to have enough time, I would say that the path 1 is good as well and bugs unrelated to the fix will show up there too... >>> arg. yes. sorry for the confusion. >>> I should have stated it explicitly that with path 2 those 3 patches should not be applied because the aim is not a fix but reproducal. Path 2 was intentionally left without the potentional fix as then nice backtrace informs when we can stop trying (which would hopefully occurred pretty soon) :-). But lets discard that path 2... >>> I have 2 spare nodes so i'll run both. 1) is on already without any issues >>> i'm just compiling 2) > > Below are the messages I got on 2) right after running ketchup (which does > a wget www.kernel.org) > > not a warning on 1) with your extra verbose patch. bummer, I got this one on 1) :( C. WARNING: at /home/legoater/linux/net-2.6.24.git/net/ipv4/tcp_input.c:2325 tcp_fastretrans_alert() Call Trace: [] __wake_up+0x1f/0x4c [] tcp_ack+0xcee/0x18ac [] tcp_rcv_established+0x61f/0x6df [] __lock_acquire+0x8a1/0xf1b [] tcp_v4_do_rcv+0x3e/0x394 [] tcp_v4_rcv+0x624/0x9b1 [] ip_local_deliver+0x1da/0x2a4 [] ip_rcv+0x57c/0x5c4 [] packet_rcv_spkt+0x19a/0x1a8 [] netif_receive_skb+0x2ba/0x2de [] :tg3:tg3_poll+0x65d/0x8a4 [] net_rx_action+0xb8/0x191 [] __do_softirq+0x5f/0xe0 [] call_softirq+0x1c/0x28 [] do_softirq+0x3b/0xb9 [] irq_exit+0x4e/0x50 [] do_IRQ+0xbe/0xd8 [] mwait_idle+0x0/0x4d [] ret_from_intr+0x0/0xf [] __sched_text_start+0x5f0/0x62b [] __sched_text_start+0x5f0/0x62b [] mwait_idle+0x43/0x4d [] enter_idle+0x22/0x24 [] cpu_idle+0x9d/0xc0 [] rest_init+0x57/0x59 [] start_kernel+0x2d1/0x2dd [] _sinittext+0x14e/0x155 WARNING: at /home/legoater/linux/net-2.6.24.git/net/ipv4/tcp_input.c:2325 tcp_fastretrans_alert() Call Trace: [] __wake_up+0x1f/0x4c [] tcp_ack+0xcee/0x18ac [] tcp_rcv_established+0x61f/0x6df [] __lock_acquire+0x8a1/0xf1b [] tcp_v4_do_rcv+0x3e/0x394 [] tcp_v4_rcv+0x624/0x9b1 [] ip_local_deliver+0x1da/0x2a4 [] ip_rcv+0x57c/0x5c4 [] packet_rcv_spkt+0x19a/0x1a8 [] netif_receive_skb+0x2ba/0x2de [] :tg3:tg3_poll+0x65d/0x8a4 [] net_rx_action+0xb8/0x191 [] __do_softirq+0x5f/0xe0 [] call_softirq+0x1c/0x28 [] do_softirq+0x3b/0xb9 [] irq_exit+0x4e/0x50 [] do_IRQ+0xbe/0xd8 [] mwait_idle+0x0/0x4d [] ret_from_intr+0x0/0xf [] __sched_text_start+0x5f0/0x62b [] __sched_text_start+0x5f0/0x62b [] mwait_idle+0x43/0x4d [] enter_idle+0x22/0x24 [] cpu_idle+0x9d/0xc0 [] rest_init+0x57/0x59 [] start_kernel+0x2d1/0x2dd - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Ilpo Järvinen wrote: > On Wed, 3 Oct 2007, Cedric Le Goater wrote: > >> Ilpo Järvinen wrote: >>> Ah, that's path 1) then... Since you seem to have enough time, I would say >>> that the path 1 is good as well and bugs unrelated to the fix will show up >>> there too... >> arg. yes. sorry for the confusion. >> >>> I should have stated it explicitly that with path 2 those 3 patches should >>> not be applied because the aim is not a fix but reproducal. Path 2 was >>> intentionally left without the potentional fix as then nice backtrace >>> informs when we can stop trying (which would hopefully occurred >>> pretty soon) :-). But lets discard that path 2... >> I have 2 spare nodes so i'll run both. 1) is on already without any issues >> i'm just compiling 2) Below are the messages I got on 2) right after running ketchup (which does a wget www.kernel.org) not a warning on 1) with your extra verbose patch. >> I usually work on -mm, so what would be interesting for me is to have what >> you >> need in net-2.6.24 which is getting pulled in -mm by andrew. then, if >> you need an extra patch for verbosity, that's fine, i'll include it in >> my usual patchset. > > Ah, I'm sorry about the subject and the extra work it caused, no problem, that was a comment for the futur patchset. > it was meant for DaveM only, didn't realize at that time it would be > meaningful to you as well, thus couldn't warn you back then... Testing on > top of mm would be (/ have been) fine as well... From my point of view > both mm and net-2.6.24 are pretty much the same (I even verified that > those patches apply fine on top of rc8-mm2 since I thought that you might > want to use that one). He, you might have solved it with 1). If not, I'm keeping the hardware for you. Cheers, C. WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 tcp_verify_fackets() Call Trace: [] tcp_verify_fackets+0x119/0x237 [] tcp_fragment+0x468/0x4b8 [] tcp_retransmit_skb+0xcf/0x2f4 [] tcp_xmit_retransmit_queue+0xc3/0x31e [] tcp_fastretrans_alert+0xb36/0xb43 [] tcp_ack+0x5d3/0x71b [] tcp_rcv_established+0x61f/0x6df [] __lock_acquire+0x8a1/0xf1b [] tcp_v4_do_rcv+0x3e/0x394 [] tcp_v4_rcv+0x61c/0x9a9 [] ip_local_deliver+0x1da/0x2a4 [] ip_rcv+0x583/0x5c9 [] packet_rcv_spkt+0x19a/0x1a8 [] netif_receive_skb+0x2cf/0x2f5 [] :tg3:tg3_poll+0x65d/0x8a4 [] net_rx_action+0xb8/0x191 [] __do_softirq+0x5f/0xe0 [] call_softirq+0x1c/0x28 [] do_softirq+0x3b/0xb8 [] irq_exit+0x4e/0x50 [] do_IRQ+0xbd/0xd7 [] mwait_idle+0x0/0x4d [] ret_from_intr+0x0/0xf [] mwait_idle+0x43/0x4d [] enter_idle+0x22/0x24 [] cpu_idle+0x9d/0xc0 [] rest_init+0x55/0x57 [] start_kernel+0x2e0/0x2ec [] _sinittext+0x134/0x13b WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:198 tcp_verify_fackets() Call Trace: [] vgacon_set_cursor_size+0x39/0xd5 [] tcp_verify_fackets+0x163/0x237 [] tcp_fragment+0x468/0x4b8 [] tcp_retransmit_skb+0xcf/0x2f4 [] tcp_xmit_retransmit_queue+0xc3/0x31e [] tcp_fastretrans_alert+0xb36/0xb43 [] tcp_ack+0x5d3/0x71b [] tcp_rcv_established+0x61f/0x6df [] __lock_acquire+0x8a1/0xf1b [] tcp_v4_do_rcv+0x3e/0x394 [] tcp_v4_rcv+0x61c/0x9a9 [] ip_local_deliver+0x1da/0x2a4 [] ip_rcv+0x583/0x5c9 [] packet_rcv_spkt+0x19a/0x1a8 [] netif_receive_skb+0x2cf/0x2f5 [] :tg3:tg3_poll+0x65d/0x8a4 [] net_rx_action+0xb8/0x191 [] __do_softirq+0x5f/0xe0 [] call_softirq+0x1c/0x28 [] do_softirq+0x3b/0xb8 [] irq_exit+0x4e/0x50 [] do_IRQ+0xbd/0xd7 [] mwait_idle+0x0/0x4d [] ret_from_intr+0x0/0xf [] mwait_idle+0x43/0x4d [] enter_idle+0x22/0x24 [] cpu_idle+0x9d/0xc0 [] rest_init+0x55/0x57 [] start_kernel+0x2e0/0x2ec [] _sinittext+0x134/0x13b TCP wq(s) -S--SSS< TCP wq(i) hf < s4 f9 (47) p9 seq: su3460595874 hs3460607374 sn3460659962 (3460608822) WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 tcp_verify_fackets() Call Trace: [] vgacon_set_cursor_size+0x39/0xd5 [] tcp_verify_fackets+0x119/0x237 [] tcp_fragment+0x468/0x4b8 [] tcp_retransmit_skb+0xcf/0x2f4 [] tcp_xmit_retransmit_queue+0xc3/0x31e [] tcp_fastretrans_alert+0xb36/0xb43 [] tcp_ack+0x5d3/0x71b [] tcp_rcv_established+0x61f/0x6df [] __lock_acquire+0x8a1/0xf1b [] tcp_v4_do_rcv+0x3e/0x394 [] tcp_v4_rcv+0x61c/0x9a9 [] ip_local_deliver+0x1da/0x2a4 [] ip_rcv+0x583/0x5c9 [] packet_rcv_spkt+0x19a/0x1a8 [] netif_receive_skb+0x2cf/0x2f5 [] :tg3:tg3_poll+0x65d/0x8a4 [] net_rx_action+0xb8/0x191 [] __do_softirq+0x5f/0xe0 [] call_softirq+0x1c/0x28 [] do_softirq+0x3b/0xb8 [] irq_exit+0x4e/0x50 [] do_IRQ+0xbd/0xd7 [] mwait_idle+0x0/0x4d [] ret_from_intr+0x0/0xf [] mwait_idle+0x43/0x4d [] enter_idle+0x22/0x24 [] cpu_idle+0x9d/0xc0 [] rest_init+0x55/0x57 [] start_kernel+0x2e0/0x2ec [] _sinittext+0x134/0x13b WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:198 tcp_v
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Cedric Le Goater wrote: > Ilpo Järvinen wrote: >> On Wed, 3 Oct 2007, Cedric Le Goater wrote: >> >>> Ilpo Järvinen wrote: Ah, that's path 1) then... Since you seem to have enough time, I would say that the path 1 is good as well and bugs unrelated to the fix will show up there too... >>> arg. yes. sorry for the confusion. >>> I should have stated it explicitly that with path 2 those 3 patches should not be applied because the aim is not a fix but reproducal. Path 2 was intentionally left without the potentional fix as then nice backtrace informs when we can stop trying (which would hopefully occurred pretty soon) :-). But lets discard that path 2... >>> I have 2 spare nodes so i'll run both. 1) is on already without any issues >>> i'm just compiling 2) > > Below are the messages I got on 2) right after running ketchup (which does > a wget www.kernel.org) and a second run of ketchup gave the following. cheers, C. WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 tcp_verify_fackets() Call Trace: [] tcp_verify_fackets+0x119/0x237 [] tcp_fragment+0x468/0x4b8 [] tcp_retransmit_skb+0xcf/0x2f4 [] tcp_xmit_retransmit_queue+0xc3/0x31e [] tcp_fastretrans_alert+0xb36/0xb43 [] tcp_ack+0x5d3/0x71b [] tcp_rcv_established+0x61f/0x6df [] __lock_acquire+0x8a1/0xf1b [] tcp_v4_do_rcv+0x3e/0x394 [] tcp_v4_rcv+0x61c/0x9a9 [] ip_local_deliver+0x1da/0x2a4 [] ip_rcv+0x583/0x5c9 [] packet_rcv_spkt+0x19a/0x1a8 [] netif_receive_skb+0x2cf/0x2f5 [] :tg3:tg3_poll+0x65d/0x8a4 [] net_rx_action+0xb8/0x191 [] __do_softirq+0x5f/0xe0 [] call_softirq+0x1c/0x28 [] do_softirq+0x3b/0xb8 [] irq_exit+0x4e/0x50 [] do_IRQ+0xbd/0xd7 [] mwait_idle+0x0/0x4d [] ret_from_intr+0x0/0xf [] mwait_idle+0x43/0x4d [] enter_idle+0x22/0x24 [] cpu_idle+0x9d/0xc0 [] rest_init+0x55/0x57 [] start_kernel+0x2e0/0x2ec [] _sinittext+0x134/0x13b TCP wq(s) --S--< TCP wq(i) h < s1 f5 (14) p6 seq: su110259658 hs110265450 sn110278722 (0) WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 tcp_verify_fackets() Call Trace: [] vgacon_scroll+0x188/0x1dd [] tcp_verify_fackets+0x119/0x237 [] tcp_fragment+0x468/0x4b8 [] tcp_retransmit_skb+0xcf/0x2f4 [] tcp_xmit_retransmit_queue+0xc3/0x31e [] tcp_fastretrans_alert+0xb36/0xb43 [] tcp_ack+0x5d3/0x71b [] tcp_rcv_established+0x61f/0x6df [] __lock_acquire+0x8a1/0xf1b [] tcp_v4_do_rcv+0x3e/0x394 [] tcp_v4_rcv+0x61c/0x9a9 [] ip_local_deliver+0x1da/0x2a4 [] ip_rcv+0x583/0x5c9 [] packet_rcv_spkt+0x19a/0x1a8 [] netif_receive_skb+0x2cf/0x2f5 [] :tg3:tg3_poll+0x65d/0x8a4 [] net_rx_action+0xb8/0x191 [] __do_softirq+0x5f/0xe0 [] call_softirq+0x1c/0x28 [] do_softirq+0x3b/0xb8 [] irq_exit+0x4e/0x50 [] do_IRQ+0xbd/0xd7 [] mwait_idle+0x0/0x4d [] ret_from_intr+0x0/0xf [] mwait_idle+0x43/0x4d [] enter_idle+0x22/0x24 [] cpu_idle+0x9d/0xc0 [] rest_init+0x55/0x57 [] start_kernel+0x2e0/0x2ec [] _sinittext+0x134/0x13b WARNING: at /home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:198 tcp_verify_fackets() Call Trace: [] vgacon_scroll+0x188/0x1dd [] tcp_verify_fackets+0x163/0x237 [] tcp_fragment+0x468/0x4b8 [] tcp_retransmit_skb+0xcf/0x2f4 [] tcp_xmit_retransmit_queue+0xc3/0x31e [] tcp_fastretrans_alert+0xb36/0xb43 [] tcp_ack+0x5d3/0x71b [] tcp_rcv_established+0x61f/0x6df [] __lock_acquire+0x8a1/0xf1b [] tcp_v4_do_rcv+0x3e/0x394 [] tcp_v4_rcv+0x61c/0x9a9 [] ip_local_deliver+0x1da/0x2a4 [] ip_rcv+0x583/0x5c9 [] packet_rcv_spkt+0x19a/0x1a8 [] netif_receive_skb+0x2cf/0x2f5 [] :tg3:tg3_poll+0x65d/0x8a4 [] net_rx_action+0xb8/0x191 [] __do_softirq+0x5f/0xe0 [] call_softirq+0x1c/0x28 [] do_softirq+0x3b/0xb8 [] irq_exit+0x4e/0x50 [] do_IRQ+0xbd/0xd7 [] mwait_idle+0x0/0x4d [] ret_from_intr+0x0/0xf [] mwait_idle+0x43/0x4d [] enter_idle+0x22/0x24 [] cpu_idle+0x9d/0xc0 [] rest_init+0x55/0x57 [] start_kernel+0x2e0/0x2ec [] _sinittext+0x134/0x13b TCP wq(s) ---SSS-< TCP wq(i) hf< s3 f7 (14) p7 seq: su110259658 hs110268346 sn110278722 (110269794) C. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[IPv6] Fix ICMPv6 redirect handling with target multicast address, try 3
When the ICMPv6 Target address is multicast, Linux processes the redirect instead of dropping it. The problem is in this code in ndisc_redirect_rcv(): if (ipv6_addr_equal(dest, target)) { on_link = 1; } else if (!(ipv6_addr_type(target) & IPV6_ADDR_LINKLOCAL)) { ND_PRINTK2(KERN_WARNING "ICMPv6 Redirect: target address is not link-local.\n"); return; } This second check will succeed if the Target address is, for example, FF02::1 because it has link-local scope. Instead, it should be checking if it's a unicast link-local address, as stated in RFC 2461/4861 Section 8.1: - The ICMP Target Address is either a link-local address (when redirected to a router) or the same as the ICMP Destination Address (when redirected to the on-link destination). I know this doesn't explicitly say unicast link-local address, but it's implied. This bug is preventing Linux kernels from achieving IPv6 Logo Phase II certification because of a recent error that was found in the TAHI test suite - Neighbor Disovery suite test 206 (v6LC.2.3.6_G) had the multicast address in the Destination field instead of Target field, so we were passing the test. This won't be the case anymore. The patch below fixes this problem, and also fixes ndisc_send_redirect() to not send an invalid redirect with a multicast address in the Target field. I re-ran the TAHI Neighbor Discovery section to make sure Linux passes all 245 tests now. -Brian Signed-off-by: Brian Haley <[EMAIL PROTECTED]> Acked-by: David L Stevens <[EMAIL PROTECTED]> diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 74c4d8d..b761dbe 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -1267,9 +1267,10 @@ static void ndisc_redirect_rcv(struct sk_buff *skb) if (ipv6_addr_equal(dest, target)) { on_link = 1; - } else if (!(ipv6_addr_type(target) & IPV6_ADDR_LINKLOCAL)) { + } else if (ipv6_addr_type(target) != + (IPV6_ADDR_UNICAST|IPV6_ADDR_LINKLOCAL)) { ND_PRINTK2(KERN_WARNING - "ICMPv6 Redirect: target address is not link-local.\n"); + "ICMPv6 Redirect: target address is not link-local unicast.\n"); return; } @@ -1343,9 +1344,9 @@ void ndisc_send_redirect(struct sk_buff *skb, struct neighbour *neigh, } if (!ipv6_addr_equal(&ipv6_hdr(skb)->daddr, target) && - !(ipv6_addr_type(target) & IPV6_ADDR_LINKLOCAL)) { + ipv6_addr_type(target) != (IPV6_ADDR_UNICAST|IPV6_ADDR_LINKLOCAL)) { ND_PRINTK2(KERN_WARNING - "ICMPv6 Redirect: target address is not link-local.\n"); + "ICMPv6 Redirect: target address is not link-local unicast.\n"); return; }
Re: Please pull 'upstream-davem' branch of wireless-2.6
On Tue, Oct 02, 2007 at 07:01:56PM -0700, David Miller wrote: > From: "John W. Linville" <[EMAIL PROTECTED]> > Date: Tue, 2 Oct 2007 21:25:52 -0400 > > > git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git > > upstream-davem > > This doesn't pull cleanly. > > Probably you used a recently cloned Linus tree, pulled > net-2.6.24 into that (and resolved the conflicts), and > then put your patches in. No, in fact I'm quite conscious of that. I follow a procedure identical to what you outlined. I even leave my 'master-davem' branch available as a reference, and create the initial 'upstream-davem' branch as a checkout from it. :-) As an experiment, I cloned your current tree (which has the patches applied already, thanks!) and created a branch which backed-out the patches from me you had already applied by hand. I then did a pull from my tree, and the results were quite clean. [linville]:> git checkout -b jwltest fc26d79bb258b5fdb3dee940bea12d6ef7c217c5 Switched to a new branch "jwltest" [linville]:> git pull git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git upstream-davem remote: Generating pack... remote: Done counting 257 objects. remote: Result has 199 objects. remote: Deltifying 199 objects... remote: 100% (199/199) done Indexing 199 objects... remote: Total 199 (delta 150), reused 143 (delta 115) 100% (199/199) done Resolving 150 deltas... 100% (150/150) done 32 objects were added to complete this thin pack. Removed drivers/net/wireless/zd1211rw/zd_util.c Removed drivers/net/wireless/zd1211rw/zd_util.h Merge made by recursive. Documentation/networking/mac80211-injection.txt | 32 ++- drivers/net/wireless/adm8211.c |8 +- drivers/net/wireless/b43/Kconfig| 12 + drivers/net/wireless/b43/Makefile |5 +- drivers/net/wireless/b43/b43.h | 11 +- drivers/net/wireless/b43/leds.c | 399 ++- drivers/net/wireless/b43/leds.h | 63 ++-- drivers/net/wireless/b43/main.c | 205 drivers/net/wireless/b43/phy.c | 13 +- drivers/net/wireless/b43/phy.h |2 +- drivers/net/wireless/b43/rfkill.c | 184 +++ drivers/net/wireless/b43/rfkill.h | 58 drivers/net/wireless/hostap/hostap.h|2 +- drivers/net/wireless/hostap/hostap_hw.c |2 +- drivers/net/wireless/hostap/hostap_main.c | 19 +- drivers/net/wireless/iwlwifi/iwl3945-base.c |4 - drivers/net/wireless/iwlwifi/iwl4965-base.c |4 - drivers/net/wireless/p54common.c|4 +- drivers/net/wireless/p54pci.c |4 +- drivers/net/wireless/rt2x00/rt2x00.h|2 +- drivers/net/wireless/zd1211rw/Makefile |2 +- drivers/net/wireless/zd1211rw/zd_chip.c |1 - drivers/net/wireless/zd1211rw/zd_mac.c |4 +- drivers/net/wireless/zd1211rw/zd_usb.c |1 - drivers/net/wireless/zd1211rw/zd_util.c | 82 - drivers/net/wireless/zd1211rw/zd_util.h | 29 -- include/linux/rfkill.h | 24 ++ include/net/mac80211.h | 46 +++- net/mac80211/cfg.c | 75 - net/mac80211/ieee80211.c| 189 +--- net/mac80211/ieee80211_i.h | 17 +- net/mac80211/ieee80211_iface.c | 68 + net/mac80211/ieee80211_ioctl.c | 31 +- net/mac80211/ieee80211_led.c| 67 +++- net/mac80211/ieee80211_led.h|6 + net/mac80211/ieee80211_rate.c |3 +- net/mac80211/ieee80211_rate.h |2 - net/mac80211/ieee80211_sta.c|7 +- net/mac80211/key.c |1 - net/mac80211/rx.c | 122 +++- net/mac80211/sta_info.c | 13 +- net/mac80211/tx.c | 211 ++-- net/mac80211/wme.c | 10 +- net/rfkill/Kconfig |7 + net/rfkill/rfkill.c | 49 +++- 45 files changed, 1022 insertions(+), 1078 deletions(-) create mode 100644 drivers/net/wireless/b43/rfkill.c create mode 100644 dr
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
On Wed, 3 Oct 2007, Cedric Le Goater wrote: > Ilpo Järvinen wrote: > > > > Ah, that's path 1) then... Since you seem to have enough time, I would say > > that the path 1 is good as well and bugs unrelated to the fix will show up > > there too... > > arg. yes. sorry for the confusion. > > > I should have stated it explicitly that with path 2 those 3 patches should > > not be applied because the aim is not a fix but reproducal. Path 2 was > > intentionally left without the potentional fix as then nice backtrace > > informs when we can stop trying (which would hopefully occurred > > pretty soon) :-). But lets discard that path 2... > > I have 2 spare nodes so i'll run both. 1) is on already without any issues > i'm just compiling 2) Thanks a lot. :-) > I usually work on -mm, so what would be interesting for me is to have what > you > need in net-2.6.24 which is getting pulled in -mm by andrew. then, if > you need an extra patch for verbosity, that's fine, i'll include it in > my usual patchset. Ah, I'm sorry about the subject and the extra work it caused, it was meant for DaveM only, didn't realize at that time it would be meaningful to you as well, thus couldn't warn you back then... Testing on top of mm would be (/ have been) fine as well... From my point of view both mm and net-2.6.24 are pretty much the same (I even verified that those patches apply fine on top of rc8-mm2 since I thought that you might want to use that one). -- i.
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Ilpo Järvinen wrote: > On Wed, 3 Oct 2007, Cedric Le Goater wrote: > >> Ilpo Järvinen wrote: >>> On Wed, 3 Oct 2007, Cedric Le Goater wrote: >>> I'm dropping the previous patches you sent me and switching to this patchset. right ? >>> Yes you can do that... However, there are two ways forward: >>> >>> 1) Drop and test with this patchset long enough to verify it's gone... >>> 2) No dropping and get the more exact trace by reproducing, which can >>>point out to tcp_retrans_try_collapse confirming the source of the >>>bug or revealing yet another bug... >>> >>> The first one has one drawback, it cannot prove the fix very well since >>> the bug could just not occur by chance... Path 2 would clearly show the >>> place from where the problem originates because we will know that it got >>> triggered! I personally would prefer path 2 but whether you want to go for >>> that depends on the time you want to invest in it... >>> >>> ...I rediffed the tcp_verify_fackets patch too (below) just in case it >>> would be something else in you case and you choose path 1 (put it on top >>> of this patchset, applies with some offsets). In case the problem is gone, >>> it shouldn't trigger and if it does, we'll have another bug caught. >> I have a spare node so I'm starting 2) with the 3 patches you sent and that >> last one which applied fine. > > Ah, that's path 1) then... Since you seem to have enough time, I would say > that the path 1 is good as well and bugs unrelated to the fix will show up > there too... arg. yes. sorry for the confusion. > I should have stated it explicitly that with path 2 those 3 patches should > not be applied because the aim is not a fix but reproducal. Path 2 was > intentionally left without the potentional fix as then nice backtrace > informs when we can stop trying (which would hopefully occurred > pretty soon) :-). But lets discard that path 2... I have 2 spare nodes so i'll run both. 1) is on already without any issues i'm just compiling 2) I usually work on -mm, so what would be interesting for me is to have what you need in net-2.6.24 which is getting pulled in -mm by andrew. then, if you need an extra patch for verbosity, that's fine, i'll include it in my usual patchset. Cheers, C. >> all of them on a fresh git pull of net-2.6.24 > > That's fine, they're pretty well in sync (mm and net-2.6.24, and > soon 2.6.24-rcs too). > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 3/3] git-net: sctp build fix (not for applying)
[EMAIL PROTECTED] wrote: > From: Andrew Morton <[EMAIL PROTECTED]> > > net/sctp/sm_statetable.c:551: error: 'sctp_sf_tabort_8_4_8' undeclared here > (not in a function) > Andrew, is the a result of the merge of net-2.6.24 with net-2.6? That's the only way I see this happening. > > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> > --- > > net/sctp/sm_statetable.c |2 -- > 1 file changed, 2 deletions(-) > > diff -puN net/sctp/sm_statetable.c~git-net-sctp-hack net/sctp/sm_statetable.c > --- a/net/sctp/sm_statetable.c~git-net-sctp-hack > +++ a/net/sctp/sm_statetable.c > @@ -527,8 +527,6 @@ static const sctp_sm_table_entry_t prsct > /* SCTP_STATE_EMPTY */ \ > TYPE_SCTP_FUNC(sctp_sf_ootb), \ > /* SCTP_STATE_CLOSED */ \ > - TYPE_SCTP_FUNC(sctp_sf_tabort_8_4_8), \ That should be changed to sctp_sf_ootb and then it'll compile. As is, the patch is wrong. Thanks -vlad > - /* SCTP_STATE_COOKIE_WAIT */ \ > TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ > /* SCTP_STATE_COOKIE_ECHOED */ \ > TYPE_SCTP_FUNC(sctp_sf_eat_auth), \ > _ > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fallback to ipv4 if we try to add join IPv4 multicast group via ipv4-mapped address.
Hello, David Stevens wrote: Dmitry, Good catch; a couple comments: Thank you for the response. struct ipv6_pinfo *np = inet6_sk(sk); int err; + int addr_type = ipv6_addr_type(addr); + + if (addr_type == IPV6_ADDR_MAPPED) { + __be32 v4addr = addr->s6_addr32[3]; + struct ip_mreqn mreq; + mreq.imr_multiaddr.s_addr = v4addr; + mreq.imr_address.s_addr = INADDR_ANY; + mreq.imr_ifindex = ifindex; + + return ip_mc_join_group(sk, &mreq); + } ipv6_addr_type() returns a bitmask, so you should use: if (addr_type & IPV6_ADDR_MAPPED) { I just c'n'pasted the code that checks for mapped addresses. In most cases it's just ==, not bitmask operation. Also, you should have a blank line after the "mreq" declaration. ok. Ditto for both in ipv6_mc_sock_drop(). I don't expect the multicast source filtering interface will behave well for mapped addresses, either. The mapped multicast address won't appear to be a multicast address (and return error there), and all the source filters would have to be v4mapped addresses and modify the v4 source filters for this to do as you expect. So, there's more to it (and it may be a bit messy) to support mapped multicast addresses fully. I'll think about that part some more. Didn't have time to test it throughly. I've only checked that call succeeds and that all necessary igmp are sent. I hope, this weekend I'll have more time to check. -- With best wishes Dmitry Baryshkov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3][NET_BATCH] net core use batching
On Wed, 2007-03-10 at 01:29 -0400, Bill Fink wrote: > It does sound sensible. My own decidedly non-expert speculation > was that the big 30 % performance hit right at 4 KB may be related > to memory allocation issues or having to split the skb across > multiple 4 KB pages. plausible. But i also worry it could be 10 other things; example, could it be the driver used? I noted in my udp test the oddity that turned out to be tx coal parameter related. In any case, I will attempt to run those tests later. > And perhaps it only affected the single > process case because with multiple processes lock contention may > be a bigger issue and the xmit batching changes would presumably > help with that. I am admittedly a novice when it comes to the > detailed internals of TCP/skb processing, although I have been > slowly slogging my way through parts of the TCP kernel code to > try and get a better understanding, so I don't know if these > thoughts have any merit. You do bring up issues that need to be looked into and i will run those tests. Note, the effectiveness of batching becomes evident as the number of flows grows. Actually, scratch that: It becomes evident if you can keep the tx path busyed out to which multiple users running contribute. If i can have a user per CPU with lots of traffic to send, i can create that condition. It's a little boring in the scenario where the bottleneck is the wire but it needs to be checked. > BTW does anyone know of a good book they would recommend that has > substantial coverage of the Linux kernel TCP code, that's fairly > up-to-date and gives both an overall view of the code and packet > flow as well as details on individual functions and algorithms, > and hopefully covers basic issues like locking and synchronization, > concurrency of different parts of the stack, and memory allocation. > I have several books already on Linux kernel and networking internals, > but they seem to only cover the IP (and perhaps UDP) portions of the > network stack, and none have more than a cursory reference to TCP. > The most useful documentation on the Linux TCP stack that I have > found thus far is some of Dave Miller's excellent web pages and > a few other web references, but overall it seems fairly skimpy > for such an important part of the Linux network code. Reading books or magazines may end up busying you out with some small gains of knowledge at the end. They tend to be outdated fast. My advice is if you start with a focus on one thing, watch the patches that fly around on that area and learn that way. Read the code to further understand things then ask questions when its not clear. Other folks may have different views. The other way to do it is pick yourself some task to either add or improve something and get your hands dirty that way. > It would be good to see some empirical evidence that there aren't > any unforeseen gotchas for larger packet sizes, that at least the > same level of performance can be obtained with no greater CPU > utilization. Reasonable - I will try with 9K after i move over to the new tree from Dave and make sure nothing else broke in the previous tests. And when all looks good, i will move to TCP. > > [1] On average i spend 10x more time performance testing and analysing > > results than writting code. > > As you have written previously, and I heartily agree with, this is a > very good practice for developing performance enhancement patches. To give you a perspective, the results i posted were each run 10 iterations per packet size per kernel. Each run is 60 seconds long. I think i am past that stage for resolving or fixing anything for UDP or pktgen, but i need to keep checking for any new regressions when Dave updates his tree. Now multiply that by 5 packet sizes (I am going to add 2 more) and multiply that by 3-4 kernels. Then add the time it takes to sift through the data and collect it then analyze it and go back to the drawing table when something doesnt look right. Essentially, it needs a weekend ;-> cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
On Wed, 3 Oct 2007, Cedric Le Goater wrote: > Ilpo Järvinen wrote: > > On Wed, 3 Oct 2007, Cedric Le Goater wrote: > > > >> I'm dropping the previous patches you sent me and switching to this > >> patchset. > >> right ? > > > > Yes you can do that... However, there are two ways forward: > > > > 1) Drop and test with this patchset long enough to verify it's gone... > > 2) No dropping and get the more exact trace by reproducing, which can > >point out to tcp_retrans_try_collapse confirming the source of the > >bug or revealing yet another bug... > > > > The first one has one drawback, it cannot prove the fix very well since > > the bug could just not occur by chance... Path 2 would clearly show the > > place from where the problem originates because we will know that it got > > triggered! I personally would prefer path 2 but whether you want to go for > > that depends on the time you want to invest in it... > > > > ...I rediffed the tcp_verify_fackets patch too (below) just in case it > > would be something else in you case and you choose path 1 (put it on top > > of this patchset, applies with some offsets). In case the problem is gone, > > it shouldn't trigger and if it does, we'll have another bug caught. > > I have a spare node so I'm starting 2) with the 3 patches you sent and that > last one which applied fine. Ah, that's path 1) then... Since you seem to have enough time, I would say that the path 1 is good as well and bugs unrelated to the fix will show up there too... I should have stated it explicitly that with path 2 those 3 patches should not be applied because the aim is not a fix but reproducal. Path 2 was intentionally left without the potentional fix as then nice backtrace informs when we can stop trying (which would hopefully occurred pretty soon) :-). But lets discard that path 2... > all of them on a fresh git pull of net-2.6.24 That's fine, they're pretty well in sync (mm and net-2.6.24, and soon 2.6.24-rcs too). -- i.
Re: [PATCH][E1000E] some cleanups
On Tue, 2007-02-10 at 10:43 -0700, Kok, Auke wrote: > the description of this patch is rather misleading, and the title certainly > too. That was fast - you said weeks, not days;-> > Can you resend this with a bit more elaborate explanation as to why the cb > code is > relevant to use here? Not only do I need to understand this, but others might > want > to as well later on ;) I am probably repeating something youve seen/know already. The cleanup is to break up the code so it is functionally more readable from a perspective of the 4 distinct parts in ->hard_start_xmit(): a) packet formatting (example: vlan, mss, descriptor counting, etc.) b) chip-specific formatting c) enqueueing the packet on a DMA ring d) IO operations to complete packet transmit, tell DMA engine to chew on, tx completion interrupts, set last tx time, etc. Each of those steps sitting in different functions accumulates state that is used in the next steps. cb stores this state because it a scratchpad the driver owns. You could create some other structure and pass it around the iteration, but why waste more bytes. I could stop there with the explanation, but let me go on .. ;-> >From a secondary angle, remember i am pulling these patches out of my batching work. Thats how we started this discussion ;-> I would like, once converted the driver to remove LLTX, to do #a without holding the tx lock. This stands on its own even without batching. Then of course, once all this is in such good shape it makes it easier to add the batching code because i could reuse the now functionalized steps. I hope that provides reasonable and good explanation ;-> cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][TG3]Some cleanups
On Tue, 2007-02-10 at 16:33 -0700, Michael Chan wrote: > Seems ok to me. I think we should make it more clear that we're > skipping over the VLAN tag: > > (struct tg3_tx_cbdata *)&((__skb)->cb[sizeof(struct vlan_skb_tx_cookie)]) > Will do - thanks Michael. cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Ilpo Järvinen wrote: > On Wed, 3 Oct 2007, Cedric Le Goater wrote: > >> Ilpo Järvinen wrote: >>> Sacktag fastpath_cnt_hint seems to be very tricky to get right... >>> I suppose this one fixes Cedric's case. I cannot say for sure >>> until there is something more definite indication of >>> tcp_retrans_try_collapse origin than what the simple late WARN_ON >>> gave for us. ...Especially since it's non-trivial to have skb >>> hint "correctly" positioned in the write_queue while still ending >>> up calling that function. However, considering how difficult it >>> seems to be for Cedric to reproduce, it might well be this one. >>> >>> In addition, I noticed another reset which wasn't previously >>> converted to WARN_ON, so doing that now. Boot + simple xfer >>> tested. Please apply to net-2.6.24. >> I'm dropping the previous patches you sent me and switching to this >> patchset. >> right ? > > Yes you can do that... However, there are two ways forward: > > 1) Drop and test with this patchset long enough to verify it's gone... > 2) No dropping and get the more exact trace by reproducing, which can >point out to tcp_retrans_try_collapse confirming the source of the >bug or revealing yet another bug... > > The first one has one drawback, it cannot prove the fix very well since > the bug could just not occur by chance... Path 2 would clearly show the > place from where the problem originates because we will know that it got > triggered! I personally would prefer path 2 but whether you want to go for > that depends on the time you want to invest in it... > > ...I rediffed the tcp_verify_fackets patch too (below) just in case it > would be something else in you case and you choose path 1 (put it on top > of this patchset, applies with some offsets). In case the problem is gone, > it shouldn't trigger and if it does, we'll have another bug caught. I have a spare node so I'm starting 2) with the 3 patches you sent and that last one which applied fine. all of them on a fresh git pull of net-2.6.24 > Anyway, thanks for ccing right persons and netdev right from the > beginning. thanks to git ! :) C. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
On Wed, 3 Oct 2007, Cedric Le Goater wrote: > Ilpo Järvinen wrote: > > Sacktag fastpath_cnt_hint seems to be very tricky to get right... > > I suppose this one fixes Cedric's case. I cannot say for sure > > until there is something more definite indication of > > tcp_retrans_try_collapse origin than what the simple late WARN_ON > > gave for us. ...Especially since it's non-trivial to have skb > > hint "correctly" positioned in the write_queue while still ending > > up calling that function. However, considering how difficult it > > seems to be for Cedric to reproduce, it might well be this one. > > > > In addition, I noticed another reset which wasn't previously > > converted to WARN_ON, so doing that now. Boot + simple xfer > > tested. Please apply to net-2.6.24. > > I'm dropping the previous patches you sent me and switching to this patchset. > right ? Yes you can do that... However, there are two ways forward: 1) Drop and test with this patchset long enough to verify it's gone... 2) No dropping and get the more exact trace by reproducing, which can point out to tcp_retrans_try_collapse confirming the source of the bug or revealing yet another bug... The first one has one drawback, it cannot prove the fix very well since the bug could just not occur by chance... Path 2 would clearly show the place from where the problem originates because we will know that it got triggered! I personally would prefer path 2 but whether you want to go for that depends on the time you want to invest in it... ...I rediffed the tcp_verify_fackets patch too (below) just in case it would be something else in you case and you choose path 1 (put it on top of this patchset, applies with some offsets). In case the problem is gone, it shouldn't trigger and if it does, we'll have another bug caught. Anyway, thanks for ccing right persons and netdev right from the beginning. -- i. include/net/tcp.h |3 + net/ipv4/tcp_input.c | 25 +--- net/ipv4/tcp_ipv4.c | 103 + net/ipv4/tcp_output.c |6 ++- 4 files changed, 130 insertions(+), 7 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 991ccdc..54a0d91 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -43,6 +43,9 @@ #include +extern void tcp_verify_fackets(struct sock *sk); +extern void tcp_print_queue(struct sock *sk); + extern struct inet_hashinfo tcp_hashinfo; extern atomic_t tcp_orphan_count; diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 87c9ef5..93bdc20 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1140,7 +1140,7 @@ static int tcp_check_dsack(struct tcp_sock *tp, struct sk_buff *ack_skb, return dup_sack; } -static int +int tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_una) { const struct inet_connection_sock *icsk = inet_csk(sk); @@ -1160,8 +1160,10 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_ int first_sack_index; if (!tp->sacked_out) { - if (WARN_ON(tp->fackets_out)) + if (WARN_ON(tp->fackets_out)) { tp->fackets_out = 0; + tcp_print_queue(sk); + } tp->highest_sack = tp->snd_una; } prior_fackets = tp->fackets_out; @@ -1421,6 +1423,7 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_ } } } + tcp_verify_fackets(sk); /* Check for lost retransmit. This superb idea is * borrowed from "ratehalving". Event "C". @@ -1633,13 +1636,14 @@ void tcp_enter_frto(struct sock *sk) tcp_set_ca_state(sk, TCP_CA_Disorder); tp->high_seq = tp->snd_nxt; tp->frto_counter = 1; + tcp_verify_fackets(sk); } /* Enter Loss state after F-RTO was applied. Dupack arrived after RTO, * which indicates that we should follow the traditional RTO recovery, * i.e. mark everything lost and do go-back-N retransmission. */ -static void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag) +void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag) { struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb; @@ -1676,6 +1680,7 @@ static void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag) } } tcp_verify_left_out(tp); + tcp_verify_fackets(sk); tp->snd_cwnd = tcp_packets_in_flight(tp) + allowed_segments; tp->snd_cwnd_cnt = 0; @@ -1754,6 +1759,7 @@ void tcp_enter_loss(struct sock *sk, int how) } } tcp_verify_left_out(tp); + tcp_verify_fackets(sk); tp->reordering = min_t(unsigned int, tp->reordering, sysctl_tcp_reordering); @@ -2309,7 +2315,7 @@ static void tcp_mtu
Re: [PATCH net-2.6.24 0/3]: More TCP fixes
Hello Ilpo ! Ilpo Järvinen wrote: > Hi Dave, > > Sacktag fastpath_cnt_hint seems to be very tricky to get right... > I suppose this one fixes Cedric's case. I cannot say for sure > until there is something more definite indication of > tcp_retrans_try_collapse origin than what the simple late WARN_ON > gave for us. ...Especially since it's non-trivial to have skb > hint "correctly" positioned in the write_queue while still ending > up calling that function. However, considering how difficult it > seems to be for Cedric to reproduce, it might well be this one. > > In addition, I noticed another reset which wasn't previously > converted to WARN_ON, so doing that now. Boot + simple xfer > tested. Please apply to net-2.6.24. I'm dropping the previous patches you sent me and switching to this patchset. right ? Thanks, C. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-2.6.24 0/3]: More TCP fixes
Hi Dave, Sacktag fastpath_cnt_hint seems to be very tricky to get right... I suppose this one fixes Cedric's case. I cannot say for sure until there is something more definite indication of tcp_retrans_try_collapse origin than what the simple late WARN_ON gave for us. ...Especially since it's non-trivial to have skb hint "correctly" positioned in the write_queue while still ending up calling that function. However, considering how difficult it seems to be for Cedric to reproduce, it might well be this one. In addition, I noticed another reset which wasn't previously converted to WARN_ON, so doing that now. Boot + simple xfer tested. Please apply to net-2.6.24. -- i. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] [TCP]: "Annotate" another fackets_out state reset
This should no longer be necessary because fackets_out is accurate. It indicates bugs elsewhere, thus report it. Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> --- net/ipv4/tcp_input.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index e22ffe7..87c9ef5 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1160,7 +1160,8 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_ int first_sack_index; if (!tp->sacked_out) { - tp->fackets_out = 0; + if (WARN_ON(tp->fackets_out)) + tp->fackets_out = 0; tp->highest_sack = tp->snd_una; } prior_fackets = tp->fackets_out; -- 1.5.0.6 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] [TCP]: Fix two off-by-one errors in fackets_out adjusting logic
1) Passing wrong skb to tcp_adjust_fackets_out could corrupt fastpath_cnt_hint as tcp_skb_pcount(next_skb) is not included to it if hint points exactly to the next_skb (it's lagging behind, see sacktag). 2) When fastpath_skb_hint is put backwards to avoid dangling skb reference, the skb's pcount must also be removed from count (not included like above). Reported by Cedric Le Goater <[EMAIL PROTECTED]> Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> --- net/ipv4/tcp_output.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 6199abe..5329675 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1755,14 +1755,16 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb, int m if (tcp_is_reno(tp) && tp->sacked_out) tcp_dec_pcount_approx(&tp->sacked_out, next_skb); - tcp_adjust_fackets_out(tp, skb, tcp_skb_pcount(next_skb)); + tcp_adjust_fackets_out(tp, next_skb, tcp_skb_pcount(next_skb)); tp->packets_out -= tcp_skb_pcount(next_skb); /* changed transmit queue under us so clear hints */ tcp_clear_retrans_hints_partial(tp); /* manually tune sacktag skb hint */ - if (tp->fastpath_skb_hint == next_skb) + if (tp->fastpath_skb_hint == next_skb) { tp->fastpath_skb_hint = skb; + tp->fastpath_cnt_hint -= tcp_skb_pcount(skb); + } sk_stream_free_skb(sk, next_skb); } -- 1.5.0.6 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] [TCP]: Comment fastpath_cnt_hint off-by-one trap
Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> --- include/linux/tcp.h |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index f8cf090..9ff456e 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -343,7 +343,8 @@ struct tcp_sock { struct sk_buff *forward_skb_hint; struct sk_buff *fastpath_skb_hint; - int fastpath_cnt_hint; + int fastpath_cnt_hint; /* Lags behind by current skb's pcount +* compared to respective fackets_out */ int lost_cnt_hint; int retransmit_cnt_hint; -- 1.5.0.6 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
net-2.6.24 plans
I'm a bit behind after investigating the TCP performance issues that turned out to be HW specific problems. It's a bit of a dissapointment, I thought maybe there was a cool bug to fix in TCP :-) Anyways, that means there are patches backlogged in my inbox and it is also about time to do the hopefully last rebase of the net-2.6.24 tree. I merged in Jeff Garzik's and John Linville's latest and I'm running the current tree on my workstation most of today with good results so far. Linus should release the final 2.6.23 very soon, let's kind of assume it will happen over the next 3 or 4 days. That means we need to bear down for the merge. I plan to commit my Neptune driver in it's current state, and that's the last new feature going in. You can help make the merge go swimmingly by picking some nagging issue you noticed and track it down. If you can figure out why something happens but can't or don't have time to come up with a fix, report what you've discovered. If you can provide the fix too, all the better. That's how I get backlogged, I'm working on A and notice some problem with B, then I refuse to go back to A until I bring closure to B. :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel 2.4 vs 2.6 Traffic Controller performance
On 10/3/07, Eric Dumazet <[EMAIL PROTECTED]> wrote: > Sonny a écrit : > > Hello > > This is a repost, there seems to have a misunderstanding before. > > > > I hope this is the right place to ask this. Does any know if there is a > > substantial difference in the performance of the traffic controller > > between kernel 2.4 and 2.6. We tested it using 1 iperf server and use > > 250 and 500 clients, altering the burst. > > > > This is the set-up: > > iperf client - router (w/ traffic controller) - iperf server > > > > We use the top command inside the router to check the idle time of our > > router to see this. The results we got from the 2.4 kernel shows > > around 65-70% idle time while the 2.6 shows > > 60-65% idle time. We tried to use MRTG and we're not getting any > > results either. We want to know if we could improve the bandwidth by > > upgrading the kernel, else we would have to get a new bandwidth > > manager. Have anyone performed a similar test or can suggest a better > > way to do this. Thanks in advance. > > - > Hi Sonny > > I am not sure what you are asking here. 65-70% idle time (or 60-65%) is fine. > > 2.6 is also not very meaningfull, there are a lot of changes between 2.6.0 and > 2.6.23 :) > we're using 2.6.22 > Why should you upgrade kernel ? we would like to test the difference bet 2 kernels performance > What bandwidth do you handle ? 10 mbps > What kind of platform is it ? (a new kernel wont help much if its a real old > machine, or old NICs) it's a P IV 2.8 GHz HT with 512 MB > > You seem to have some bandwidth problem but focus on cpu affairs... Bandwidth is not a problem, we can get 10mbps without a hitch. But we would like to know the scalability on the CPU vs the number of clients. So far, for both kernels, we're getting 50% CPU utilization using 500 clients and 384 burst kbps each. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sky2: jumbo frame regression fix
On Tue, Oct 02, 2007 at 09:59:14PM -0700, Stephen Hemminger wrote: > On Wed, 03 Oct 2007 03:34:34 +0200 > Ian Kumlien <[EMAIL PROTECTED]> wrote: > > > On tis, 2007-10-02 at 18:02 -0700, Stephen Hemminger wrote: > > > Remove unneeded check that caused problems with jumbo frame sizes. > > > The check was recently added and is wrong. > > > When using jumbo frames the sky2 driver does fragmentation, so > > > rx_data_size is less than mtu. > > > > Confirmed working. > > > > Now running with 9k mtu with no errors, =) > > > > It also seems that the FIFO bug was the one that affected me before, > > damn odd race that one. > > Does the workaround (forced reset work). Ian, you are the first person to > report triggering it. I haven't found a way to make it happen. > What combination of flow control and speeds are you using? I forgot to add, last time was -rc8-git2 or 3 and using Westwood flow control. > -- > Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tcp bw in 2.6
From: [EMAIL PROTECTED] (Larry McVoy) Date: Tue, 2 Oct 2007 15:36:44 -0700 > On Tue, Oct 02, 2007 at 03:32:16PM -0700, David Miller wrote: > > I'm starting to have a theory about what the bad case might > > be. > > > > A strong sender going to an even stronger receiver which can > > pull out packets into the process as fast as they arrive. > > This might be part of what keeps the receive window from > > growing. > > I can back you up on that. When I straced the receiving side that goes > slowly, all the reads were short, like 1-2K. The way that works the > reads were a lot larger as I recall. My issue turns out to be hardware specific too. The two Broadcom 5714 onboard NICs on my Niagara t1000 give bad packet receive performance for some reason, the other two which are Broadcom 5704's are perfectly fine. I'll figure out what the problem is, probably some misprogramed register in either the chip or the bridge it's behind. The UDP stream test of netperf is great for isolating TCP/TSO vs. hardware issues. If you can't saturate the pipe or the cpu with the UDP stream test, it's likely a hardware issue. The cpu utilization and service demand numbers provided, on both send and receive, are really useful for diagnosing problems like this. Rick deserves several beers for his work on this cool toy. :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sky2: jumbo frame regression fix
On tis, 2007-10-02 at 21:59 -0700, Stephen Hemminger wrote: > On Wed, 03 Oct 2007 03:34:34 +0200 > Ian Kumlien <[EMAIL PROTECTED]> wrote: > > > On tis, 2007-10-02 at 18:02 -0700, Stephen Hemminger wrote: > > > Remove unneeded check that caused problems with jumbo frame sizes. > > > The check was recently added and is wrong. > > > When using jumbo frames the sky2 driver does fragmentation, so > > > rx_data_size is less than mtu. > > > > Confirmed working. > > > > Now running with 9k mtu with no errors, =) > > > > It also seems that the FIFO bug was the one that affected me before, > > damn odd race that one. > > Does the workaround (forced reset work). Ian, you are the first person to > report triggering it. I haven't found a way to make it happen. > What combination of flow control and speeds are you using? Yes it works, it's the problem i had all along =) As to how to make it happen thats a bit harder... To me it seems like it's a combination of several connections and somewhat high bandwidth but you have to send data for it to happen... To me it usually happens when seeding files via Bittorrent, but it seems like it has to be somewhat special circumstances to actually trigger it. I use jumbo frames, my lan is gigabit, to my firewall. From the firewall it's common 1500 mtu 100mbit and i doubt that this has anything to do with it (if it's not a 'number of frames that can be stored' problem and thus the mtu limits it to a really small value making it easier to trigger) Well, thats my thoughts atleast but then i just got up after having slept 5 hours, so =) -- Ian Kumlien -- http://pomac.netswarm.net signature.asc Description: This is a digitally signed message part
Re: tcp bw in 2.6
Tangential aside: On Tue, 02 Oct 2007, Rick Jones wrote: > *) depending on the quantity of CPU around, and the type of test one is > running, > results can be better/worse depending on the CPU to which you bind the > application. Latency tends to be best when running on the same core as takes > interrupts from the NIC, bulk transfer can be better when running on a > different > core, although generally better when a different core on the same chip. These > days the throughput stuff is more easily seen on 10G, but the netperf service > demand changes are still visible on 1G. Interesting. I was going to say that I've generally had the opposite experience when it comes to bulk data transfers, which is what I would expect due to CPU caching effects, but that perhaps it's motherboard/NIC/ driver dependent. But in testing I just did I discovered it's even MTU dependent (most of my normal testing is always with 9000-byte jumbo frames). With Myricom 10-GigE NICs, NIC interrupts on CPU 0 and nuttcp app running on CPU 1 (both transmit and receive sides), and using 9000-byte jumbo frames: [EMAIL PROTECTED] ~]# nuttcp -w10m 192.168.88.16 10078.5000 MB / 10.02 sec = 8437.5396 Mbps 100 %TX 99 %RX With Myricom 10-GigE NICs, and both NIC interrupts and nuttcp app on CPU 0 (both transmit and receive sides), again using 9000-byte jumbo frames: [EMAIL PROTECTED] ~]# nuttcp -w10m 192.168.88.16 11817.8750 MB / 10.00 sec = 9909.7537 Mbps 100 %TX 74 %RX Same tests repeated with standard 1500-byte Ethernet MTU: With Myricom 10-GigE NICs, NIC interrupts on CPU 0 and nuttcp app running on CPU 1 (both transmit and receive sides), and using standard 1500-byte Ethernet MTU: [EMAIL PROTECTED] ~]# nuttcp -M1460 -w10m 192.168.88.16 5685.9375 MB / 10.00 sec = 4768.0951 Mbps 99 %TX 98 %RX With Myricom 10-GigE NICs, and both NIC interrupts and nuttcp app on CPU 0 (both transmit and receive sides), again using standard 1500-byte Ethernet MTU: [EMAIL PROTECTED] ~]# nuttcp -M1460 -w10m 192.168.88.16 4974.0625 MB / 10.03 sec = 4161.6015 Mbps 100 %TX 100 %RX Now back to your regularly scheduled programming. :-) -Bill - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/7] CAN: Add virtual CAN netdevice driver
David Miller wrote: From: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> Date: Tue, 2 Oct 2007 18:43:25 -0300 I think that helping ctags to find the definition for the debug variable to see, for instance, if it is a bitmask or a boolean without having to chose from tons of 'debug' variables is a good thing. I completely agree. OK. No problem if it's helpful. We'll change debug to vcan_debug. Thanks. Oliver - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel 2.4 vs 2.6 Traffic Controller performance
Sonny a écrit : Hello This is a repost, there seems to have a misunderstanding before. I hope this is the right place to ask this. Does any know if there is a substantial difference in the performance of the traffic controller between kernel 2.4 and 2.6. We tested it using 1 iperf server and use 250 and 500 clients, altering the burst. This is the set-up: iperf client - router (w/ traffic controller) - iperf server We use the top command inside the router to check the idle time of our router to see this. The results we got from the 2.4 kernel shows around 65-70% idle time while the 2.6 shows 60-65% idle time. We tried to use MRTG and we're not getting any results either. We want to know if we could improve the bandwidth by upgrading the kernel, else we would have to get a new bandwidth manager. Have anyone performed a similar test or can suggest a better way to do this. Thanks in advance. - Hi Sonny I am not sure what you are asking here. 65-70% idle time (or 60-65%) is fine. 2.6 is also not very meaningfull, there are a lot of changes between 2.6.0 and 2.6.23 :) Why should you upgrade kernel ? What bandwidth do you handle ? What kind of platform is it ? (a new kernel wont help much if its a real old machine, or old NICs) You seem to have some bandwidth problem but focus on cpu affairs... - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html