Re: [patch 12/19] fix irq problem with NAPI + NETPOLL
On Thu, 08 Mar 2007 10:35:13 +0900 (JST), Atsushi Nemoto [EMAIL PROTECTED] wrote: netpoll_rx() should be invokable from hardware interrupt context. What is the crash you are seeing? The problem is not netpoll_rx(). It should be called from irq context. The problem is, netif_receive_skb() is called from irq context though it seems not designed to do so. Unfortunately I could not reproduce the crash, but IIRC the crash was happened at upper protocol layer on hardware interrupt context. Anyway, I think main path of netif_receive_skb() should not be executed in hardware interrupt context. Is it wrong? It looks like perhaps the kfree_skb() calls need to be modified in __netpoll_rx(). Well, it seems an another netpoll bug. I suppose these kfree_skb() in __netpoll_rx() should be dev_kfree_skb_any(). And I found an another abuse which is irrelevant to netpoll. The netif_rx() calls kfree_skb() at its bottom. The netif_rx() should be callable from hardware interrupt context, so it should be changed to dev_kfree_skb_any(). Is it right? --- Atsushi Nemoto - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Removal of multipath cached (was Re: [PATCH] [REVISED] net/ipv4/multipath_wrandom.c: check kmalloc() return value.)
On Mon, Mar 12, 2007 at 10:22:36PM -0800, Andrew Morton wrote: On Mon, 12 Mar 2007 13:53:11 -0700 (PDT) David Miller [EMAIL PROTECTED] wrote: ... And there is absolutely no negotiations about this, I've held back on this for nearly 2 years, and nothing has happened, this code is not maintained, nobody cares enough to fix the bugs, and even no distributions enable it because it causes crashes. Good stuff. I suggest you put a big printk explaining the above into 2.6.21. Plus official way: Documentation/feature-remove-schedule.txt in the next rc-git. Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Extensible hashing and RCU
On Tuesday 13 March 2007 10:32, Evgeniy Polyakov wrote: On Fri, Mar 02, 2007 at 11:52:47AM +0300, Evgeniy Polyakov ([EMAIL PROTECTED]) wrote: So, I ask network developers about testing environment for socket lookup benchmarking. What would be the best test case to determine performance of the lookup algo? Is it enough to replace algo and locking and create say one million of connections and try to run trivial web server (that is what I'm going to test if there will not be any better suggestion, but I only have single-core athlon 64 with 1gb of ram as a test bed and two core duo machines as generators, probably I can use one of them as a test machine too. They have gigabit adapters and aree connected over gigabit switch)? One million concurrent sockets on your machines will be tricky :) $ egrep (filp|dent|^TCP|sock_inode_cache) /proc/slabinfo |cut -c1-40 TCP 12 14 1152 sock_inode_cache 423430384 dentry_cache 36996 47850132 filp4081 4680192 that means at the minimum 1860 bytes of LOWMEM per tcp socket on 32bit kernel, (2512 bytes on a 64bit kernel) I had one bench program but apparently I lost it :( It was able to open long lived sockets, (one million if enough memory), and was generating kind of random trafic on all sockets. damned. The 'server' side had to listen to many (16) ports because of the 65536 limit. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Extensible hashing and RCU
On Tue, Mar 13, 2007 at 11:08:27AM +0100, Eric Dumazet ([EMAIL PROTECTED]) wrote: On Tuesday 13 March 2007 10:32, Evgeniy Polyakov wrote: On Fri, Mar 02, 2007 at 11:52:47AM +0300, Evgeniy Polyakov ([EMAIL PROTECTED]) wrote: So, I ask network developers about testing environment for socket lookup benchmarking. What would be the best test case to determine performance of the lookup algo? Is it enough to replace algo and locking and create say one million of connections and try to run trivial web server (that is what I'm going to test if there will not be any better suggestion, but I only have single-core athlon 64 with 1gb of ram as a test bed and two core duo machines as generators, probably I can use one of them as a test machine too. They have gigabit adapters and aree connected over gigabit switch)? One million concurrent sockets on your machines will be tricky :) $ egrep (filp|dent|^TCP|sock_inode_cache) /proc/slabinfo |cut -c1-40 TCP 12 14 1152 sock_inode_cache 423430384 dentry_cache 36996 47850132 filp4081 4680192 that means at the minimum 1860 bytes of LOWMEM per tcp socket on 32bit kernel, (2512 bytes on a 64bit kernel) I had one bench program but apparently I lost it :( It was able to open long lived sockets, (one million if enough memory), and was generating kind of random trafic on all sockets. damned. The 'server' side had to listen to many (16) ports because of the 65536 limit. Yep, I was too optimistic about my hardware - getting size of the tcp socket it is impossible to even create such amount of them with 1 or 2 gb of ram. Well, I can run additional tests in userspace (ideally with hugetlb support, but given that both socket hash table and my algo use essentially the same amount of ram it should not matter) with more precise analysis... And just send a patch with detailed description. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] NetXen: Fix softlock seen on some machines during hardware writes
On Friday 09 March 2007 21:56, Stephen Hemminger wrote: Linsys Contractor Mithlesh Thukral wrote: NetXen: This will fix a softlock seen on some machines. The reason was too much time was spent waiting for writes to go through. Signed-off by: Mithlesh Thukral [EMAIL PROTECTED] --- drivers/net/netxen/netxen_nic.h |1 + drivers/net/netxen/netxen_nic_ethtool.c |1 + drivers/net/netxen/netxen_nic_init.c| 11 +-- 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/net/netxen/netxen_nic.h b/drivers/net/netxen/netxen_nic.h index 38d7409..c85c2cb 100644 --- a/drivers/net/netxen/netxen_nic.h +++ b/drivers/net/netxen/netxen_nic.h @@ -236,6 +236,7 @@ #define MPORT_MULTI_FUNCTION_MODE 0x #include netxen_nic_phan_reg.h extern unsigned long long netxen_dma_mask; +extern unsigned long last_schedule_time; /* * NetXen host-peg signal message structure diff --git a/drivers/net/netxen/netxen_nic_ethtool.c b/drivers/net/netxen/netxen_nic_ethtool.c index 3752d2a..d49a7d8 100644 --- a/drivers/net/netxen/netxen_nic_ethtool.c +++ b/drivers/net/netxen/netxen_nic_ethtool.c @@ -455,6 +455,7 @@ netxen_nic_set_eeprom(struct net_device } printk(KERN_INFO %s: flash unlocked. \n, netxen_nic_driver_name); + last_schedule_time = jiffies; ret = netxen_flash_erase_secondary(adapter); if (ret != FLASH_SUCCESS) { printk(KERN_ERR %s: Flash erase failed.\n, diff --git a/drivers/net/netxen/netxen_nic_init.c b/drivers/net/netxen/netxen_nic_init.c index b2e776f..53ca21e 100644 --- a/drivers/net/netxen/netxen_nic_init.c +++ b/drivers/net/netxen/netxen_nic_init.c @@ -42,6 +42,8 @@ struct crb_addr_pair { u32 data; }; +unsigned long last_schedule_time; + #define NETXEN_MAX_CRB_XFORM 60 static unsigned int crb_addr_xform[NETXEN_MAX_CRB_XFORM]; #define NETXEN_ADDR_ERROR (0x) @@ -404,9 +406,14 @@ static inline int do_rom_fast_write(stru static inline int do_rom_fast_read(struct netxen_adapter *adapter, int addr, int *valp) { + if (jiffies (last_schedule_time + (8 * HZ))) { + last_schedule_time = jiffies; + schedule(); + } + netxen_nic_reg_write(adapter, NETXEN_ROMUSB_ROM_ADDRESS, addr); netxen_nic_reg_write(adapter, NETXEN_ROMUSB_ROM_ABYTE_CNT, 3); - udelay(70); /* prevent bursting on CRB */ + udelay(100);/* prevent bursting on CRB */ To prevent PCI write posting issues, you should always do a dummy read before any delay. This is a good suggestion. I have the code in place in which i do a dummy read of hardware location before the delay. But as of now i have tested this code only on some machines. I will like to test it on almost all possible set of hardware configurations and put it. With that i am also trying to reduce the delay as much as possible. Till then this patch will make the code work on all hardware platforms (including one which require more delay) as well as prevent a softlockup from occurring. Thanks, Mithlesh Thukral - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] NetXen: Bug fixes
Hi All, I will be sending bug fixes to NetXen: 1G/10G Ethernet driver in subsequent mails. The patches are with respect to netdev#upstream-fixes. Regards, Mithlesh Thukral - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] NetXen: Bug fix for Jumbo frames on XG card
NetXen: Set the MTU for the right port depending upon the port number for XG cards. Signed-off by: Mithlesh Thukral [EMAIL PROTECTED] --- drivers/net/netxen/netxen_nic_hw.c |5 - 1 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/netxen/netxen_nic_hw.c b/drivers/net/netxen/netxen_nic_hw.c index 1be5570..6537574 100644 --- a/drivers/net/netxen/netxen_nic_hw.c +++ b/drivers/net/netxen/netxen_nic_hw.c @@ -822,7 +822,10 @@ int netxen_nic_set_mtu_xgb(struct netxen { struct netxen_adapter *adapter = port-adapter; new_mtu += NETXEN_NIU_HDRSIZE + NETXEN_NIU_TLRSIZE; - netxen_nic_write_w0(adapter, NETXEN_NIU_XGE_MAX_FRAME_SIZE, new_mtu); + if (port-portnum == 0) + netxen_nic_write_w0(adapter, NETXEN_NIU_XGE_MAX_FRAME_SIZE, new_mtu); + else if (port-portnum == 1) + netxen_nic_write_w0(adapter, NETXEN_NIU_XG1_MAX_FRAME_SIZE, new_mtu); return 0; } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bridge: faster compare for link local addresses
David Miller [EMAIL PROTECTED] writes: From: Rick Jones [EMAIL PROTECTED] Date: Mon, 12 Mar 2007 17:05:39 -0700 Being paranoid - are there no worries about the alignment of dest? If it's an issue, it's an issue elsewhere too, as the places where Stephen took this idiomatic code from is the code ethernet handling and that runs on every input packet via eth_type_trans(). As a quick note -- when you tell gcc the expected alignment by using correct types then moderm gcc should generate fast inline code for memcpy/memcmp/etc. by itself. It only falls back to a slow generic function when it cannot figure out the alignment or the size. So I expect just using u32 * instead of char * should have the same effect and would be somewhat cleaner and the memcmp could be kept. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bridge: faster compare for link local addresses
On Tuesday 13 March 2007 15:01, Andi Kleen wrote: David Miller [EMAIL PROTECTED] writes: From: Rick Jones [EMAIL PROTECTED] Date: Mon, 12 Mar 2007 17:05:39 -0700 Being paranoid - are there no worries about the alignment of dest? If it's an issue, it's an issue elsewhere too, as the places where Stephen took this idiomatic code from is the code ethernet handling and that runs on every input packet via eth_type_trans(). As a quick note -- when you tell gcc the expected alignment by using correct types then moderm gcc should generate fast inline code for memcpy/memcmp/etc. by itself. It only falls back to a slow generic function when it cannot figure out the alignment or the size. So I expect just using u32 * instead of char * should have the same effect and would be somewhat cleaner and the memcmp could be kept. For memcpy() yes you can have some optimizations. But memcmp() has a strong semantic (in libc). memcmp(a, b, 6) should do 6 byte compares and conditional branches, regardless of a/b alignment. Or use the x86 rep cmpsb instruction that basically has the same cost. The trick we use in compare_ether_addr() reduces to one some arithmetic and one test. return ((a[0] ^ b[0]) | (a[1] ^ b[1]) | (a[2] ^ b[2])) != 0; I found this line as clean as memcmp(a, b, 6) (On x86_64, were alignment is not mandatory, we could do : ((*(long *)a ^ *(long*)b) 16) != 0) (only if we can always read two extra bytes without faulting, of course :) ) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] natsemi: netpoll fixes
Hello. Mark Brown wrote: Subject: natsemi: Fix NAPI for interrupt sharing The interrupt status register for the natsemi chips is clear on read and was read unconditionally from both the interrupt and from the NAPI poll routine, meaning that if the interrupt service routine was called (for example, due to a shared interrupt) while a NAPI poll was scheduled interrupts could be missed. This patch fixes that by ensuring that the interrupt status register is only read by the interrupt handler when interrupts are enabled from the chip. It also reverts a workaround for this problem from the netpoll hook and improves the trace for interrupt events. Thanks to Sergei Shtylyov [EMAIL PROTECTED] for spotting the issue, Mark Huth [EMAIL PROTECTED] for a simpler method and Simon Blake [EMAIL PROTECTED] for testing resources. Signed-Off-By: Mark Brown [EMAIL PROTECTED] Index: linux-2.6/drivers/net/natsemi.c === --- linux-2.6.orig/drivers/net/natsemi.c2007-03-11 02:32:43.0 + +++ linux-2.6/drivers/net/natsemi.c 2007-03-13 00:12:29.0 + [...] @@ -2131,17 +2133,23 @@ dev-name, np-intr_status, readl(ioaddr + IntrMask)); - if (!np-intr_status) - return IRQ_NONE; - - prefetch(np-rx_skbuff[np-cur_rx % RX_RING_SIZE]); + if (np-intr_status) { + prefetch(np-rx_skbuff[np-cur_rx % RX_RING_SIZE]); - if (netif_rx_schedule_prep(dev)) { /* Disable interrupts and register for poll */ - natsemi_irq_disable(dev); - __netif_rx_schedule(dev); + if (netif_rx_schedule_prep(dev)) { + natsemi_irq_disable(dev); + __netif_rx_schedule(dev); + } else + printk(KERN_WARNING + %s: Ignoring interrupt, status %#08x, mask %#08x.\n, + dev-name, np-intr_status, + readl(ioaddr + IntrMask)); + + return IRQ_HANDLED; } - return IRQ_HANDLED; + + return IRQ_NONE; } The only complaint I have is that this restructuring seems unnecessary: the only real change it does is an addition of else to the if statement. WBR, Sergei - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Shrink struct dst_entry a bit
The ICMP rate limiting state can be shorts, we don't send that many ICMPs. Changing flags to short and reorder fields to be sorted by size to avoid holes. Move cold fields towards the end. Signed-off-by: Andi Kleen [EMAIL PROTECTED] Index: linux-2.6.21-rc3-net/include/net/dst.h === --- linux-2.6.21-rc3-net.orig/include/net/dst.h +++ linux-2.6.21-rc3-net/include/net/dst.h @@ -40,26 +40,24 @@ struct dst_entry struct rcu_head rcu_head; struct dst_entry*child; struct net_device *dev; - short error; - short obsolete; - int flags; + unsigned long expires; + short flags; #define DST_HOST 1 #define DST_NOXFRM 2 #define DST_NOPOLICY 4 #define DST_NOHASH 8 #define DST_BALANCED0x10 - unsigned long expires; + short error; + short obsolete; unsigned short header_len; /* more space at head required */ unsigned short nfheader_len; /* more non-fragment space at head required */ unsigned short trailer_len;/* space to reserve at tail */ - u32 metrics[RTAX_MAX]; - struct dst_entry*path; - - unsigned long rate_last; /* rate limiting for ICMP */ - unsigned long rate_tokens; + unsigned short rate_last; /* rate limiting for ICMP */ + unsigned short rate_tokens; + struct dst_entry*path; struct neighbour*neighbour; struct hh_cache *hh; struct xfrm_state *xfrm; @@ -67,10 +65,6 @@ struct dst_entry int (*input)(struct sk_buff*); int (*output)(struct sk_buff*); -#ifdef CONFIG_NET_CLS_ROUTE - __u32 tclassid; -#endif - struct dst_ops *ops; unsigned long lastuse; @@ -82,6 +76,13 @@ struct dst_entry struct rt6_info *rt6_next; struct dn_route *dn_next; }; + + u32 metrics[RTAX_MAX]; + +#ifdef CONFIG_NET_CLS_ROUTE + __u32 tclassid; +#endif + charinfo[0]; }; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] natsemi: netpoll fixes
Hello. Mark Brown wrote: Moving netdev_rx() would fix that one but there's some others too - there's one in the timer routine if the chip crashes. In the case you Erm, sorry, I'm not seeing it -- could you point with finger please? :-) In netdev_timer() when the device is using PORT_TP if the DspCfg read back from the chip differs from the one we think we programmed into it then the driver thinks the PHY fell over. It then goes through an init sequence, including init_registers() which will reset IntrEnable among other things. What's more important for us, it will also clear IntrStatus (and ignore all pending interrupts). Well, as it will also reinit the whole TX/RX rings, so that all packets will be lost... describe above the consequences shouldn't be too bad since it tends to only occur at high volume so further traffic will tend to occur and cause things to recover - all the testing of that patch was done with the bug present and no ill effects. Oversized packets occur only at high volume? Is it some errata? It's an errata - AN 1287 which you can get from the National web site. It's not actually that chip that's getting oversided packets, what happens is that the state machine which reads data off the wire gets confused and eventually locks up. Before locking up it will usually report one or more oversided packets so this is a useful hint that we should reset the recieve state machine in order to recover from this. That's all good by why we need to completely lose TX and other interrupts in the meantime? High inbound traffic doesn't necessarily mean a high outbound one, does it? WBR, Sergei - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4] PPPoE: miscellaneous smaller cleanups
below is a patch that just removes dead code/initializers without any effect (first access is an assignment) that I stumbled accross while reading the source. Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED] Acked-by: Michal Ostrowski [EMAIL PROTECTED] --- drivers/net/pppoe.c | 21 - 1 files changed, 8 insertions(+), 13 deletions(-) diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c index ebfa296..ec4e67d 100644 --- a/drivers/net/pppoe.c +++ b/drivers/net/pppoe.c @@ -207,7 +207,7 @@ static inline struct pppox_sock *get_item(unsigned long sid, static inline struct pppox_sock *get_item_by_addr(struct sockaddr_pppox *sp) { - struct net_device *dev = NULL; + struct net_device *dev; int ifindex; dev = dev_get_by_name(sp-sa_addr.pppoe.dev); @@ -222,9 +222,6 @@ static inline int set_item(struct pppox_sock *po) { int i; - if (!po) - return -EINVAL; - write_lock_bh(pppoe_hash_lock); i = __set_item(po); write_unlock_bh(pppoe_hash_lock); @@ -344,7 +341,7 @@ static struct notifier_block pppoe_notifier = { static int pppoe_rcv_core(struct sock *sk, struct sk_buff *skb) { struct pppox_sock *po = pppox_sk(sk); - struct pppox_sock *relay_po = NULL; + struct pppox_sock *relay_po; if (sk-sk_state PPPOX_BOUND) { struct pppoe_hdr *ph = (struct pppoe_hdr *) skb-nh.raw; @@ -514,7 +511,6 @@ static int pppoe_release(struct socket *sock) { struct sock *sk = sock-sk; struct pppox_sock *po; - int error = 0; if (!sk) return 0; @@ -543,7 +539,7 @@ static int pppoe_release(struct socket *sock) skb_queue_purge(sk-sk_receive_queue); sock_put(sk); - return error; + return 0; } @@ -762,10 +758,10 @@ static int pppoe_ioctl(struct socket *sock, unsigned int cmd, static int pppoe_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m, size_t total_len) { - struct sk_buff *skb = NULL; + struct sk_buff *skb; struct sock *sk = sock-sk; struct pppox_sock *po = pppox_sk(sk); - int error = 0; + int error; struct pppoe_hdr hdr; struct pppoe_hdr *ph; struct net_device *dev; @@ -929,10 +925,10 @@ static int pppoe_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m, size_t total_len, int flags) { struct sock *sk = sock-sk; - struct sk_buff *skb = NULL; + struct sk_buff *skb; int error = 0; int len; - struct pppoe_hdr *ph = NULL; + struct pppoe_hdr *ph; if (sk-sk_state PPPOX_BOUND) { error = -EIO; @@ -949,7 +945,6 @@ static int pppoe_recvmsg(struct kiocb *iocb, struct socket *sock, m-msg_namelen = 0; if (skb) { - error = 0; ph = (struct pppoe_hdr *) skb-nh.raw; len = ntohs(ph-length); @@ -991,7 +986,7 @@ out: static __inline__ struct pppox_sock *pppoe_get_idx(loff_t pos) { - struct pppox_sock *po = NULL; + struct pppox_sock *po; int i = 0; for (; i PPPOE_HASH_SIZE; i++) { -- 1.5.0.g78e90 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] PPPOE: memory leak when socket is release()d before PPPIOCGCHAN has been called on it
below you find a patch that fixes a memory leak when a PPPoE socket is release()d after it has been connect()ed, but before the PPPIOCGCHAN ioctl ever has been called on it. This is somewhat of a security problem, too, since PPPoE sockets can be created by any user, so any user can easily allocate all the machine's RAM to non-swappable address space and thus DoS the system. Is there any specific reason for PPPoE sockets being available to any unprivileged process, BTW? After all, you need a packet socket for the discovery stage anyway, so it's unlikely that any unprivileged process will ever need to create a PPPoE socket, no? Allocating all session IDs for a known AC is a kind of DoS, too, after all - with Juniper ERXes, this is really easy, actually, since they don't ever assign session ids above 8000 ... Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED] Acked-by: Michal Ostrowski [EMAIL PROTECTED] --- drivers/net/pppox.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/pppox.c b/drivers/net/pppox.c index 9315046..3f8115d 100644 --- a/drivers/net/pppox.c +++ b/drivers/net/pppox.c @@ -58,7 +58,7 @@ void pppox_unbind_sock(struct sock *sk) { /* Clear connection to ppp device, if attached. */ - if (sk-sk_state (PPPOX_BOUND | PPPOX_ZOMBIE)) { + if (sk-sk_state (PPPOX_BOUND | PPPOX_CONNECTED | PPPOX_ZOMBIE)) { ppp_unregister_channel(pppox_sk(sk)-chan); sk-sk_state = PPPOX_DEAD; } -- 1.5.0.g78e90 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] PPPOE: race between interface going down and connect()
below you find a patch that (hopefully) fixes a race between an interface going down and a connect() to a peer on that interface. Before, connect() would determine that an interface is up, then the interface could go down and all entries referring to that interface in the item_hash_table would be marked as ZOMBIEs and their references to the device would be freed, and after that, connect() would put a new entry into the hash table referring to the device that meanwhile is down already - which also would cause unregister_netdevice() to wait until the socket has been release()d. This patch does not suffice if we are not allowed to accept connect()s referring to a device that we already acked a NETDEV_GOING_DOWN for (that is: all references are only guaranteed to be freed after NETDEV_DOWN has been acknowledged, not necessarily after the NETDEV_GOING_DOWN already). And if we are allowed to, we could avoid looking through the hash table upon NETDEV_GOING_DOWN completely and only do that once we get the NETDEV_DOWN ... mostrows: pppoe_flush_dev is called on NETDEV_GOING_DOWN and NETDEV_DOWN to deal with this late connect issue. Ideally one would hope to notify users at the NETDEV_GOING_DOWN phase (just to pretend to be nice). However, it is the NETDEV_DOWN scan that takes all the responsibility for ensuring nobody is hanging around at that time. Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED] Acked-by: Michal Ostrowski [EMAIL PROTECTED] --- drivers/net/pppoe.c | 19 ++- 1 files changed, 6 insertions(+), 13 deletions(-) diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c index ec4e67d..4e878c9 100644 --- a/drivers/net/pppoe.c +++ b/drivers/net/pppoe.c @@ -218,17 +218,6 @@ static inline struct pppox_sock *get_item_by_addr(struct sockaddr_pppox *sp) return get_item(sp-sa_addr.pppoe.sid, sp-sa_addr.pppoe.remote, ifindex); } -static inline int set_item(struct pppox_sock *po) -{ - int i; - - write_lock_bh(pppoe_hash_lock); - i = __set_item(po); - write_unlock_bh(pppoe_hash_lock); - - return i; -} - static inline struct pppox_sock *delete_item(unsigned long sid, char *addr, int ifindex) { struct pppox_sock *ret; @@ -595,14 +584,18 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr, po-pppoe_dev = dev; po-pppoe_ifindex = dev-ifindex; - if (!(dev-flags IFF_UP)) + write_lock_bh(pppoe_hash_lock); + if (!(dev-flags IFF_UP)){ + write_unlock_bh(pppoe_hash_lock); goto err_put; + } memcpy(po-pppoe_pa, sp-sa_addr.pppoe, sizeof(struct pppoe_addr)); - error = set_item(po); + error = __set_item(po); + write_unlock_bh(pppoe_hash_lock); if (error 0) goto err_put; -- 1.5.0.g78e90 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] PPPOE: Fix device tear-down notification.
pppoe_flush_dev() kicks all sockets bound to a device that is going down. In doing so, locks must be taken in the right order consistently (sock lock, followed by the pppoe_hash_lock). However, the scan process is based on us holding the sock lock. So, when something is found in the scan we must release the lock we're holding and grab the sock lock. This patch fixes race conditions between this code and pppoe_release(), both of which perform similar functions but would naturally prefer to grab locks in opposing orders. Both code paths are now going after these locks in a consistent manner. pppoe_hash_lock protects the contents of the pppox_sock objects that reside inside the hash. Thus, NULL'ing out the pppoe_dev field should be done under the protection of this lock. Signed-off-by: Michal Ostrowski [EMAIL PROTECTED] --- drivers/net/pppoe.c | 93 +-- 1 files changed, 53 insertions(+), 40 deletions(-) diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c index 4e878c9..0961bf9 100644 --- a/drivers/net/pppoe.c +++ b/drivers/net/pppoe.c @@ -241,54 +241,53 @@ static inline struct pppox_sock *delete_item(unsigned long sid, char *addr, int static void pppoe_flush_dev(struct net_device *dev) { int hash; - BUG_ON(dev == NULL); - read_lock_bh(pppoe_hash_lock); + write_lock_bh(pppoe_hash_lock); for (hash = 0; hash PPPOE_HASH_SIZE; hash++) { struct pppox_sock *po = item_hash_table[hash]; while (po != NULL) { - if (po-pppoe_dev == dev) { - struct sock *sk = sk_pppox(po); - - sock_hold(sk); - po-pppoe_dev = NULL; - - /* We hold a reference to SK, now drop the -* hash table lock so that we may attempt -* to lock the socket (which can sleep). -*/ - read_unlock_bh(pppoe_hash_lock); - - lock_sock(sk); - - if (sk-sk_state - (PPPOX_CONNECTED | PPPOX_BOUND)) { - pppox_unbind_sock(sk); - dev_put(dev); - sk-sk_state = PPPOX_ZOMBIE; - sk-sk_state_change(sk); - } - - release_sock(sk); + struct sock *sk = sk_pppox(po); + if (po-pppoe_dev != dev) { + po = po-next; + continue; + } + po-pppoe_dev = NULL; + dev_put(dev); + + + /* We always grab the socket lock, followed by the +* pppoe_hash_lock, in that order. Since we should +* hold the sock lock while doing any unbinding, +* we need to release the lock we're holding. +* Hold a reference to the sock so it doesn't disappear +* as we're jumping between locks. +*/ - sock_put(sk); + sock_hold(sk); - read_lock_bh(pppoe_hash_lock); + write_unlock_bh(pppoe_hash_lock); + lock_sock(sk); - /* Now restart from the beginning of this -* hash chain. We always NULL out pppoe_dev -* so we are guaranteed to make forward -* progress. -*/ - po = item_hash_table[hash]; - continue; + if (sk-sk_state (PPPOX_CONNECTED | PPPOX_BOUND)) { + pppox_unbind_sock(sk); + sk-sk_state = PPPOX_ZOMBIE; + sk-sk_state_change(sk); } - po = po-next; + + release_sock(sk); + sock_put(sk); + + /* Restart scan at the beginning of this hash chain. +* While the lock was dropped the chain contents may +* have changed. +*/ + write_lock_bh(pppoe_hash_lock); + po = item_hash_table[hash]; } } - read_unlock_bh(pppoe_hash_lock); + write_unlock_bh(pppoe_hash_lock); } static int pppoe_device_event(struct notifier_block *this, @@ -504,28 +503,42 @@
Re: [PATCH] Shrink struct dst_entry a bit
On Tuesday 13 March 2007 15:10, Eric Dumazet wrote: On Tuesday 13 March 2007 14:48, Andi Kleen wrote: The ICMP rate limiting state can be shorts, we don't send that many ICMPs. Changing flags to short and reorder fields to be sorted by size to avoid holes. Move cold fields towards the end. Nope, you cannot break the reordering I've done one month ago. Ok. When you do such changes you should always add a comment, otherwise it will be always destroyed with the next change. But it seems highly fragile to me anyways because it depends on the exact value of RTAX_MAX which tends to change regularly when someone invents a new attribute. You should probably have moved next out of the dst entry. Anyways here's a new patch with next still at the end and a comment. -Andi Shrink dst_entry a bit. The ICMP rate limiting state can be shorts, we don't send that many ICMPs. Changing flags to short and reorder fields to be sorted by size to avoid holes. Move cold fields towards the end. Signed-off-by: Andi Kleen [EMAIL PROTECTED] Index: linux-2.6.21-rc3-net/include/net/dst.h === --- linux-2.6.21-rc3-net.orig/include/net/dst.h +++ linux-2.6.21-rc3-net/include/net/dst.h @@ -40,26 +40,24 @@ struct dst_entry struct rcu_head rcu_head; struct dst_entry*child; struct net_device *dev; - short error; - short obsolete; - int flags; + unsigned long expires; + short flags; #define DST_HOST 1 #define DST_NOXFRM 2 #define DST_NOPOLICY 4 #define DST_NOHASH 8 #define DST_BALANCED0x10 - unsigned long expires; + short error; + short obsolete; unsigned short header_len; /* more space at head required */ unsigned short nfheader_len; /* more non-fragment space at head required */ unsigned short trailer_len;/* space to reserve at tail */ - u32 metrics[RTAX_MAX]; - struct dst_entry*path; - - unsigned long rate_last; /* rate limiting for ICMP */ - unsigned long rate_tokens; + unsigned short rate_last; /* rate limiting for ICMP */ + unsigned short rate_tokens; + struct dst_entry*path; struct neighbour*neighbour; struct hh_cache *hh; struct xfrm_state *xfrm; @@ -67,21 +65,26 @@ struct dst_entry int (*input)(struct sk_buff*); int (*output)(struct sk_buff*); -#ifdef CONFIG_NET_CLS_ROUTE - __u32 tclassid; -#endif - struct dst_ops *ops; unsigned long lastuse; atomic_t__refcnt; /* client references*/ int __use; + u32 metrics[RTAX_MAX]; + +#ifdef CONFIG_NET_CLS_ROUTE + __u32 tclassid; +#endif + + /* Should be at the end to be on the same cache line as + the flow information in rtable. */ union { struct dst_entry *next; struct rtable*rt_next; struct rt6_info *rt6_next; struct dn_route *dn_next; }; + charinfo[0]; }; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Shrink struct dst_entry a bit
On Tuesday 13 March 2007 15:31, Andi Kleen wrote: Ok. When you do such changes you should always add a comment, otherwise it will be always destroyed with the next change. But it seems highly fragile to me anyways because it depends on the exact value of RTAX_MAX which tends to change regularly when someone invents a new attribute. You should probably have moved next out of the dst entry. Not an option, unfortunately. But yes, a comment is needed. (Before my february patches, the 'next' pointer was forced to be the first field of dst). Anyways here's a new patch with next still at the end and a comment. Andi, did you actually test your patch ? Unless I really miss something obvious, rate_last is supposed to store jiffies. net/ipv4/route.c:1313: if (time_after(jiffies, rt-u.dst.rate_last + ip_rt_redirect_silence)) So you *cannot* convert it to 'unsigned short'. Really. However, you could convert it to a u32, and use a helper function : static inline u32 get_jiffies_32() { return (u32)jiffies; } and change appropriate code using rate_last Also, 'lastuse' could use a u32 too, I even had a patch for this one... - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Shrink struct dst_entry a bit
On Tuesday 13 March 2007 15:44, Eric Dumazet wrote: Also, 'lastuse' could use a u32 too, I even had a patch for this one... Here is the patch I have here for lastuse u32 conversion, not for inclusion yet because not yet tested (only compiled) [PATCH] NET : abstract lastuse (from struct dst_entry) and convert it to u32 This saves 4 bytes (possibly 8) on 64 bit archs Signed-off-by: Eric Dumazet [EMAIL PROTECTED] diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index c080f61..fb23951 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -89,7 +89,16 @@ static inline u64 get_jiffies_64(void) return (u64)jiffies; } #endif - +/* + * On 64bit archs, storing timestamps in 'unsigned long' vars + * may consume unecessary memory. Using u32 is ok when deltas + * between current jiffie and past timestamps are known to + * fit in 32-1 bits. Even with HZ=1000, thats 24 days. + */ +static inline u32 get_jiffies_32(void) +{ + return (u32)jiffies; +} /* * These inlines deal with timer wrapping correctly. You are * strongly encouraged to use them diff --git a/include/net/dst.h b/include/net/dst.h index e12a8ce..70366b7 100644 --- a/include/net/dst.h +++ b/include/net/dst.h @@ -67,13 +67,13 @@ #define DST_BALANCED0x10 int (*input)(struct sk_buff*); int (*output)(struct sk_buff*); + + struct dst_ops *ops; #ifdef CONFIG_NET_CLS_ROUTE __u32 tclassid; #endif - - struct dst_ops *ops; - unsigned long lastuse; + u32 __lastuse; atomic_t__refcnt; /* client references*/ int __use; union { @@ -103,11 +103,23 @@ struct dst_ops int entry_size; atomic_tentries; - struct kmem_cache *kmem_cachep; + struct kmem_cache *kmem_cachep; }; #ifdef __KERNEL__ +static inline void +dst_lastuse_set(struct dst_entry *dst) +{ + dst-__lastuse = get_jiffies_32(); +} + +static inline unsigned long +dst_lastuse_delta(const struct dst_entry *dst) +{ + return get_jiffies_32() - dst-__lastuse; +} + static inline u32 dst_metric(const struct dst_entry *dst, int metric) { diff --git a/net/core/dst.c b/net/core/dst.c index 764bccb..6c0a023 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -136,7 +136,7 @@ void * dst_alloc(struct dst_ops * ops) return NULL; atomic_set(dst-__refcnt, 0); dst-ops = ops; - dst-lastuse = jiffies; + dst_lastuse_set(dst); dst-path = dst; dst-input = dst_discard_in; dst-output = dst_discard_out; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 6055074..c3f5264 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -215,7 +215,7 @@ int rtnl_put_cacheinfo(struct sk_buff *s u32 ts, u32 tsage, long expires, u32 error) { struct rta_cacheinfo ci = { - .rta_lastuse = jiffies_to_clock_t(jiffies - dst-lastuse), + .rta_lastuse = jiffies_to_clock_t(dst_lastuse_delta(dst)), .rta_used = dst-__use, .rta_clntref = atomic_read((dst-__refcnt)), .rta_error = error, diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c index 32a7db3..3d5c46c 100644 --- a/net/decnet/dn_route.c +++ b/net/decnet/dn_route.c @@ -166,7 +166,7 @@ static void dn_dst_check_expire(unsigned spin_lock(dn_rt_hash_table[i].lock); while((rt=*rtp) != NULL) { if (atomic_read(rt-u.dst.__refcnt) || - (now - rt-u.dst.lastuse) expire) { + dst_lastuse_delta(rt-u.dst) expire) { rtp = rt-u.dst.dn_next; continue; } @@ -187,7 +187,6 @@ static int dn_dst_gc(void) { struct dn_route *rt, **rtp; int i; - unsigned long now = jiffies; unsigned long expire = 10 * HZ; for(i = 0; i = dn_rt_hash_mask; i++) { @@ -197,7 +196,7 @@ static int dn_dst_gc(void) while((rt=*rtp) != NULL) { if (atomic_read(rt-u.dst.__refcnt) || - (now - rt-u.dst.lastuse) expire) { + dst_lastuse_delta(rt-u.dst) expire) { rtp = rt-u.dst.dn_next; continue; } @@ -278,7 +277,6 @@ static inline int compare_keys(struct fl static int dn_insert_route(struct dn_route *rt, unsigned hash, struct dn_route **rp) { struct dn_route *rth, **rthp; - unsigned long now = jiffies; rthp = dn_rt_hash_table[hash].chain; @@ -293,7 +291,7 @@
[PATCH 1/3] PPPoE: improved hashing routine
Hi, I'm not sure whether this is really worth it, but it looked so extremely inefficient that I couldn't resist - so let's hope providers will keep PPPoE around for a while, at least until terabit dsl ;-) The new code produces the same results as the old version and is ~ 3 to 6 times faster for 4-bit hashes on the CPUs I tested. Florian --- Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED] diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c index 9e51fcc..954328c 100644 --- a/drivers/net/pppoe.c +++ b/drivers/net/pppoe.c @@ -108,19 +108,24 @@ static inline int cmp_addr(struct pppoe_addr *a, unsigned long sid, char *addr) (memcmp(a-remote,addr,ETH_ALEN) == 0)); } -static int hash_item(unsigned long sid, unsigned char *addr) +#if 8%PPPOE_HASH_BITS +#error 8 must be a multiple of PPPOE_HASH_BITS +#endif + +static int hash_item(unsigned int sid, unsigned char *addr) { - char hash = 0; - int i, j; + unsigned char hash = 0; + unsigned int i; - for (i = 0; i ETH_ALEN ; ++i) { - for (j = 0; j 8/PPPOE_HASH_BITS ; ++j) { - hash ^= addr[i] ( j * PPPOE_HASH_BITS ); - } + for (i = 0 ; i ETH_ALEN ; i++) { + hash ^= addr[i]; + } + for (i = 0 ; i sizeof(sid_t)*8 ; i += 8 ){ + hash ^= sidi; + } + for (i = 8 ; (i=1) = PPPOE_HASH_BITS ; ) { + hash ^= hashi; } - - for (i = 0; i (sizeof(unsigned long)*8) / PPPOE_HASH_BITS ; ++i) - hash ^= sid (i*PPPOE_HASH_BITS); return hash ( PPPOE_HASH_SIZE - 1 ); } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] PPPoX/E: return ENOTTY on unknown ioctl requests
Hi, here another patch for the PPPoX/E code that makes sure that ENOTTY is returned for unknown ioctl requests rather than 0 (and removes another unneeded initializer which I didn't bother creating a separate patch for). Florian --- Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED] diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c index 954328c..9554924 100644 --- a/drivers/net/pppoe.c +++ b/drivers/net/pppoe.c @@ -669,8 +669,8 @@ static int pppoe_ioctl(struct socket *sock, unsigned int cmd, { struct sock *sk = sock-sk; struct pppox_sock *po = pppox_sk(sk); - int val = 0; - int err = 0; + int val; + int err; switch (cmd) { case PPPIOCGMRU: @@ -759,8 +759,9 @@ static int pppoe_ioctl(struct socket *sock, unsigned int cmd, err = 0; break; - default:; - }; + default: + err = -ENOTTY; + } return err; } diff --git a/drivers/net/pppox.c b/drivers/net/pppox.c index 3f8115d..51de561 100644 --- a/drivers/net/pppox.c +++ b/drivers/net/pppox.c @@ -72,7 +72,7 @@ int pppox_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) { struct sock *sk = sock-sk; struct pppox_sock *po = pppox_sk(sk); - int rc = 0; + int rc; lock_sock(sk); @@ -93,12 +93,9 @@ int pppox_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) break; } default: - if (pppox_protos[sk-sk_protocol]-ioctl) - rc = pppox_protos[sk-sk_protocol]-ioctl(sock, cmd, - arg); - - break; - }; + rc = pppox_protos[sk-sk_protocol]-ioctl ? + pppox_protos[sk-sk_protocol]-ioctl(sock, cmd, arg) : -ENOTTY; + } release_sock(sk); return rc; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] PPPoE: move lock_sock() in pppoe_sendmsg() to the right location
Hi, and the last one for now: Acquire the sock lock in pppoe_sendmsg() before accessing the sock - and in particular avoid releasing the lock even though it hasn't been acquired. Florian --- Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED] diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c index 9554924..eef8a5b 100644 --- a/drivers/net/pppoe.c +++ b/drivers/net/pppoe.c @@ -779,6 +779,7 @@ static int pppoe_sendmsg(struct kiocb *iocb, struct socket *sock, struct net_device *dev; char *start; + lock_sock(sk); if (sock_flag(sk, SOCK_DEAD) || !(sk-sk_state PPPOX_CONNECTED)) { error = -ENOTCONN; goto end; @@ -789,8 +790,6 @@ static int pppoe_sendmsg(struct kiocb *iocb, struct socket *sock, hdr.code = 0; hdr.sid = po-num; - lock_sock(sk); - dev = po-pppoe_dev; error = -EMSGSIZE; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] tc35815: Fix an usage of streaming DMA API.
The tc35815 driver lacks a call to pci_dma_sync_single_for_device() on receiving. Recent fix of MIPS dma_sync_single_for_cpu() reveal this bug. Signed-off-by: Atsushi Nemoto [EMAIL PROTECTED] --- This patch can be applied to netdev-2.6 tree or 2.6.21-rc3-mm2. diff --git a/drivers/net/tc35815.c b/drivers/net/tc35815.c index ec888db..eed78b5 100644 --- a/drivers/net/tc35815.c +++ b/drivers/net/tc35815.c @@ -58,12 +58,13 @@ * 1.34Fix netpoll locking. BH rule for NAPI is not enough with * netpoll, hard_start_xmit might be called from irq context. * PM support. + * 1.35Fix an usage of streaming DMA API. */ #ifdef TC35815_NAPI -#define DRV_VERSION1.34-NAPI +#define DRV_VERSION1.35-NAPI #else -#define DRV_VERSION1.34 +#define DRV_VERSION1.35 #endif static const char *version = tc35815.c:v DRV_VERSION \n; #define MODNAMEtc35815 @@ -1551,6 +1552,11 @@ tc35815_rx(struct net_device *dev) PCI_DMA_FROMDEVICE); #endif memcpy(data + offset, rxbuf, len); +#ifdef TC35815_DMA_SYNC_ONDEMAND + pci_dma_sync_single_for_device(lp-pci_dev, + dma, len, + PCI_DMA_FROMDEVICE); +#endif offset += len; cur_bd++; } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[NET] AX.25 Kconfig and docs updates and fixes
o The AX.25 Howto is unmaintained since several years. I've replaced it with a wiki at http://www.linux-ax25.org which provides more uptodate information. o Change default for AX25_DAMA_SLAVE to Y. AX25_DAMA_SLAVE only compiles in support for DAMA but doesn't activate it. I hope this gets Linux distributions to ship their AX.25 kernels with AX25_DAMA_SLAVE enabled. The price for this would be very small. o Delete historic changelog from comments, that's what SCM systems are meant to do. o ---help--- in Kconfig looks so yellingly eye insulting. Use just help. o Rewrite the commented out piece of old Linux 2.4 configuration language to Kconfig for consistency. o Fixup dependencies. Signed-off-by: Ralf Baechle [EMAIL PROTECTED] diff --git a/Documentation/networking/ax25.txt b/Documentation/networking/ax25.txt index 37c25b0..8257dbf 100644 --- a/Documentation/networking/ax25.txt +++ b/Documentation/networking/ax25.txt @@ -1,16 +1,10 @@ To use the amateur radio protocols within Linux you will need to get a -suitable copy of the AX.25 Utilities. More detailed information about these -and associated programs can be found on http://zone.pspt.fi/~jsn/. - -For more information about the AX.25, NET/ROM and ROSE protocol stacks, see -the AX25-HOWTO written by Terry Dawson [EMAIL PROTECTED] -who is also the AX.25 Utilities maintainer. +suitable copy of the AX.25 Utilities. More detailed information about +AX.25, NET/ROM and ROSE, associated programs and and utilities can be +found on http://www.linux-ax25.org. There is an active mailing list for discussing Linux amateur radio matters -called linux-hams. To subscribe to it, send a message to +called [EMAIL PROTECTED] To subscribe to it, send a message to [EMAIL PROTECTED] with the words subscribe linux-hams in the body -of the message, the subject field is ignored. - -Jonathan G4KLX - [EMAIL PROTECTED] +of the message, the subject field is ignored. You don't need to be +subscribed to post but of course that means you might miss an answer. diff --git a/net/ax25/Kconfig b/net/ax25/Kconfig index a8993a0..43dd86f 100644 --- a/net/ax25/Kconfig +++ b/net/ax25/Kconfig @@ -1,30 +1,27 @@ # # Amateur Radio protocols and AX.25 device configuration # -# 19971130 Now in an own category to make correct compilation of the -# AX.25 stuff easier... -# Joerg Reuter DL1BKE [EMAIL PROTECTED] -# 19980129 Moved to net/ax25/Config.in, sourcing device drivers. menuconfig HAMRADIO depends on NET bool Amateur Radio support help If you want to connect your Linux box to an amateur radio, answer Y - here. You want to read http://www.tapr.org/tapr/html/pkthome.html and - the AX25-HOWTO, available from http://www.tldp.org/docs.html#howto. + here. You want to read http://www.tapr.org/tapr/html/pkthome.html + and more specifically about AX.25 on Linux + http://www.linux-ax25.org/. Note that the answer to this question won't directly affect the kernel: saying N will just cause the configurator to skip all the questions about amateur radio. comment Packet Radio protocols - depends on HAMRADIO NET + depends on HAMRADIO config AX25 tristate Amateur Radio AX.25 Level 2 protocol - depends on HAMRADIO NET - ---help--- + depends on HAMRADIO + help This is the protocol used for computer communication over amateur radio. It is either used by itself for point-to-point links, or to carry other protocols such as tcp/ip. To use it, you need a device @@ -52,6 +49,7 @@ config AX25 config AX25_DAMA_SLAVE bool AX.25 DAMA Slave support + default y depends on AX25 help DAMA is a mechanism to prevent collisions when doing AX.25 @@ -59,23 +57,38 @@ config AX25_DAMA_SLAVE from clients (called slaves) and redistributes it to other slaves. If you say Y here, your Linux box will act as a DAMA slave; this is transparent in that you don't have to do any special DAMA - configuration. (Linux cannot yet act as a DAMA server.) If unsure, - say N. + configuration. Linux cannot yet act as a DAMA server. This option + only compiles DAMA slave support into the kernel. It still needs to + be enabled at runtime. For more about DAMA see + http://www.linux-ax25.org. If unsure, say Y. + +# placeholder until implemented +config AX25_DAMA_MASTER + bool 'AX.25 DAMA Master support' + depends on AX25_DAMA_SLAVE BROKEN + help + DAMA is a mechanism to prevent collisions when doing AX.25 + networking. A DAMA server (called master) accepts incoming traffic + from clients (called slaves) and redistributes it to other slaves. + If you say Y here, your Linux box will act as a DAMA master; this is +
Re: SWS for rcvbuf MTU
Alex Sidorenko wrote: Here are the values from live kernel (obtained with 'crash') when the host was in SWS state: full_space=708 full_space/2=354 free_space=393 window=76 In this case the test from my original fix, (window full_space/2), succeeds. But John's test free_space window + full_space/2 393 430 does not. So I suspect that the new fix will not always work. From tcpdump traces we can see that both hosts exchange with 76-byte packets for a long time. From customer's application log we see that it continues to read 76-byte chunks per each read() call - even though more than that is available in the receive buffer. Technically it's OK for read() to return even after reading one byte, so if sk-receive_queue contains multiple 76-byte skbuffs we may return after processing just one skbuff (but we we don't understand the details of why this happens on customer's system). Are there any particular reasons why you want to postpone window update until free_space becomes window + full_space/2 and not as soon as free_space full_space/2? As the only real-life occurance of SWS shows free_space oscillating slightly above full_space/2, I created the fix specifically to match this phenomena as seen on customer's host. We reach the modified section only when (free_space full_space/2) so it should be OK to update the window at this point if mss==full_space. So yes, we can test John's fix on customer's host but I doubt it will work for the reasons mentioned above, in brief: 'window = free_space' instead of 'window=full_space/2' is OK, but the test 'free_space window + full_space/2' is not for the specific pattern customer sees on his hosts. Sorry for the long delay in response, I've been on vacation. I'm okay with your patch, and I can't think of any real problem with it, except that the behavior is non-standard. Then again, Linux acking in general is non-standard, which has created the bug in the first place. :) The only thing I can think where it might still ack too often is if free_space frequently drops just below full_space/2 for a bit then rises above full_space/2. I've also attached a corrected version of my earlier patch that I think solves the problem you noted. Thanks, -John Do full receiver-side SWS avoidance when rcvbuf mss. Signed-off-by: John Heffner [EMAIL PROTECTED] --- commit f4333661026621e15549fb75b37be785e4a1c443 tree 30d46b64ea19634875fdd4656d33f76db526a313 parent 562aa1d4c6a874373f9a48ac184f662fbbb06a04 author John Heffner [EMAIL PROTECTED] Tue, 13 Mar 2007 14:17:03 -0400 committer John Heffner [EMAIL PROTECTED] Tue, 13 Mar 2007 14:17:03 -0400 net/ipv4/tcp_output.c |9 - 1 files changed, 8 insertions(+), 1 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index dc15113..e621a63 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1605,8 +1605,15 @@ u32 __tcp_select_window(struct sock *sk) * We also don't do any window rounding when the free space * is too small. */ - if (window = free_space - mss || window free_space) + if (window = free_space - mss || window free_space) { window = (free_space/mss)*mss; + } else if (mss == full_space) { + /* Do full receive-side SWS avoidance +* when rcvbuf = mss */ + window = tcp_receive_window(tp); + if (free_space window + full_space/2) + window = free_space; + } } return window;
Re: [PATCH] tc35815: Fix an usage of streaming DMA API.
On Wed, 14 Mar 2007 01:02:20 +0900 (JST) Atsushi Nemoto [EMAIL PROTECTED] wrote: The tc35815 driver lacks a call to pci_dma_sync_single_for_device() on receiving. Recent fix of MIPS dma_sync_single_for_cpu() reveal this bug. Signed-off-by: Atsushi Nemoto [EMAIL PROTECTED] --- This patch can be applied to netdev-2.6 tree or 2.6.21-rc3-mm2. diff --git a/drivers/net/tc35815.c b/drivers/net/tc35815.c index ec888db..eed78b5 100644 --- a/drivers/net/tc35815.c +++ b/drivers/net/tc35815.c @@ -58,12 +58,13 @@ * 1.34Fix netpoll locking. BH rule for NAPI is not enough with * netpoll, hard_start_xmit might be called from irq context. * PM support. + * 1.35Fix an usage of streaming DMA API. */ Please don't use comments as changelog anymore. It gets out of date. The use of change control systems has made this practice obsolete. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 4/4] tcp: statistics not read_mostly
Stephen Hemminger [EMAIL PROTECTED] writes: /* * FIXME: On x86 and some other CPUs the split into user and softirq parts * is not needed because addl $1,memory is atomic against interrupts (but * atomic_inc would be overkill because of the lock cycles). Wants new * nonlocked_atomic_inc() primitives -AK */ That exists now as local_t. And in fact the generic (non x86) local_t is implemented in the same way as the current network statistics (although I'm not convinced that's the best portable way to do this) -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] natsemi: netpoll fixes
On Tue, Mar 13, 2007 at 04:53:54PM +0300, Sergei Shtylyov wrote: Mark Brown wrote: confused and eventually locks up. Before locking up it will usually report one or more oversided packets so this is a useful hint that we should reset the recieve state machine in order to recover from this. That's all good by why we need to completely lose TX and other interrupts in the meantime? High inbound traffic doesn't necessarily mean a high outbound one, does it? While the code in the driver can cope if the chip takes a while to respond to the reset as far as I have been able to tell in testing it does so close enough to immediately to avoid repeating the loop at all. The effect on transmit processing should be minimal. -- You grabbed my hand and we fell into it, like a daydream - or a fever. signature.asc Description: Digital signature
Re: bridge: faster compare for link local addresses
From: Eric Dumazet [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 14:38:32 +0100 But memcmp() has a strong semantic (in libc). memcmp(a, b, 6) should do 6 byte compares and conditional branches, regardless of a/b alignment. Or use the x86 rep cmpsb instruction that basically has the same cost. Yep, that's the issue, gcc won't make the reductions necessary here to get it down to one comparison and one branch. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: wireless extensions vs. 64-bit architectures
On Mon, 2007-03-12 at 10:56 -0700, Jean Tourrilhes wrote: I did that in the e-mail to Jouni. The problem is that most people are unfamiliar with decoding iwevents, so can't grasp the explanation. Basically, for iwpoint, we have an outer lenght and an inner length. If they don't match, we have an alignement issue and just need to pick the payload 8 bytes after the expected location. For other events, they have a well known size. If the outer lenght is not the expected size, but is expected+4, you just pick the payload 4 bytes after the expected location. Ok. So the plan now is to put this document up somewhere maybe with some graphics or whatever, and then send this to distros so they know what happens when people hit this bug. Does your new version work without padding even on 64-bit arches? Then in a few years we can actually remove the padding completely in the kernel, right? johannes signature.asc Description: This is a digitally signed message part
Re: bridge: faster compare for link local addresses
On Tue, 13 Mar 2007 12:39:54 -0700 (PDT) David Miller [EMAIL PROTECTED] wrote: From: Eric Dumazet [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 14:38:32 +0100 But memcmp() has a strong semantic (in libc). memcmp(a, b, 6) should do 6 byte compares and conditional branches, regardless of a/b alignment. Or use the x86 rep cmpsb instruction that basically has the same cost. Yep, that's the issue, gcc won't make the reductions necessary here to get it down to one comparison and one branch. Also, for our usage we only care about equality, not greater/less than return value. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] Get rid of netdev_nit
It isn't any faster to test a boolean global variable than do a simple check for empty list. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- net/core/dev.c | 18 +- 1 files changed, 5 insertions(+), 13 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 3a8590c..f2ae2c9 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -226,12 +226,6 @@ #endif ***/ /* - * For efficiency - */ - -static int netdev_nit; - -/* * Add a protocol ID to the list. Now that the input handler is * smarter we can dispense with all the messy stuff that used to be * here. @@ -265,10 +259,9 @@ void dev_add_pack(struct packet_type *pt int hash; spin_lock_bh(ptype_lock); - if (pt-type == htons(ETH_P_ALL)) { - netdev_nit++; + if (pt-type == htons(ETH_P_ALL)) list_add_rcu(pt-list, ptype_all); - } else { + else { hash = ntohs(pt-type) 15; list_add_rcu(pt-list, ptype_base[hash]); } @@ -295,10 +288,9 @@ void __dev_remove_pack(struct packet_typ spin_lock_bh(ptype_lock); - if (pt-type == htons(ETH_P_ALL)) { - netdev_nit--; + if (pt-type == htons(ETH_P_ALL)) head = ptype_all; - } else + else head = ptype_base[ntohs(pt-type) 15]; list_for_each_entry(pt1, head, list) { @@ -1333,7 +1325,7 @@ static int dev_gso_segment(struct sk_buf int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev) { if (likely(!skb-next)) { - if (netdev_nit) + if (!list_empty(ptype_all)) dev_queue_xmit_nit(skb, dev); if (netif_needs_gso(dev, skb)) { -- 1.4.1 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: wireless extensions vs. 64-bit architectures
On Tue, Mar 13, 2007 at 08:42:05PM +0100, Johannes Berg wrote: On Mon, 2007-03-12 at 10:56 -0700, Jean Tourrilhes wrote: I did that in the e-mail to Jouni. The problem is that most people are unfamiliar with decoding iwevents, so can't grasp the explanation. Basically, for iwpoint, we have an outer lenght and an inner length. If they don't match, we have an alignement issue and just need to pick the payload 8 bytes after the expected location. For other events, they have a well known size. If the outer lenght is not the expected size, but is expected+4, you just pick the payload 4 bytes after the expected location. Ok. So the plan now is to put this document up somewhere maybe with some graphics or whatever, and then send this to distros so they know what happens when people hit this bug. Does your new version work without padding even on 64-bit arches? Then in a few years we can actually remove the padding completely in the kernel, right? You are too smart ;-) Yes, the second version in pre16 does exactly that. That's why I had to change the constants. johannes Jean - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [IPROUTE2][GENERAL] nl_mgrp to crap if base multicast groups exceeded
On Sun, 25 Feb 2007 12:02:23 -0500 jamal [EMAIL PROTECTED] wrote: cheers, jamal applied both patches -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC IPROUTE 00/08]: Time cleanups + nano-second clock resolution support
On Sun, 4 Mar 2007 20:14:53 +0100 (MET) Patrick McHardy [EMAIL PROTECTED] wrote: This patchset consists of four parts: - minor TBF time conversion fix - consolidation of time calculations: consolidate commonly used expressions with the goal of making it easier to audit for integer overflows when increasing the internally used clock resolution. - support for detecting the clock resolution used by the kernel and converting time values as necessary. - finally, increase the internally used clock resolution to nano-seconds These patches have been tested (well, TBF and HFSC) with both old kernels and patched kernels using nano-second resolution. tc/m_estimator.c |4 +-- tc/m_police.c |2 - tc/q_cbq.c| 15 +++-- tc/q_hfsc.c | 18 +++ tc/q_htb.c|4 +-- tc/q_netem.c | 14 +++- tc/q_tbf.c| 22 +-- tc/tc_cbq.c |8 +++ tc/tc_core.c | 61 ++ tc/tc_core.h | 13 +++ tc/tc_estimator.c |2 - tc/tc_red.c |2 - tc/tc_util.c | 40 ++- tc/tc_util.h |7 +++--- 14 files changed, 125 insertions(+), 87 deletions(-) Patrick McHardy: [IPROUTE]: tbf: fix latency printing [IPROUTE]: Use tc_calc_xmittime() where appropriate [IPROUTE]: Introduce tc_calc_xmitsize and use where appropriate [IPROUTE]: Introduce TIME_UNITS_PER_SEC to represent internal clock resolution [IPROUTE]: Replace usec by time in function names [IPROUTE]: Add sprint_ticks() function and use in CBQ [IPROUTE]: Handle different kernel clock resolutions [IPROUTE]: Increase internal clock resolution to nsec applied all -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHES 0/15] skb-h is now a one member union
Hi David, Please consider pulling from: master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.22 We're getting close... Thanks a lot! - Arnaldo - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/15] [SK_BUFF]: Introduce skb_reset_transport_header(skb)
For the common, open coded 'skb-h.raw = skb-data' operation, so that we can later turn skb-h.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple cases: skb-h.raw = skb-data; skb-h.raw = {skb_push|[__]skb_pull}() The next ones will handle the slightly more complex cases. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/infiniband/hw/cxgb3/iwch_cm.c |6 +++--- drivers/net/appletalk/cops.c|2 +- drivers/net/appletalk/ltpc.c|4 ++-- drivers/net/cxgb3/sge.c |2 +- include/linux/dccp.h|6 +++--- include/linux/skbuff.h |5 + net/appletalk/aarp.c|6 +++--- net/appletalk/ddp.c |4 ++-- net/ax25/af_ax25.c |4 ++-- net/ax25/ax25_in.c |8 net/bluetooth/af_bluetooth.c|2 +- net/bluetooth/hci_core.c|9 + net/bluetooth/hci_sock.c|2 +- net/core/dev.c |2 +- net/core/netpoll.c |2 +- net/decnet/dn_nsp_in.c |2 +- net/decnet/dn_nsp_out.c |2 +- net/decnet/dn_route.c |4 ++-- net/ipv4/af_inet.c |6 -- net/ipv4/ah4.c |3 ++- net/ipv4/ip_input.c |2 +- net/ipv4/ip_output.c|2 +- net/ipv4/ipmr.c |2 +- net/ipv4/udp.c |3 ++- net/ipv4/xfrm4_mode_transport.c |2 +- net/ipv6/ip6_input.c|2 +- net/ipv6/ip6_output.c |8 net/ipv6/ipv6_sockglue.c|4 ++-- net/ipv6/netfilter/nf_conntrack_reasm.c |2 +- net/ipv6/reassembly.c |2 +- net/ipv6/xfrm6_mode_transport.c |2 +- net/ipx/af_ipx.c|2 +- net/ipx/ipx_route.c |2 +- net/irda/af_irda.c |4 ++-- net/irda/irlap_frame.c |2 +- net/iucv/af_iucv.c |2 +- net/key/af_key.c|2 +- net/llc/llc_sap.c |2 +- net/netlink/af_netlink.c|2 +- net/netrom/af_netrom.c |6 +++--- net/netrom/nr_in.c |2 +- net/netrom/nr_loopback.c|2 +- net/rose/af_rose.c |2 +- net/rose/rose_loopback.c|2 +- net/rose/rose_route.c |2 +- net/unix/af_unix.c |2 +- net/x25/af_x25.c|3 +-- net/x25/x25_dev.c |2 +- net/x25/x25_in.c|2 +- 49 files changed, 82 insertions(+), 73 deletions(-) From 410c353531e314f7c3642471b2f1b61bd8fc4ef7 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 13:06:52 -0300 Subject: [PATCH 01/15] [SK_BUFF]: Introduce skb_reset_transport_header(skb) For the common, open coded 'skb-h.raw = skb-data' operation, so that we can later turn skb-h.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple cases: skb-h.raw = skb-data; skb-h.raw = {skb_push|[__]skb_pull}() The next ones will handle the slightly more complex cases. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/infiniband/hw/cxgb3/iwch_cm.c |6 +++--- drivers/net/appletalk/cops.c|2 +- drivers/net/appletalk/ltpc.c|4 ++-- drivers/net/cxgb3/sge.c |2 +- include/linux/dccp.h|6 +++--- include/linux/skbuff.h |5 + net/appletalk/aarp.c|6 +++--- net/appletalk/ddp.c |4 ++-- net/ax25/af_ax25.c |4 ++-- net/ax25/ax25_in.c |8 net/bluetooth/af_bluetooth.c|2 +- net/bluetooth/hci_core.c|9 + net/bluetooth/hci_sock.c|2 +- net/core/dev.c |2 +- net/core/netpoll.c |2 +- net/decnet/dn_nsp_in.c |2 +- net/decnet/dn_nsp_out.c |2 +- net/decnet/dn_route.c |4 ++-- net/ipv4/af_inet.c |6 -- net/ipv4/ah4.c |3 ++- net/ipv4/ip_input.c |2 +- net/ipv4/ip_output.c|2 +- net/ipv4/ipmr.c |2 +- net/ipv4/udp.c |3 ++- net/ipv4/xfrm4_mode_transport.c |2 +-
[PATCH 02/15] [SK_BUFF]: Introduce skb_transport_offset()
For the quite common 'skb-h.raw - skb-data' sequence. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/atl1/atl1_main.c | 10 +- drivers/net/cassini.c |6 ++ drivers/net/cxgb3/sge.c|7 --- drivers/net/e1000/e1000_main.c | 10 +- drivers/net/ixgb/ixgb_main.c |8 drivers/net/myri10ge/myri10ge.c|5 +++-- drivers/net/netxen/netxen_nic_hw.c |2 +- drivers/net/sk98lin/skge.c |4 ++-- drivers/net/skge.c |2 +- drivers/net/sky2.c |2 +- drivers/net/sungem.c |6 ++ drivers/net/sunhme.c |6 ++ include/linux/skbuff.h |5 + include/net/udplite.h |6 +++--- net/core/dev.c |2 +- net/core/skbuff.c |2 +- net/ipv4/esp4.c|2 +- net/ipv4/udp.c |2 +- net/ipv6/esp6.c|9 +++-- net/ipv6/exthdrs.c | 12 +++- net/ipv6/ip6_input.c |2 +- net/ipv6/ipcomp6.c |4 +--- net/ipv6/mip6.c|5 +++-- net/ipv6/raw.c |4 ++-- net/ipv6/reassembly.c |3 ++- net/sctp/input.c |2 +- 26 files changed, 64 insertions(+), 64 deletions(-) From 9e8e523e2f63bdb7f93f50990817d82a94d276e1 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 13:24:15 -0300 Subject: [PATCH 02/15] [SK_BUFF]: Introduce skb_transport_offset() For the quite common 'skb-h.raw - skb-data' sequence. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/atl1/atl1_main.c | 10 +- drivers/net/cassini.c |6 ++ drivers/net/cxgb3/sge.c|7 --- drivers/net/e1000/e1000_main.c | 10 +- drivers/net/ixgb/ixgb_main.c |8 drivers/net/myri10ge/myri10ge.c|5 +++-- drivers/net/netxen/netxen_nic_hw.c |2 +- drivers/net/sk98lin/skge.c |4 ++-- drivers/net/skge.c |2 +- drivers/net/sky2.c |2 +- drivers/net/sungem.c |6 ++ drivers/net/sunhme.c |6 ++ include/linux/skbuff.h |5 + include/net/udplite.h |6 +++--- net/core/dev.c |2 +- net/core/skbuff.c |2 +- net/ipv4/esp4.c|2 +- net/ipv4/udp.c |2 +- net/ipv6/esp6.c|9 +++-- net/ipv6/exthdrs.c | 12 +++- net/ipv6/ip6_input.c |2 +- net/ipv6/ipcomp6.c |4 +--- net/ipv6/mip6.c|5 +++-- net/ipv6/raw.c |4 ++-- net/ipv6/reassembly.c |3 ++- net/sctp/input.c |2 +- 26 files changed, 64 insertions(+), 64 deletions(-) diff --git a/drivers/net/atl1/atl1_main.c b/drivers/net/atl1/atl1_main.c index 5d69178..c5ac46f 100644 --- a/drivers/net/atl1/atl1_main.c +++ b/drivers/net/atl1/atl1_main.c @@ -1326,8 +1326,8 @@ static int atl1_tx_csum(struct atl1_adapter *adapter, struct sk_buff *skb, u8 css, cso; if (likely(skb-ip_summed == CHECKSUM_PARTIAL)) { - cso = skb-h.raw - skb-data; - css = (skb-h.raw + skb-csum) - skb-data; + cso = skb_transport_offset(skb); + css = cso + skb-csum; if (unlikely(cso 0x1)) { printk(KERN_DEBUG %s: payload offset != even number\n, atl1_driver_name); @@ -1369,8 +1369,8 @@ static void atl1_tx_map(struct atl1_adapter *adapter, if (tcp_seg) { /* TSO/GSO */ - proto_hdr_len = - ((skb-h.raw - skb-data) + (skb-h.th-doff 2)); + proto_hdr_len = (skb_transport_offset(skb) + + (skb-h.th-doff 2)); buffer_info-length = proto_hdr_len; page = virt_to_page(skb-data); offset = (unsigned long)skb-data ~PAGE_MASK; @@ -1562,7 +1562,7 @@ static int atl1_xmit_frame(struct sk_buff *skb, struct net_device *netdev) mss = skb_shinfo(skb)-gso_size; if (mss) { if (skb-protocol == ntohs(ETH_P_IP)) { - proto_hdr_len = ((skb-h.raw - skb-data) + + proto_hdr_len = (skb_transport_offset(skb) + (skb-h.th-doff 2)); if (unlikely(proto_hdr_len len)) { dev_kfree_skb_any(skb); diff --git a/drivers/net/cassini.c b/drivers/net/cassini.c index 68e37a6..bd3ab64 100644 --- a/drivers/net/cassini.c +++ b/drivers/net/cassini.c @@ -2821,10 +2821,8 @@ static inline int cas_xmit_tx_ringN(struct cas *cp, int ring, ctrl = 0; if (skb-ip_summed == CHECKSUM_PARTIAL) { - u64 csum_start_off, csum_stuff_off; - - csum_start_off = (u64) (skb-h.raw - skb-data); - csum_stuff_off = csum_start_off + skb-csum_offset; + const u64 csum_start_off = skb_transport_offset(skb); + const u64 csum_stuff_off = csum_start_off + skb-csum_offset; ctrl =
[PATCH 03/15] [SK_BUFF]: Introduce skb_set_transport_header
For the cases where the transport header is being set to a offset from skb-data. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/skbuff.h |6 ++ net/ax25/af_ax25.c | 20 net/ax25/ax25_in.c |2 +- net/ipv4/esp4.c |3 ++- net/ipv4/ip_output.c| 19 --- net/ipv4/tcp_input.c|2 +- net/ipv6/ah6.c |2 +- net/ipv6/esp6.c |4 ++-- net/ipv6/netfilter/nf_conntrack_reasm.c |2 +- net/ipv6/xfrm6_mode_beet.c |2 +- net/ipv6/xfrm6_mode_ro.c|2 +- net/ipv6/xfrm6_mode_transport.c |2 +- 12 files changed, 33 insertions(+), 33 deletions(-) From 275275bab5d9e1887f1227fddbee875eaec82c6f Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 13:51:52 -0300 Subject: [PATCH 03/15] [SK_BUFF]: Introduce skb_set_transport_header For the cases where the transport header is being set to a offset from skb-data. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/skbuff.h |6 ++ net/ax25/af_ax25.c | 20 net/ax25/ax25_in.c |2 +- net/ipv4/esp4.c |3 ++- net/ipv4/ip_output.c| 19 --- net/ipv4/tcp_input.c|2 +- net/ipv6/ah6.c |2 +- net/ipv6/esp6.c |4 ++-- net/ipv6/netfilter/nf_conntrack_reasm.c |2 +- net/ipv6/xfrm6_mode_beet.c |2 +- net/ipv6/xfrm6_mode_ro.c|2 +- net/ipv6/xfrm6_mode_transport.c |2 +- 12 files changed, 33 insertions(+), 33 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index f721fab..12bd740 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -960,6 +960,12 @@ static inline void skb_reset_transport_header(struct sk_buff *skb) skb-h.raw = skb-data; } +static inline void skb_set_transport_header(struct sk_buff *skb, + const int offset) +{ + skb-h.raw = skb-data + offset; +} + static inline int skb_transport_offset(const struct sk_buff *skb) { return skb-h.raw - skb-data; diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index 14db01a..75d4d69 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -1425,7 +1425,6 @@ static int ax25_sendmsg(struct kiocb *iocb, struct socket *sock, struct sockaddr_ax25 sax; struct sk_buff *skb; ax25_digi dtmp, *dp; - unsigned char *asmptr; ax25_cb *ax25; size_t size; int lv, err, addr_len = msg-msg_namelen; @@ -1551,10 +1550,8 @@ static int ax25_sendmsg(struct kiocb *iocb, struct socket *sock, skb_reset_network_header(skb); /* Add the PID if one is not supplied by the user in the skb */ - if (!ax25-pidincl) { - asmptr = skb_push(skb, 1); - *asmptr = sk-sk_protocol; - } + if (!ax25-pidincl) + *skb_push(skb, 1) = sk-sk_protocol; SOCK_DEBUG(sk, AX.25: Transmitting buffer\n); @@ -1573,7 +1570,7 @@ static int ax25_sendmsg(struct kiocb *iocb, struct socket *sock, goto out; } - asmptr = skb_push(skb, 1 + ax25_addr_size(dp)); + skb_push(skb, 1 + ax25_addr_size(dp)); SOCK_DEBUG(sk, Building AX.25 Header (dp=%p).\n, dp); @@ -1581,17 +1578,16 @@ static int ax25_sendmsg(struct kiocb *iocb, struct socket *sock, SOCK_DEBUG(sk, Num digipeaters=%d\n, dp-ndigi); /* Build an AX.25 header */ - asmptr += (lv = ax25_addr_build(asmptr, ax25-source_addr, - sax.sax25_call, dp, - AX25_COMMAND, AX25_MODULUS)); + lv = ax25_addr_build(skb-data, ax25-source_addr, sax.sax25_call, + dp, AX25_COMMAND, AX25_MODULUS); SOCK_DEBUG(sk, Built header (%d bytes)\n,lv); - skb-h.raw = asmptr; + skb_set_transport_header(skb, lv); - SOCK_DEBUG(sk, base=%p pos=%p\n, skb-data, asmptr); + SOCK_DEBUG(sk, base=%p pos=%p\n, skb-data, skb-h.raw); - *asmptr = AX25_UI; + *skb-h.raw = AX25_UI; /* Datagram frames go straight out of the door as UI */ ax25_queue_xmit(skb, ax25-ax25_dev-dev); diff --git a/net/ax25/ax25_in.c b/net/ax25/ax25_in.c index 724ad5c..31c5938 100644 --- a/net/ax25/ax25_in.c +++ b/net/ax25/ax25_in.c @@ -233,7 +233,7 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev, /* UI frame - bypass LAPB processing */ if ((*skb-data ~0x10) == AX25_UI dp.lastrepeat + 1 == dp.ndigi) { - skb-h.raw = skb-data + 2; /* skip control and pid */ + skb_set_transport_header(skb, 2); /* skip control and pid */ ax25_send_to_raw(dest, skb, skb-data[1]); diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index 9576745..82543ee 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -261,7 +261,8 @@ static int esp_input(struct xfrm_state *x, struct sk_buff *skb) iph-protocol = nexthdr[1]; pskb_trim(skb,
[PATCH 04/15] [SCTP]: Introduce sctp_hdr()
For consistency with all the other skb-h.raw accessors. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/sctp.h |9 + net/sctp/input.c | 14 +- net/sctp/ipv6.c |4 ++-- net/sctp/protocol.c | 10 -- 4 files changed, 20 insertions(+), 17 deletions(-) From c483c1c1cc2c8e79c8eb06ef8f1e9fe8b5860721 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 13:59:32 -0300 Subject: [PATCH 04/15] [SCTP]: Introduce sctp_hdr() For consistency with all the other skb-h.raw accessors. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/sctp.h |9 + net/sctp/input.c | 14 +- net/sctp/ipv6.c |4 ++-- net/sctp/protocol.c | 10 -- 4 files changed, 20 insertions(+), 17 deletions(-) diff --git a/include/linux/sctp.h b/include/linux/sctp.h index d4f8656..d76767d 100644 --- a/include/linux/sctp.h +++ b/include/linux/sctp.h @@ -63,6 +63,15 @@ typedef struct sctphdr { __be32 checksum; } __attribute__((packed)) sctp_sctphdr_t; +#ifdef __KERNEL__ +#include linux/skbuff.h + +static inline struct sctphdr *sctp_hdr(const struct sk_buff *skb) +{ + return (struct sctphdr *)skb-h.raw; +} +#endif + /* Section 3.2. Chunk Field Descriptions. */ typedef struct sctp_chunkhdr { __u8 type; diff --git a/net/sctp/input.c b/net/sctp/input.c index 9311b5d..3a322c5 100644 --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -79,14 +79,10 @@ static void sctp_add_backlog(struct sock *sk, struct sk_buff *skb); /* Calculate the SCTP checksum of an SCTP packet. */ static inline int sctp_rcv_checksum(struct sk_buff *skb) { - struct sctphdr *sh; - __u32 cmp, val; struct sk_buff *list = skb_shinfo(skb)-frag_list; - - sh = (struct sctphdr *) skb-h.raw; - cmp = ntohl(sh-checksum); - - val = sctp_start_cksum((__u8 *)sh, skb_headlen(skb)); + struct sctphdr *sh = sctp_hdr(skb); + __u32 cmp = ntohl(sh-checksum); + __u32 val = sctp_start_cksum((__u8 *)sh, skb_headlen(skb)); for (; list; list = list-next) val = sctp_update_cksum((__u8 *)list-data, skb_headlen(list), @@ -138,7 +134,7 @@ int sctp_rcv(struct sk_buff *skb) if (skb_linearize(skb)) goto discard_it; - sh = (struct sctphdr *) skb-h.raw; + sh = sctp_hdr(skb); /* Pull up the IP and SCTP headers. */ __skb_pull(skb, skb_transport_offset(skb)); @@ -905,7 +901,7 @@ static struct sctp_association *__sctp_rcv_init_lookup(struct sk_buff *skb, struct sctp_association *asoc; union sctp_addr addr; union sctp_addr *paddr = addr; - struct sctphdr *sh = (struct sctphdr *) skb-h.raw; + struct sctphdr *sh = sctp_hdr(skb); sctp_chunkhdr_t *ch; union sctp_params params; sctp_init_chunk_t *init; diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c index e1bfc50..dff72e0 100644 --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -390,7 +390,7 @@ static void sctp_v6_from_skb(union sctp_addr *addr,struct sk_buff *skb, addr-v6.sin6_flowinfo = 0; /* FIXME */ addr-v6.sin6_scope_id = ((struct inet6_skb_parm *)skb-cb)-iif; - sh = (struct sctphdr *) skb-h.raw; + sh = sctp_hdr(skb); if (is_saddr) { *port = sh-source; from = ipv6_hdr(skb)-saddr; @@ -765,7 +765,7 @@ static void sctp_inet6_skb_msgname(struct sk_buff *skb, char *msgname, if (msgname) { sctp_inet6_msgname(msgname, addr_len); sin6 = (struct sockaddr_in6 *)msgname; - sh = (struct sctphdr *)skb-h.raw; + sh = sctp_hdr(skb); sin6-sin6_port = sh-source; /* Map ipv4 address into v4-mapped-on-v6 address. */ diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index 08f92ba..7c28c9b 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -235,7 +235,7 @@ static void sctp_v4_from_skb(union sctp_addr *addr, struct sk_buff *skb, port = addr-v4.sin_port; addr-v4.sin_family = AF_INET; - sh = (struct sctphdr *) skb-h.raw; + sh = sctp_hdr(skb); if (is_saddr) { *port = sh-source; from = ip_hdr(skb)-saddr; @@ -731,13 +731,11 @@ static void sctp_inet_event_msgname(struct sctp_ulpevent *event, char *msgname, /* Initialize and copy out a msgname from an inbound skb. */ static void sctp_inet_skb_msgname(struct sk_buff *skb, char *msgname, int *len) { - struct sctphdr *sh; - struct sockaddr_in *sin; - if (msgname) { + struct sctphdr *sh = sctp_hdr(skb); + struct sockaddr_in *sin = (struct sockaddr_in *)msgname; + sctp_inet_msgname(msgname, len); - sin = (struct sockaddr_in *)msgname; - sh = (struct sctphdr *)skb-h.raw; sin-sin_port = sh-source; sin-sin_addr.s_addr = ip_hdr(skb)-saddr; } -- 1.5.0.2
[PATCH 09/15] [TCP]: Introduce tcp_hdrlen() and tcp_optlen()
The ip_hdrlen() buddy, created to reduce the number of skb-h.th- uses and to avoid the longer, open coded equivalent. Ditched a no-op in bnx2 in the process. I wonder if we should have a BUG_ON(skb-h.th-doff 5) in tcp_optlen()... Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/atl1/atl1_main.c |7 +++ drivers/net/bnx2.c |7 +++ drivers/net/e1000/e1000_main.c |4 ++-- drivers/net/ehea/ehea_main.c |2 +- drivers/net/ixgb/ixgb_main.c |2 +- drivers/net/myri10ge/myri10ge.c |3 +-- drivers/net/netxen/netxen_nic_hw.c |3 +-- drivers/net/netxen/netxen_nic_main.c |2 +- drivers/net/sky2.c |2 +- drivers/net/tg3.c|4 ++-- drivers/s390/net/qeth_eddp.c |8 include/linux/tcp.h | 10 ++ net/ipv4/tcp_ipv4.c |2 +- net/ipv6/tcp_ipv6.c |2 +- 14 files changed, 32 insertions(+), 26 deletions(-) From 89d23ed26a2c62b8d8c0e2159c6cc9c6fb47a491 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 15:04:14 -0300 Subject: [PATCH 09/15] [TCP]: Introduce tcp_hdrlen() and tcp_optlen() The ip_hdrlen() buddy, created to reduce the number of skb-h.th- uses and to avoid the longer, open coded equivalent. Ditched a no-op in bnx2 in the process. I wonder if we should have a BUG_ON(skb-h.th-doff 5) in tcp_optlen()... Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/atl1/atl1_main.c |7 +++ drivers/net/bnx2.c |7 +++ drivers/net/e1000/e1000_main.c |4 ++-- drivers/net/ehea/ehea_main.c |2 +- drivers/net/ixgb/ixgb_main.c |2 +- drivers/net/myri10ge/myri10ge.c |3 +-- drivers/net/netxen/netxen_nic_hw.c |3 +-- drivers/net/netxen/netxen_nic_main.c |2 +- drivers/net/sky2.c |2 +- drivers/net/tg3.c|4 ++-- drivers/s390/net/qeth_eddp.c |8 include/linux/tcp.h | 10 ++ net/ipv4/tcp_ipv4.c |2 +- net/ipv6/tcp_ipv6.c |2 +- 14 files changed, 32 insertions(+), 26 deletions(-) diff --git a/drivers/net/atl1/atl1_main.c b/drivers/net/atl1/atl1_main.c index c5ac46f..0912d2a 100644 --- a/drivers/net/atl1/atl1_main.c +++ b/drivers/net/atl1/atl1_main.c @@ -1307,7 +1307,7 @@ static int atl1_tso(struct atl1_adapter *adapter, struct sk_buff *skb, tso-tsopl |= (iph-ihl CSUM_PARAM_IPHL_MASK) CSUM_PARAM_IPHL_SHIFT; - tso-tsopl |= ((skb-h.th-doff 2) + tso-tsopl |= (tcp_hdrlen(skb) TSO_PARAM_TCPHDRLEN_MASK) TSO_PARAM_TCPHDRLEN_SHIFT; tso-tsopl |= (skb_shinfo(skb)-gso_size TSO_PARAM_MSS_MASK) TSO_PARAM_MSS_SHIFT; @@ -1369,8 +1369,7 @@ static void atl1_tx_map(struct atl1_adapter *adapter, if (tcp_seg) { /* TSO/GSO */ - proto_hdr_len = (skb_transport_offset(skb) + - (skb-h.th-doff 2)); + proto_hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb); buffer_info-length = proto_hdr_len; page = virt_to_page(skb-data); offset = (unsigned long)skb-data ~PAGE_MASK; @@ -1563,7 +1562,7 @@ static int atl1_xmit_frame(struct sk_buff *skb, struct net_device *netdev) if (mss) { if (skb-protocol == ntohs(ETH_P_IP)) { proto_hdr_len = (skb_transport_offset(skb) + - (skb-h.th-doff 2)); + tcp_hdrlen(skb)); if (unlikely(proto_hdr_len len)) { dev_kfree_skb_any(skb); return NETDEV_TX_OK; diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c index 01ea2ba..f948918 100644 --- a/drivers/net/bnx2.c +++ b/drivers/net/bnx2.c @@ -4520,13 +4520,12 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } - tcp_opt_len = ((skb-h.th-doff - 5) * 4); vlan_tag_flags |= TX_BD_FLAGS_SW_LSO; tcp_opt_len = 0; - if (skb-h.th-doff 5) { - tcp_opt_len = (skb-h.th-doff - 5) 2; - } + if (skb-h.th-doff 5) + tcp_opt_len = tcp_optlen(skb); + ip_tcp_len = ip_hdrlen(skb) + sizeof(struct tcphdr); iph = ip_hdr(skb); diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 79988b7..40e18c2 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -2887,7 +2887,7 @@ e1000_tso(struct e1000_adapter *adapter, struct e1000_tx_ring *tx_ring, return err; } - hdr_len = (skb_transport_offset(skb) + (skb-h.th-doff 2)); + hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb); mss = skb_shinfo(skb)-gso_size; if (skb-protocol == htons(ETH_P_IP)) { struct iphdr *iph = ip_hdr(skb); @@ -3292,7 +3292,7 @@ e1000_xmit_frame(struct sk_buff *skb, struct net_device *netdev) /* TSO Workaround for 82571/2/3 Controllers -- if skb-data * points to just header, pull a few bytes of payload from * frags into skb-data */ - hdr_len = (skb_transport_offset(skb) +
[PATCH 11/15] [SK_BUFF]: Introduce ipip_hdr(), remove skb-h.ipiph
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/sk98lin/skge.c |4 ++-- drivers/net/skge.c |2 +- include/linux/ip.h |5 + include/linux/skbuff.h |1 - net/ipv4/xfrm4_mode_tunnel.c |6 +++--- net/ipv6/xfrm6_mode_tunnel.c |2 +- 6 files changed, 12 insertions(+), 8 deletions(-) From 13b28cb035592e5bfeb3da05550850b0c825e01e Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 15:52:43 -0300 Subject: [PATCH 11/15] [SK_BUFF]: Introduce ipip_hdr(), remove skb-h.ipiph Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/sk98lin/skge.c |4 ++-- drivers/net/skge.c |2 +- include/linux/ip.h |5 + include/linux/skbuff.h |1 - net/ipv4/xfrm4_mode_tunnel.c |6 +++--- net/ipv6/xfrm6_mode_tunnel.c |2 +- 6 files changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/net/sk98lin/skge.c b/drivers/net/sk98lin/skge.c index e4ab7a8..b987a5c 100644 --- a/drivers/net/sk98lin/skge.c +++ b/drivers/net/sk98lin/skge.c @@ -1565,7 +1565,7 @@ struct sk_buff *pMessage) /* pointer to send-message */ u16 hdrlen = skb_transport_offset(pMessage); u16 offset = hdrlen + pMessage-csum_offset; - if ((pMessage-h.ipiph-protocol == IPPROTO_UDP ) + if ((ipip_hdr(pMessage)-protocol == IPPROTO_UDP) (pAC-GIni.GIChipRev == 0) (pAC-GIni.GIChipId == CHIP_ID_YUKON)) { pTxd-TBControl = BMU_TCP_CHECK; @@ -1691,7 +1691,7 @@ struct sk_buff *pMessage) /* pointer to send-message */ ** opcode for udp is not working in the hardware yet ** (Revision 2.0) */ - if ((pMessage-h.ipiph-protocol == IPPROTO_UDP ) + if ((ipip_hdr(pMessage)-protocol == IPPROTO_UDP) (pAC-GIni.GIChipRev == 0) (pAC-GIni.GIChipId == CHIP_ID_YUKON)) { Control |= BMU_TCP_CHECK; diff --git a/drivers/net/skge.c b/drivers/net/skge.c index 609cdb4..26b0fe0 100644 --- a/drivers/net/skge.c +++ b/drivers/net/skge.c @@ -2626,7 +2626,7 @@ static int skge_xmit_frame(struct sk_buff *skb, struct net_device *dev) /* This seems backwards, but it is what the sk98lin * does. Looks like hardware is wrong? */ - if (skb-h.ipiph-protocol == IPPROTO_UDP + if (ipip_hdr(skb)-protocol == IPPROTO_UDP hw-chip_rev == 0 hw-chip_id == CHIP_ID_YUKON) control = BMU_TCP_CHECK; else diff --git a/include/linux/ip.h b/include/linux/ip.h index f2f26db..1957844 100644 --- a/include/linux/ip.h +++ b/include/linux/ip.h @@ -111,6 +111,11 @@ static inline struct iphdr *ip_hdr(const struct sk_buff *skb) { return (struct iphdr *)skb_network_header(skb); } + +static inline struct iphdr *ipip_hdr(const struct sk_buff *skb) +{ + return (struct iphdr *)skb-h.raw; +} #endif struct ip_auth_hdr { diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index c5407b7..2b1e188 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -239,7 +239,6 @@ struct sk_buff { struct net_device *input_dev; union { - struct iphdr *ipiph; struct ipv6hdr *ipv6h; unsigned char *raw; } h; diff --git a/net/ipv4/xfrm4_mode_tunnel.c b/net/ipv4/xfrm4_mode_tunnel.c index edba756..521e52f 100644 --- a/net/ipv4/xfrm4_mode_tunnel.c +++ b/net/ipv4/xfrm4_mode_tunnel.c @@ -17,7 +17,7 @@ static inline void ipip_ecn_decapsulate(struct sk_buff *skb) { struct iphdr *outer_iph = ip_hdr(skb); - struct iphdr *inner_iph = skb-h.ipiph; + struct iphdr *inner_iph = ipip_hdr(skb); if (INET_ECN_is_ce(outer_iph-tos)) IP_ECN_set_ce(inner_iph); @@ -47,7 +47,7 @@ static int xfrm4_tunnel_output(struct xfrm_state *x, struct sk_buff *skb) int flags; iph = ip_hdr(skb); - skb-h.ipiph = iph; + skb-h.raw = skb-nh.raw; skb_push(skb, x-props.header_len); skb_reset_network_header(skb); @@ -116,7 +116,7 @@ static int xfrm4_tunnel_input(struct xfrm_state *x, struct sk_buff *skb) iph = ip_hdr(skb); if (iph-protocol == IPPROTO_IPIP) { if (x-props.flags XFRM_STATE_DECAP_DSCP) - ipv4_copy_dscp(iph, skb-h.ipiph); + ipv4_copy_dscp(iph, ipip_hdr(skb)); if (!(x-props.flags XFRM_STATE_NOECN)) ipip_ecn_decapsulate(skb); } diff --git a/net/ipv6/xfrm6_mode_tunnel.c b/net/ipv6/xfrm6_mode_tunnel.c index 28f36b3..9d3bd33 100644 --- a/net/ipv6/xfrm6_mode_tunnel.c +++ b/net/ipv6/xfrm6_mode_tunnel.c @@ -28,7 +28,7 @@ static inline void ipip6_ecn_decapsulate(struct sk_buff *skb) static inline void ip6ip_ecn_decapsulate(struct sk_buff *skb) { if (INET_ECN_is_ce(ipv6_get_dsfield(ipv6_hdr(skb - IP_ECN_set_ce(skb-h.ipiph); + IP_ECN_set_ce(ipip_hdr(skb)); } /* Add encapsulation header. -- 1.5.0.2
[PATCH 06/15] [SK_BUFF]: Introduce igmp_hdr() friends, remove skb-h.igmph
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/igmp.h | 21 + include/linux/skbuff.h |1 - net/ipv4/igmp.c| 22 +++--- net/ipv4/ipmr.c|2 +- 4 files changed, 33 insertions(+), 13 deletions(-) From 515b800d7c7cc7f224ebc24525be83d1e0f956ed Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 14:19:23 -0300 Subject: [PATCH 06/15] [SK_BUFF]: Introduce igmp_hdr() friends, remove skb-h.igmph Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/igmp.h | 21 + include/linux/skbuff.h |1 - net/ipv4/igmp.c| 22 +++--- net/ipv4/ipmr.c|2 +- 4 files changed, 33 insertions(+), 13 deletions(-) diff --git a/include/linux/igmp.h b/include/linux/igmp.h index a113fe6..ca28552 100644 --- a/include/linux/igmp.h +++ b/include/linux/igmp.h @@ -80,6 +80,27 @@ struct igmpv3_query { __be32 srcs[0]; }; +#ifdef __KERNEL__ +#include linux/skbuff.h + +static inline struct igmphdr *igmp_hdr(const struct sk_buff *skb) +{ + return (struct igmphdr *)skb-h.raw; +} + +static inline struct igmpv3_report * + igmpv3_report_hdr(const struct sk_buff *skb) +{ + return (struct igmpv3_report *)skb-h.raw; +} + +static inline struct igmpv3_query * + igmpv3_query_hdr(const struct sk_buff *skb) +{ + return (struct igmpv3_query *)skb-h.raw; +} +#endif + #define IGMP_HOST_MEMBERSHIP_QUERY 0x11 /* From RFC1112 */ #define IGMP_HOST_MEMBERSHIP_REPORT 0x12 /* Ditto */ #define IGMP_DVMRP 0x13 /* DVMRP routing */ diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 12bd740..a60d1e5 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -242,7 +242,6 @@ struct sk_buff { struct tcphdr *th; struct udphdr *uh; struct icmphdr *icmph; - struct igmphdr *igmph; struct iphdr *ipiph; struct ipv6hdr *ipv6h; unsigned char *raw; diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index 596eaaa..56a838c 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -333,8 +333,8 @@ static struct sk_buff *igmpv3_newpack(struct net_device *dev, int size) ((u8*)pip[1])[2] = 0; ((u8*)pip[1])[3] = 0; - pig =(struct igmpv3_report *)skb_put(skb, sizeof(*pig)); - skb-h.igmph = (struct igmphdr *)pig; + skb-h.raw = skb_put(skb, sizeof(*pig)); + pig = igmpv3_report_hdr(skb); pig-type = IGMPV3_HOST_MEMBERSHIP_REPORT; pig-resv1 = 0; pig-csum = 0; @@ -346,13 +346,13 @@ static struct sk_buff *igmpv3_newpack(struct net_device *dev, int size) static int igmpv3_sendpack(struct sk_buff *skb) { struct iphdr *pip = ip_hdr(skb); - struct igmphdr *pig = skb-h.igmph; + struct igmphdr *pig = igmp_hdr(skb); const int iplen = skb-tail - skb-nh.raw; const int igmplen = skb-tail - skb-h.raw; pip-tot_len = htons(iplen); ip_send_check(pip); - pig-csum = ip_compute_csum(skb-h.igmph, igmplen); + pig-csum = ip_compute_csum(igmp_hdr(skb), igmplen); return NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, skb-dev, dst_output); @@ -379,7 +379,7 @@ static struct sk_buff *add_grhead(struct sk_buff *skb, struct ip_mc_list *pmc, pgr-grec_auxwords = 0; pgr-grec_nsrcs = 0; pgr-grec_mca = pmc-multiaddr; - pih = (struct igmpv3_report *)skb-h.igmph; + pih = igmpv3_report_hdr(skb); pih-ngrec = htons(ntohs(pih-ngrec)+1); *ppgr = pgr; return skb; @@ -412,7 +412,7 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ip_mc_list *pmc, if (!*psf_list) goto empty_source; - pih = skb ? (struct igmpv3_report *)skb-h.igmph : NULL; + pih = skb ? igmpv3_report_hdr(skb) : NULL; /* EX and TO_EX get a fresh packet, if needed */ if (truncate) { @@ -829,8 +829,8 @@ static void igmp_heard_report(struct in_device *in_dev, __be32 group) static void igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb, int len) { - struct igmphdr *ih = skb-h.igmph; - struct igmpv3_query *ih3 = (struct igmpv3_query *)ih; + struct igmphdr *ih = igmp_hdr(skb); + struct igmpv3_query *ih3 = igmpv3_query_hdr(skb); struct ip_mc_list *im; __be32 group = ih-group; int max_delay; @@ -863,12 +863,12 @@ static void igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb, if (!pskb_may_pull(skb, sizeof(struct igmpv3_query))) return; - ih3 = (struct igmpv3_query *) skb-h.raw; + ih3 = igmpv3_query_hdr(skb); if (ih3-nsrcs) { if (!pskb_may_pull(skb, sizeof(struct igmpv3_query) + ntohs(ih3-nsrcs)*sizeof(__be32))) return; - ih3 = (struct igmpv3_query *) skb-h.raw; + ih3 = igmpv3_query_hdr(skb); } max_delay = IGMPV3_MRC(ih3-code)*(HZ/IGMP_TIMER_SCALE); @@ -945,7 +945,7 @@ int igmp_rcv(struct sk_buff *skb) goto drop; } - ih = skb-h.igmph; + ih = igmp_hdr(skb); switch (ih-type) { case IGMP_HOST_MEMBERSHIP_QUERY: igmp_heard_query(in_dev, skb, len); diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index 03869d9..05bc270 100644 --- a/net/ipv4/ipmr.c
[PATCH 07/15] [SK_BUFF]: Introduce udp_hdr(), remove skb-h.uh
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/gianfar.c |4 ++-- drivers/net/ioc3-eth.c|2 +- drivers/net/mv643xx_eth.c |2 +- include/linux/skbuff.h|1 - include/linux/udp.h |9 + include/net/udplite.h |2 +- net/core/netpoll.c|4 +++- net/core/pktgen.c |4 ++-- net/ipv4/udp.c| 12 ++-- net/ipv6/udp.c| 10 +- net/rxrpc/connection.c|4 ++-- net/rxrpc/transport.c |4 ++-- 12 files changed, 34 insertions(+), 24 deletions(-) From 2b56798afa3a60638188e29d3794263f07ef594c Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 14:28:48 -0300 Subject: [PATCH 07/15] [SK_BUFF]: Introduce udp_hdr(), remove skb-h.uh Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/gianfar.c |4 ++-- drivers/net/ioc3-eth.c|2 +- drivers/net/mv643xx_eth.c |2 +- include/linux/skbuff.h|1 - include/linux/udp.h |9 + include/net/udplite.h |2 +- net/core/netpoll.c|4 +++- net/core/pktgen.c |4 ++-- net/ipv4/udp.c| 12 ++-- net/ipv6/udp.c| 10 +- net/rxrpc/connection.c|4 ++-- net/rxrpc/transport.c |4 ++-- 12 files changed, 34 insertions(+), 24 deletions(-) diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c index c9abc96..b9f4460 100644 --- a/drivers/net/gianfar.c +++ b/drivers/net/gianfar.c @@ -944,9 +944,9 @@ static inline void gfar_tx_checksum(struct sk_buff *skb, struct txfcb *fcb) /* And provide the already calculated phcs */ if (ip_hdr(skb)-protocol == IPPROTO_UDP) { flags |= TXFCB_UDP; - fcb-phcs = skb-h.uh-check; + fcb-phcs = udp_hdr(skb)-check; } else - fcb-phcs = skb-h.th-check; + fcb-phcs = udp_hdr(skb)-check; /* l3os is the distance between the start of the * frame (skb-data) and the start of the IP hdr. diff --git a/drivers/net/ioc3-eth.c b/drivers/net/ioc3-eth.c index d375e78..ba012e1 100644 --- a/drivers/net/ioc3-eth.c +++ b/drivers/net/ioc3-eth.c @@ -1422,7 +1422,7 @@ static int ioc3_start_xmit(struct sk_buff *skb, struct net_device *dev) csoff = ETH_HLEN + (ih-ihl 2); if (proto == IPPROTO_UDP) { csoff += offsetof(struct udphdr, check); - skb-h.uh-check = csum; + udp_hdr(skb)-check = csum; } if (proto == IPPROTO_TCP) { csoff += offsetof(struct tcphdr, check); diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c index 92ecf76..af99068 100644 --- a/drivers/net/mv643xx_eth.c +++ b/drivers/net/mv643xx_eth.c @@ -1164,7 +1164,7 @@ static void eth_tx_submit_descs_for_skb(struct mv643xx_private *mp, switch (ip_hdr(skb)-protocol) { case IPPROTO_UDP: cmd_sts |= ETH_UDP_FRAME; - desc-l4i_chk = skb-h.uh-check; + desc-l4i_chk = udp_hdr(skb)-check; break; case IPPROTO_TCP: desc-l4i_chk = skb-h.th-check; diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index a60d1e5..a5d1087 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -240,7 +240,6 @@ struct sk_buff { union { struct tcphdr *th; - struct udphdr *uh; struct icmphdr *icmph; struct iphdr *ipiph; struct ipv6hdr *ipv6h; diff --git a/include/linux/udp.h b/include/linux/udp.h index 7e08c07..1f58503 100644 --- a/include/linux/udp.h +++ b/include/linux/udp.h @@ -26,6 +26,15 @@ struct udphdr { __sum16 check; }; +#ifdef __KERNEL__ +#include linux/skbuff.h + +static inline struct udphdr *udp_hdr(const struct sk_buff *skb) +{ + return (struct udphdr *)skb-h.raw; +} +#endif + /* UDP socket options */ #define UDP_CORK 1 /* Never send partially complete segments */ #define UDP_ENCAP 100 /* Set the socket to accept encapsulated packets */ diff --git a/include/net/udplite.h b/include/net/udplite.h index 7650320..635b0ea 100644 --- a/include/net/udplite.h +++ b/include/net/udplite.h @@ -101,7 +101,7 @@ static inline int udplite_sender_cscov(struct udp_sock *up, struct udphdr *uh) static inline __wsum udplite_csum_outgoing(struct sock *sk, struct sk_buff *skb) { - int cscov = udplite_sender_cscov(udp_sk(sk), skb-h.uh); + int cscov = udplite_sender_cscov(udp_sk(sk), udp_hdr(skb)); __wsum csum = 0; skb-ip_summed = CHECKSUM_NONE; /* no HW support for checksumming */ diff --git a/net/core/netpoll.c b/net/core/netpoll.c index cac0279..ae63087 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -296,7 +296,9 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len) memcpy(skb-data, msg, len); skb-len += len; - skb-h.uh = udph = (struct udphdr *) skb_push(skb, sizeof(*udph)); + skb_push(skb, sizeof(*udph)); + skb_reset_transport_header(skb); + udph = udp_hdr(skb); udph-source = htons(np-local_port); udph-dest = htons(np-remote_port); udph-len = htons(udp_len); diff --git a/net/core/pktgen.c b/net/core/pktgen.c index 6389693..2d49dbc 100644 ---
[PATCH 08/15] [SK_BUFF]: Introduce icmp_hdr(), remove skb-h.icmph
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/icmp.h |9 + include/linux/skbuff.h |1 - net/dccp/ipv4.c|4 ++-- net/ipv4/ah4.c |4 ++-- net/ipv4/esp4.c|4 ++-- net/ipv4/icmp.c| 14 +++--- net/ipv4/ip_gre.c | 12 ++-- net/ipv4/ip_sockglue.c |6 +++--- net/ipv4/ipcomp.c |4 ++-- net/ipv4/ipip.c| 12 ++-- net/ipv4/raw.c |6 +++--- net/ipv4/tcp_ipv4.c|4 ++-- net/ipv4/udp.c |4 ++-- net/ipv6/sit.c | 12 ++-- net/sctp/input.c |4 ++-- 15 files changed, 54 insertions(+), 46 deletions(-) From 5750acd8b55567ec9e8839f352f60af0f492f509 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 14:43:18 -0300 Subject: [PATCH 08/15] [SK_BUFF]: Introduce icmp_hdr(), remove skb-h.icmph Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/icmp.h |9 + include/linux/skbuff.h |1 - net/dccp/ipv4.c|4 ++-- net/ipv4/ah4.c |4 ++-- net/ipv4/esp4.c|4 ++-- net/ipv4/icmp.c| 14 +++--- net/ipv4/ip_gre.c | 12 ++-- net/ipv4/ip_sockglue.c |6 +++--- net/ipv4/ipcomp.c |4 ++-- net/ipv4/ipip.c| 12 ++-- net/ipv4/raw.c |6 +++--- net/ipv4/tcp_ipv4.c|4 ++-- net/ipv4/udp.c |4 ++-- net/ipv6/sit.c | 12 ++-- net/sctp/input.c |4 ++-- 15 files changed, 54 insertions(+), 46 deletions(-) diff --git a/include/linux/icmp.h b/include/linux/icmp.h index 24da4fb..cd3017a 100644 --- a/include/linux/icmp.h +++ b/include/linux/icmp.h @@ -82,6 +82,15 @@ struct icmphdr { } un; }; +#ifdef __KERNEL__ +#include linux/skbuff.h + +static inline struct icmphdr *icmp_hdr(const struct sk_buff *skb) +{ + return (struct icmphdr *)skb-h.raw; +} +#endif + /* * constants for (set|get)sockopt */ diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index a5d1087..eea512a 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -240,7 +240,6 @@ struct sk_buff { union { struct tcphdr *th; - struct icmphdr *icmph; struct iphdr *ipiph; struct ipv6hdr *ipv6h; unsigned char *raw; diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c index b85437d..718f2fa 100644 --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -207,8 +207,8 @@ static void dccp_v4_err(struct sk_buff *skb, u32 info) (iph-ihl 2)); struct dccp_sock *dp; struct inet_sock *inet; - const int type = skb-h.icmph-type; - const int code = skb-h.icmph-code; + const int type = icmp_hdr(skb)-type; + const int code = icmp_hdr(skb)-code; struct sock *sk; __u64 seq; int err; diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c index ebcc797..e1bb9e0 100644 --- a/net/ipv4/ah4.c +++ b/net/ipv4/ah4.c @@ -198,8 +198,8 @@ static void ah4_err(struct sk_buff *skb, u32 info) struct ip_auth_hdr *ah = (struct ip_auth_hdr*)(skb-data+(iph-ihl2)); struct xfrm_state *x; - if (skb-h.icmph-type != ICMP_DEST_UNREACH || - skb-h.icmph-code != ICMP_FRAG_NEEDED) + if (icmp_hdr(skb)-type != ICMP_DEST_UNREACH || + icmp_hdr(skb)-code != ICMP_FRAG_NEEDED) return; x = xfrm_state_lookup((xfrm_address_t *)iph-daddr, ah-spi, IPPROTO_AH, AF_INET); diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index 82543ee..de019f9 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -304,8 +304,8 @@ static void esp4_err(struct sk_buff *skb, u32 info) struct ip_esp_hdr *esph = (struct ip_esp_hdr*)(skb-data+(iph-ihl2)); struct xfrm_state *x; - if (skb-h.icmph-type != ICMP_DEST_UNREACH || - skb-h.icmph-code != ICMP_FRAG_NEEDED) + if (icmp_hdr(skb)-type != ICMP_DEST_UNREACH || + icmp_hdr(skb)-code != ICMP_FRAG_NEEDED) return; x = xfrm_state_lookup((xfrm_address_t *)iph-daddr, esph-spi, IPPROTO_ESP, AF_INET); diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 4d70c21..8372f8b 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -355,7 +355,7 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param, ipc, rt, MSG_DONTWAIT) 0) ip_flush_pending_frames(icmp_socket-sk); else if ((skb = skb_peek(icmp_socket-sk-sk_write_queue)) != NULL) { - struct icmphdr *icmph = skb-h.icmph; + struct icmphdr *icmph = icmp_hdr(skb); __wsum csum = 0; struct sk_buff *skb1; @@ -613,7 +613,7 @@ static void icmp_unreach(struct sk_buff *skb) if (!pskb_may_pull(skb, sizeof(struct iphdr))) goto out_err; - icmph = skb-h.icmph; + icmph = icmp_hdr(skb); iph = (struct iphdr *)skb-data; if (iph-ihl 5) /* Mangled header, drop. */ @@ -743,7 +743,7 @@ static void icmp_redirect(struct sk_buff *skb) iph = (struct iphdr *)skb-data; - switch (skb-h.icmph-code 7) { + switch (icmp_hdr(skb)-code 7) { case ICMP_REDIR_NET: case ICMP_REDIR_NETTOS: /* @@ -752,7 +752,7 @@ static void icmp_redirect(struct sk_buff *skb) case ICMP_REDIR_HOST:
[PATCH 10/15] [SK_BUFF]: Introduce tcp_hdr(), remove skb-h.th
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/atl1/atl1_main.c |7 --- drivers/net/bnx2.c |8 drivers/net/chelsio/sge.c |2 +- drivers/net/cxgb3/sge.c|2 +- drivers/net/e1000/e1000_main.c | 11 ++- drivers/net/ioc3-eth.c |2 +- drivers/net/ixgb/ixgb_main.c |7 --- drivers/net/mv643xx_eth.c |2 +- drivers/net/tg3.c | 15 +++ drivers/s390/net/qeth_eddp.c |2 +- drivers/s390/net/qeth_tso.h|4 ++-- include/linux/skbuff.h |1 - include/linux/tcp.h|9 +++-- include/net/tcp.h |2 +- include/net/tcp_ecn.h |6 +++--- net/ipv4/ip_output.c |4 ++-- net/ipv4/syncookies.c | 36 ++-- net/ipv4/tcp.c | 22 +++--- net/ipv4/tcp_input.c | 28 +++- net/ipv4/tcp_ipv4.c| 32 net/ipv4/tcp_minisocks.c |9 + net/ipv4/tcp_output.c | 13 - net/ipv6/tcp_ipv6.c| 32 23 files changed, 134 insertions(+), 122 deletions(-) From 084fb923f59f068a3bdb95e7ae772811b7b09793 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 15:37:34 -0300 Subject: [PATCH 10/15] [SK_BUFF]: Introduce tcp_hdr(), remove skb-h.th Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/atl1/atl1_main.c |7 --- drivers/net/bnx2.c |8 drivers/net/chelsio/sge.c |2 +- drivers/net/cxgb3/sge.c|2 +- drivers/net/e1000/e1000_main.c | 11 ++- drivers/net/ioc3-eth.c |2 +- drivers/net/ixgb/ixgb_main.c |7 --- drivers/net/mv643xx_eth.c |2 +- drivers/net/tg3.c | 15 +++ drivers/s390/net/qeth_eddp.c |2 +- drivers/s390/net/qeth_tso.h|4 ++-- include/linux/skbuff.h |1 - include/linux/tcp.h|9 +++-- include/net/tcp.h |2 +- include/net/tcp_ecn.h |6 +++--- net/ipv4/ip_output.c |4 ++-- net/ipv4/syncookies.c | 36 ++-- net/ipv4/tcp.c | 22 +++--- net/ipv4/tcp_input.c | 28 +++- net/ipv4/tcp_ipv4.c| 32 net/ipv4/tcp_minisocks.c |9 + net/ipv4/tcp_output.c | 13 - net/ipv6/tcp_ipv6.c| 32 23 files changed, 134 insertions(+), 122 deletions(-) diff --git a/drivers/net/atl1/atl1_main.c b/drivers/net/atl1/atl1_main.c index 0912d2a..cc131d3 100644 --- a/drivers/net/atl1/atl1_main.c +++ b/drivers/net/atl1/atl1_main.c @@ -1298,9 +1298,10 @@ static int atl1_tso(struct atl1_adapter *adapter, struct sk_buff *skb, iph-tot_len = 0; iph-check = 0; - skb-h.th-check = ~csum_tcpudp_magic(iph-saddr, - iph-daddr, 0, - IPPROTO_TCP, 0); + tcp_hdr(skb)-check = ~csum_tcpudp_magic(iph-saddr, + iph-daddr, 0, + IPPROTO_TCP, + 0); ipofst = skb_network_offset(skb); if (ipofst != ENET_HEADER_SIZE) /* 802.3 frame */ tso-tsopl |= 1 TSO_PARAM_ETHTYPE_SHIFT; diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c index f948918..5658b46 100644 --- a/drivers/net/bnx2.c +++ b/drivers/net/bnx2.c @@ -4523,7 +4523,7 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev) vlan_tag_flags |= TX_BD_FLAGS_SW_LSO; tcp_opt_len = 0; - if (skb-h.th-doff 5) + if (tcp_hdr(skb)-doff 5) tcp_opt_len = tcp_optlen(skb); ip_tcp_len = ip_hdrlen(skb) + sizeof(struct tcphdr); @@ -4531,9 +4531,9 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev) iph = ip_hdr(skb); iph-check = 0; iph-tot_len = htons(mss + ip_tcp_len + tcp_opt_len); - skb-h.th-check = ~csum_tcpudp_magic(iph-saddr, iph-daddr, - 0, IPPROTO_TCP, 0); - + tcp_hdr(skb)-check = ~csum_tcpudp_magic(iph-saddr, + iph-daddr, 0, + IPPROTO_TCP, 0); if (tcp_opt_len || (iph-ihl 5)) { vlan_tag_flags |= ((iph-ihl - 5) + (tcp_opt_len 2)) 8; diff --git a/drivers/net/chelsio/sge.c b/drivers/net/chelsio/sge.c index a4204df..43e92f9 100644 --- a/drivers/net/chelsio/sge.c +++ b/drivers/net/chelsio/sge.c @@ -1872,7 +1872,7 @@ int t1_start_xmit(struct sk_buff *skb, struct net_device *dev) hdr-opcode = CPL_TX_PKT_LSO; hdr-ip_csum_dis = hdr-l4_csum_dis = 0; hdr-ip_hdr_words = ip_hdr(skb)-ihl; - hdr-tcp_hdr_words = skb-h.th-doff; + hdr-tcp_hdr_words = tcp_hdr(skb)-doff; hdr-eth_type_mss = htons(MK_ETH_TYPE_MSS(eth_type, skb_shinfo(skb)-gso_size)); hdr-len = htonl(skb-len - sizeof(*hdr)); diff --git a/drivers/net/cxgb3/sge.c
[PATCH 12/15] [SK_BUFF]: Introduce ipipv6_hdr(), remove skb-h.ipv6h
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/ipv6.h |5 + include/linux/skbuff.h |1 - net/ipv6/xfrm6_mode_beet.c |4 ++-- net/ipv6/xfrm6_mode_tunnel.c |8 4 files changed, 11 insertions(+), 7 deletions(-) From bb1c2a7d91b74b6d7952dda388bbafc2341f563f Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 16:15:37 -0300 Subject: [PATCH 12/15] [SK_BUFF]: Introduce ipipv6_hdr(), remove skb-h.ipv6h Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/ipv6.h |5 + include/linux/skbuff.h |1 - net/ipv6/xfrm6_mode_beet.c |4 ++-- net/ipv6/xfrm6_mode_tunnel.c |8 4 files changed, 11 insertions(+), 7 deletions(-) diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h index 096dcd2..df48c96 100644 --- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -225,6 +225,11 @@ static inline struct ipv6hdr *ipv6_hdr(const struct sk_buff *skb) return (struct ipv6hdr *)skb_network_header(skb); } +static inline struct ipv6hdr *ipipv6_hdr(const struct sk_buff *skb) +{ + return (struct ipv6hdr *)skb-h.raw; +} + /* This structure contains results of exthdrs parsing as offsets from skb-nh. diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 2b1e188..f69a06d 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -239,7 +239,6 @@ struct sk_buff { struct net_device *input_dev; union { - struct ipv6hdr *ipv6h; unsigned char *raw; } h; diff --git a/net/ipv6/xfrm6_mode_beet.c b/net/ipv6/xfrm6_mode_beet.c index abac094..0cc96ec 100644 --- a/net/ipv6/xfrm6_mode_beet.c +++ b/net/ipv6/xfrm6_mode_beet.c @@ -47,8 +47,8 @@ static int xfrm6_beet_output(struct xfrm_state *x, struct sk_buff *skb) skb_reset_network_header(skb); top_iph = ipv6_hdr(skb); - skb-nh.raw = top_iph-nexthdr; - skb-h.ipv6h = top_iph + 1; + skb-h.raw = skb-nh.raw + sizeof(struct ipv6hdr); + skb-nh.raw += offsetof(struct ipv6hdr, nexthdr); ipv6_addr_copy(top_iph-saddr, (struct in6_addr *)x-props.saddr); ipv6_addr_copy(top_iph-daddr, (struct in6_addr *)x-id.daddr); diff --git a/net/ipv6/xfrm6_mode_tunnel.c b/net/ipv6/xfrm6_mode_tunnel.c index 9d3bd33..21d65df 100644 --- a/net/ipv6/xfrm6_mode_tunnel.c +++ b/net/ipv6/xfrm6_mode_tunnel.c @@ -19,7 +19,7 @@ static inline void ipip6_ecn_decapsulate(struct sk_buff *skb) { struct ipv6hdr *outer_iph = ipv6_hdr(skb); - struct ipv6hdr *inner_iph = skb-h.ipv6h; + struct ipv6hdr *inner_iph = ipipv6_hdr(skb); if (INET_ECN_is_ce(ipv6_get_dsfield(outer_iph))) IP6_ECN_set_ce(inner_iph); @@ -55,8 +55,8 @@ static int xfrm6_tunnel_output(struct xfrm_state *x, struct sk_buff *skb) skb_reset_network_header(skb); top_iph = ipv6_hdr(skb); - skb-nh.raw = top_iph-nexthdr; - skb-h.ipv6h = top_iph + 1; + skb-h.raw = skb-nh.raw + sizeof(struct ipv6hdr); + skb-nh.raw += offsetof(struct ipv6hdr, nexthdr); top_iph-version = 6; if (xdst-route-ops-family == AF_INET6) { @@ -102,7 +102,7 @@ static int xfrm6_tunnel_input(struct xfrm_state *x, struct sk_buff *skb) nh = skb_network_header(skb); if (nh[IP6CB(skb)-nhoff] == IPPROTO_IPV6) { if (x-props.flags XFRM_STATE_DECAP_DSCP) - ipv6_copy_dscp(ipv6_hdr(skb), skb-h.ipv6h); + ipv6_copy_dscp(ipv6_hdr(skb), ipipv6_hdr(skb)); if (!(x-props.flags XFRM_STATE_NOECN)) ipip6_ecn_decapsulate(skb); } else { -- 1.5.0.2
[PATCH 13/15] [SK_BUFF]: More skb_reset_transport_header conversions
These are a bit more subtle, they are of this type: - skb-h.raw = payload; __skb_pull(skb, payload - skb-data); + skb_reset_transport_header(skb); __skb_pull results in: skb-data = skb-data + payload - skb-data; skb-data = payload; So after __skb_pull we have skb-data pointing to payload and we can just call skb_reset_transport_header(skb), that will do: skb-h.raw = payload; The others are similar, allowing us to get rid of some more cases where a pointer was being attributed to the layer headers. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/ipv4/ip_sockglue.c | 12 +++- net/ipv6/datagram.c|4 ++-- 2 files changed, 9 insertions(+), 7 deletions(-) From 7684a24cb0328e519298a3674e08db4bca6861c3 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 17:10:43 -0300 Subject: [PATCH 13/15] [SK_BUFF]: More skb_reset_transport_header conversions These are a bit more subtle, they are of this type: - skb-h.raw = payload; __skb_pull(skb, payload - skb-data); + skb_reset_transport_header(skb); __skb_pull results in: skb-data = skb-data + payload - skb-data; skb-data = payload; So after __skb_pull we have skb-data pointing to payload and we can just call skb_reset_transport_header(skb), that will do: skb-h.raw = payload; The others are similar, allowing us to get rid of some more cases where a pointer was being attributed to the layer headers. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/ipv4/ip_sockglue.c | 12 +++- net/ipv6/datagram.c|4 ++-- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index ccdc59d..fcb35cd 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -278,10 +278,12 @@ void ip_icmp_error(struct sock *sk, struct sk_buff *skb, int err, skb_network_header(skb); serr-port = port; - skb-h.raw = payload; - if (!skb_pull(skb, payload - skb-data) || - sock_queue_err_skb(sk, skb)) - kfree_skb(skb); + if (skb_pull(skb, payload - skb-data) != NULL) { + skb_reset_transport_header(skb); + if (sock_queue_err_skb(sk, skb) == 0) + return; + } + kfree_skb(skb); } void ip_local_error(struct sock *sk, int err, __be32 daddr, __be16 port, u32 info) @@ -314,8 +316,8 @@ void ip_local_error(struct sock *sk, int err, __be32 daddr, __be16 port, u32 inf serr-addr_offset = (u8 *)iph-daddr - skb_network_header(skb); serr-port = port; - skb-h.raw = skb-tail; __skb_pull(skb, skb-tail - skb-data); + skb_reset_transport_header(skb); if (sock_queue_err_skb(sk, skb)) kfree_skb(skb); diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index feba6b1..f16f4f0 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -231,8 +231,8 @@ void ipv6_icmp_error(struct sock *sk, struct sk_buff *skb, int err, skb_network_header(skb); serr-port = port; - skb-h.raw = payload; __skb_pull(skb, payload - skb-data); + skb_reset_transport_header(skb); if (sock_queue_err_skb(sk, skb)) kfree_skb(skb); @@ -268,8 +268,8 @@ void ipv6_local_error(struct sock *sk, int err, struct flowi *fl, u32 info) serr-addr_offset = (u8 *)iph-daddr - skb_network_header(skb); serr-port = fl-fl_ip_dport; - skb-h.raw = skb-tail; __skb_pull(skb, skb-tail - skb-data); + skb_reset_transport_header(skb); if (sock_queue_err_skb(sk, skb)) kfree_skb(skb); -- 1.5.0.2
[PATCH 14/15] [SCTP]: Eliminate some pointer attributions to the skb layer headers
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/sctp/input.c |8 net/sctp/ipv6.c |5 ++--- 2 files changed, 6 insertions(+), 7 deletions(-) From b9c0a34313240c6f74bcc5587a496493480daf7d Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 17:17:10 -0300 Subject: [PATCH 14/15] [SCTP]: Eliminate some pointer attributions to the skb layer headers Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/sctp/input.c |8 net/sctp/ipv6.c |5 ++--- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/net/sctp/input.c b/net/sctp/input.c index 40d0df8..f38e91b 100644 --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -506,7 +506,7 @@ void sctp_err_finish(struct sock *sk, struct sctp_association *asoc) void sctp_v4_err(struct sk_buff *skb, __u32 info) { struct iphdr *iph = (struct iphdr *)skb-data; - struct sctphdr *sh = (struct sctphdr *)(skb-data + (iph-ihl 2)); + const int ihlen = iph-ihl * 4; const int type = icmp_hdr(skb)-type; const int code = icmp_hdr(skb)-code; struct sock *sk; @@ -516,7 +516,7 @@ void sctp_v4_err(struct sk_buff *skb, __u32 info) char *saveip, *savesctp; int err; - if (skb-len ((iph-ihl 2) + 8)) { + if (skb-len ihlen + 8) { ICMP_INC_STATS_BH(ICMP_MIB_INERRORS); return; } @@ -525,8 +525,8 @@ void sctp_v4_err(struct sk_buff *skb, __u32 info) saveip = skb-nh.raw; savesctp = skb-h.raw; skb_reset_network_header(skb); - skb-h.raw = (char *)sh; - sk = sctp_err_lookup(AF_INET, skb, sh, asoc, transport); + skb_set_transport_header(skb, ihlen); + sk = sctp_err_lookup(AF_INET, skb, sctp_hdr(skb), asoc, transport); /* Put back, the original pointers. */ skb-nh.raw = saveip; skb-h.raw = savesctp; diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c index dff72e0..6cad0f4 100644 --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -122,7 +122,6 @@ SCTP_STATIC void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, int type, int code, int offset, __be32 info) { struct inet6_dev *idev; - struct sctphdr *sh = (struct sctphdr *)(skb-data + offset); struct sock *sk; struct sctp_association *asoc; struct sctp_transport *transport; @@ -136,8 +135,8 @@ SCTP_STATIC void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, saveip = skb-nh.raw; savesctp = skb-h.raw; skb_reset_network_header(skb); - skb-h.raw = (char *)sh; - sk = sctp_err_lookup(AF_INET6, skb, sh, asoc, transport); + skb_set_transport_header(skb, offset); + sk = sctp_err_lookup(AF_INET6, skb, sctp_hdr(skb), asoc, transport); /* Put back, the original pointers. */ skb-nh.raw = saveip; skb-h.raw = savesctp; -- 1.5.0.2
[PATCH 15/15] [SK_BUFF]: Introduce skb_transport_header(skb)
For the places where we need a pointer to the transport header, it is still legal to touch skb-h.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/appletalk/ltpc.c|7 +-- drivers/net/cxgb3/sge.c |8 +--- drivers/s390/net/qeth_eddp.c|4 ++-- include/linux/atalk.h |4 ++-- include/linux/dccp.h| 19 --- include/linux/icmp.h|2 +- include/linux/icmpv6.h |2 +- include/linux/igmp.h|6 +++--- include/linux/ip.h |2 +- include/linux/ipv6.h|2 +- include/linux/sctp.h|2 +- include/linux/skbuff.h |5 + include/linux/tcp.h |2 +- include/linux/udp.h |2 +- include/net/ipx.h |2 +- include/net/pkt_cls.h |2 +- include/net/udp.h |4 ++-- net/802/psnap.c |2 +- net/ax25/af_ax25.c |5 +++-- net/bluetooth/hci_core.c|4 ++-- net/core/dev.c |6 +++--- net/econet/af_econet.c |2 +- net/ipv4/igmp.c |2 +- net/ipv4/ip_gre.c |2 +- net/ipv4/ip_output.c|6 -- net/ipv4/ipconfig.c |4 ++-- net/ipv4/ipmr.c |8 +--- net/ipv4/tcp.c | 12 +++- net/ipv4/tcp_input.c| 13 +++-- net/ipv4/xfrm4_mode_beet.c |2 +- net/ipv4/xfrm4_mode_transport.c |5 +++-- net/ipv6/ah6.c |2 +- net/ipv6/esp6.c |2 +- net/ipv6/exthdrs.c | 21 ++--- net/ipv6/ipcomp6.c |2 +- net/ipv6/mcast.c| 16 +--- net/ipv6/mip6.c |8 net/ipv6/ndisc.c| 17 + net/ipv6/raw.c |2 +- net/ipv6/reassembly.c |2 +- net/ipv6/xfrm6_mode_transport.c |5 +++-- net/xfrm/xfrm_input.c |6 +++--- 42 files changed, 129 insertions(+), 102 deletions(-) From eab29f0961397cd4e0f54b9448c7f7a4be39b3e9 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Tue, 13 Mar 2007 17:20:39 -0300 Subject: [PATCH 15/15] [SK_BUFF]: Introduce skb_transport_header(skb) For the places where we need a pointer to the transport header, it is still legal to touch skb-h.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- drivers/net/appletalk/ltpc.c|7 +-- drivers/net/cxgb3/sge.c |8 +--- drivers/s390/net/qeth_eddp.c|4 ++-- include/linux/atalk.h |4 ++-- include/linux/dccp.h| 19 --- include/linux/icmp.h|2 +- include/linux/icmpv6.h |2 +- include/linux/igmp.h|6 +++--- include/linux/ip.h |2 +- include/linux/ipv6.h|2 +- include/linux/sctp.h|2 +- include/linux/skbuff.h |5 + include/linux/tcp.h |2 +- include/linux/udp.h |2 +- include/net/ipx.h |2 +- include/net/pkt_cls.h |2 +- include/net/udp.h |4 ++-- net/802/psnap.c |2 +- net/ax25/af_ax25.c |5 +++-- net/bluetooth/hci_core.c|4 ++-- net/core/dev.c |6 +++--- net/econet/af_econet.c |2 +- net/ipv4/igmp.c |2 +- net/ipv4/ip_gre.c |2 +- net/ipv4/ip_output.c|6 -- net/ipv4/ipconfig.c |4 ++-- net/ipv4/ipmr.c |8 +--- net/ipv4/tcp.c | 12 +++- net/ipv4/tcp_input.c| 13 +++-- net/ipv4/xfrm4_mode_beet.c |2 +- net/ipv4/xfrm4_mode_transport.c |5 +++-- net/ipv6/ah6.c |2 +- net/ipv6/esp6.c |2 +- net/ipv6/exthdrs.c | 21 ++--- net/ipv6/ipcomp6.c |2 +- net/ipv6/mcast.c| 16 +--- net/ipv6/mip6.c |8 net/ipv6/ndisc.c| 17 + net/ipv6/raw.c |2 +- net/ipv6/reassembly.c |2 +- net/ipv6/xfrm6_mode_transport.c |5 +++-- net/xfrm/xfrm_input.c |6 +++--- 42 files changed, 129 insertions(+), 102 deletions(-) diff --git a/drivers/net/appletalk/ltpc.c b/drivers/net/appletalk/ltpc.c index dc3bce9..43c17c8 100644 --- a/drivers/net/appletalk/ltpc.c +++ b/drivers/net/appletalk/ltpc.c @@ -917,6 +917,7 @@ static int ltpc_xmit(struct sk_buff *skb, struct net_device *dev) int i; struct lt_sendlap cbuf; +
Re: [PATCH 09/15] [TCP]: Introduce tcp_hdrlen() and tcp_optlen()
On 3/13/07, Arnaldo Carvalho de Melo [EMAIL PROTECTED] wrote: Introduce tcp_hdrlen() and tcp_optlen(): The ip_hdrlen() buddy, created to reduce the number of skb-h.th- uses and to avoid the longer, open coded equivalent. +static inline unsigned int tcp_hdrlen(const struct sk_buff *skb) +{ + return skb-h.th-doff * 4; +} + +static inline unsigned int tcp_optlen(const struct sk_buff *skb) +{ + return (skb-h.th-doff - 5) * 4; +} acme, good stuff, but does the * 4 generate equivalent assembly with gcc 3/4 as 2 ? I could assume that the compiler would be smart enough, but every time I assume I know what the compiler is doing I get myself in trouble. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/15] [TCP]: Introduce tcp_hdrlen() and tcp_optlen()
On 3/13/07, Jesse Brandeburg [EMAIL PROTECTED] wrote: acme, good stuff, but does the * 4 generate equivalent assembly with gcc 3/4 as 2 ? I could assume that the compiler would be smart enough, but every time I assume I know what the compiler is doing I get myself in trouble. nevermind, I wrote a program myself to test it (which I should have done first). with x86-64 gcc 3.4.6 or 4.1.0 it always comes out to shl 2, %eax etc, sorry for the noise. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tc35815: Fix an usage of streaming DMA API.
On Tue, 13 Mar 2007 12:04:18 -0700, Stephen Hemminger [EMAIL PROTECTED] wrote: + * 1.35Fix an usage of streaming DMA API. */ Please don't use comments as changelog anymore. It gets out of date. The use of change control systems has made this practice obsolete. OK, Jeff, should I send a revised patch dropping this line? --- Atsushi Nemoto - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2/6] 2.6.21-rc3: known regressions
This email lists some known regressions in Linus' tree compared to 2.6.20. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject: ipv6 crash References : http://lkml.org/lkml/2007/3/10/2 Submitter : Len Brown [EMAIL PROTECTED] Status : unknown Subject: ThinkPad X60: bluetooth hardlocks References : http://lkml.org/lkml/2007/3/2/85 Submitter : Pavel Machek [EMAIL PROTECTED] Handled-By : Marcel Holtmann [EMAIL PROTECTED] Status : unknown Subject: forcedeth: skb_over_panic References : http://bugzilla.kernel.org/show_bug.cgi?id=8058 Submitter : Albert Hopkins [EMAIL PROTECTED] Handled-By : Ayaz Abdulla [EMAIL PROTECTED] Status : problem is being debugged - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] avoid OPEN_MAX in SCM_MAX_FD
On Tue, Mar 13, 2007 at 01:39:12AM -0700, Roland McGrath wrote: The OPEN_MAX constant is an arbitrary number with no useful relation to anything. Nothing should be using it. This patch changes SCM_MAX_FD to use NR_OPEN instead of OPEN_MAX. This increases the size of the struct scm_fp_list type fourfold, to make it big enough to contain as many file descriptors as could be asked of it. This size increase may not be very worthwhile, but at any rate if an arbitrary limit unrelated to anything else is being defined it should be done explicitly here with: -#define SCM_MAX_FD (OPEN_MAX-1) +#define SCM_MAX_FD (NR_OPEN-1) This is a bad idea. From linux/fs.h: #undef NR_OPEN #define NR_OPEN (1024*1024) /* Absolute upper limit on fd num */ There isn't anything I can see guaranteeing that net/scm.h is included before fs.h. This affects networking and should really be Cc'd to netdev@vger.kernel.org, which will raise the issue that if SCM_MAX_FD is raised, the resulting simple kmalloc() must be changed. That said, I doubt SCM_MAX_FD really needs to be raised, as applications using many file descriptors are unlikely to try to send their entire file table to another process in one go -- they have to handle the limits imposed by SCM_MAX_FD anyways. -ben -- Time is of no importance, Mr. President, only life is important. Don't Email: [EMAIL PROTECTED]. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 4/4] [TULIP] Rev tulip version
On Mon, Mar 12, 2007 at 10:07:33AM -0400, Jeff Garzik wrote: Pekka Enberg wrote: Hi, On 3/12/07, Valerie Henson [EMAIL PROTECTED] wrote: --- tulip-2.6-mm-linux.orig/drivers/net/tulip/tulip_core.c +++ tulip-2.6-mm-linux/drivers/net/tulip/tulip_core.c @@ -17,11 +17,11 @@ #define DRV_NAME tulip #ifdef CONFIG_TULIP_NAPI -#define DRV_VERSION1.1.14-NAPI /* Keep at least for test */ +#define DRV_VERSION1.1.15-NAPI /* Keep at least for test */ #else -#define DRV_VERSION1.1.14 +#define DRV_VERSION1.1.15 #endif -#define DRV_RELDATEMay 11, 2002 +#define DRV_RELDATEFeb 27, 2007 Why not just drop this? What purpose does a per-module revision have for in-kernel drivers anyway? It's the maintainer's call. Sometimes it eases parsing bug reports, and tracking changes as your drivers get backported to various enterprise operating systems(tm). Sometimes it just gets in the way. It's good to keep this type of information in drivers. I've been thinking lately that it would be nice to even expand it a little bit (maybe include the commit sum) so its easier to help those who aren't running the latest upstream kernels on their boxes - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] avoid OPEN_MAX in SCM_MAX_FD
-#define SCM_MAX_FD (OPEN_MAX-1) +#define SCM_MAX_FD (NR_OPEN-1) This is a bad idea. [...] Ok. My only agenda is to get rid of OPEN_MAX. I then propose the following instead. Thanks, Roland --- [PATCH] avoid OPEN_MAX in SCM_MAX_FD The OPEN_MAX constant is an arbitrary number with no useful relation to anything. Nothing should be using it. SCM_MAX_FD is just an arbitrary constant and it should be clear that its value is chosen in net/scm.h and not actually derived from anything else meaningful in the system. Signed-off-by: Roland McGrath [EMAIL PROTECTED] --- include/net/scm.h |5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/include/net/scm.h b/include/net/scm.h index 5637d5e..2240690 100644 --- a/include/net/scm.h +++ b/include/net/scm.h @@ -8,7 +8,7 @@ /* Well, we should have at least one descriptor open * to accept passed FDs 8) */ -#define SCM_MAX_FD (OPEN_MAX-1) +#define SCM_MAX_FD 255 struct scm_fp_list { - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tcp_cubic: use 32 bit math
Hi Stephen, On Mon, Mar 12, 2007 at 02:11:56PM -0700, Stephen Hemminger wrote: Oh BTW, I have a newer version with a first approximation of the cbrt() before the div64_64, which allows us to reduce from 3 div64 to only 2 div64. This results in a version which is twice as fast as the initial one (ncubic), but with slightly less accuracy (0.286% compared to 0.247). But I see that other functions such as hcbrt() had a 1.5% avg error, so I think this is not dramatic. Ignore my hcbrt() it was a less accurate version of andi's stuff. OK. Also, I managed to remove all other divides, to be kind with CPUs having a slow divide instruction or no divide at all. Since we compute on limited range (22 bits), we can multiply then shift right. It shows me even slightly better time on pentium-m and athlon, with a slightly higher avg error (0.297% compared to 0.286%), and slightly smaller code. What does the code look like? Well, I have cleaned it a little bit, there were more comments and ifdefs than code ! I've appended it to the end of this mail. I have changed it a bit, because I noticed that integer divide precision was so coarse that there were other possibilities to play with the bits. I have experimented with combinations of several methods : - replace integer divides with multiplies/shifts where possible. - compensation for divide imprecisions by adding/removing small values bofore/after them. Often, the integer result of 1/(x*(x-1)) is closer to (float)1/(float)x^2 than 1/(x*x). This is because the divide always truncates the result. - use direct result lookup for small values. Small inputs give small outputs which have very few moving bits. Many different values fit in a 32bit integer, so we use a shift offset to lookup the value. I used this in an fls function I wrote a while ago, that I should also post because it is up to twice as fast as the kernel's. Sometimes it seems faster to lookup in from memory, sometimes it is faster from an immediate value. Maybe more visible differences would show up on RISC CPUs where loading 32 bits immediate needs two instructions. I don't know yet, I've not tested on my sparc yet. - use small lookup tables (64 bytes) with 6 bits inputs and at least as many on output. We only lookup the 6 MSB and return the 2-3 MSB of the result. - iterative search and manual refinment of the lookup tables for best accuracy. The avg error rate can easily be halved this way. I have duplicated tried several functions with 0, 1, 2 and 3 divides. Several of them offer better accuracy over what we currently have, in less cycles. Others offer faster results (up to 5 times) with slightly less accuracy. There is one function which is not to be used, but is just here for comparison (ncubic_0div). It does no divide but has awful avg error. But one which is interesting is the ncubic_tab0. It does not use any divide at all, even not any div64. It shows a 0.6% avg error, which I'm not sure is enough or not. It is 6.7 times faster than initial ncubic() with less accuracy, and 4 times smaller. I suspect that it can differ more on architectures which have no divide instruction. Is 0.6% avg error rate is too much, ncubic_tab1() uses one single div64 and is twice slower (still nearly 3 times faster than ncubic). It show 0.195% avg error, which is better than initial ncubic. I think that it is a good tradeoff. If best accuracy is an absolute requirement, then I have a variation of ncubic (ncubic_3div) which does 0.17% in 2/3 of the time (compared to 0.247%), and which is slightly smaller. I have also added a size column, indicating approximative function size, provided that the compiler does not reorder the code. On gcc 3.4, it's OK, but 4.1 returns garbage. That does not matter, it's just a rough estimate anyway. Here are the results classed by speed : /* Sample output on a Pentium-M 600 MHz : Function clocks mean(us) max(us) std(us) Avg err size ncubic_tab0 79 0.66 7.20 1.04 0.613% 160 ncubic_0div 84 0.70 7.64 1.57 4.521% 192 ncubic_1div 178 1.4816.27 1.81 0.443% 336 ncubic_tab1 179 1.4916.34 1.85 0.195% 320 ncubic_ndiv3 263 2.1824.04 3.59 0.250% 512 ncubic_2div 270 2.2424.70 2.77 0.187% 512 ncubic32_1 359 2.9832.81 3.59 0.238% 544 ncubic_3div 361 2.9933.08 3.79 0.170% 656 ncubic32 364 3.0233.29 3.51 0.247% 544 ncubic 529 4.3948.39 4.92 0.247% 720 hcbrt539 4.4749.25 5.98 1.580% 96 ocubic 732 4.9361.83 7.22 0.274% 320 acbrt842 6.9876.73 8.55 0.275% 192 bictcp 1032 6.9586.30 9.04 0.172% 768 And now by avg error : ncubic_3div
Re: [PATCH 1/2] avoid OPEN_MAX in SCM_MAX_FD
On Tue, 13 Mar 2007, Roland McGrath wrote: The OPEN_MAX constant is an arbitrary number with no useful relation to anything. Nothing should be using it. SCM_MAX_FD is just an arbitrary constant and it should be clear that its value is chosen in net/scm.h and not actually derived from anything else meaningful in the system. I'd actually prefer this as part of the remove OPEN_MAX patch. It's certainly nice to have small independent patches in a series, but two one-liners that really aren't all that independent either in practice or in goals doesn't make much sense to me. Much better to just be up-front about things and say: remove OPEN_MAX, and to do so, just rewrite that other arbitrary constant to not need it any more. That said, it actually worries me that you should call _SC_OPEN_MAX. I think the whole POSIX config method is way over-designed (anybody who thinks you should ever have used _SC_HZ or whatever it was called was just crazy), but more importantly, and independently of that worry, I just suspect a lot of programs simply _don't_do_it_. For example, I know perfectly well that I should use _SC_PATH_MAX, but a *lot* of code simply doesn't care. In git, I used PATH_MAX, and the reason is that - I want a constant for arrays - I don't care that much about the exact value, I just want a reasonable value for sizing an array for some random path - _SC_PATH_MAX is practically unportable and simply not *useful*. .. in short, I'm not a big believer in programs should do Xyz according to some paper standard. Paper standards are written by committees, not programmers, and seldom take issues other than politics into account. So, what's the likelihood that this will break some old programs? I realize that modern distributions don't put the kernel headers in their user-visible includes any more, but the breakage is most likely exactly for old programs and older distributions. Linus - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] avoid OPEN_MAX in SCM_MAX_FD
I'd actually prefer this as part of the remove OPEN_MAX patch. Ok. (But now you're going to argue with me about remove OPEN_MAX, and you haven't said you have any problem with changing SCM_MAX_FD, so why make it wait?) That said, it actually worries me that you should call _SC_OPEN_MAX. [...] For example, I know perfectly well that I should use _SC_PATH_MAX, but a *lot* of code simply doesn't care. In git, I used PATH_MAX, and the reason [...] Ok, fine. But PATH_MAX is a real constant that has some meaning in the kernel. It's perfectly correct to use PATH_MAX as a constant on a system like Linux that defines it and means what it says. Conversely, OPEN_MAX has no useful relationship with anything the kernel is doing at all. So, what's the likelihood that this will break some old programs? I realize that modern distributions don't put the kernel headers in their user-visible includes any more, but the breakage is most likely exactly for old programs and older distributions. Well, I don't know for sure. It doesn't seem all that likely to me (not like PATH_MAX), as there has been getdtablesize() since before there was OPEN_MAX by that name (not to mention before there was Linux). If things use OPEN_MAX as a constant for arrays, they're already broken unless they call setrlimit to constrain themselves. Getting things fixed has to start somewhere. Thanks, Roland - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] avoid OPEN_MAX in SCM_MAX_FD
On Tue, 13 Mar 2007, Roland McGrath wrote: Ok, fine. But PATH_MAX is a real constant that has some meaning in the kernel. It's perfectly correct to use PATH_MAX as a constant on a system like Linux that defines it and means what it says. Conversely, OPEN_MAX has no useful relationship with anything the kernel is doing at all. Sure. I'm just saying that some people may use OPEN_MAX the way I know people use PATH_MAX - whether it's what you're supposed to or not. I do agree that PATH_MAX is much more appropriate to be used that way, and is more likely to have real meaning, I just worry. Linus - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[ANNOUNCE] iproute2 2.6.20-070313
This is an experimental to the iproute2 command set. The version number includes the kernel version to denote what features are supported. The same source should build on older systems, but obviously the newer kernel features won't be available. As much as possible, this package tries to be source compatible across releases. It can be downloaded from: http://developer.osdl.org/dev/iproute2/download/iproute2-2.6.20-070313.tar.gz Repository: git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git For more info on iproute2 see: http://linux-net.osdl.org/index.php/Iproute2 Changes: Jamal Hadi Salim: update rest to use nl_mgrp nl_mgrp to crap if base multicast groups exceeded Old bug on tc Mike Frysinger: do not ignore build failures in subdirs of iproute2 Noriaki TAKAMIYA: enabled to manipulate the flags of IFA_F_HOMEADDRESS or IFA_F_NODAD from ip. Patrick McHardy: tbf: fix latency printing Use tc_calc_xmittime() where appropriate Introduce tc_calc_xmitsize and use where appropriate Introduce TIME_UNITS_PER_SEC to represent internal clock resolution Replace usec by time in function names Add sprint_ticks() function and use in CBQ Handle different kernel clock resolutions Increase internal clock resolution to nsec Stephen Hemminger: netem use read/write for changes fix tc-pfifo and tc-bfifo man pages iptables library fix TC bfifo man page Use kernel headers from 2.6.20.y Thomas Hisch: Fixes use of uninitialized string - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.21-rc3-git4 ata1.00: qc timeout (cmd 0xef) (crashdump kernel)
On 12/03/07, Tejun Heo [EMAIL PROTECTED] wrote: Stephen Hemminger wrote: On Tue, 13 Mar 2007 04:03:00 +0900 Tejun Heo [EMAIL PROTECTED] wrote: Stephen Hemminger wrote: 1. the controller has IRQ stuck high (infrequent but possible) 2. the IRQ is already requested by another device 3. the IRQ gets disabled due to screaming interrupts at the moment ata_piix does pci_enable_device(). I think we can be much more resilient to screaming interrupts if we enable device with IRQ disabled and enable it after the device is initialized to some level, possibly when requesting IRQ. The first thing the skge driver does is do a chip reset, and that should cause IRQ to be disabled and cleared. The driver has no chance to fix it if the BIOS left the IRQ screaming... What if we do something like... pci_intx(pdev, 0); pci_enable_device(pdev); /* initialize */ request_irq(blah blah...); pci_intx(pdev, 1); Would this work for skge? Okay for testing, but any change like this should be done in the base PCI layer, not one off in a particular driver. Yeap, it was a proof-of-concept pseudo code. I attached a patch to do above in skge. Please point out if it is broken (e.g. intx needs to be enabled earlier). Michal, can you apply the attached patch and see whether it fixes the problem. I think that problem is solved. Thanks. Thanks. -- tejun diff --git a/drivers/net/skge.c b/drivers/net/skge.c index eea75a4..2c990f2 100644 --- a/drivers/net/skge.c +++ b/drivers/net/skge.c @@ -3585,6 +3585,7 @@ static int __devinit skge_probe(struct pci_dev *pdev, struct skge_hw *hw; int err, using_dac = 0; + pci_intx(pdev, 0); err = pci_enable_device(pdev); if (err) { dev_err(pdev-dev, cannot enable PCI device\n); @@ -3669,6 +3670,7 @@ static int __devinit skge_probe(struct pci_dev *pdev, dev-name, pdev-irq); goto err_out_unregister; } + pci_intx(pdev, 1); skge_show_addr(dev); if (hw-ports 1 (dev1 = skge_devinit(hw, 1, using_dac))) { Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html