Re: Doubt in kernel packet generation
On Fri, Aug 12, 2005 at 09:33:13AM +0530, varun ([EMAIL PROTECTED]) wrote: Hi all, I have a major doubt regarding how to generate my own icmp packet from the kernel space. That is iam aware of raw sockets and packet sockets but thats from user space. I want one of my kernel module to generate a packet using skb and probably add it to the transmit queue. Can anyone help me in this? Iam new to this group so if question is irrelevant so please let me know where i can post it to get the answer. net/core/pktgen.c has an excellent example of building network packet in kernelspace. Varun -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] TCP Offload (TOE) - Chelsio
OPEN TOE submission from Chelsio Communications. The following items have been addressed: - cleaned up indentation. - cleaned up comments. - cleaned up c-styles. - using EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL - removed 2.4 compatibility. - created TCP_OFFLOAD config option. - moved #defines to appropriate files. - removed obfuscating macros. - included necessary definitions instead of struct. - made IS_OFFLOADED an inline function instead of macro. The following items are currently being worked on: - use sysfs instead of procfs. - addressing the use of semaphores in 'register_tom'. - use RCU, need to look at this. - use inline function instead of TOEDEV macro, requires some work. Comments: - static was removed from functions '__tcp_inherit_port' '__tcp_v4_hash' because these are called outside of tcp_ipv4.c from the TOM driver. Signed-off-by: Scott Bardone [EMAIL PROTECTED] diff -Naur linux-2.6.13-rc6-git3/include/linux/netdevice.h linux-2.6.13-rc6-git3.patched/include/linux/netdevice.h --- linux-2.6.13-rc6-git3/include/linux/netdevice.h 2005-08-07 11:18:56.0 -0700 +++ linux-2.6.13-rc6-git3.patched/include/linux/netdevice.h 2005-08-11 21:28:36.0 -0700 @@ -408,6 +408,9 @@ #define NETIF_F_VLAN_CHALLENGED1024/* Device cannot handle VLAN packets */ #define NETIF_F_TSO2048/* Can offload TCP/IP segmentation */ #define NETIF_F_LLTX 4096/* LockLess TX */ +#ifdef CONFIG_TCP_OFFLOAD +#define NETIF_F_TCPIP_OFFLOAD 65536 /* Can offload TCP/IP */ +#endif /* Called after device is detached from network. */ void(*uninit)(struct net_device *dev); diff -Naur linux-2.6.13-rc6-git3/include/linux/tcp_diag.h linux-2.6.13-rc6-git3.patched/include/linux/tcp_diag.h --- linux-2.6.13-rc6-git3/include/linux/tcp_diag.h 2005-08-07 11:18:56.0 -0700 +++ linux-2.6.13-rc6-git3.patched/include/linux/tcp_diag.h 2005-08-11 21:28:36.0 -0700 @@ -4,6 +4,11 @@ /* Just some random number */ #define TCPDIAG_GETSOCK 18 +/* TOE API */ +#ifdef CONFIG_TCP_OFFLOAD +#define TCPDIAG_OFFLOAD 5 +#endif + /* Socket identity */ struct tcpdiag_sockid { diff -Naur linux-2.6.13-rc6-git3/include/linux/tcp.h linux-2.6.13-rc6-git3.patched/include/linux/tcp.h --- linux-2.6.13-rc6-git3/include/linux/tcp.h 2005-08-07 11:18:56.0 -0700 +++ linux-2.6.13-rc6-git3.patched/include/linux/tcp.h 2005-08-11 21:28:36.0 -0700 @@ -235,6 +235,10 @@ return (struct tcp_request_sock *)req; } +#ifdef CONFIG_TCP_OFFLOAD +struct toe_funcs; +#endif + struct tcp_sock { /* inet_sock has to be the first member of tcp_sock */ struct inet_sockinet; @@ -342,6 +346,10 @@ struct tcp_func *af_specific; /* Operations which are AF_INET{4,6} specific */ +#ifdef CONFIG_TCP_OFFLOAD + struct toe_funcs*toe_specific; /* Operations overriden by TOEs */ +#endif + __u32 rcv_wnd;/* Current receiver window */ __u32 rcv_wup;/* rcv_nxt on last window update sent */ __u32 write_seq; /* Tail(+1) of data held in tcp send buffer */ diff -Naur linux-2.6.13-rc6-git3/include/linux/toedev.h linux-2.6.13-rc6-git3.patched/include/linux/toedev.h --- linux-2.6.13-rc6-git3/include/linux/toedev.h1969-12-31 16:00:00.0 -0800 +++ linux-2.6.13-rc6-git3.patched/include/linux/toedev.h2005-08-11 22:37:03.94780 -0700 @@ -0,0 +1,126 @@ +/* + * * + * File: * + * toedev.h * + * * + * Description: * + * TOE device definitions. * + * * + * This program is free software; you can redistribute it and/or modify * + * it under the terms of the GNU General Public License, version 2, as * + * published by the Free Software Foundation.* + * * + * You should have received a copy of the GNU General Public License along * + * with this program; if not, write to the Free Software Foundation, Inc., * + * 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. * + * * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED* + * WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF * + * MERCHANTABILITY AND FITNESS
Re: [PATCH] TCP Offload (TOE) - Chelsio
From: Scott Bardone [EMAIL PROTECTED] Date: Thu, 11 Aug 2005 23:16:14 -0700 - static was removed from functions '__tcp_inherit_port' '__tcp_v4_hash' because these are called outside of tcp_ipv4.c from the TOM driver. There is no way you're going to be allowed to call such deep TCP internals from your driver. This would mean that every time we wish to change the data structures and interfaces for TCP socket lookup, your drivers would need to change. This is all looking exactly like the deep dark dungeon I feared TOE support would be. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TCP Offload (TOE) - Chelsio
The networking gurus can comment on the internals of your patch better than I can. Just a few style notes though: +#ifdef CONFIG_TCP_OFFLOAD +#define NETIF_F_TCPIP_OFFLOAD65536 /* Can offload TCP/IP */ +#endif No need to protect this inside CONFIG_* option +/* TOE API */ +#ifdef CONFIG_TCP_OFFLOAD +#define TCPDIAG_OFFLOAD 5 +#endif Ditto +#ifdef CONFIG_TCP_OFFLOAD +struct toe_funcs; +#endif Ditto +#ifdef CONFIG_TCP_OFFLOAD +#include linux/toedev.h +#endif Include linux/toedev.h unconditionally. Have it handle the !CONFIG_TCP_OFFLOAD case itself by declaring noop macros for things like toe_neigh_update(). This way you can remove a lot of the #ifdef's you've sprinkled all over the .c files +#define boot_phase 0 Some explaination here? It looks like something left over from development. +#ifndef __raise_softirq_irqoff +#define __raise_softirq_irqoff(nr) __cpu_raise_softirq(smp_processor_id(), nr) +#endif What is this needed for? +static int toedev_init(void); This forward declaration seems to be only needed for the boot_phase thing above, so if that goes this can go as well. +/* + * Allocate a unique index for a TOE device. We keep the index within 30 bits Maybe look at lib/idr.c to handle this? + struct toedev *dev = kmalloc(sizeof(struct toedev), GFP_KERNEL); + + if (dev) { + memset(dev, 0, sizeof(struct toedev)); Minor nitpick (that some might disagree with)... I usually prefer: struct toedev *dev = kmalloc(sizeof(*dev), GFP_KERNEL); +int toe_receive_skb(struct toedev *dev, struct sk_buff **skb, int n) +{ + int i; n and i should probably be unsigned int +#ifdef CONFIG_TCP_OFFLOAD + tcp_listen_offload(sk); +#endif Another example of something that could be an empty macro in a .h file for the !CONFIG_TCP_OFFLOAD case. +#ifndef CONFIG_TCP_OFFLOAD +static +#endif Don't do this... just make it non-static unconditionally. It's not worth the ugliness. Same applies to other places. +#ifndef CONFIG_TCP_OFFLOAD +static +#endif +__inline__ void __tcp_inherit_port(struct sock *sk, struct sock *child) { struct tcp_bind_hashbucket *head = tcp_bhash[tcp_bhashfn(inet_sk(child)-num)]; @@ -351,7 +357,10 @@ } } Things that are inline and are now going to be shared really need to just remain static inline and move to a header file probably +#ifdef CONFIG_TCP_OFFLOAD + if (tcp_connect_offload(sk)) + return 0; +#endif Just another example of the kind of #ifdef that doesn't belong in the .c files. If the !CONFIG_TCP_OFFLOAD case just had #define tcp_connect_offload(sk) (0) then you can skip the #ifdef +#ifndef CONFIG_TCP_OFFLOAD LIMIT_NETDEBUG(printk(KERN_DEBUG TCP: drop open request from %u.%u. %u.%u/%u\n, NIPQUAD(saddr), ntohs(skb-h.th-source))); +#else + NETDEBUG(if (net_ratelimit()) \ + printk(KERN_DEBUG TCP: drop open +request from %u.%u. +%u.%u/%u\n, \ +NIPQUAD(saddr), +ntohs(skb-h.th-source))); +#endif Huh? What about TOE requires changes to printk ratelimiting? -Mitch - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TCP Offload (TOE) - Chelsio
David S. Miller wrote: From: Scott Bardone [EMAIL PROTECTED] Date: Thu, 11 Aug 2005 23:16:14 -0700 - static was removed from functions '__tcp_inherit_port' '__tcp_v4_hash' because these are called outside of tcp_ipv4.c from the TOM driver. There is no way you're going to be allowed to call such deep TCP internals from your driver. This would mean that every time we wish to change the data structures and interfaces for TCP socket lookup, your drivers would need to change. This is all looking exactly like the deep dark dungeon I feared TOE support would be. Although I keep an open mind, I really don't see how any TOE solution will ever overcome my own conceptual merge objections: 1) RFC compliance differs based on whether you use a TOE NIC, or Linux software stack. What Linux am I talking to, today? Linux is consistently the most RFC-compliant net stack in existence, AFAIK. TOE suddenly leaves all that open to question. 2) Security updates. We can deploy a net stack security fix very rapidly, and know that we have solved the issue(s). With TOE, security fixes no longer cover all users. One has to either wait on multiple TOE vendors to deploy firmware fixes, or deploy the software fix and leave TOE users exposed. Once again... What Linux am I talking to, today? 3) Netfilter. Either a TOE NIC (a) doesn't support netfilter, (b) needs far-reaching packet mangling hooks, or (c) includes its own custom netfilter [clone], with attendant bugs and maintenance issues. 4) Configuration. Either a TOE NIC needs deep net stack hooks, or needs its own netlink/ifconfig configuration interfaces. 5) As we see in this thread -- upper layer (TCP, IP) changes in the net stack require touching a bunch of low-level drivers. Brand new maintenance issue, which slows down upper layer development. So far, I haven't seen a TOE NIC that satisfies even half of these objections. About the only TOE situation I could imagine which -would- would be where the TOE firmware source code is included in the Linux kernel source code, but even then, all the hooks would be nasty. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: atheros driver - desc
On Sun, Aug 07, 2005 at 05:01:34PM +0200, Harald Welte wrote: I will consult my legal counsel about this. My current naive position on this is that only the actuall process of the re-engineering matters, not the result. Which countries is this advice valid for? Does someone need to chase this inside the US in parallel? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TCP Offload (TOE) - Chelsio
I'm fairly pessimistic about full TOE also, I just want to see the patch cleaned up a bit so we can see the exact impact it would have. The RX optimization work presented in the Neterion and Intel papers at OLS sounds a lot more interesting to me though. However, I do want to comment on one statement of yours: Jeff Garzik wrote: 3) Netfilter. Either a TOE NIC (a) doesn't support netfilter, (b) needs far-reaching packet mangling hooks, or (c) includes its own custom netfilter [clone], with attendant bugs and maintenance issues. I don't think netfilter is a big deal. The kernel could still check the TCP handshake packets (or, if needed, faked-up versions with the same data) at accept()/connect() time. If those pass muster it's a pretty good bet that the other 100,000 packets making up that TCP connection would also. Of course this limitation would need to be documented but I doubt most netfilter users would mind too much. There's obviously edge cases where you can lose like if you update the netfilter rules you ideally want to revalidate all the currently open connections. Since TOE hardware is designed to help the TCP end point you probably don't have to worry about NAT or other fancy mangling on these interfaces. -Mitch - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Fix NET/ROM queue length
NET/ROM uses virtual interfaces so setting a queue length is wrong. Signed-off-by: Ralf Baechle DL5RB [EMAIL PROTECTED] net/netrom/nr_dev.c |1 - 1 files changed, 1 deletion(-) Index: linux-cvs/net/netrom/nr_dev.c === --- linux-cvs.orig/net/netrom/nr_dev.c +++ linux-cvs/net/netrom/nr_dev.c @@ -187,7 +187,6 @@ void nr_setup(struct net_device *dev) dev-hard_header_len= NR_NETWORK_LEN + NR_TRANSPORT_LEN; dev-addr_len = AX25_ADDR_LEN; dev-type = ARPHRD_NETROM; - dev-tx_queue_len = 40; dev-rebuild_header = nr_rebuild_header; dev-set_mac_address= nr_set_mac_address; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6][IPV6] Generalise the tcp_v6_lookup routines
Hi David, Please consider pulling from: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/acme/net-2.6.14.git/ This is based on the discussions we had on the [EMAIL PROTECTED] about fully generalising tcp_diag, that is accomplished in this series of changesets without breaking userspace ABI, it breaks source code if users move from the previous tcp_diag.h to inet_diag.h, which is expected but only required if wanting to support this generalised infrastructure, the work required is basically a big sed, I'll do this later today/tomorrow. Best Regards, - Arnaldo tree 78f33e1b9c74aa4e1586326e0918db068a967676 parent ccd176a23975b634cbdd89ffa190fb9da107c34e author Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123816755 -0300 committer Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123816755 -0300 [IPV6] Generalise the tcp_v6_lookup routines In the same way as was done with the v4 counterparts, this will be moved to inet6_hashtables.c. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] -- include/linux/ipv6.h |5 + include/net/inet6_hashtables.h | 26 +++ net/ipv4/Kconfig |3 net/ipv4/tcp_diag.c| 40 +-- net/ipv6/tcp_ipv6.c| 139 + 5 files changed, 122 insertions(+), 91 deletions(-) -- diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h --- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -193,6 +193,11 @@ struct inet6_skb_parm { #define IP6CB(skb) ((struct inet6_skb_parm*)((skb)-cb)) +static inline int inet6_iif(const struct sk_buff *skb) +{ + return IP6CB(skb)-iif; +} + struct tcp6_request_sock { struct tcp_request_sock req; struct in6_addr loc_addr; diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h new file mode 100644 --- /dev/null +++ b/include/net/inet6_hashtables.h @@ -0,0 +1,26 @@ +/* + * INETAn implementation of the TCP/IP protocol suite for the LINUX + * operating system. INET is implemented using the BSD Socket + * interface as the means of communication with the user level. + * + * Authors:Lotsa people, from code originally in tcp + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _INET6_HASHTABLES_H +#define _INET6_HASHTABLES_H + +#include linux/types.h + +struct in6_addr; +struct inet_hashinfo; + +extern struct sock *inet6_lookup(struct inet_hashinfo *hashinfo, +const struct in6_addr *saddr, const u16 sport, +const struct in6_addr *daddr, const u16 dport, +const int dif); +#endif /* _INET6_HASHTABLES_H */ diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig --- a/net/ipv4/Kconfig +++ b/net/ipv4/Kconfig @@ -425,9 +425,6 @@ config IP_TCPDIAG If unsure, say Y. -config IP_TCPDIAG_IPV6 - def_bool (IP_TCPDIAG=y IPV6=y) || (IP_TCPDIAG=m IPV6) - config IP_TCPDIAG_DCCP def_bool (IP_TCPDIAG=y IP_DCCP=y) || (IP_TCPDIAG=m IP_DCCP) diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c --- a/net/ipv4/tcp_diag.c +++ b/net/ipv4/tcp_diag.c @@ -24,6 +24,10 @@ #include net/tcp.h #include net/ipv6.h #include net/inet_common.h +#include net/inet_connection_sock.h +#include net/inet_hashtables.h +#include net/inet_timewait_sock.h +#include net/inet6_hashtables.h #include linux/inet.h #include linux/stddef.h @@ -102,7 +106,7 @@ static int tcpdiag_fill(struct sk_buff * r-tcpdiag_wqueue = 0; r-tcpdiag_uid = 0; r-tcpdiag_inode = 0; -#ifdef CONFIG_IP_TCPDIAG_IPV6 +#if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) if (r-tcpdiag_family == AF_INET6) { const struct tcp6_timewait_sock *tcp6tw = tcp6_twsk(sk); @@ -121,7 +125,7 @@ static int tcpdiag_fill(struct sk_buff * r-id.tcpdiag_src[0] = inet-rcv_saddr; r-id.tcpdiag_dst[0] = inet-daddr; -#ifdef CONFIG_IP_TCPDIAG_IPV6 +#if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) if (r-tcpdiag_family == AF_INET6) { struct ipv6_pinfo *np = inet6_sk(sk); @@ -196,19 +200,6 @@ nlmsg_failure: return -1; } -#ifdef CONFIG_IP_TCPDIAG_IPV6 -extern struct sock *tcp_v6_lookup(struct in6_addr *saddr, u16 sport, - struct in6_addr *daddr, u16 dport, - int dif); -#else -static inline struct sock *tcp_v6_lookup(struct in6_addr *saddr, u16 sport, -
[PATCH 2/6][INET6_HASHTABLES] Move inet6_lookup functions to net/ipv4/inet6_hashtables.c
Hi David, Please consider pulling from: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/acme/net-2.6.14.git/ This is based on the discussions we had on the [EMAIL PROTECTED] about fully generalising tcp_diag, that is accomplished in this series of changesets without breaking userspace ABI, it breaks source code if users move from the previous tcp_diag.h to inet_diag.h, which is expected but only required if wanting to support this generalised infrastructure, the work required is basically a big sed, I'll do this later today/tomorrow. Best Regards, - Arnaldo tree 321162afae4bc318a868c1294d79be04cef31ad8 parent b0e1ef9a964a4d4ef3510d6820db759bc4821e44 author Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123817709 -0300 committer Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123817709 -0300 [INET6_HASHTABLES] Move inet6_lookup functions to net/ipv4/inet6_hashtables.c Doing this we allow tcp_diag to support IPV6 even if tcp_diag is compiled statically and IPV6 is compiled as a module, removing the previous restriction while not building any IPV6 code if it is not selected. Now to work on the tcpdiag_register infrastructure and then to rename the whole thing to inetdiag, reflecting its by then completely generic nature. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Signed-off-by: David S. Miller [EMAIL PROTECTED] -- include/net/inet6_hashtables.h | 106 +++- net/ipv4/Kconfig |4 - net/ipv4/Makefile |2 net/ipv4/inet6_hashtables.c| 81 + net/ipv6/tcp_ipv6.c| 154 - 5 files changed, 190 insertions(+), 157 deletions(-) -- diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h --- a/include/net/inet6_hashtables.h +++ b/include/net/inet6_hashtables.h @@ -14,13 +14,117 @@ #ifndef _INET6_HASHTABLES_H #define _INET6_HASHTABLES_H +#include linux/config.h + +#if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) +#include linux/in6.h +#include linux/ipv6.h #include linux/types.h -struct in6_addr; +#include net/ipv6.h + struct inet_hashinfo; +/* I have no idea if this is a good hash for v6 or not. -DaveM */ +static inline int inet6_ehashfn(const struct in6_addr *laddr, const u16 lport, + const struct in6_addr *faddr, const u16 fport, + const int ehash_size) +{ + int hashent = (lport ^ fport); + + hashent ^= (laddr-s6_addr32[3] ^ faddr-s6_addr32[3]); + hashent ^= hashent 16; + hashent ^= hashent 8; + return (hashent (ehash_size - 1)); +} + +static inline int inet6_sk_ehashfn(const struct sock *sk, const int ehash_size) +{ + const struct inet_sock *inet = inet_sk(sk); + const struct ipv6_pinfo *np = inet6_sk(sk); + const struct in6_addr *laddr = np-rcv_saddr; + const struct in6_addr *faddr = np-daddr; + const __u16 lport = inet-num; + const __u16 fport = inet-dport; + return inet6_ehashfn(laddr, lport, faddr, fport, ehash_size); +} + +/* + * Sockets in TCP_CLOSE state are _always_ taken out of the hash, so + * we need not check it for TCP lookups anymore, thanks Alexey. -DaveM + * + * The sockhash lock must be held as a reader here. + */ +static inline struct sock * + __inet6_lookup_established(struct inet_hashinfo *hashinfo, + const struct in6_addr *saddr, + const u16 sport, + const struct in6_addr *daddr, + const u16 hnum, + const int dif) +{ + struct sock *sk; + const struct hlist_node *node; + const __u32 ports = INET_COMBINED_PORTS(sport, hnum); + /* Optimize here for direct hit, only listening connections can +* have wildcards anyways. +*/ + const int hash = inet6_ehashfn(daddr, hnum, saddr, sport, + hashinfo-ehash_size); + struct inet_ehash_bucket *head = hashinfo-ehash[hash]; + + read_lock(head-lock); + sk_for_each(sk, node, head-chain) { + /* For IPV6 do the cheaper port and family tests first. */ + if (INET6_MATCH(sk, saddr, daddr, ports, dif)) + goto hit; /* You sunk my battleship! */ + } + /* Must check for a TIME_WAIT'er before going to listener hash. */ + sk_for_each(sk, node, (head + hashinfo-ehash_size)-chain) { + const struct inet_timewait_sock *tw = inet_twsk(sk); + + if(*((__u32 *)(tw-tw_dport)) == ports + sk-sk_family== PF_INET6) { +
[PATCH 3/6][TCPDIAG] Introduce inet_diag_{register,unregister}
Hi David, Please consider pulling from: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/acme/net-2.6.14.git/ This is based on the discussions we had on the [EMAIL PROTECTED] about fully generalising tcp_diag, that is accomplished in this series of changesets without breaking userspace ABI, it breaks source code if users move from the previous tcp_diag.h to inet_diag.h, which is expected but only required if wanting to support this generalised infrastructure, the work required is basically a big sed, I'll do this later today/tomorrow. Best Regards, - Arnaldo tree 068b3f880dfe76b8bae940ede403d6b8f5dd5c8c parent 790164673413c8cfebc910d6b99ab2ae6ae2c9bb author Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123829138 -0300 committer Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123829138 -0300 [TCPDIAG] Introduce inet_diag_{register,unregister} Next changeset will rename tcp_diag to inet_diag and move the tcp_diag code out of it and into a new tcp_diag.c, similar to the net/dccp/diag.c introduced in this changeset, completing the transition to a generic inet_diag infrastructure. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Signed-off-by: David S. Miller [EMAIL PROTECTED] -- include/linux/tcp_diag.h | 19 + net/dccp/Kconfig |5 + net/dccp/Makefile|4 + net/dccp/diag.c | 47 ++ net/ipv4/Kconfig |3 net/ipv4/tcp_diag.c | 153 ++- 6 files changed, 186 insertions(+), 45 deletions(-) -- diff --git a/include/linux/tcp_diag.h b/include/linux/tcp_diag.h --- a/include/linux/tcp_diag.h +++ b/include/linux/tcp_diag.h @@ -5,6 +5,8 @@ #define TCPDIAG_GETSOCK 18 #define DCCPDIAG_GETSOCK 19 +#define INET_DIAG_GETSOCK_MAX 24 + /* Socket identity */ struct tcpdiag_sockid { @@ -125,4 +127,21 @@ struct tcpvegas_info { __u32 tcpv_minrtt; }; +#ifdef __KERNEL__ +struct sock; +struct inet_hashinfo; + +struct inet_diag_handler { + struct inet_hashinfo*idiag_hashinfo; + void(*idiag_get_info)(struct sock *sk, + struct tcpdiagmsg *r, + void *info); + __u16 idiag_info_size; + __u16 idiag_type; +}; + +extern int inet_diag_register(const struct inet_diag_handler *handler); +extern void inet_diag_unregister(const struct inet_diag_handler *handler); +#endif /* __KERNEL__ */ + #endif /* _TCP_DIAG_H_ */ diff --git a/net/dccp/Kconfig b/net/dccp/Kconfig --- a/net/dccp/Kconfig +++ b/net/dccp/Kconfig @@ -19,6 +19,11 @@ config IP_DCCP If in doubt, say N. +config IP_DCCP_DIAG + depends on IP_DCCP IP_TCPDIAG + def_tristate y if (IP_DCCP = y IP_TCPDIAG = y) + def_tristate m + source net/dccp/ccids/Kconfig endmenu diff --git a/net/dccp/Makefile b/net/dccp/Makefile --- a/net/dccp/Makefile +++ b/net/dccp/Makefile @@ -3,4 +3,8 @@ obj-$(CONFIG_IP_DCCP) += dccp.o dccp-y := ccid.o input.o ipv4.o minisocks.o options.o output.o proto.o \ timer.o packet_history.o +obj-$(CONFIG_IP_DCCP_DIAG) += dccp_diag.o + obj-y += ccids/ + +dccp_diag-y := diag.o diff --git a/net/dccp/diag.c b/net/dccp/diag.c new file mode 100644 --- /dev/null +++ b/net/dccp/diag.c @@ -0,0 +1,47 @@ +/* + * net/dccp/diag.c + * + * An implementation of the DCCP protocol + * Arnaldo Carvalho de Melo [EMAIL PROTECTED] + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include linux/config.h + +#include linux/module.h +#include linux/tcp_diag.h + +#include dccp.h + +static void dccp_diag_get_info(struct sock *sk, struct tcpdiagmsg *r, + void *_info) +{ + r-tcpdiag_rqueue = r-tcpdiag_wqueue = 0; +} + +static struct inet_diag_handler dccp_diag_handler = { + .idiag_hashinfo = dccp_hashinfo, + .idiag_get_info = dccp_diag_get_info, + .idiag_type = DCCPDIAG_GETSOCK, + .idiag_info_size = 0, +}; + +static int __init dccp_diag_init(void) +{ + return inet_diag_register(dccp_diag_handler); +} + +static void __exit dccp_diag_fini(void) +{ + inet_diag_unregister(dccp_diag_handler); +} + +module_init(dccp_diag_init); +module_exit(dccp_diag_fini); + +MODULE_LICENSE(GPL); +MODULE_AUTHOR(Arnaldo Carvalho de Melo [EMAIL PROTECTED]); +MODULE_DESCRIPTION(DCCP inet_diag handler); diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig --- a/net/ipv4/Kconfig +++ b/net/ipv4/Kconfig @@ -423,9 +423,6 @@ config IP_TCPDIAG If unsure, say Y. -config IP_TCPDIAG_DCCP - def_bool (IP_TCPDIAG=y
[PATCH 4/6][TCPDIAG] Just rename everything to inet_diag
Hi David, Please consider pulling from: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/acme/net-2.6.14.git/ This is based on the discussions we had on the [EMAIL PROTECTED] about fully generalising tcp_diag, that is accomplished in this series of changesets without breaking userspace ABI, it breaks source code if users move from the previous tcp_diag.h to inet_diag.h, which is expected but only required if wanting to support this generalised infrastructure, the work required is basically a big sed, I'll do this later today/tomorrow. Best Regards, - Arnaldo tree c121114a797d3c2a5b0fe3bfcc7e26a83a3c1c55 parent e708bc5b8898bc13af6daa55a022272c70e6a747 author Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123836219 -0300 committer Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123836219 -0300 [TCPDIAG] Just rename everything to inet_diag Next changeset will rename tcp_diag.[ch] to inet_diag.[ch]. I'm taking this longer route so as to easy review, making clear the changes made all along the way. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Signed-off-by: David S. Miller [EMAIL PROTECTED] -- include/linux/netlink.h |2 include/linux/tcp_diag.h | 135 - include/net/tcp.h|2 net/dccp/Kconfig |4 net/dccp/diag.c |4 net/ipv4/Kconfig | 10 net/ipv4/Makefile|2 net/ipv4/tcp_diag.c | 395 +-- net/ipv4/tcp_vegas.c |4 net/ipv4/tcp_westwood.c |4 security/selinux/hooks.c |4 security/selinux/include/av_inherit.h|2 security/selinux/include/av_perm_to_string.h |4 security/selinux/include/av_permissions.h| 48 +-- security/selinux/include/class_to_string.h |2 security/selinux/include/flask.h |2 security/selinux/nlmsgtab.c | 11 17 files changed, 314 insertions(+), 321 deletions(-) -- diff --git a/include/linux/netlink.h b/include/linux/netlink.h --- a/include/linux/netlink.h +++ b/include/linux/netlink.h @@ -8,7 +8,7 @@ #define NETLINK_W1 1 /* 1-wire subsystem */ #define NETLINK_USERSOCK 2 /* Reserved for user mode socket protocols */ #define NETLINK_FIREWALL 3 /* Firewalling hook */ -#define NETLINK_TCPDIAG4 /* TCP socket monitoring */ +#define NETLINK_INET_DIAG 4 /* INET socket monitoring */ #define NETLINK_NFLOG 5 /* netfilter/iptables ULOG */ #define NETLINK_XFRM 6 /* ipsec */ #define NETLINK_SELINUX7 /* SELinux event notifications */ diff --git a/include/linux/tcp_diag.h b/include/linux/tcp_diag.h --- a/include/linux/tcp_diag.h +++ b/include/linux/tcp_diag.h @@ -1,5 +1,5 @@ -#ifndef _TCP_DIAG_H_ -#define _TCP_DIAG_H_ 1 +#ifndef _INET_DIAG_H_ +#define _INET_DIAG_H_ 1 /* Just some random number */ #define TCPDIAG_GETSOCK 18 @@ -8,39 +8,36 @@ #define INET_DIAG_GETSOCK_MAX 24 /* Socket identity */ -struct tcpdiag_sockid -{ - __u16 tcpdiag_sport; - __u16 tcpdiag_dport; - __u32 tcpdiag_src[4]; - __u32 tcpdiag_dst[4]; - __u32 tcpdiag_if; - __u32 tcpdiag_cookie[2]; -#define TCPDIAG_NOCOOKIE (~0U) +struct inet_diag_sockid { + __u16 idiag_sport; + __u16 idiag_dport; + __u32 idiag_src[4]; + __u32 idiag_dst[4]; + __u32 idiag_if; + __u32 idiag_cookie[2]; +#define INET_DIAG_NOCOOKIE (~0U) }; /* Request structure */ -struct tcpdiagreq -{ - __u8tcpdiag_family; /* Family of addresses. */ - __u8tcpdiag_src_len; - __u8tcpdiag_dst_len; - __u8tcpdiag_ext;/* Query extended information */ +struct inet_diag_req { + __u8idiag_family; /* Family of addresses. */ + __u8idiag_src_len; + __u8idiag_dst_len; + __u8idiag_ext; /* Query extended information */ - struct tcpdiag_sockid id; + struct inet_diag_sockid id; - __u32 tcpdiag_states; /* States to dump */ - __u32 tcpdiag_dbs;/* Tables to dump (NI) */ + __u32 idiag_states; /* States to dump */ + __u32 idiag_dbs; /* Tables to dump (NI) */ }; -enum -{ - TCPDIAG_REQ_NONE, - TCPDIAG_REQ_BYTECODE, +enum { + INET_DIAG_REQ_NONE, + INET_DIAG_REQ_BYTECODE, }; -#define TCPDIAG_REQ_MAX
[PATCH 5/6][INET_DIAG] Rename tcp_diag.[ch] to inet_diag.[ch]
Hi David, Please consider pulling from: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/acme/net-2.6.14.git/ This is based on the discussions we had on the [EMAIL PROTECTED] about fully generalising tcp_diag, that is accomplished in this series of changesets without breaking userspace ABI, it breaks source code if users move from the previous tcp_diag.h to inet_diag.h, which is expected but only required if wanting to support this generalised infrastructure, the work required is basically a big sed, I'll do this later today/tomorrow. Best Regards, - Arnaldo tree 5ade244ad9d4220137112a4ea75325c652a66e03 parent 415f7316a38f275e121cc1962565cb7077cd188e author Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123837525 -0300 committer Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123837525 -0300 [INET_DIAG] Rename tcp_diag.[ch] to inet_diag.[ch] Next changeset will introduce net/ipv4/tcp_diag.c, moving the code that was put transitioanlly in inet_diag.c. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Signed-off-by: David S. Miller [EMAIL PROTECTED] -- b/include/linux/inet_diag.h | 138 ++ b/net/dccp/diag.c |2 b/net/ipv4/Makefile |2 b/net/ipv4/inet_diag.c| 893 ++ b/net/ipv4/tcp_vegas.c|2 b/net/ipv4/tcp_westwood.c |2 b/security/selinux/nlmsgtab.c |2 include/linux/tcp_diag.h | 138 -- net/ipv4/tcp_diag.c | 892 - 9 files changed, 1036 insertions(+), 1035 deletions(-) -- diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h new file mode 100644 --- /dev/null +++ b/include/linux/inet_diag.h @@ -0,0 +1,138 @@ +#ifndef _INET_DIAG_H_ +#define _INET_DIAG_H_ 1 + +/* Just some random number */ +#define TCPDIAG_GETSOCK 18 +#define DCCPDIAG_GETSOCK 19 + +#define INET_DIAG_GETSOCK_MAX 24 + +/* Socket identity */ +struct inet_diag_sockid { + __u16 idiag_sport; + __u16 idiag_dport; + __u32 idiag_src[4]; + __u32 idiag_dst[4]; + __u32 idiag_if; + __u32 idiag_cookie[2]; +#define INET_DIAG_NOCOOKIE (~0U) +}; + +/* Request structure */ + +struct inet_diag_req { + __u8idiag_family; /* Family of addresses. */ + __u8idiag_src_len; + __u8idiag_dst_len; + __u8idiag_ext; /* Query extended information */ + + struct inet_diag_sockid id; + + __u32 idiag_states; /* States to dump */ + __u32 idiag_dbs; /* Tables to dump (NI) */ +}; + +enum { + INET_DIAG_REQ_NONE, + INET_DIAG_REQ_BYTECODE, +}; + +#define INET_DIAG_REQ_MAX INET_DIAG_REQ_BYTECODE + +/* Bytecode is sequence of 4 byte commands followed by variable arguments. + * All the commands identified by code are conditional jumps forward: + * to offset cc+yes or to offset cc+no. yes is supposed to be + * length of the command and its arguments. + */ + +struct inet_diag_bc_op { + unsigned char code; + unsigned char yes; + unsigned short no; +}; + +enum { + INET_DIAG_BC_NOP, + INET_DIAG_BC_JMP, + INET_DIAG_BC_S_GE, + INET_DIAG_BC_S_LE, + INET_DIAG_BC_D_GE, + INET_DIAG_BC_D_LE, + INET_DIAG_BC_AUTO, + INET_DIAG_BC_S_COND, + INET_DIAG_BC_D_COND, +}; + +struct inet_diag_hostcond { + __u8family; + __u8prefix_len; + int port; + __u32 addr[0]; +}; + +/* Base info structure. It contains socket identity (addrs/ports/cookie) + * and, alas, the information shown by netstat. */ +struct inet_diag_msg { + __u8idiag_family; + __u8idiag_state; + __u8idiag_timer; + __u8idiag_retrans; + + struct inet_diag_sockid id; + + __u32 idiag_expires; + __u32 idiag_rqueue; + __u32 idiag_wqueue; + __u32 idiag_uid; + __u32 idiag_inode; +}; + +/* Extensions */ + +enum { + INET_DIAG_NONE, + INET_DIAG_MEMINFO, + INET_DIAG_INFO, + INET_DIAG_VEGASINFO, + INET_DIAG_CONG, +}; + +#define INET_DIAG_MAX INET_DIAG_CONG + + +/* INET_DIAG_MEM */ + +struct inet_diag_meminfo { + __u32 idiag_rmem; + __u32 idiag_wmem; + __u32 idiag_fmem; + __u32 idiag_tmem; +}; + +/* INET_DIAG_VEGASINFO */ + +struct tcpvegas_info { + __u32 tcpv_enabled; + __u32 tcpv_rttcnt; + __u32 tcpv_rtt; + __u32 tcpv_minrtt; +}; + +#ifdef __KERNEL__ +struct sock; +struct inet_hashinfo; + +struct inet_diag_handler { + struct inet_hashinfo*idiag_hashinfo; + void(*idiag_get_info)(struct sock *sk, + struct inet_diag_msg *r, +
[PATCH 6/6][INET_DIAG] Move the tcp_diag interface to the proper place
Hi David, Please consider pulling from: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/acme/net-2.6.14.git/ This is based on the discussions we had on the [EMAIL PROTECTED] about fully generalising tcp_diag, that is accomplished in this series of changesets without breaking userspace ABI, it breaks source code if users move from the previous tcp_diag.h to inet_diag.h, which is expected but only required if wanting to support this generalised infrastructure, the work required is basically a big sed, I'll do this later today/tomorrow. Best Regards, - Arnaldo tree 34a82c300ebcf262b22e607a303158c758967760 parent b2245293d4c6b27d66812567d524e7eea4e91c25 author Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123840162 -0300 committer Arnaldo Carvalho de Melo [EMAIL PROTECTED] 1123840162 -0300 [INET_DIAG] Move the tcp_diag interface to the proper place With this the previous setup is back, i.e. tcp_diag can be built as a module, as dccp_diag and both share the infrastructure available in inet_diag. If one selects CONFIG_INET_DIAG as module CONFIG_INET_TCP_DIAG will also be built as a module, as will CONFIG_INET_DCCP_DIAG, if CONFIG_IP_DCCP was selected static or as a module, if CONFIG_INET_DIAG is y, being statically linked CONFIG_INET_TCP_DIAG will follow suit and CONFIG_INET_DCCP_DIAG will be built in the same manner as CONFIG_IP_DCCP. Now to aim at UDP, converting it to use inet_hashinfo, so that we can use iproute2 for UDP sockets as well. Ah, just to show an example of this new infrastructure working for DCCP :-) [EMAIL PROTECTED] ~]# ./ss -dane State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 0 *:5001 *:* ino:942 sk:cfd503a0 ESTAB 0 0 127.0.0.1:5001 127.0.0.1:32770 ino:943 sk:cfd50a60 ESTAB 0 0 127.0.0.1:32770127.0.0.1:5001 ino:947 sk:cfd50700 TIME-WAIT 0 0 127.0.0.1:32769127.0.0.1:5001 timer:(timewait,3.430ms,0) ino:0 sk:cf209620 Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Signed-off-by: David S. Miller [EMAIL PROTECTED] -- include/net/tcp.h|2 - net/dccp/Kconfig |6 ++--- net/dccp/Makefile|6 ++--- net/ipv4/Kconfig |8 +-- net/ipv4/Makefile|3 +- net/ipv4/inet_diag.c | 27 - net/ipv4/tcp_diag.c | 54 +++ 7 files changed, 70 insertions(+), 36 deletions(-) -- diff --git a/include/net/tcp.h b/include/net/tcp.h --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -479,7 +479,7 @@ static inline void tcp_clear_xmit_timers extern unsigned int tcp_sync_mss(struct sock *sk, u32 pmtu); extern unsigned int tcp_current_mss(struct sock *sk, int large); -/* tcp_diag.c */ +/* tcp.c */ extern void tcp_get_info(struct sock *, struct tcp_info *); /* Read 'sendfile()'-style from a TCP socket */ diff --git a/net/dccp/Kconfig b/net/dccp/Kconfig --- a/net/dccp/Kconfig +++ b/net/dccp/Kconfig @@ -19,9 +19,9 @@ config IP_DCCP If in doubt, say N. -config IP_DCCP_DIAG - depends on IP_DCCP IP_INET_DIAG - def_tristate y if (IP_DCCP = y IP_INET_DIAG = y) +config INET_DCCP_DIAG + depends on IP_DCCP INET_DIAG + def_tristate y if (IP_DCCP = y INET_DIAG = y) def_tristate m source net/dccp/ccids/Kconfig diff --git a/net/dccp/Makefile b/net/dccp/Makefile --- a/net/dccp/Makefile +++ b/net/dccp/Makefile @@ -3,8 +3,8 @@ obj-$(CONFIG_IP_DCCP) += dccp.o dccp-y := ccid.o input.o ipv4.o minisocks.o options.o output.o proto.o \ timer.o packet_history.o -obj-$(CONFIG_IP_DCCP_DIAG) += dccp_diag.o - -obj-y += ccids/ +obj-$(CONFIG_INET_DCCP_DIAG) += dccp_diag.o dccp_diag-y := diag.o + +obj-y += ccids/ diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig --- a/net/ipv4/Kconfig +++ b/net/ipv4/Kconfig @@ -413,8 +413,8 @@ config INET_TUNNEL If unsure, say Y. -config IP_INET_DIAG - tristate IP: INET socket monitoring interface +config INET_DIAG + tristate INET: socket monitoring interface default y ---help--- Support for INET (TCP, DCCP, etc) socket monitoring interface used by @@ -423,6 +423,10 @@ config IP_INET_DIAG If unsure, say Y. +config INET_TCP_DIAG + depends on INET_DIAG + def_tristate INET_DIAG + config TCP_CONG_ADVANCED bool TCP: advanced congestion control ---help--- diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -30,8 +30,9 @@ obj-$(CONFIG_IP_ROUTE_MULTIPATH_WRANDOM) obj-$(CONFIG_IP_ROUTE_MULTIPATH_DRR) += multipath_drr.o obj-$(CONFIG_NETFILTER)+= netfilter/ obj-$(CONFIG_IP_VS) += ipvs/ -obj-$(CONFIG_IP_INET_DIAG) +=
Re: argh... ;/
On Thu, Aug 11, 2005 at 10:36:34PM -0700, Chris Wedgwood wrote: On Fri, Aug 05, 2005 at 01:20:59PM -0400, John W. Linville wrote: Yes. Opening attachments makes them harder to review. Lots of people can't inline patches because they are inflicted with crappy MUAs --- I would much prefer patches as attachments in those cases versus mangled patches. Don't use crappy MUAs? Also, I would arguue any sane MUA would make dealing with reading/openning patches for sensible mime types trivial. Any sane MUA wouldn't mangle the patches... John -- John W. Linville [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] add new iptables ipt_connbytes match
Andi Kleen wrote: David S. Miller [EMAIL PROTECTED] writes: Won't work in x86 -- x86_64 compat environments. Thanks for catching it. The aligned u64 trick probably will #define aligned_u64 unsigned long long __attribute__((aligned(8))) It just forces i386 to be aligned too. Then use aligned_u64 instead of u64/__u64/u_int64_t in all user visible places. Similar for signed types. Unfortunately one of the iptables structures which is needed to get the ruleset in the kernel (ipt_replace) is differently sized when compiled for 32/64 bit. IIRC it doesn't work at all currently. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: atheros driver - desc
On Fri, Aug 12, 2005 at 12:37:55AM -0700, Chris Wedgwood wrote: On Sun, Aug 07, 2005 at 05:01:34PM +0200, Harald Welte wrote: I will consult my legal counsel about this. My current naive position on this is that only the actuall process of the re-engineering matters, not the result. Which countries is this advice valid for? Does someone need to chase this inside the US in parallel? I'll see whether I can get Eben Moglen to comment on that matter. -- - Harald Welte [EMAIL PROTECTED] http://gnumonks.org/ Privacy in residential applications is a desirable marketing option. (ETSI EN 300 175-7 Ch. A6) pgpEOdNBXlpM1.pgp Description: PGP signature
Re: [PATCH] add new iptables ipt_connbytes match
On Fri, Aug 12, 2005 at 04:52:49AM +0200, Patrick McHardy wrote: This functions looks broken. I feared it... Divisor and divident are mixed up, the shifted result variable is not used in the actual division, the first bit has to be 32 assumption is wrong and num_shift is calculated incorrectly. To find a 32-bit divisor consisting of the most-significant 32 bits we need to find the highest bit set and subtract 32 from this, then right-shift by that value if it is larger than 0. I can send a fixed patch tomorrow but I'm too tired now. Thanks. +case IPT_CONNBYTES_WHAT_PKTS: I would really prefer the name IPT_CONNBYTES_PKTS :) I _think_ it's sure to change it, since we don't include ipt_connbytes.h in the iptables package. Just send two incremental patches to Dave. Cheers, Harald -- - Harald Welte [EMAIL PROTECTED] http://netfilter.org/ Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed.-- Paul Vixie pgpgwRWcZpriU.pgp Description: PGP signature
Re: [PATCH 2/6][INET6_HASHTABLES] Move inet6_lookup functions to net/ipv4/inet6_hashtables.c
Em Fri, Aug 12, 2005 at 09:09:53PM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ escreveu: In article [EMAIL PROTECTED] (at Fri, 12 Aug 2005 08:40:24 -0300), [EMAIL PROTECTED] (Arnaldo Carvalho de Melo) says: [INET6_HASHTABLES] Move inet6_lookup functions to net/ipv4/inet6_hashtables.c Doing this we allow tcp_diag to support IPV6 even if tcp_diag is compiled statically and IPV6 is compiled as a module, removing the previous restriction while not building any IPV6 code if it is not selected. Please put this into net/ipv6 and list it in obj-y in net/ipv6/Makefile, like for net/ipv6/exthdrs_core.c. --yoshfuji Humm, was not aware of this, lemme test this and then recreate the tree... - Arnaldo - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6][INET6_HASHTABLES] Move inet6_lookup functions to net/ipv4/inet6_hashtables.c
Em Fri, Aug 12, 2005 at 09:12:36AM -0300, Arnaldo Carvalho de Melo escreveu: Em Fri, Aug 12, 2005 at 09:09:53PM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ escreveu: In article [EMAIL PROTECTED] (at Fri, 12 Aug 2005 08:40:24 -0300), [EMAIL PROTECTED] (Arnaldo Carvalho de Melo) says: [INET6_HASHTABLES] Move inet6_lookup functions to net/ipv4/inet6_hashtables.c Doing this we allow tcp_diag to support IPV6 even if tcp_diag is compiled statically and IPV6 is compiled as a module, removing the previous restriction while not building any IPV6 code if it is not selected. Please put this into net/ipv6 and list it in obj-y in net/ipv6/Makefile, like for net/ipv6/exthdrs_core.c. --yoshfuji Humm, was not aware of this, lemme test this and then recreate the tree... Done, the mirrors should pick it from master.kernel.org shortly, thank you Yoshifuji-san. - Arnaldo - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Strange uses of netif_start_queue
* Ralf Baechle [EMAIL PROTECTED] 2005-08-12 14:39 On Fri, Aug 12, 2005 at 02:27:59PM +0100, Ralf Baechle wrote: Something I noticed doing the tty work. the 6pack driver calls netif_start_queue() before it calls register_netdev. I'm curious if this is allowed ? As part of adding support for extended 6pack which is required by the PR 430 I've recently fixed that. It was looking suspect enough that I fixed it though I don't see any way this could do harm. To answer the fundamental question, I think netif_start_queue / netif_stop_queue should be allowed in case the driver for some reason has the desire to stop queueing of packet immediately after register_netdev. The statement simply has no effect because the queue cannot be woken up at this point, if so it would be a bug anyway due to uninitialized spinlocks regardless of the prior call to netif_start_queue() so the statement has no effect at all. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: argh... ;/
On Fri, Aug 12, 2005 at 07:44:28AM -0400, John W. Linville wrote: Don't use crappy MUAs? Well, plenty of people do. It's almost the norm so crappy probably isn't very fair. It does seem that most if the GUI-base MUAs though by default have problematic settings (Mozilla, Thunderbird, Evolution, Outlook all have problems at tims). People also like to cut paste patches from xterms or simlar into MUAs which usually doesn't work very well either. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[OT] Re: [PATCH] xfrm: do not use large arrays in BSS
On Thu, Aug 11, 2005 at 02:44:11PM +0200, Balazs Scheidler wrote: On Thu, 2005-08-11 at 22:31 +1000, Herbert Xu wrote: Balazs Scheidler [EMAIL PROTECTED] wrote: I've attached a revised patch, this time with complete error checking, and propagating the error code to the caller. Please apply. Sorry, but it seems that you've left out the bits that check the return value from xfrm_init()? Damn. I still have to get used to with git. Thanks for the hint. Anyone know a git description that tells me how to follow a tree and maintain my own set of patches on top? I create one local branch (head) for every feature/patchset by doing something like cp .git/refs/heads/master .git/refs/heads/foo. Then you can switch to the foo head by ln -sf refs/heads/foo .git/HEAD; cg-reset I then apply the patch (cg-patch) and commit (cg-commit). Whenever I want to sync the upstream tree, i ln -sf .git/refs/heads/master .git/HEAD; cg-update origin ln -sf .git/refs/heads/foo .git/HEAD; cg-reset; cg-merge master (and iterate over all other heads and do the same). To get a diff to your local master, you can then do cg-diff -r master:foo It's not nice, but has been working for me through the last weeks . -- - Harald Welte [EMAIL PROTECTED] http://gnumonks.org/ Privacy in residential applications is a desirable marketing option. (ETSI EN 300 175-7 Ch. A6) pgpRBjOxCfcAX.pgp Description: PGP signature
Re: [PATCH 4/6][TCPDIAG] Just rename everything to inet_diag
On Fri, 12 Aug 2005, Arnaldo Carvalho de Melo wrote: Please do NOT apply these changes to the SELinux code. These values are automatically generated and must be synchronized with userland policy. diff --git a/security/selinux/include/av_inherit.h b/security/selinux/include/av_inherit.h --- a/security/selinux/include/av_inherit.h +++ b/security/selinux/include/av_inherit.h @@ -21,7 +21,7 @@ S_(SECCLASS_SHM, ipc, 0x0200UL) S_(SECCLASS_NETLINK_ROUTE_SOCKET, socket, 0x0040UL) S_(SECCLASS_NETLINK_FIREWALL_SOCKET, socket, 0x0040UL) - S_(SECCLASS_NETLINK_TCPDIAG_SOCKET, socket, 0x0040UL) + S_(SECCLASS_NETLINK_INET_DIAG_SOCKET, socket, 0x0040UL) S_(SECCLASS_NETLINK_NFLOG_SOCKET, socket, 0x0040UL) S_(SECCLASS_NETLINK_XFRM_SOCKET, socket, 0x0040UL) S_(SECCLASS_NETLINK_SELINUX_SOCKET, socket, 0x0040UL) etc. At this stage, I suggest only updating the SELinux code so that it recognizes the DCCPDIAG_GETSOCK message. We need to work out how to transition SELinux policy from a netlink_tcpdiag_socket class to netlink_inetdiag_socket. i.e. whether to even bother changing the name of the class, or aliasing it somehow. - James -- James Morris [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6][TCPDIAG] Just rename everything to inet_diag
Em Fri, Aug 12, 2005 at 11:42:11AM -0400, James Morris escreveu: On Fri, 12 Aug 2005, Arnaldo Carvalho de Melo wrote: Please do NOT apply these changes to the SELinux code. These values are automatically generated and must be synchronized with userland policy. diff --git a/security/selinux/include/av_inherit.h b/security/selinux/include/av_inherit.h --- a/security/selinux/include/av_inherit.h +++ b/security/selinux/include/av_inherit.h @@ -21,7 +21,7 @@ S_(SECCLASS_SHM, ipc, 0x0200UL) S_(SECCLASS_NETLINK_ROUTE_SOCKET, socket, 0x0040UL) S_(SECCLASS_NETLINK_FIREWALL_SOCKET, socket, 0x0040UL) - S_(SECCLASS_NETLINK_TCPDIAG_SOCKET, socket, 0x0040UL) + S_(SECCLASS_NETLINK_INET_DIAG_SOCKET, socket, 0x0040UL) S_(SECCLASS_NETLINK_NFLOG_SOCKET, socket, 0x0040UL) S_(SECCLASS_NETLINK_XFRM_SOCKET, socket, 0x0040UL) S_(SECCLASS_NETLINK_SELINUX_SOCKET, socket, 0x0040UL) etc. At this stage, I suggest only updating the SELinux code so that it recognizes the DCCPDIAG_GETSOCK message. We need to work out how to transition SELinux policy from a netlink_tcpdiag_socket class to netlink_inetdiag_socket. i.e. whether to even bother changing the name of the class, or aliasing it somehow. Here I go regenerating the tree, at least this one is closer to the end of the series... I'll just remove _all_ of the selinux related bits, OK? Lesson learned :-) - Arnaldo - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6][TCPDIAG] Just rename everything to inet_diag
On Fri, 12 Aug 2005, Arnaldo Carvalho de Melo wrote: Here I go regenerating the tree, at least this one is closer to the end of the series... I'll just remove _all_ of the selinux related bits, OK? Lesson learned :-) Ok, and I'll send a patch to make SELinux compile again :-) - James -- James Morris [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] add new iptables ipt_connbytes match
On Fri, Aug 12, 2005 at 02:03:20PM +0200, Andi Kleen wrote: Unfortunately one of the iptables structures which is needed to get the ruleset in the kernel (ipt_replace) is differently sized when compiled for 32/64 bit. IIRC it doesn't work at all currently. Yes that's the old bug and cannot be fixed without breaking compatibility. But we hope that ctnetlink will not repeat that mistake. That is why I'm suggesting to use aligned_u64 in all new interfaces I'll soon push a patch for all nfnetlink_{conntrack,queue,log} stuff for net-2.6.14. Don't worry about that. But getting back to the original connbytes issue. Is it worth fixing it, if the core iptables doesn't even work (the old bug)? I don't think that we're ever going to fix that bug in the old {get,set}sockopt interface, but rather introduce a netlink interface when pkt_tables matures. -- - Harald Welte [EMAIL PROTECTED] http://netfilter.org/ Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed.-- Paul Vixie pgp23UNUvw65R.pgp Description: PGP signature
Fw: [Bug 5050] New: KERNEL: assertion (cnt = tp-packets_out) failed at net/ipv4/tcp_input.c (1476)
Begin forwarded message: Date: Fri, 12 Aug 2005 06:14:57 -0700 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: [Bug 5050] New: KERNEL: assertion (cnt = tp-packets_out) failed at net/ipv4/tcp_input.c (1476) http://bugzilla.kernel.org/show_bug.cgi?id=5050 Summary: KERNEL: assertion (cnt = tp-packets_out) failed at net/ipv4/tcp_input.c (1476) Kernel Version: 2.6.13-rc6 Status: NEW Severity: normal Owner: [EMAIL PROTECTED] Submitter: [EMAIL PROTECTED] Distribution: Debian Hardware Environment: P4 3.2 GHz 2048MB RAM 4xscsi disk Software Environment: squid + netfilter Problem Description: KERNEL: assertion (cnt = tp-packets_out) failed at net/ipv4/tcp_input.c (1476) Steps to reproduce: System is running 2 days and after that time produce this message KERNEL: assertion (cnt = tp-packets_out) failed at net/ipv4/tcp_input.c (1476) KERNEL: assertion (cnt = tp-packets_out) failed at net/ipv4/tcp_input.c (1476) --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [OT] Re: [PATCH] xfrm: do not use large arrays in BSS
On Fri, Aug 12, 2005 at 06:14:39PM +0200, Balazs Scheidler wrote: Whenever I want to sync the upstream tree, i ln -sf .git/refs/heads/master .git/HEAD; cg-update origin Sorry, there is a cg-reset missing between the ln and the cg-update -- - Harald Welte [EMAIL PROTECTED] http://gnumonks.org/ Privacy in residential applications is a desirable marketing option. (ETSI EN 300 175-7 Ch. A6) pgpnTpuXnYH97.pgp Description: PGP signature
Re: [PATCH] TCP Offload (TOE) - Chelsio
From: Dimitris Michailidis [EMAIL PROTECTED] Date: Fri, 12 Aug 2005 10:22:47 -0700 This is true. There is nothing fundamentally preventing both passive and active opens to check netfilter before OKing a connection. Once a connection is established, it's rather impractical to run each of its packets through netfilter, this is 10G after all. You'd probably not lose much functionality that you could have otherwise used at these speeds. People don't use netfilter just for state tracking and filtering, they also use it to some extent for rate limiting, packet logging, and similar things. And as busses and cpus get faster, your this is 10G after all argument becomes null and void. Note that this TOE mess also makes the packet scheduler, queueing disciplines, and packet classifiers totally unusable as well. Essentially, half of the Linux networking stack's features are turned uncontrollably _OFF_ in the presence of TOE. It is this, along with many other reasons, why the Linux networking community, in general, are so against TOE. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] add new iptables ipt_connbytes match
I don't think that we're ever going to fix that bug in the old {get,set}sockopt interface, but rather introduce a netlink interface when pkt_tables matures. All new interfaces should be emulation clean, so that if the old interface is replaced later it should eventually work. The best way to do that is to use aligned_u64. Should probably put that into linux/types.h -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TCP Offload (TOE) - Chelsio
From: Dimitris Michailidis [EMAIL PROTECTED] Date: Fri, 12 Aug 2005 10:00:12 -0700 On 8/12/05, David S. Miller [EMAIL PROTECTED] wrote: This would mean that every time we wish to change the data structures and interfaces for TCP socket lookup, your drivers would need to change. I think using TCP's own functions was done exactly to avoid this problem. That's doesn't achieve the desired result. I do plan to merge in IBM's move of the TCP hash tables over to RCU style locking, and that will require knowledge of the locking at the call sites to the functions you have exported to the TOE drivers. The TOE drivers would break as a result. You are creating a maintainence headache for us as well. Once this stuff gets exported to drivers, it becomes nearly impossible to change. And I absolutely reserve the right to create restrictions of use that increase the flexibility we have to change interfaces, data structures, and locking strategies in the future. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6][TCPDIAG] Just rename everything to inet_diag
From: [EMAIL PROTECTED] (Arnaldo Carvalho de Melo) Date: Fri, 12 Aug 2005 13:17:36 -0300 Just checked: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/acme/net-2.6.14.git/ Has the reworked, not touching selinux tree We might have to reneg on changing things from tcpdiag to inetdiag, James's conflict was one I did not anticipate. Let me think about this over the weekend before we commit to doing things one way or the other. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] add new iptables ipt_connbytes match
On Fri, Aug 12, 2005 at 08:23:55PM +0200, Andi Kleen wrote: I don't think that we're ever going to fix that bug in the old {get,set}sockopt interface, but rather introduce a netlink interface when pkt_tables matures. All new interfaces should be emulation clean, so that if the old interface is replaced later it should eventually work. The best way to do that is to use aligned_u64. Should probably put that into linux/types.h Ok, I hope everyone is fine with this patch: -- - Harald Welte [EMAIL PROTECTED] http://netfilter.org/ Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed.-- Paul Vixie [NETFILTER] introduce and use aligned_u64 data type As proposed by Andi Kleen, this is required esp. for x86_64 architecture, where 64bit code needs 8byte aligned 64bit data types, but 32bit userspace apps will only align to 4bytes. Signed-off-by: Harald Welte [EMAIL PROTECTED] --- commit 30da9a3da187af74b2e2d00becf2d9cab3624ddd tree 7666f6ce67e96beedc8884f1aba18ea80a20e2b1 parent 7c249f391a3b9bc86ec07d734959c532a3c7a3f6 author Harald Welte [EMAIL PROTECTED] Fr, 12 Aug 2005 21:00:28 +0200 committer Harald Welte [EMAIL PROTECTED] Fr, 12 Aug 2005 21:00:28 +0200 include/linux/netfilter/nfnetlink_log.h |5 +++-- include/linux/netfilter/nfnetlink_queue.h|5 +++-- include/linux/netfilter_ipv4/ipt_connbytes.h |4 ++-- include/linux/types.h|3 +++ 4 files changed, 11 insertions(+), 6 deletions(-) diff --git a/include/linux/netfilter/nfnetlink_log.h b/include/linux/netfilter/nfnetlink_log.h --- a/include/linux/netfilter/nfnetlink_log.h +++ b/include/linux/netfilter/nfnetlink_log.h @@ -5,6 +5,7 @@ * and not any kind of function definitions. It is shared between kernel and * userspace. Don't put kernel specific stuff in here */ +#include linux/types.h #include linux/netfilter/nfnetlink.h enum nfulnl_msg_types { @@ -27,8 +28,8 @@ struct nfulnl_msg_packet_hw { } __attribute__ ((packed)); struct nfulnl_msg_packet_timestamp { - u_int64_t sec; - u_int64_t usec; + aligned_u64 sec; + aligned_u64 usec; } __attribute__ ((packed)); #define NFULNL_PREFIXLEN 30 /* just like old log target */ diff --git a/include/linux/netfilter/nfnetlink_queue.h b/include/linux/netfilter/nfnetlink_queue.h --- a/include/linux/netfilter/nfnetlink_queue.h +++ b/include/linux/netfilter/nfnetlink_queue.h @@ -1,6 +1,7 @@ #ifndef _NFNETLINK_QUEUE_H #define _NFNETLINK_QUEUE_H +#include linux/types.h #include linux/netfilter/nfnetlink.h enum nfqnl_msg_types { @@ -24,8 +25,8 @@ struct nfqnl_msg_packet_hw { } __attribute__ ((packed)); struct nfqnl_msg_packet_timestamp { - u_int64_t sec; - u_int64_t usec; + aligned_u64 sec; + aligned_u64 usec; } __attribute__ ((packed)); enum nfqnl_attr_type { diff --git a/include/linux/netfilter_ipv4/ipt_connbytes.h b/include/linux/netfilter_ipv4/ipt_connbytes.h --- a/include/linux/netfilter_ipv4/ipt_connbytes.h +++ b/include/linux/netfilter_ipv4/ipt_connbytes.h @@ -16,8 +16,8 @@ enum ipt_connbytes_direction { struct ipt_connbytes_info { struct { - u_int64_t from; /* count to be matched */ - u_int64_t to; /* count to be matched */ + aligned_u64 from; /* count to be matched */ + aligned_u64 to; /* count to be matched */ } count; u_int8_t what; /* ipt_connbytes_what */ u_int8_t direction; /* ipt_connbytes_direction */ diff --git a/include/linux/types.h b/include/linux/types.h --- a/include/linux/types.h +++ b/include/linux/types.h @@ -123,6 +123,9 @@ typedef __u64 u_int64_t; typedef__s64 int64_t; #endif +/* this is a special 64bit data type that is 8-byte aligned */ +#define aligned_u64 unsigned long long __attribute__((aligned(8))) + /* * The type used for indexing onto a disc or disc partition. * If required, asm/types.h can override it and define pgpTd8fpnsZCU.pgp Description: PGP signature
Re: [PATCH 4/6][TCPDIAG] Just rename everything to inet_diag
On 8/12/05, David S. Miller [EMAIL PROTECTED] wrote: From: [EMAIL PROTECTED] (Arnaldo Carvalho de Melo) Date: Fri, 12 Aug 2005 13:17:36 -0300 Just checked: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/acme/net-2.6.14.git/ Has the reworked, not touching selinux tree We might have to reneg on changing things from tcpdiag to inetdiag, James's conflict was one I did not anticipate. Let me think about this over the weekend before we commit to doing things one way or the other. Take your time but as far as I understood from talking to James it was just a matter of rerunning some sort of userspace tool to regenerate those files, something he said he would be doing after I submitted the non-touching SELinux parts. He seems to be interested in eventually reflecting the fact that inet_diag uses the same netlink sock for several inet transport level protocols, but I'd say that for a start he could as well apply the current, pre-inet_diag rules for just TCP to all the protocols now using this kernel communication channel. - Arnaldo - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] add new iptables ipt_connbytes match
From: Harald Welte [EMAIL PROTECTED] Date: Fri, 12 Aug 2005 21:03:43 +0200 Ok, I hope everyone is fine with this patch: It is, but I did not add the connbytes patch into my tree so I can't use this patch as-is. That's why I replied this is broken, fix u64 alignment to the connbytes patch instead of applied, thanks :-) Please untangle this stuff. This is how we end up with a big mess of noise changesets in the tree, due to how we have been putting half-working changes in first then a bunch of fixup patches. I'd like to avoid that, because I then spend a lot of time redoing things when I rebase the tree later. So in this case, send me the aligned_u64 patch seperately which doesn't assume connbytes is in the tree. Then another patch which adds connbytes with the proper usage of aligned_u64. Thanks Harald. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.13-rc6 1/2]netdevice ethtool: Add support for getting the permanent hardware address (resend)
Adds a new field to net device to hold the permanent hardware address, and adds a new generic ethtool_op function to get that address. Signed-off-by: Jon Wetzel [EMAIL PROTECTED] Signed-off-by: John W. Linville [EMAIL PROTECTED] --- linux-2.6.13-rc6/include/linux/netdevice.h 2005-08-12 13:10:12.0 -0500 +++ linux-2.6.13-rc6-jw/include/linux/netdevice.h 2005-08-12 13:35:18.0 -0500 @@ -336,6 +336,7 @@ /* Interface address info. */ unsigned char broadcast[MAX_ADDR_LEN];/* hw bcast add */ unsigned char dev_addr[MAX_ADDR_LEN]; /* hw address */ + unsigned char perm_addr[MAX_ADDR_LEN]; /* permanent hw address */ unsigned char addr_len; /* hardware address length */ unsigned short dev_id; /* for shared network cards */ --- linux-2.6.13-rc6/include/linux/ethtool.h2005-08-05 02:04:37.0 -0500 +++ linux-2.6.13-rc6-jw/include/linux/ethtool.h 2005-08-12 13:42:28.0 -0500 @@ -250,6 +250,12 @@ u64 data[0]; }; +struct ethtool_perm_addr { + u32 cmd;/* ETHTOOLGPERMADDR */ + int size; + chardata[0]; +} + struct net_device; /* Some generic methods drivers may use in their ethtool_ops */ @@ -261,6 +267,8 @@ int ethtool_op_set_sg(struct net_device *dev, u32 data); u32 ethtool_op_get_tso(struct net_device *dev); int ethtool_op_set_tso(struct net_device *dev, u32 data); +int ethtool_op_get_perm_addr(struct net_device *dev, int len, +struct ethtool_addr *addr); /** * ethtool_ops - Alter and report network device settings --- linux-2.6.13-rc6/net/core/ethtool.c 2005-08-05 02:04:37.0 -0500 +++ linux-2.6.13-rc6-jw/net/core/ethtool.c 2005-08-12 13:43:35.0 -0500 @@ -81,6 +81,16 @@ return 0; } +int ethtool_op_get_perm_addr(struct net_device *dev, int len, struct ethtool_addr *addr) +{ + if ( len MAX_ADDR_LEN ) + return -ETOOSMALL; + + memcpy(addr-data, dev-perm_addr, MAX_ADDR_LEN); + return 0; +} + + /* Handlers for each ethtool command */ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr) @@ -826,6 +836,7 @@ EXPORT_SYMBOL(dev_ethtool); EXPORT_SYMBOL(ethtool_op_get_link); +EXPORT_SYMBOL_GPL(ethtool_op_get_perm_addr); EXPORT_SYMBOL(ethtool_op_get_sg); EXPORT_SYMBOL(ethtool_op_get_tso); EXPORT_SYMBOL(ethtool_op_get_tx_csum); @@ -833,3 +844,4 @@ EXPORT_SYMBOL(ethtool_op_set_tso); EXPORT_SYMBOL(ethtool_op_set_tx_csum); EXPORT_SYMBOL(ethtool_op_set_tx_hw_csum); + - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.13-rc6 1/2]netdevice ethtool: Add support for getting the permanent hardware address (resend)
From: Jon Wetzel [EMAIL PROTECTED] Date: Fri, 12 Aug 2005 15:52:28 -0500 Adds a new field to net device to hold the permanent hardware address, and adds a new generic ethtool_op function to get that address. Signed-off-by: Jon Wetzel [EMAIL PROTECTED] Signed-off-by: John W. Linville [EMAIL PROTECTED] I think I'll put this stuff in for 2.6.14, it's too late in the devel cycle to stick it into 2.6.13. Thanks. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.13-rc6 2/2]e1000: Add support for getting the permanent hardware address (correction)
Accidentally sent an old version of this patch. This is the current one. e1000 driver updated to fill in the new field in netdevice and use the new ethtool, get_perm_addr. Signed-off-by: Jon Wetzel [EMAIL PROTECTED] Signed-off-by: John W. Linville [EMAIL PROTECTED] --- linux-2.6.13-rc6/drivers/net/e1000/e1000_ethtool.c 2005-08-12 13:09:16.0 -0500 +++ linux-2.6.13-rc6-jw/drivers/net/e1000/e1000_ethtool.c 2005-08-12 13:36:09.0 -0500 @@ -1739,6 +1739,7 @@ .phys_id= e1000_phys_id, .get_stats_count= e1000_get_stats_count, .get_ethtool_stats = e1000_get_ethtool_stats, + .get_perm_addr = ethtool_op_get_perm_addr, }; void e1000_set_ethtool_ops(struct net_device *netdev) --- linux-2.6.13-rc6/drivers/net/e1000/e1000_main.c 2005-08-12 13:09:17.0 -0500 +++ linux-2.6.13-rc6-jw/drivers/net/e1000/e1000_main.c 2005-08-12 13:36:09.0 -0500 @@ -614,8 +614,9 @@ if(e1000_read_mac_addr(adapter-hw)) DPRINTK(PROBE, ERR, EEPROM Read Error\n); memcpy(netdev-dev_addr, adapter-hw.mac_addr, netdev-addr_len); + memcpy(netdev-perm_addr, adapter-hw.mac_addr, netdev-addr_len); - if(!is_valid_ether_addr(netdev-dev_addr)) { + if(!is_valid_ether_addr(netdev-perm_addr)) { DPRINTK(PROBE, ERR, Invalid MAC Address\n); err = -EIO; goto err_eeprom; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6][TCPDIAG] Just rename everything to inet_diag
On 8/12/05, David S. Miller [EMAIL PROTECTED] wrote: From: James Morris [EMAIL PROTECTED] Date: Fri, 12 Aug 2005 15:00:49 -0400 (EDT) Just do what you think is right for the core networking and we'll adjust SELinux accordingly. Ok, I've pulled in Arnaldo's changes, as-is. Thanks! - Arnaldo - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: skb-pkt_type
From: Patrick McHardy [EMAIL PROTECTED] Date: Wed, 10 Aug 2005 02:18:46 +0200 BTW, an idea to make room for ipvs_property would be to place the three nfctinfo bits in the lower three bits of the nfct pointer. I'm not sure if it guarantees 8 byte alignemnt, which would be required for this to work .. It turns out that we need a two-bit state for the fast SKB cloning patch I'm working on with Thomas Graf, which perfectly combines with the now-3-bit pkt_type field. So that's the plan for the time being. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fw: [Bug 5050] New: KERNEL: assertion (cnt = tp-packets_out) failed at net/ipv4/tcp_input.c (1476)
On Fri, Aug 12, 2005 at 09:15:44AM -0700, Stephen Hemminger wrote: Steps to reproduce: System is running 2 days and after that time produce this message KERNEL: assertion (cnt = tp-packets_out) failed at net/ipv4/tcp_input.c (1476) KERNEL: assertion (cnt = tp-packets_out) failed at net/ipv4/tcp_input.c (1476) We believe that this bug may have been fixed by the following patch which was applied after rc6. Please apply only the debugging patch and let us know what it prints out so that we can confirm that this is indeed the problem. Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1370,15 +1370,21 @@ int tcp_retransmit_skb(struct sock *sk, if (skb-len cur_mss) { int old_factor = tcp_skb_pcount(skb); - int new_factor; + int diff; if (tcp_fragment(sk, skb, cur_mss, cur_mss)) return -ENOMEM; /* We'll try again later. */ /* New SKB created, account for it. */ - new_factor = tcp_skb_pcount(skb); - tp-packets_out -= old_factor - new_factor; - tp-packets_out += tcp_skb_pcount(skb-next); + diff = old_factor - tcp_skb_pcount(skb) - + tcp_skb_pcount(skb-next); + tp-packets_out -= diff; + + if (diff 0) { + tp-fackets_out -= diff; + if ((int)tp-fackets_out 0) + tp-fackets_out = 0; + } } /* Collapse two adjacent packets if worthwhile and we can. */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1474,6 +1474,10 @@ static void tcp_mark_head_lost(struct so int cnt = packets; BUG_TRAP(cnt = tp-packets_out); + if (unlikely(cnt = tp-packets_out)) { + printk(packets_out = %d, fackets_out = %d, reordering = %d, sack_ok = 0x%x, mss_cache=%d\n, tp-packets_out, tp-fackets_out, tp-reordering, tp-rx_opt.sack_ok, tp-mss_cache); + dump_stack(); + } sk_stream_for_retrans_queue(skb, sk) { cnt -= tcp_skb_pcount(skb);
[RFC NETLINK 0/8]: Support dynamic number of groups
Hi, besides a small bugfix, this patchset adds support for dynamic number of groups to netlink. To support an arbitary number of groups a couple of changes had to me made, I'll explain them below. The patches are only sent to netdev to avoid spamming your inboxes. The destination groups of a packet are currently stored in the cb as a bitmask. To avoid beeing limited by the size of the cb, support for broadcasting to multiple groups using a single call to netlink_broadcast is removed and only a single destination group is supported. which is stored as an integer in the cb. No users in the kernel used more than a single destination group. The subscribed groups bitmask in struct netlink_sock is only 32 bit wide, it is changed to be dynamically allocated. Currently binding to a group is possible before a kernel socket for a protocol exists. To avoid guessing the group number and dealing with reallocations this is changed and sockets for a protocol can only be created when a kernel socket exists. Herbert and Thomas agreed that pure userspace communication is not a good idea with current netlink and the change should be ok. For compatibility, userspace can still subscribe to the lower 32 groups using bind and see which groups a socket is subscribed to using getsockname, to subscribe/unsubscribe groups in the extended range two setsockopt options are provided. struct nl_addr can only contain up to 32 groups, to get the destination group of a packet for the extended range a nl_pktinfo control message can be enabled using another setsockopt option. [NETLINK]: Fix module refcounting problems [NETLINK]: Remove unused groups member from struct netlink_skb_parms [NETLINK]: Use group numbers instead of bitmasks internally [NETLINK]: Convert netlink users to use group numbers instead of bitmasks [NETLINK]: Return -EPROTONOSUPPORT in netlink_create() if no kernel socket is registered [NETLINK]: Support dynamic number of multicast groups per netlink family [NETLINK]: Add set/getsockopt options to support more than 32 groups [NETLINK]: Add groups argument to netlink_kernel_create - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[NETLINK 1/8]: Fix module refcounting problems
[NETLINK]: Fix module refcounting problems Use-after-free: the struct proto_ops containing the module pointer is freed when a socket with pid=0 is released, which besides for kernel sockets is true for all unbound sockets. Module refcount leak: when the kernel socket is closed before all user sockets have been closed the proto_ops struct for this family is replaced by the generic one and the module refcount can't be dropped. The second problem can't be solved cleanly using module refcounting in the generic socket code, so this patch adds explicit refcounting to netlink_create/netlink_release. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] --- commit 1f74632caaf6f2bf31cf02ac28c5087e4224b02e tree c63e3fcfef8d10a928ac7a03fd2ba66ea12479cf parent 036b419a397e294a5a8ca37845e3023f979976fc author Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 00:21:23 +0200 committer Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 00:21:23 +0200 net/netlink/af_netlink.c | 100 -- 1 files changed, 35 insertions(+), 65 deletions(-) diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -73,8 +73,12 @@ struct netlink_sock { struct netlink_callback *cb; spinlock_t cb_lock; void (*data_ready)(struct sock *sk, int bytes); + struct module *module; + u32 flags; }; +#define NETLINK_KERNEL_SOCKET 0x1 + static inline struct netlink_sock *nlk_sk(struct sock *sk) { return (struct netlink_sock *)sk; @@ -97,7 +101,7 @@ struct netlink_table { struct nl_pid_hash hash; struct hlist_head mc_list; unsigned int nl_nonroot; - struct proto_ops *p_ops; + struct module *module; }; static struct netlink_table *nl_table; @@ -338,6 +342,7 @@ static int netlink_create(struct socket { struct sock *sk; struct netlink_sock *nlk; + struct module *module; sock-state = SS_UNCONNECTED; @@ -347,30 +352,36 @@ static int netlink_create(struct socket if (protocol0 || protocol = MAX_LINKS) return -EPROTONOSUPPORT; - netlink_table_grab(); + netlink_lock_table(); if (!nl_table[protocol].hash.entries) { #ifdef CONFIG_KMOD /* We do 'best effort'. If we find a matching module, * it is loaded. If not, we don't return an error to * allow pure userspace-userspace communication. -HW */ - netlink_table_ungrab(); + netlink_unlock_table(); request_module(net-pf-%d-proto-%d, PF_NETLINK, protocol); - netlink_table_grab(); + netlink_lock_table(); #endif } - netlink_table_ungrab(); + module = nl_table[protocol].module; + if (!try_module_get(module)) + module = NULL; + netlink_unlock_table(); - sock-ops = nl_table[protocol].p_ops; + sock-ops = netlink_ops; sk = sk_alloc(PF_NETLINK, GFP_KERNEL, netlink_proto, 1); - if (!sk) + if (!sk) { + module_put(module); return -ENOMEM; + } sock_init_data(sock, sk); nlk = nlk_sk(sk); + nlk-module = module; spin_lock_init(nlk-cb_lock); init_waitqueue_head(nlk-wait); sk-sk_destruct = netlink_sock_destruct; @@ -415,22 +426,15 @@ static int netlink_release(struct socket notifier_call_chain(netlink_chain, NETLINK_URELEASE, n); } - /* When this is a kernel socket, we need to remove the owner pointer, - * since we don't know whether the module will be dying at any given - * point - HW - */ - if (!nlk-pid) { - struct proto_ops *p_tmp; + if (nlk-module) + module_put(nlk-module); + if (nlk-flags NETLINK_KERNEL_SOCKET) { netlink_table_grab(); - p_tmp = nl_table[sk-sk_protocol].p_ops; - if (p_tmp != netlink_ops) { - nl_table[sk-sk_protocol].p_ops = netlink_ops; - kfree(p_tmp); - } + nl_table[sk-sk_protocol].module = NULL; netlink_table_ungrab(); } - + sock_put(sk); return 0; } @@ -1061,9 +1065,9 @@ static void netlink_data_ready(struct so struct sock * netlink_kernel_create(int unit, void (*input)(struct sock *sk, int len), struct module *module) { - struct proto_ops *p_ops; struct socket *sock; struct sock *sk; + struct netlink_sock *nlk; if (!nl_table) return NULL; @@ -1071,64 +1075,32 @@ netlink_kernel_create(int unit, void (*i if (unit0 || unit=MAX_LINKS) return NULL; - /* Do a quick check, to make us not go down to netlink_insert() - * if protocol already has kernel socket. - */ - sk = netlink_lookup(unit, 0); - if (unlikely(sk)) { - sock_put(sk); - return NULL; - } - if (sock_create_lite(PF_NETLINK, SOCK_DGRAM, unit, sock)) return NULL; - sk = NULL; - if (module) { - /* Every registering protocol implemented in a module needs - * it's own p_ops, since the socket code cannot deal with - * module refcounting otherwise. -HW - */ - p_ops = kmalloc(sizeof(*p_ops), GFP_KERNEL); - if (!p_ops) - goto out_sock_release; - - memcpy(p_ops, netlink_ops, sizeof(*p_ops)); - p_ops-owner = module; - } else - p_ops = netlink_ops; - - netlink_table_grab(); - nl_table[unit].p_ops = p_ops; - netlink_table_ungrab(); - - if (netlink_create(sock, unit) 0)
[NETLINK 4/8]: Convert netlink users to use group numbers instead of bitmasks
[NETLINK]: Convert netlink users to use group numbers instead of bitmasks Signed-off-by: Patrick McHardy [EMAIL PROTECTED] --- commit a8a8c74ef1b37254f920103a6ce70237a6a55dab tree c8decf70f15805fc7c23bee441b2ce8b14e7b264 parent 5c34a3fbc1e62fc90db80f148e07ea7817013dca author Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 00:56:59 +0200 committer Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 00:56:59 +0200 include/linux/netfilter/nfnetlink.h | 23 +++- include/linux/netfilter_decnet.h | 14 ++ include/linux/rtnetlink.h | 42 +++-- include/linux/xfrm.h | 18 net/bridge/netfilter/ebt_ulog.c |4 +-- net/core/neighbour.c |8 +++--- net/core/rtnetlink.c |6 ++-- net/core/wireless.c |4 +-- net/decnet/dn_dev.c |8 +++--- net/decnet/dn_table.c |4 +-- net/decnet/netfilter/dn_rtmsg.c |6 ++-- net/ipv4/devinet.c|7 ++--- net/ipv4/fib_frontend.c |2 + net/ipv4/fib_semantics.c |4 +-- net/ipv4/netfilter/ip_conntrack_netlink.c | 12 net/ipv4/netfilter/ipt_ULOG.c |8 +++--- net/ipv6/addrconf.c | 24 - net/ipv6/route.c |8 +++--- net/netfilter/nfnetlink.c |2 + net/sched/act_api.c |8 +++--- net/sched/cls_api.c |2 + net/sched/sch_api.c |4 +-- net/xfrm/xfrm_user.c | 23 +++- 23 files changed, 163 insertions(+), 78 deletions(-) diff --git a/include/linux/netfilter/nfnetlink.h b/include/linux/netfilter/nfnetlink.h --- a/include/linux/netfilter/nfnetlink.h +++ b/include/linux/netfilter/nfnetlink.h @@ -2,13 +2,34 @@ #define _NFNETLINK_H #include linux/types.h -/* nfnetlink groups: Up to 32 maximum */ +#ifndef __KERNEL__ +/* nfnetlink groups: Up to 32 maximum - backwards compatibility for userspace */ #define NF_NETLINK_CONNTRACK_NEW 0x0001 #define NF_NETLINK_CONNTRACK_UPDATE 0x0002 #define NF_NETLINK_CONNTRACK_DESTROY 0x0004 #define NF_NETLINK_CONNTRACK_EXP_NEW 0x0008 #define NF_NETLINK_CONNTRACK_EXP_UPDATE 0x0010 #define NF_NETLINK_CONNTRACK_EXP_DESTROY 0x0020 +#endif + +enum nfnetlink_groups { + NFNLGRP_NONE, +#define NFNLGRP_NONE NFNLGRP_NONE + NFNLGRP_CONNTRACK_NEW, +#define NFNLGRP_CONNTRACK_NEW NFNLGRP_CONNTRACK_NEW + NFNLGRP_CONNTRACK_UPDATE, +#define NFNLGRP_CONNTRACK_UPDATE NFNLGRP_CONNTRACK_UPDATE + NFNLGRP_CONNTRACK_DESTROY, +#define NFNLGRP_CONNTRACK_DESTROY NFNLGRP_CONNTRACK_DESTROY + NFNLGRP_CONNTRACK_EXP_NEW, +#define NFNLGRP_CONNTRACK_EXP_NEW NFNLGRP_CONNTRACK_EXP_NEW + NFNLGRP_CONNTRACK_EXP_UPDATE, +#define NFNLGRP_CONNTRACK_EXP_UPDATE NFNLGRP_CONNTRACK_EXP_UPDATE + NFNLGRP_CONNTRACK_EXP_DESTROY, +#define NFNLGRP_CONNTRACK_EXP_DESTROY NFNLGRP_CONNTRACK_EXP_DESTROY + __NFNLGRP_MAX, +}; +#define NFNLGRP_MAX (__NFNLGRP_MAX - 1) /* Generic structure for encapsulation optional netfilter information. * It is reminiscent of sockaddr, but with sa_family replaced diff --git a/include/linux/netfilter_decnet.h b/include/linux/netfilter_decnet.h --- a/include/linux/netfilter_decnet.h +++ b/include/linux/netfilter_decnet.h @@ -56,7 +56,21 @@ struct nf_dn_rtmsg { #define NFDN_RTMSG(r) ((unsigned char *)(r) + NLMSG_ALIGN(sizeof(struct nf_dn_rtmsg))) +#ifndef __KERNEL__ +/* backwards compatibility for userspace */ #define DNRMG_L1_GROUP 0x01 #define DNRMG_L2_GROUP 0x02 +#endif + +enum { + DNRNG_NLGRP_NONE, +#define DNRNG_NLGRP_NONE DNRNG_NLGRP_NONE + DNRNG_NLGRP_L1, +#define DNRNG_NLGRP_L1 DNRNG_NLGRP_L1 + DNRNG_NLGRP_L2, +#define DNRNG_NLGRP_L2 DNRNG_NLGRP_L2 + __DNRNG_NLGRP_MAX +}; +#define DNRNG_NLGRP_MAX (__DNRNG_NLGRP_MAX - 1) #endif /*__LINUX_DECNET_NETFILTER_H*/ diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -826,9 +826,8 @@ enum #define TCA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct tcmsg #define TCA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcmsg)) - -/* RTnetlink multicast groups */ - +#ifndef __KERNEL__ +/* RTnetlink multicast groups - backwards compatibility for userspace */ #define RTMGRP_LINK 1 #define RTMGRP_NOTIFY 2 #define RTMGRP_NEIGH 4 @@ -847,6 +846,43 @@ enum #define RTMGRP_DECnet_ROUTE 0x4000 #define RTMGRP_IPV6_PREFIX 0x2 +#endif + +/* RTnetlink multicast groups */ +enum rtnetlink_groups { + RTNLGRP_NONE, +#define RTNLGRP_NONE RTNLGRP_NONE + RTNLGRP_LINK, +#define RTNLGRP_LINK RTNLGRP_LINK + RTNLGRP_NOTIFY, +#define RTNLGRP_NOTIFY RTNLGRP_NOTIFY + RTNLGRP_NEIGH, +#define RTNLGRP_NEIGH RTNLGRP_NEIGH + RTNLGRP_TC, +#define
[NETLINK 2/8]: Remove unused groups member from struct netlink_skb_parms
[NETLINK]: Remove unused groups member from struct netlink_skb_parms Signed-off-by: Patrick McHardy [EMAIL PROTECTED],net --- commit 910f9b156d87a1d9d013985ce3973b9a0d27dbd6 tree a430be569a7d7c79088d7b830a57e31c98f95060 parent 1f74632caaf6f2bf31cf02ac28c5087e4224b02e author Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 00:30:12 +0200 committer Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 00:30:12 +0200 include/linux/netlink.h |1 - net/ipv4/fib_frontend.c |1 - net/netlink/af_netlink.c |1 - 3 files changed, 0 insertions(+), 3 deletions(-) diff --git a/include/linux/netlink.h b/include/linux/netlink.h --- a/include/linux/netlink.h +++ b/include/linux/netlink.h @@ -106,7 +106,6 @@ struct netlink_skb_parms { struct ucred creds; /* Skb credentials */ __u32 pid; - __u32 groups; __u32 dst_pid; __u32 dst_groups; kernel_cap_t eff_cap; diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -558,7 +558,6 @@ static void nl_fib_input(struct sock *sk nl_fib_lookup(frn, tb); pid = nlh-nlmsg_pid; /*pid of sending process */ - NETLINK_CB(skb).groups = 0; /* not in mcast group */ NETLINK_CB(skb).pid = 0; /* from kernel */ NETLINK_CB(skb).dst_pid = pid; NETLINK_CB(skb).dst_groups = 0; /* unicast */ diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -954,7 +954,6 @@ static int netlink_sendmsg(struct kiocb goto out; NETLINK_CB(skb).pid = nlk-pid; - NETLINK_CB(skb).groups = nlk-groups; NETLINK_CB(skb).dst_pid = dst_pid; NETLINK_CB(skb).dst_groups = dst_groups; NETLINK_CB(skb).loginuid = audit_get_loginuid(current-audit_context);
[NETLINK 6/8]: Support dynamic number of multicast groups per netlink family
[NETLINK]: Support dynamic number of multicast groups per netlink family Signed-off-by: Patrick McHardy [EMAIL PROTECTED] --- commit a5314b2c777dc032b93f4f068ab1759f5610999f tree 68571754baf232d5c76b15ec7e270b4af058867a parent 2b1cc05d6484d70aae14d869730f8ce959ed7bdd author Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 01:16:52 +0200 committer Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 01:16:52 +0200 net/netlink/af_netlink.c | 66 +- 1 files changed, 48 insertions(+), 18 deletions(-) diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -60,21 +60,24 @@ #include net/scm.h #define Nprintk(a...) +#define NLGRPSZ(x) (ALIGN(x, sizeof(unsigned long) * 8) / 8) struct netlink_sock { /* struct sock has to be the first member of netlink_sock */ struct sock sk; u32 pid; - unsigned int groups; u32 dst_pid; u32 dst_group; + u32 flags; + u32 subscriptions; + u32 ngroups; + unsigned long *groups; unsigned long state; wait_queue_head_t wait; struct netlink_callback *cb; spinlock_t cb_lock; void (*data_ready)(struct sock *sk, int bytes); struct module *module; - u32 flags; }; #define NETLINK_KERNEL_SOCKET 0x1 @@ -101,6 +104,7 @@ struct netlink_table { struct nl_pid_hash hash; struct hlist_head mc_list; unsigned int nl_nonroot; + unsigned int groups; struct module *module; int registered; }; @@ -138,6 +142,7 @@ static void netlink_sock_destruct(struct BUG_TRAP(!atomic_read(sk-sk_rmem_alloc)); BUG_TRAP(!atomic_read(sk-sk_wmem_alloc)); BUG_TRAP(!nlk_sk(sk)-cb); + BUG_TRAP(!nlk_sk(sk)-groups); } /* This lock without WQ_FLAG_EXCLUSIVE is good on UP and it is _very_ bad on SMP. @@ -333,7 +338,7 @@ static void netlink_remove(struct sock * netlink_table_grab(); if (sk_del_node_init(sk)) nl_table[sk-sk_protocol].hash.entries--; - if (nlk_sk(sk)-groups) + if (nlk_sk(sk)-subscriptions) __sk_del_bind_node(sk); netlink_table_ungrab(); } @@ -369,6 +374,8 @@ static int __netlink_create(struct socke static int netlink_create(struct socket *sock, int protocol) { struct module *module = NULL; + struct netlink_sock *nlk; + unsigned int groups; int err = 0; sock-state = SS_UNCONNECTED; @@ -392,15 +399,23 @@ static int netlink_create(struct socket module = nl_table[protocol].module; else err = -EPROTONOSUPPORT; + groups = nl_table[protocol].groups; netlink_unlock_table(); - if (err) - goto out; + if (err || (err = __netlink_create(sock, protocol) 0)) + goto out_module; + + nlk = nlk_sk(sock-sk); - if ((err = __netlink_create(sock, protocol) 0)) + nlk-groups = kmalloc(NLGRPSZ(groups), GFP_KERNEL); + if (nlk-groups == NULL) { + err = -ENOMEM; goto out_module; + } + memset(nlk-groups, 0, NLGRPSZ(groups)); + nlk-ngroups = groups; - nlk_sk(sock-sk)-module = module; + nlk-module = module; out: return err; @@ -437,7 +452,7 @@ static int netlink_release(struct socket skb_queue_purge(sk-sk_write_queue); - if (nlk-pid !nlk-groups) { + if (nlk-pid !nlk-subscriptions) { struct netlink_notify n = { .protocol = sk-sk_protocol, .pid = nlk-pid, @@ -455,6 +470,7 @@ static int netlink_release(struct socket netlink_table_ungrab(); } + kfree(nlk-groups); sock_put(sk); return 0; } @@ -503,6 +519,18 @@ static inline int netlink_capable(struct capable(CAP_NET_ADMIN); } +static void +netlink_update_subscriptions(struct sock *sk, unsigned int subscriptions) +{ + struct netlink_sock *nlk = nlk_sk(sk); + + if (nlk-subscriptions !subscriptions) + __sk_del_bind_node(sk); + else if (!nlk-subscriptions subscriptions) + sk_add_bind_node(sk, nl_table[sk-sk_protocol].mc_list); + nlk-subscriptions = subscriptions; +} + static int netlink_bind(struct socket *sock, struct sockaddr *addr, int addr_len) { struct sock *sk = sock-sk; @@ -528,15 +556,14 @@ static int netlink_bind(struct socket *s return err; } - if (!nladdr-nl_groups !nlk-groups) + if (!nladdr-nl_groups !(u32)nlk-groups[0]) return 0; netlink_table_grab(); - if (nlk-groups !nladdr-nl_groups) - __sk_del_bind_node(sk); - else if (!nlk-groups nladdr-nl_groups) - sk_add_bind_node(sk, nl_table[sk-sk_protocol].mc_list); - nlk-groups = nladdr-nl_groups; + netlink_update_subscriptions(sk, nlk-subscriptions + + hweight32(nladdr-nl_groups) - + hweight32(nlk-groups[0])); + *(u32 *)nlk-groups = nladdr-nl_groups; netlink_table_ungrab(); return 0; @@ -590,7 +617,7 @@ static int netlink_getname(struct socket nladdr-nl_groups = netlink_group_mask(nlk-dst_group); } else { nladdr-nl_pid = nlk-pid; - nladdr-nl_groups = nlk-groups; + nladdr-nl_groups = nlk-groups[0]; } return 0; } @@ -791,7 +818,8 @@ static inline int do_one_broadcast(struc if (p-exclude_sk == sk) goto out; - if (nlk-pid == p-pid ||
[NETLINK 8/8]: Add groups argument to netlink_kernel_create
[NETLINK]: Add groups argument to netlink_kernel_create Signed-off-by: Patrick McHardy [EMAIL PROTECTED] --- commit 5719d60b114683e7c1bf1aa9a553efb641184e1b tree c6a56c893ae404e6767f3cefbebd2a88a2981775 parent c366740a65d35924ee4efce970db8a738dd4b384 author Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 01:50:00 +0200 committer Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 01:50:00 +0200 drivers/w1/w1_int.c |2 +- include/linux/netlink.h |2 +- kernel/audit.c |2 +- lib/kobject_uevent.c|2 +- net/bridge/netfilter/ebt_ulog.c |3 ++- net/core/rtnetlink.c|3 ++- net/decnet/netfilter/dn_rtmsg.c |4 ++-- net/ipv4/fib_frontend.c |2 +- net/ipv4/netfilter/ip_queue.c |2 +- net/ipv4/netfilter/ipt_ULOG.c |3 ++- net/ipv4/tcp_diag.c |2 +- net/ipv6/netfilter/ip6_queue.c |3 ++- net/netfilter/nfnetlink.c |4 ++-- net/netlink/af_netlink.c|6 -- net/xfrm/xfrm_user.c|4 ++-- 15 files changed, 25 insertions(+), 19 deletions(-) diff --git a/drivers/w1/w1_int.c b/drivers/w1/w1_int.c --- a/drivers/w1/w1_int.c +++ b/drivers/w1/w1_int.c @@ -88,7 +88,7 @@ static struct w1_master * w1_alloc_dev(u dev-groups = 23; dev-seq = 1; - dev-nls = netlink_kernel_create(NETLINK_W1, NULL, THIS_MODULE); + dev-nls = netlink_kernel_create(NETLINK_W1, 1, NULL, THIS_MODULE); if (!dev-nls) { printk(KERN_ERR Failed to create new netlink socket(%u) for w1 master %s.\n, NETLINK_NFLOG, dev-dev.bus_id); diff --git a/include/linux/netlink.h b/include/linux/netlink.h --- a/include/linux/netlink.h +++ b/include/linux/netlink.h @@ -125,7 +125,7 @@ struct netlink_skb_parms #define NETLINK_CREDS(skb) (NETLINK_CB((skb)).creds) -extern struct sock *netlink_kernel_create(int unit, void (*input)(struct sock *sk, int len), struct module *module); +extern struct sock *netlink_kernel_create(int unit, unsigned int groups, void (*input)(struct sock *sk, int len), struct module *module); extern void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err); extern int netlink_unicast(struct sock *ssk, struct sk_buff *skb, __u32 pid, int nonblock); extern int netlink_broadcast(struct sock *ssk, struct sk_buff *skb, __u32 pid, diff --git a/kernel/audit.c b/kernel/audit.c --- a/kernel/audit.c +++ b/kernel/audit.c @@ -514,7 +514,7 @@ static int __init audit_init(void) { printk(KERN_INFO audit: initializing netlink socket (%s)\n, audit_default ? enabled : disabled); - audit_sock = netlink_kernel_create(NETLINK_AUDIT, audit_receive, + audit_sock = netlink_kernel_create(NETLINK_AUDIT, 0, audit_receive, THIS_MODULE); if (!audit_sock) audit_panic(cannot initialize netlink socket); diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c --- a/lib/kobject_uevent.c +++ b/lib/kobject_uevent.c @@ -153,7 +153,7 @@ EXPORT_SYMBOL_GPL(kobject_uevent_atomic) static int __init kobject_uevent_init(void) { - uevent_sock = netlink_kernel_create(NETLINK_KOBJECT_UEVENT, NULL, + uevent_sock = netlink_kernel_create(NETLINK_KOBJECT_UEVENT, 1, NULL, THIS_MODULE); if (!uevent_sock) { diff --git a/net/bridge/netfilter/ebt_ulog.c b/net/bridge/netfilter/ebt_ulog.c --- a/net/bridge/netfilter/ebt_ulog.c +++ b/net/bridge/netfilter/ebt_ulog.c @@ -258,7 +258,8 @@ static int __init init(void) spin_lock_init(ulog_buffers[i].lock); } - ebtulognl = netlink_kernel_create(NETLINK_NFLOG, NULL, THIS_MODULE); + ebtulognl = netlink_kernel_create(NETLINK_NFLOG, EBT_ULOG_MAXNLGROUPS, + NULL, THIS_MODULE); if (!ebtulognl) ret = -ENOMEM; else if ((ret = ebt_register_watcher(ulog))) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -708,7 +708,8 @@ void __init rtnetlink_init(void) if (!rta_buf) panic(rtnetlink_init: cannot allocate rta_buf\n); - rtnl = netlink_kernel_create(NETLINK_ROUTE, rtnetlink_rcv, THIS_MODULE); + rtnl = netlink_kernel_create(NETLINK_ROUTE, RTNLGRP_MAX, rtnetlink_rcv, + THIS_MODULE); if (rtnl == NULL) panic(rtnetlink_init: cannot initialize rtnetlink\n); netlink_set_nonroot(NETLINK_ROUTE, NL_NONROOT_RECV); diff --git a/net/decnet/netfilter/dn_rtmsg.c b/net/decnet/netfilter/dn_rtmsg.c --- a/net/decnet/netfilter/dn_rtmsg.c +++ b/net/decnet/netfilter/dn_rtmsg.c @@ -138,8 +138,8 @@ static int __init init(void) { int rv = 0; - dnrmg = netlink_kernel_create(NETLINK_DNRTMSG, dnrmg_receive_user_sk, - THIS_MODULE); + dnrmg = netlink_kernel_create(NETLINK_DNRTMSG, DNRNG_NLGRP_MAX, + dnrmg_receive_user_sk, THIS_MODULE); if (dnrmg == NULL) { printk(KERN_ERR dn_rtmsg: Cannot create netlink socket); return -ENOMEM; diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@
[NETLINK 5/8]: Return -EPROTONOSUPPORT in netlink_create() if no kernel socket is registered
[NETLINK]: Return -EPROTONOSUPPORT in netlink_create() if no kernel socket is registered This is necessary for dynamic number of netlink groups to make sure we know the number of possible groups before bind() is called. With this change pure userspace communication using unused netlink protocols becomes impossible. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] --- commit 2b1cc05d6484d70aae14d869730f8ce959ed7bdd tree 3c1145ba97171ef0652f80d7e35a54de1b0be4bf parent a8a8c74ef1b37254f920103a6ce70237a6a55dab author Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 01:05:49 +0200 committer Patrick McHardy [EMAIL PROTECTED] Sat, 13 Aug 2005 01:05:49 +0200 net/netlink/af_netlink.c | 72 -- 1 files changed, 44 insertions(+), 28 deletions(-) diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -102,6 +102,7 @@ struct netlink_table { struct hlist_head mc_list; unsigned int nl_nonroot; struct module *module; + int registered; }; static struct netlink_table *nl_table; @@ -343,11 +344,32 @@ static struct proto netlink_proto = { .obj_size = sizeof(struct netlink_sock), }; -static int netlink_create(struct socket *sock, int protocol) +static int __netlink_create(struct socket *sock, int protocol) { struct sock *sk; struct netlink_sock *nlk; - struct module *module; + + sock-ops = netlink_ops; + + sk = sk_alloc(PF_NETLINK, GFP_KERNEL, netlink_proto, 1); + if (!sk) + return -ENOMEM; + + sock_init_data(sock, sk); + + nlk = nlk_sk(sk); + spin_lock_init(nlk-cb_lock); + init_waitqueue_head(nlk-wait); + + sk-sk_destruct = netlink_sock_destruct; + sk-sk_protocol = protocol; + return 0; +} + +static int netlink_create(struct socket *sock, int protocol) +{ + struct module *module = NULL; + int err = 0; sock-state = SS_UNCONNECTED; @@ -358,41 +380,33 @@ static int netlink_create(struct socket return -EPROTONOSUPPORT; netlink_lock_table(); - if (!nl_table[protocol].hash.entries) { #ifdef CONFIG_KMOD - /* We do 'best effort'. If we find a matching module, - * it is loaded. If not, we don't return an error to - * allow pure userspace-userspace communication. -HW - */ + if (!nl_table[protocol].registered) { netlink_unlock_table(); request_module(net-pf-%d-proto-%d, PF_NETLINK, protocol); netlink_lock_table(); -#endif } - module = nl_table[protocol].module; - if (!try_module_get(module)) - module = NULL; +#endif + if (nl_table[protocol].registered + try_module_get(nl_table[protocol].module)) + module = nl_table[protocol].module; + else + err = -EPROTONOSUPPORT; netlink_unlock_table(); - sock-ops = netlink_ops; - - sk = sk_alloc(PF_NETLINK, GFP_KERNEL, netlink_proto, 1); - if (!sk) { - module_put(module); - return -ENOMEM; - } - - sock_init_data(sock, sk); + if (err) + goto out; - nlk = nlk_sk(sk); + if ((err = __netlink_create(sock, protocol) 0)) + goto out_module; - nlk-module = module; - spin_lock_init(nlk-cb_lock); - init_waitqueue_head(nlk-wait); - sk-sk_destruct = netlink_sock_destruct; + nlk_sk(sock-sk)-module = module; +out: + return err; - sk-sk_protocol = protocol; - return 0; +out_module: + module_put(module); + goto out; } static int netlink_release(struct socket *sock) @@ -437,6 +451,7 @@ static int netlink_release(struct socket if (nlk-flags NETLINK_KERNEL_SOCKET) { netlink_table_grab(); nl_table[sk-sk_protocol].module = NULL; + nl_table[sk-sk_protocol].registered = 0; netlink_table_ungrab(); } @@ -1082,7 +1097,7 @@ netlink_kernel_create(int unit, void (*i if (sock_create_lite(PF_NETLINK, SOCK_DGRAM, unit, sock)) return NULL; - if (netlink_create(sock, unit) 0) + if (__netlink_create(sock, unit) 0) goto out_sock_release; sk = sock-sk; @@ -1098,6 +1113,7 @@ netlink_kernel_create(int unit, void (*i netlink_table_grab(); nl_table[unit].module = module; + nl_table[unit].registered = 1; netlink_table_ungrab(); return sk;
skb-stamp conversion missing from latest net-2.6.14
Hi Dave, I just wanted to make the patch to break compilation for unconverted code for the skb-stamp change and noticed that the patch is missing from your latest net-2.6.14 tree. Is this deliberate or did it get lost? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html