date:20070322

PROBLEM: can't initiate empty e100 eeprom

2007-03-22 Thread Seved Torstendahl



When we receive new CPU bourds the EEPROM for one Ethernet port is unprogrammed
i.e. contains only FF. The first thing to do is to perform some accepence tests
and to set the MAC address on this port using the tool eepro100-diag.
This worked with earlier kernels but now the e100.c driver refuses to do this.

The new module parameter eeprom_bad_csum_allow solves half the problem but
the check is_valid_ether_addr(...) at line 2669 fails and then the device
is removed (our line number due to other changes).

Suggestion: let the module parameter control this test as in the patch below.
We don't want to use a DOS tool!

Regards
Seved Torstendahl

--
Seved Torstendahl   Net Insight AB
Senior SW System Architect  Box 42093, S-126 14 Stockholm, SWEDEN
tel: + 46 8 685 04 38   Visiting Address: Västberga Allé 9, Hägersten
fax: + 46 8 685 04 20   http://www.netinsight.net


Index: e100.c
===
RCS file: /cvs/new-sw/kernel/kernel/drivers/net/e100.c,v
retrieving revision 1.2
diff -u -r1.2 e100.c
--- e100.c  23 Oct 2006 13:58:36 -  1.2
+++ e100.c  22 Mar 2007 10:09:22 -
@@ -2666,7 +2666,8 @@

memcpy(netdev->dev_addr, nic->eeprom, ETH_ALEN);
memcpy(netdev->perm_addr, nic->eeprom, ETH_ALEN);
-   if(!is_valid_ether_addr(netdev->perm_addr)) {
+   /* invalid MAC address (ff:ff:ff:ff:ff:ff) accepted, changed at 
installation */
+   if (!eeprom_bad_csum_allow && 
(!is_valid_ether_addr(netdev->perm_addr))) {
DPRINTK(PROBE, ERR, "Invalid MAC address from "
"EEPROM, aborting.\n");
err = -EAGAIN;



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2-2.6.20-070313 bug ?

2007-03-22 Thread Patrick McHardy

Denys wrote:
> Possible i discovered bug, but maybe specific to my setup. 
> 
> In your sources (tc/tc_core.h) i notice 
> #define TIME_UNITS_PER_SEC10 
> When i change it to 
> #define TIME_UNITS_PER_SEC  100.0 
> (it was value before in sources) 
> everythign works fine. Otherwise tbf not working at all, it is dropping all 
> packets.
> 
> Did anyone test new iproute2 with tbf?

Yes, please send the commands you use.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1][PKT_CLS] Avoid multiple tree locks

2007-03-22 Thread Patrick McHardy

jamal wrote:
> On Wed, 2007-21-03 at 15:04 +0100, Patrick McHardy wrote:
> 
>>These (compile tested) patches demonstrate the idea. 
>>
>>The first one
>>lets netlink_kernel_create users specify a mutex that should be
>>held during dump callbacks, the second one uses this for rtnetlink
>>and changes inet_dump_ifaddr for demonstration.
>>
>>A complete patch would allow us to simplify locking in lots of
>>spots, all rtnetlink users currently need to implement extra
>>locking just for the dump functions, and a number of them
>>already get it wrong and seem to rely on the rtnl.
>>
> 
> 
> The mutex is certainly a cleaner approach;
> and a lot of the RCU protection would go away. I like it.

Not as much as I initially thought, but at least we would have
consistent locking for the dump callbacks.

> Knowing you i sense theres something clever in there that i am 
> missing. I dont see how you could get rid of the tree locking
> since we need to protect against the data path still, no?
> Or are you looking at that as a separate effort?

We can remove qdisc_tree_lock since with this patch all changes
and all tree walking happen under the RTNL. We still need to keep
dev->queue_lock for the data path.

I'll update the patches to include all rtnetlink users and repost
in a day or two.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2-2.6.20-070313 bug ?

2007-03-22 Thread Denys

Dear Sir

Already i sent. I will copy here also

Normal "patched by me" iproute2 

/sbin/tc qdisc del dev ppp0 root 
/sbin/tc qdisc add dev ppp0 root handle 1: prio 
/sbin/tc qdisc add dev ppp0 parent 1:1 handle 2: tbf buffer 1024kb latency 
500ms rate 128kbit peakrate 256kbit minburst 16384 
/sbin/tc filter add dev ppp0 parent 1:0 protocol ip prio 10 u32 match ip dst 
0.0.0.0/0 flowid 2:1 

tc(patched) monitor output 
deleted qdisc prio 1: dev ppp0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 
1 1 1 1 
qdisc prio 1: dev ppp0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 
qdisc tbf 2: dev ppp0 parent 1:1 rate 128000bit burst 1024Kb peakrate 
256000bit minburst 16Kb lat 500.0ms 
filter dev ppp0 parent 1: protocol ip pref 10 u32 fh 800::800 order 2048 key 
ht 800 bkt 0 flowid 2:1 
 match / at 16 

VISP-Office ~ #cat /proc/net/psched 
0001 0001 000f4240 03e8 

Now running tc2, it is "stock" version 

/sbin/tc2 qdisc del dev ppp0 root 
/sbin/tc2 qdisc add dev ppp0 root handle 1: prio 
/sbin/tc2 qdisc add dev ppp0 parent 1:1 handle 2: tbf buffer 1024kb latency 
500ms rate 128kbit peakrate 256kbit minburst 16384 
/sbin/tc2 filter add dev ppp0 parent 1:0 protocol ip prio 10 u32 match ip dst 
0.0.0.0/0 flowid 2:1 

Monitor output: 
deleted qdisc prio 1: dev ppp0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 
1 1 1 1 
qdisc prio 1: dev ppp0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 
qdisc tbf 2: dev ppp0 parent 1:1 rate 128000bit burst 1024Kb peakrate 
256000bit minburst 16Kb lat 500.0ms 
filter dev ppp0 parent 1: protocol ip pref 10 u32 fh 800::800 order 2048 key 
ht 800 bkt 0 flowid 2:1 
 match / at 16 

VISP-Office ~ #cat /proc/net/psched 
0001 0001 000f4240 03e8 

Sure when i run tc2 - i see in stats, when it stopped (tc - normal, tc2 - 
buggy): 
VISP-Office ~ #tc -s qdisc show dev ppp0 
qdisc ingress :  
Sent 184893 bytes 2311 pkt (dropped 0, overlimits 0 requeues 0) 
rate 0bit 0pps backlog 0b 0p requeues 0 
qdisc prio 1: bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 
Sent 7765 bytes 64 pkt (dropped 0, overlimits 0 requeues 0) 
rate 0bit 0pps backlog 0b 64p requeues 0 
qdisc tbf 2: parent 1:1 rate 128000bit burst 4294932937b peakrate 256000bit 
minburst 16Kb lat 4.2s 
Sent 7765 bytes 64 pkt (dropped 0, overlimits 64 requeues 0) 
rate 0bit 0pps backlog 0b 64p requeues 0 
VISP-Office ~ #tc2 -s qdisc show dev ppp0 
qdisc ingress :  
Sent 186423 bytes 2324 pkt (dropped 0, overlimits 0 requeues 0) 
rate 0bit 0pps backlog 0b 0p requeues 0 
qdisc prio 1: bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 
Sent 8677 bytes 77 pkt (dropped 0, overlimits 0 requeues 0) 
rate 0bit 0pps backlog 0b 77p requeues 0 
qdisc tbf 2: parent 1:1 rate 128000bit burst 4294932937b peakrate 256000bit 
minburst 16Kb lat 4.2s 
Sent 8677 bytes 77 pkt (dropped 0, overlimits 77 requeues 0) 
rate 0bit 0pps backlog 0b 77p requeues 0 

I wish this will be enough information. Thanks for your help!



On Thu, 22 Mar 2007 12:23:03 +0100, Patrick McHardy wrote
> Denys wrote:
> > Possible i discovered bug, but maybe specific to my setup. 
> > 
> > In your sources (tc/tc_core.h) i notice 
> > #define TIME_UNITS_PER_SEC10 
> > When i change it to 
> > #define TIME_UNITS_PER_SEC  100.0 
> > (it was value before in sources) 
> > everythign works fine. Otherwise tbf not working at all, it is dropping 
all 
> > packets.
> > 
> > Did anyone test new iproute2 with tbf?
> 
> Yes, please send the commands you use.


--
Virtual ISP S.A.L.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 00/12] [RESEND] rtnetlink message handler registration interface

2007-03-22 Thread Thomas Graf

The existing function names seem to have sentimental value to some
people. Same patches but without changes to the functio names.

Introduces an interface to register rtnetlink message handlers
and converts all users of rtnl_links[].

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 01/12] [RTNL]: Message handler registration interface

2007-03-22 Thread Thomas Graf

This patch adds a new interface to register rtnetlink message
handlers replacing the exported rtnl_links[] array which
required many message handlers to be exported unnecessarly.

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/include/net/rtnetlink.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ net-2.6.22/include/net/rtnetlink.h  2007-03-22 12:48:20.0 +0100
@@ -0,0 +1,18 @@
+#ifndef __NET_RTNETLINK_H
+#define __NET_RTNETLINK_H
+
+#include 
+#include 
+
+typedef int (*rtnl_doit_func)(struct sk_buff *, struct nlmsghdr *, void *);
+typedef int (*rtnl_dumpit_func)(struct sk_buff *, struct netlink_callback *);
+
+extern int __rtnl_register(int protocol, int msgtype,
+   rtnl_doit_func, rtnl_dumpit_func);
+extern voidrtnl_register(int protocol, int msgtype,
+ rtnl_doit_func, rtnl_dumpit_func);
+extern int rtnl_unregister(int protocol, int msgtype);
+extern voidrtnl_unregister_all(int protocol);
+extern int rtnl_dump_all(struct sk_buff *skb, struct netlink_callback *cb);
+
+#endif
Index: net-2.6.22/net/core/rtnetlink.c
===
--- net-2.6.22.orig/net/core/rtnetlink.c2007-03-22 12:48:05.0 
+0100
+++ net-2.6.22/net/core/rtnetlink.c 2007-03-22 12:48:20.0 +0100
@@ -50,12 +50,18 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #ifdef CONFIG_NET_WIRELESS_RTNETLINK
 #include 
 #include 
 #endif /* CONFIG_NET_WIRELESS_RTNETLINK */
 
+struct rtnl_link
+{
+   rtnl_doit_func  doit;
+   rtnl_dumpit_funcdumpit;
+};
+
 static DEFINE_MUTEX(rtnl_mutex);
 static struct sock *rtnl;
 
@@ -95,7 +101,151 @@ int rtattr_parse(struct rtattr *tb[], in
return 0;
 }
 
-struct rtnetlink_link * rtnetlink_links[NPROTO];
+struct rtnl_link *rtnl_msg_handlers[NPROTO];
+
+static inline int rtm_msgindex(int msgtype)
+{
+   int msgindex = msgtype - RTM_BASE;
+
+   /*
+* msgindex < 0 implies someone tried to register a netlink
+* control code. msgindex >= RTM_NR_MSGTYPES may indicate that
+* the message type has not been added to linux/rtnetlink.h
+*/
+   BUG_ON(msgindex < 0 || msgindex >= RTM_NR_MSGTYPES);
+
+   return msgindex;
+}
+
+static rtnl_doit_func rtnl_get_doit(int protocol, int msgindex)
+{
+   struct rtnl_link *tab;
+
+   tab = rtnl_msg_handlers[protocol];
+   if (tab == NULL || tab->doit == NULL)
+   tab = rtnl_msg_handlers[PF_UNSPEC];
+
+   return tab ? tab->doit : NULL;
+}
+
+static rtnl_dumpit_func rtnl_get_dumpit(int protocol, int msgindex)
+{
+   struct rtnl_link *tab;
+
+   tab = rtnl_msg_handlers[protocol];
+   if (tab == NULL || tab->dumpit == NULL)
+   tab = rtnl_msg_handlers[PF_UNSPEC];
+
+   return tab ? tab->dumpit : NULL;
+}
+
+/**
+ * __rtnl_register - Register a rtnetlink message type
+ * @protocol: Protocol family or PF_UNSPEC
+ * @msgtype: rtnetlink message type
+ * @doit: Function pointer called for each request message
+ * @dumpit: Function pointer called for each dump request (NLM_F_DUMP) message
+ *
+ * Registers the specified function pointers (at least one of them has
+ * to be non-NULL) to be called whenever a request message for the
+ * specified protocol family and message type is received.
+ *
+ * The special protocol family PF_UNSPEC may be used to define fallback
+ * function pointers for the case when no entry for the specific protocol
+ * family exists.
+ *
+ * Returns 0 on success or a negative error code.
+ */
+int __rtnl_register(int protocol, int msgtype,
+   rtnl_doit_func doit, rtnl_dumpit_func dumpit)
+{
+   struct rtnl_link *tab;
+   int msgindex;
+
+   BUG_ON(protocol < 0 || protocol >= NPROTO);
+   msgindex = rtm_msgindex(msgtype);
+
+   tab = rtnl_msg_handlers[protocol];
+   if (tab == NULL) {
+   tab = kcalloc(RTM_NR_MSGTYPES, sizeof(*tab), GFP_KERNEL);
+   if (tab == NULL)
+   return -ENOBUFS;
+
+   rtnl_msg_handlers[protocol] = tab;
+   }
+
+   if (doit)
+   tab[msgindex].doit = doit;
+
+   if (dumpit)
+   tab[msgindex].dumpit = dumpit;
+
+   return 0;
+}
+
+EXPORT_SYMBOL_GPL(__rtnl_register);
+
+/**
+ * rtnl_register - Register a rtnetlink message type
+ *
+ * Identical to __rtnl_register() but panics on failure. This is useful
+ * as failure of this function is very unlikely, it can only happen due
+ * to lack of memory when allocating the chain to store all message
+ * handlers for a protocol. Meant for use in init functions where lack
+ * of memory implies no sense in continueing.
+ */
+void rtnl_register(int protocol, int msgtype,
+  rtnl_doit_func doit, rtnl_dumpit_func dumpit)
+{
+   if (__rtnl_register(protocol, msgtype, do

[PATCH 02/12] [NET] link: Use rtnl registration interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/core/rtnetlink.c
===
--- net-2.6.22.orig/net/core/rtnetlink.c2007-03-22 12:48:20.0 
+0100
+++ net-2.6.22/net/core/rtnetlink.c 2007-03-22 12:49:31.0 +0100
@@ -960,9 +960,6 @@ static void rtnetlink_rcv(struct sock *s
 
 static struct rtnetlink_link link_rtnetlink_table[RTM_NR_MSGTYPES] =
 {
-   [RTM_GETLINK - RTM_BASE] = { .doit   = rtnl_getlink,
-.dumpit = rtnl_dump_ifinfo  },
-   [RTM_SETLINK - RTM_BASE] = { .doit   = rtnl_setlink  },
[RTM_GETADDR - RTM_BASE] = { .dumpit = rtnl_dump_all },
[RTM_GETROUTE- RTM_BASE] = { .dumpit = rtnl_dump_all },
[RTM_NEWNEIGH- RTM_BASE] = { .doit   = neigh_add },
@@ -1023,8 +1020,9 @@ void __init rtnetlink_init(void)
panic("rtnetlink_init: cannot initialize rtnetlink\n");
netlink_set_nonroot(NETLINK_ROUTE, NL_NONROOT_RECV);
register_netdevice_notifier(&rtnetlink_dev_notifier);
-   rtnetlink_links[PF_UNSPEC] = link_rtnetlink_table;
-   rtnetlink_links[PF_PACKET] = link_rtnetlink_table;
+
+   rtnl_register(PF_UNSPEC, RTM_GETLINK, rtnl_getlink, rtnl_dump_ifinfo);
+   rtnl_register(PF_UNSPEC, RTM_SETLINK, rtnl_setlink, NULL);
 }
 
 EXPORT_SYMBOL(__rta_fill);

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 03/12] [NEIGH]: Use rtnl registration interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/include/net/neighbour.h
===
--- net-2.6.22.orig/include/net/neighbour.h 2007-03-22 12:48:05.0 
+0100
+++ net-2.6.22/include/net/neighbour.h  2007-03-22 12:49:47.0 +0100
@@ -24,6 +24,7 @@
 
 #include 
 #include 
+#include 
 
 #define NUD_IN_TIMER   (NUD_INCOMPLETE|NUD_REACHABLE|NUD_DELAY|NUD_PROBE)
 #define NUD_VALID  
(NUD_PERMANENT|NUD_NOARP|NUD_REACHABLE|NUD_PROBE|NUD_STALE|NUD_DELAY)
@@ -213,16 +214,7 @@ extern voidpneigh_enqueue(struct 
neig
 extern struct pneigh_entry *pneigh_lookup(struct neigh_table *tbl, const 
void *key, struct net_device *dev, int creat);
 extern int pneigh_delete(struct neigh_table *tbl, const 
void *key, struct net_device *dev);
 
-struct netlink_callback;
-struct nlmsghdr;
-extern int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb);
-extern int neigh_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg);
-extern int neigh_delete(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg);
 extern void neigh_app_ns(struct neighbour *n);
-
-extern int neightbl_dump_info(struct sk_buff *skb, struct netlink_callback 
*cb);
-extern int neightbl_set(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg);
-
 extern void neigh_for_each(struct neigh_table *tbl, void (*cb)(struct 
neighbour *, void *), void *cookie);
 extern void __neigh_for_each_release(struct neigh_table *tbl, int (*cb)(struct 
neighbour *));
 extern void pneigh_for_each(struct neigh_table *tbl, void (*cb)(struct 
pneigh_entry *));
Index: net-2.6.22/net/core/neighbour.c
===
--- net-2.6.22.orig/net/core/neighbour.c2007-03-22 12:48:05.0 
+0100
+++ net-2.6.22/net/core/neighbour.c 2007-03-22 12:51:12.0 +0100
@@ -1435,7 +1435,7 @@ int neigh_table_clear(struct neigh_table
return 0;
 }
 
-int neigh_delete(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+static int neigh_delete(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 {
struct ndmsg *ndm;
struct nlattr *dst_attr;
@@ -1500,7 +1500,7 @@ out:
return err;
 }
 
-int neigh_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+static int neigh_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 {
struct ndmsg *ndm;
struct nlattr *tb[NDA_MAX+1];
@@ -1780,7 +1780,7 @@ static struct nla_policy nl_ntbl_parm_po
[NDTPA_LOCKTIME]= { .type = NLA_U64 },
 };
 
-int neightbl_set(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+static int neightbl_set(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 {
struct neigh_table *tbl;
struct ndtmsg *ndtmsg;
@@ -1904,7 +1904,7 @@ errout:
return err;
 }
 
-int neightbl_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
+static int neightbl_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 {
int family, tidx, nidx = 0;
int tbl_skip = cb->args[0];
@@ -2028,7 +2028,7 @@ out:
return rc;
 }
 
-int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
+static int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 {
struct neigh_table *tbl;
int t, family, s_t;
@@ -2737,14 +2737,26 @@ void neigh_sysctl_unregister(struct neig
 
 #endif /* CONFIG_SYSCTL */
 
+static int __init neigh_init(void)
+{
+   rtnl_register(PF_UNSPEC, RTM_NEWNEIGH, neigh_add, NULL);
+   rtnl_register(PF_UNSPEC, RTM_DELNEIGH, neigh_delete, NULL);
+   rtnl_register(PF_UNSPEC, RTM_GETNEIGH, NULL, neigh_dump_info);
+
+   rtnl_register(PF_UNSPEC, RTM_GETNEIGHTBL, NULL, neightbl_dump_info);
+   rtnl_register(PF_UNSPEC, RTM_SETNEIGHTBL, neightbl_set, NULL);
+
+   return 0;
+}
+
+subsys_initcall(neigh_init);
+
 EXPORT_SYMBOL(__neigh_event_send);
 EXPORT_SYMBOL(neigh_changeaddr);
 EXPORT_SYMBOL(neigh_compat_output);
 EXPORT_SYMBOL(neigh_connected_output);
 EXPORT_SYMBOL(neigh_create);
-EXPORT_SYMBOL(neigh_delete);
 EXPORT_SYMBOL(neigh_destroy);
-EXPORT_SYMBOL(neigh_dump_info);
 EXPORT_SYMBOL(neigh_event_ns);
 EXPORT_SYMBOL(neigh_ifdown);
 EXPORT_SYMBOL(neigh_lookup);
Index: net-2.6.22/net/core/rtnetlink.c
===
--- net-2.6.22.orig/net/core/rtnetlink.c2007-03-22 12:49:31.0 
+0100
+++ net-2.6.22/net/core/rtnetlink.c 2007-03-22 12:49:47.0 +0100
@@ -962,16 +962,11 @@ static struct rtnetlink_link link_rtnetl
 {
[RTM_GETADDR - RTM_BASE] = { .dumpit = rtnl_dump_all },
[RTM_GETROUTE- RTM_BASE] = { .dumpit = rtnl_dump_all },
-   [RTM_NEWNEIGH- RTM_BASE] = { .doit   = neigh_add },
-   [RTM_DELNEIGH- RTM_BASE] = { .doit   = neigh_delete  },
-   [RTM_GETNEIGH- RTM_BASE] = { .dumpit = neigh_d

[PATCH 04/12] [NET] rules: Use rtnl registration interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/core/fib_rules.c
===
--- net-2.6.22.orig/net/core/fib_rules.c2007-03-22 12:48:05.0 
+0100
+++ net-2.6.22/net/core/fib_rules.c 2007-03-22 12:52:34.0 +0100
@@ -152,7 +152,7 @@ out:
 
 EXPORT_SYMBOL_GPL(fib_rules_lookup);
 
-int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
+static int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
 {
struct fib_rule_hdr *frh = nlmsg_data(nlh);
struct fib_rules_ops *ops = NULL;
@@ -239,7 +239,7 @@ errout:
return err;
 }
 
-int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
+static int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
 {
struct fib_rule_hdr *frh = nlmsg_data(nlh);
struct fib_rules_ops *ops = NULL;
@@ -471,6 +471,10 @@ static struct notifier_block fib_rules_n
 
 static int __init fib_rules_init(void)
 {
+   rtnl_register(PF_UNSPEC, RTM_NEWRULE, fib_nl_newrule, NULL);
+   rtnl_register(PF_UNSPEC, RTM_DELRULE, fib_nl_delrule, NULL);
+   rtnl_register(PF_UNSPEC, RTM_GETRULE, NULL, rtnl_dump_all);
+
return register_netdevice_notifier(&fib_rules_notifier);
 }
 
Index: net-2.6.22/include/net/fib_rules.h
===
--- net-2.6.22.orig/include/net/fib_rules.h 2007-03-22 12:48:05.0 
+0100
+++ net-2.6.22/include/net/fib_rules.h  2007-03-22 12:51:31.0 +0100
@@ -5,7 +5,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 struct fib_rule
 {
@@ -98,10 +98,6 @@ extern int   fib_rules_lookup(struct fib
 struct flowi *, int flags,
 struct fib_lookup_arg *);
 
-extern int fib_nl_newrule(struct sk_buff *,
-  struct nlmsghdr *, void *);
-extern int fib_nl_delrule(struct sk_buff *,
-  struct nlmsghdr *, void *);
 extern int fib_rules_dump(struct sk_buff *,
   struct netlink_callback *, int);
 #endif
Index: net-2.6.22/net/core/rtnetlink.c
===
--- net-2.6.22.orig/net/core/rtnetlink.c2007-03-22 12:49:47.0 
+0100
+++ net-2.6.22/net/core/rtnetlink.c 2007-03-22 12:51:31.0 +0100
@@ -962,11 +962,6 @@ static struct rtnetlink_link link_rtnetl
 {
[RTM_GETADDR - RTM_BASE] = { .dumpit = rtnl_dump_all },
[RTM_GETROUTE- RTM_BASE] = { .dumpit = rtnl_dump_all },
-#ifdef CONFIG_FIB_RULES
-   [RTM_NEWRULE - RTM_BASE] = { .doit   = fib_nl_newrule},
-   [RTM_DELRULE - RTM_BASE] = { .doit   = fib_nl_delrule},
-#endif
-   [RTM_GETRULE - RTM_BASE] = { .dumpit = rtnl_dump_all },
 };
 
 static int rtnetlink_event(struct notifier_block *this, unsigned long event, 
void *ptr)

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 05/12] [IPv4]: Use rtnl registration interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/include/net/ip_fib.h
===
--- net-2.6.22.orig/include/net/ip_fib.h2007-03-22 12:48:05.0 
+0100
+++ net-2.6.22/include/net/ip_fib.h 2007-03-22 12:53:03.0 +0100
@@ -215,10 +215,6 @@ extern void fib_select_default(const str
 /* Exported by fib_frontend.c */
 extern struct nla_policy rtm_ipv4_policy[];
 extern voidip_fib_init(void);
-extern int inet_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void 
*arg);
-extern int inet_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void 
*arg);
-extern int inet_rtm_getroute(struct sk_buff *skb, struct nlmsghdr* nlh, void 
*arg);
-extern int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb);
 extern int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
   struct net_device *dev, __be32 *spec_dst, u32 
*itag);
 extern void fib_select_multipath(const struct flowi *flp, struct fib_result 
*res);
@@ -235,8 +231,6 @@ extern __be32  __fib_res_prefsrc(struct 
 extern struct fib_table *fib_hash_init(u32 id);
 
 #ifdef CONFIG_IP_MULTIPLE_TABLES
-extern int fib4_rules_dump(struct sk_buff *skb, struct netlink_callback *cb);
-
 extern void __init fib4_rules_init(void);
 
 #ifdef CONFIG_NET_CLS_ROUTE
Index: net-2.6.22/net/ipv4/devinet.c
===
--- net-2.6.22.orig/net/ipv4/devinet.c  2007-03-22 12:48:05.0 +0100
+++ net-2.6.22/net/ipv4/devinet.c   2007-03-22 12:54:49.0 +0100
@@ -48,7 +48,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -62,7 +61,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 struct ipv4_devconf ipv4_devconf = {
.accept_redirects = 1,
@@ -1241,19 +1240,6 @@ errout:
rtnl_set_sk_err(RTNLGRP_IPV4_IFADDR, err);
 }
 
-static struct rtnetlink_link inet_rtnetlink_table[RTM_NR_MSGTYPES] = {
-   [RTM_NEWADDR  - RTM_BASE] = { .doit = inet_rtm_newaddr, },
-   [RTM_DELADDR  - RTM_BASE] = { .doit = inet_rtm_deladdr, },
-   [RTM_GETADDR  - RTM_BASE] = { .dumpit   = inet_dump_ifaddr, },
-   [RTM_NEWROUTE - RTM_BASE] = { .doit = inet_rtm_newroute,},
-   [RTM_DELROUTE - RTM_BASE] = { .doit = inet_rtm_delroute,},
-   [RTM_GETROUTE - RTM_BASE] = { .doit = inet_rtm_getroute,
- .dumpit   = inet_dump_fib,},
-#ifdef CONFIG_IP_MULTIPLE_TABLES
-   [RTM_GETRULE  - RTM_BASE] = { .dumpit   = fib4_rules_dump,  },
-#endif
-};
-
 #ifdef CONFIG_SYSCTL
 
 void inet_forward_change(void)
@@ -1636,7 +1622,10 @@ void __init devinet_init(void)
 {
register_gifconf(PF_INET, inet_gifconf);
register_netdevice_notifier(&ip_netdev_notifier);
-   rtnetlink_links[PF_INET] = inet_rtnetlink_table;
+
+   rtnl_register(PF_INET, RTM_NEWADDR, inet_rtm_newaddr, NULL);
+   rtnl_register(PF_INET, RTM_DELADDR, inet_rtm_deladdr, NULL);
+   rtnl_register(PF_INET, RTM_GETADDR, NULL, inet_dump_ifaddr);
 #ifdef CONFIG_SYSCTL
devinet_sysctl.sysctl_header =
register_sysctl_table(devinet_sysctl.devinet_root_dir);
Index: net-2.6.22/net/ipv4/fib_rules.c
===
--- net-2.6.22.orig/net/ipv4/fib_rules.c2007-03-22 12:48:05.0 
+0100
+++ net-2.6.22/net/ipv4/fib_rules.c 2007-03-22 13:23:53.0 +0100
@@ -277,7 +277,7 @@ nla_put_failure:
return -ENOBUFS;
 }
 
-int fib4_rules_dump(struct sk_buff *skb, struct netlink_callback *cb)
+static int fib4_rule_dump(struct sk_buff *skb, struct netlink_callback *cb)
 {
return fib_rules_dump(skb, cb, AF_INET);
 }
@@ -329,4 +329,6 @@ void __init fib4_rules_init(void)
list_add_tail(&default_rule.common.list, &fib4_rules);
 
fib_rules_register(&fib4_rules_ops);
+
+   rtnl_register(PF_INET, RTM_GETRULE, NULL, fib4_rule_dump);
 }
Index: net-2.6.22/net/ipv4/fib_frontend.c
===
--- net-2.6.22.orig/net/ipv4/fib_frontend.c 2007-03-22 12:48:05.0 
+0100
+++ net-2.6.22/net/ipv4/fib_frontend.c  2007-03-22 12:57:04.0 +0100
@@ -34,7 +34,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -46,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define FFprint(a...) printk(KERN_DEBUG a)
 
@@ -535,7 +535,7 @@ errout:
return err;
 }
 
-int inet_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
+static int inet_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void 
*arg)
 {
struct fib_config cfg;
struct fib_table *tb;
@@ -556,7 +556,7 @@ errout:
return err;
 }
 
-int inet_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
+static int inet_rtm_newroute(struct sk_buff

[PATCH 06/12] [PKT_SCHED] qdisc: Use rtnl registration interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/include/net/sch_generic.h
===
--- net-2.6.22.orig/include/net/sch_generic.h   2007-03-22 13:23:14.0 
+0100
+++ net-2.6.22/include/net/sch_generic.h2007-03-22 13:23:57.0 
+0100
@@ -5,10 +5,10 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
+#include 
 
 struct Qdisc_ops;
 struct qdisc_walker;
Index: net-2.6.22/net/sched/sch_api.c
===
--- net-2.6.22.orig/net/sched/sch_api.c 2007-03-22 13:23:14.0 +0100
+++ net-2.6.22/net/sched/sch_api.c  2007-03-22 13:23:57.0 +0100
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -1239,29 +1238,17 @@ static const struct file_operations psch
 
 static int __init pktsched_init(void)
 {
-   struct rtnetlink_link *link_p;
-
-   link_p = rtnetlink_links[PF_UNSPEC];
-
-   /* Setup rtnetlink links. It is made here to avoid
-  exporting large number of public symbols.
-*/
-
-   if (link_p) {
-   link_p[RTM_NEWQDISC-RTM_BASE].doit = tc_modify_qdisc;
-   link_p[RTM_DELQDISC-RTM_BASE].doit = tc_get_qdisc;
-   link_p[RTM_GETQDISC-RTM_BASE].doit = tc_get_qdisc;
-   link_p[RTM_GETQDISC-RTM_BASE].dumpit = tc_dump_qdisc;
-   link_p[RTM_NEWTCLASS-RTM_BASE].doit = tc_ctl_tclass;
-   link_p[RTM_DELTCLASS-RTM_BASE].doit = tc_ctl_tclass;
-   link_p[RTM_GETTCLASS-RTM_BASE].doit = tc_ctl_tclass;
-   link_p[RTM_GETTCLASS-RTM_BASE].dumpit = tc_dump_tclass;
-   }
-
register_qdisc(&pfifo_qdisc_ops);
register_qdisc(&bfifo_qdisc_ops);
proc_net_fops_create("psched", 0, &psched_fops);
 
+   rtnl_register(PF_UNSPEC, RTM_NEWQDISC, tc_modify_qdisc, NULL);
+   rtnl_register(PF_UNSPEC, RTM_DELQDISC, tc_get_qdisc, NULL);
+   rtnl_register(PF_UNSPEC, RTM_GETQDISC, tc_get_qdisc, tc_dump_qdisc);
+   rtnl_register(PF_UNSPEC, RTM_NEWTCLASS, tc_ctl_tclass, NULL);
+   rtnl_register(PF_UNSPEC, RTM_DELTCLASS, tc_ctl_tclass, NULL);
+   rtnl_register(PF_UNSPEC, RTM_GETTCLASS, tc_ctl_tclass, tc_dump_tclass);
+
return 0;
 }
 

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 07/12] [PKT_SCHED] cls: Use rtnl registration interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/sched/cls_api.c
===
--- net-2.6.22.orig/net/sched/cls_api.c 2007-03-22 13:23:14.0 +0100
+++ net-2.6.22/net/sched/cls_api.c  2007-03-22 13:23:58.0 +0100
@@ -29,7 +29,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -616,18 +615,11 @@ rtattr_failure: __attribute__ ((unused))
 
 static int __init tc_filter_init(void)
 {
-   struct rtnetlink_link *link_p = rtnetlink_links[PF_UNSPEC];
+   rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_ctl_tfilter, NULL);
+   rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_ctl_tfilter, NULL);
+   rtnl_register(PF_UNSPEC, RTM_GETTFILTER, tc_ctl_tfilter,
+tc_dump_tfilter);
 
-   /* Setup rtnetlink links. It is made here to avoid
-  exporting large number of public symbols.
-*/
-
-   if (link_p) {
-   link_p[RTM_NEWTFILTER-RTM_BASE].doit = tc_ctl_tfilter;
-   link_p[RTM_DELTFILTER-RTM_BASE].doit = tc_ctl_tfilter;
-   link_p[RTM_GETTFILTER-RTM_BASE].doit = tc_ctl_tfilter;
-   link_p[RTM_GETTFILTER-RTM_BASE].dumpit = tc_dump_tfilter;
-   }
return 0;
 }
 

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 08/12] [PKT_SCHED] act: Use rtnl registration interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/sched/act_api.c
===
--- net-2.6.22.orig/net/sched/act_api.c 2007-03-22 13:23:13.0 +0100
+++ net-2.6.22/net/sched/act_api.c  2007-03-22 13:23:59.0 +0100
@@ -25,7 +25,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -1077,14 +1076,9 @@ nlmsg_failure:
 
 static int __init tc_action_init(void)
 {
-   struct rtnetlink_link *link_p = rtnetlink_links[PF_UNSPEC];
-
-   if (link_p) {
-   link_p[RTM_NEWACTION-RTM_BASE].doit = tc_ctl_action;
-   link_p[RTM_DELACTION-RTM_BASE].doit = tc_ctl_action;
-   link_p[RTM_GETACTION-RTM_BASE].doit = tc_ctl_action;
-   link_p[RTM_GETACTION-RTM_BASE].dumpit = tc_dump_action;
-   }
+   rtnl_register(PF_UNSPEC, RTM_NEWACTION, tc_ctl_action, NULL);
+   rtnl_register(PF_UNSPEC, RTM_DELACTION, tc_ctl_action, NULL);
+   rtnl_register(PF_UNSPEC, RTM_GETACTION, tc_ctl_action, tc_dump_action);
 
return 0;
 }

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 09/12] [DECNet]: Use rtnl registration interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/include/net/dn_fib.h
===
--- net-2.6.22.orig/include/net/dn_fib.h2007-03-22 13:23:13.0 
+0100
+++ net-2.6.22/include/net/dn_fib.h 2007-03-22 13:24:01.0 +0100
@@ -148,17 +148,8 @@ extern void dn_fib_rules_cleanup(void);
 extern unsigned dnet_addr_type(__le16 addr);
 extern int dn_fib_lookup(struct flowi *fl, struct dn_fib_res *res);
 
-/*
- * rtnetlink interface
- */
-extern int dn_fib_rtm_delroute(struct sk_buff *skb, struct nlmsghdr *nlh, void 
*arg);
-extern int dn_fib_rtm_newroute(struct sk_buff *skb, struct nlmsghdr *nlh, void 
*arg);
 extern int dn_fib_dump(struct sk_buff *skb, struct netlink_callback *cb);
 
-extern int dn_fib_rtm_delrule(struct sk_buff *skb, struct nlmsghdr *nlh, void 
*arg);
-extern int dn_fib_rtm_newrule(struct sk_buff *skb, struct nlmsghdr *nlh, void 
*arg);
-extern int dn_fib_dump_rules(struct sk_buff *skb, struct netlink_callback *cb);
-
 extern void dn_fib_free_info(struct dn_fib_info *fi);
 
 static inline void dn_fib_info_put(struct dn_fib_info *fi)
Index: net-2.6.22/net/decnet/dn_rules.c
===
--- net-2.6.22.orig/net/decnet/dn_rules.c   2007-03-22 13:23:13.0 
+0100
+++ net-2.6.22/net/decnet/dn_rules.c2007-03-22 13:24:01.0 +0100
@@ -241,7 +241,7 @@ static u32 dn_fib_rule_default_pref(void
return 0;
 }
 
-int dn_fib_dump_rules(struct sk_buff *skb, struct netlink_callback *cb)
+static int dn_fib_dump_rules(struct sk_buff *skb, struct netlink_callback *cb)
 {
return fib_rules_dump(skb, cb, AF_DECnet);
 }
@@ -265,10 +265,12 @@ void __init dn_fib_rules_init(void)
 {
list_add_tail(&default_rule.common.list, &dn_fib_rules);
fib_rules_register(&dn_fib_rules_ops);
+   rtnl_register(PF_DECnet, RTM_GETRULE, NULL, dn_fib_dump_rules);
 }
 
 void __exit dn_fib_rules_cleanup(void)
 {
+   rtnl_unregister(PF_DECnet, RTM_GETRULE);
fib_rules_unregister(&dn_fib_rules_ops);
 }
 
Index: net-2.6.22/net/decnet/dn_fib.c
===
--- net-2.6.22.orig/net/decnet/dn_fib.c 2007-03-22 13:23:13.0 +0100
+++ net-2.6.22/net/decnet/dn_fib.c  2007-03-22 13:24:01.0 +0100
@@ -501,7 +501,7 @@ static int dn_fib_check_attr(struct rtms
return 0;
 }
 
-int dn_fib_rtm_delroute(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+static int dn_fib_rtm_delroute(struct sk_buff *skb, struct nlmsghdr *nlh, void 
*arg)
 {
struct dn_fib_table *tb;
struct rtattr **rta = arg;
@@ -517,7 +517,7 @@ int dn_fib_rtm_delroute(struct sk_buff *
return -ESRCH;
 }
 
-int dn_fib_rtm_newroute(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+static int dn_fib_rtm_newroute(struct sk_buff *skb, struct nlmsghdr *nlh, void 
*arg)
 {
struct dn_fib_table *tb;
struct rtattr **rta = arg;
@@ -745,11 +745,13 @@ void __exit dn_fib_cleanup(void)
 
 void __init dn_fib_init(void)
 {
-
dn_fib_table_init();
dn_fib_rules_init();
 
register_dnaddr_notifier(&dn_fib_dnaddr_notifier);
+
+   rtnl_register(PF_DECnet, RTM_NEWROUTE, dn_fib_rtm_newroute, NULL);
+   rtnl_register(PF_DECnet, RTM_DELROUTE, dn_fib_rtm_delroute, NULL);
 }
 
 
Index: net-2.6.22/net/decnet/af_decnet.c
===
--- net-2.6.22.orig/net/decnet/af_decnet.c  2007-03-22 13:23:13.0 
+0100
+++ net-2.6.22/net/decnet/af_decnet.c   2007-03-22 13:24:01.0 +0100
@@ -2413,6 +2413,7 @@ module_init(decnet_init);
 static void __exit decnet_exit(void)
 {
sock_unregister(AF_DECnet);
+   rtnl_unregister_all(PF_DECnet);
dev_remove_pack(&dn_dix_packet_type);
 
dn_unregister_sysctl();
Index: net-2.6.22/net/decnet/dn_dev.c
===
--- net-2.6.22.orig/net/decnet/dn_dev.c 2007-03-22 13:23:13.0 +0100
+++ net-2.6.22/net/decnet/dn_dev.c  2007-03-22 13:24:01.0 +0100
@@ -1447,24 +1447,6 @@ static const struct file_operations dn_d
 
 #endif /* CONFIG_PROC_FS */
 
-static struct rtnetlink_link dnet_rtnetlink_table[RTM_NR_MSGTYPES] =
-{
-   [RTM_NEWADDR  - RTM_BASE] = { .doit = dn_nl_newaddr,},
-   [RTM_DELADDR  - RTM_BASE] = { .doit = dn_nl_deladdr,},
-   [RTM_GETADDR  - RTM_BASE] = { .dumpit   = dn_nl_dump_ifaddr,},
-#ifdef CONFIG_DECNET_ROUTER
-   [RTM_NEWROUTE - RTM_BASE] = { .doit = dn_fib_rtm_newroute,  },
-   [RTM_DELROUTE - RTM_BASE] = { .doit = dn_fib_rtm_delroute,  },
-   [RTM_GETROUTE - RTM_BASE] = { .doit = dn_cache_getroute,
- .dumpit   = dn_fib_dump,  },
-   [RTM_GETRULE  - RTM_BASE] = { .dumpit   = dn_fib_dump_rules,},
-#else
-   [RTM_GETROUTE - RTM_BASE] = {

[PATCH 10/12] [IPv6]: Use rtnl registration interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/include/net/ip6_fib.h
===
--- net-2.6.22.orig/include/net/ip6_fib.h   2007-03-22 13:23:12.0 
+0100
+++ net-2.6.22/include/net/ip6_fib.h2007-03-22 13:24:02.0 +0100
@@ -218,8 +218,6 @@ extern void fib6_init(void);
 
 extern voidfib6_rules_init(void);
 extern voidfib6_rules_cleanup(void);
-extern int fib6_rules_dump(struct sk_buff *,
-   struct netlink_callback *);
 
 #endif
 #endif
Index: net-2.6.22/net/ipv6/addrconf.c
===
--- net-2.6.22.orig/net/ipv6/addrconf.c 2007-03-22 13:23:12.0 +0100
+++ net-2.6.22/net/ipv6/addrconf.c  2007-03-22 13:24:02.0 +0100
@@ -3607,23 +3607,6 @@ errout:
rtnl_set_sk_err(RTNLGRP_IPV6_PREFIX, err);
 }
 
-static struct rtnetlink_link inet6_rtnetlink_table[RTM_NR_MSGTYPES] = {
-   [RTM_GETLINK - RTM_BASE] = { .dumpit= inet6_dump_ifinfo, },
-   [RTM_NEWADDR - RTM_BASE] = { .doit  = inet6_rtm_newaddr, },
-   [RTM_DELADDR - RTM_BASE] = { .doit  = inet6_rtm_deladdr, },
-   [RTM_GETADDR - RTM_BASE] = { .doit  = inet6_rtm_getaddr,
-.dumpit= inet6_dump_ifaddr, },
-   [RTM_GETMULTICAST - RTM_BASE] = { .dumpit = inet6_dump_ifmcaddr, },
-   [RTM_GETANYCAST - RTM_BASE] = { .dumpit = inet6_dump_ifacaddr, },
-   [RTM_NEWROUTE - RTM_BASE] = { .doit = inet6_rtm_newroute, },
-   [RTM_DELROUTE - RTM_BASE] = { .doit = inet6_rtm_delroute, },
-   [RTM_GETROUTE - RTM_BASE] = { .doit = inet6_rtm_getroute,
- .dumpit   = inet6_dump_fib, },
-#ifdef CONFIG_IPV6_MULTIPLE_TABLES
-   [RTM_GETRULE  - RTM_BASE] = { .dumpit   = fib6_rules_dump,   },
-#endif
-};
-
 static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
 {
inet6_ifa_notify(event ? : RTM_NEWADDR, ifp);
@@ -4135,7 +4118,18 @@ int __init addrconf_init(void)
register_netdevice_notifier(&ipv6_dev_notf);
 
addrconf_verify(0);
-   rtnetlink_links[PF_INET6] = inet6_rtnetlink_table;
+
+   err = __rtnl_register(PF_INET6, RTM_GETLINK, NULL, inet6_dump_ifinfo);
+   if (err < 0)
+   goto errout;
+
+   /* Only the first call to __rtnl_register can fail */
+   __rtnl_register(PF_INET6, RTM_NEWADDR, inet6_rtm_newaddr, NULL);
+   __rtnl_register(PF_INET6, RTM_DELADDR, inet6_rtm_deladdr, NULL);
+   __rtnl_register(PF_INET6, RTM_GETADDR, inet6_rtm_getaddr, 
inet6_dump_ifaddr);
+   __rtnl_register(PF_INET6, RTM_GETMULTICAST, NULL, inet6_dump_ifmcaddr);
+   __rtnl_register(PF_INET6, RTM_GETANYCAST, NULL, inet6_dump_ifacaddr);
+
 #ifdef CONFIG_SYSCTL
addrconf_sysctl.sysctl_header =
register_sysctl_table(addrconf_sysctl.addrconf_root_dir);
@@ -4143,6 +4137,10 @@ int __init addrconf_init(void)
 #endif
 
return 0;
+errout:
+   unregister_netdevice_notifier(&ipv6_dev_notf);
+
+   return err;
 }
 
 void __exit addrconf_cleanup(void)
@@ -4154,7 +4152,6 @@ void __exit addrconf_cleanup(void)
 
unregister_netdevice_notifier(&ipv6_dev_notf);
 
-   rtnetlink_links[PF_INET6] = NULL;
 #ifdef CONFIG_SYSCTL
addrconf_sysctl_unregister(&ipv6_devconf_dflt);
addrconf_sysctl_unregister(&ipv6_devconf);
Index: net-2.6.22/net/ipv6/fib6_rules.c
===
--- net-2.6.22.orig/net/ipv6/fib6_rules.c   2007-03-22 13:23:12.0 
+0100
+++ net-2.6.22/net/ipv6/fib6_rules.c2007-03-22 13:24:02.0 +0100
@@ -221,7 +221,7 @@ nla_put_failure:
return -ENOBUFS;
 }
 
-int fib6_rules_dump(struct sk_buff *skb, struct netlink_callback *cb)
+static int fib6_rules_dump(struct sk_buff *skb, struct netlink_callback *cb)
 {
return fib_rules_dump(skb, cb, AF_INET6);
 }
@@ -259,9 +259,11 @@ void __init fib6_rules_init(void)
list_add_tail(&main_rule.common.list, &fib6_rules);
 
fib_rules_register(&fib6_rules_ops);
+   __rtnl_register(PF_INET6, RTM_GETRULE, NULL, fib6_rules_dump);
 }
 
 void fib6_rules_cleanup(void)
 {
+   rtnl_unregister(PF_INET6, RTM_GETRULE);
fib_rules_unregister(&fib6_rules_ops);
 }
Index: net-2.6.22/include/net/ip6_route.h
===
--- net-2.6.22.orig/include/net/ip6_route.h 2007-03-22 13:23:12.0 
+0100
+++ net-2.6.22/include/net/ip6_route.h  2007-03-22 13:24:02.0 +0100
@@ -116,12 +116,7 @@ extern void
rt6_pmtu_discovery(struct 
   struct net_device *dev,
   u32 pmtu);
 
-struct nlmsghdr;
 struct netlink_callback;
-extern int

[PATCH 11/12] [BRIDGE]: Use rtnl registration interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/bridge/br.c
===
--- net-2.6.22.orig/net/bridge/br.c 2007-03-22 13:23:11.0 +0100
+++ net-2.6.22/net/bridge/br.c  2007-03-22 13:24:03.0 +0100
@@ -47,7 +47,10 @@ static int __init br_init(void)
if (err)
goto err_out2;
 
-   br_netlink_init();
+   err = br_netlink_init();
+   if (err)
+   goto err_out3;
+
brioctl_set(br_ioctl_deviceless_stub);
br_handle_frame_hook = br_handle_frame;
 
@@ -55,7 +58,8 @@ static int __init br_init(void)
br_fdb_put_hook = br_fdb_put;
 
return 0;
-
+err_out3:
+   unregister_netdevice_notifier(&br_device_notifier);
 err_out2:
br_netfilter_fini();
 err_out1:
Index: net-2.6.22/net/bridge/br_netlink.c
===
--- net-2.6.22.orig/net/bridge/br_netlink.c 2007-03-22 13:23:11.0 
+0100
+++ net-2.6.22/net/bridge/br_netlink.c  2007-03-22 13:24:03.0 +0100
@@ -11,8 +11,7 @@
  */
 
 #include 
-#include 
-#include 
+#include 
 #include "br_private.h"
 
 static inline size_t br_nlmsg_size(void)
@@ -179,18 +178,19 @@ static int br_rtm_setlink(struct sk_buff
 }
 
 
-static struct rtnetlink_link bridge_rtnetlink_table[RTM_NR_MSGTYPES] = {
-   [RTM_GETLINK - RTM_BASE] = { .dumpit= br_dump_ifinfo, },
-   [RTM_SETLINK - RTM_BASE] = { .doit  = br_rtm_setlink, },
-};
-
-void __init br_netlink_init(void)
+int __init br_netlink_init(void)
 {
-   rtnetlink_links[PF_BRIDGE] = bridge_rtnetlink_table;
+   if (__rtnl_register(PF_BRIDGE, RTM_GETLINK, NULL, br_dump_ifinfo))
+   return -ENOBUFS;
+
+   /* Only the first call to __rtnl_register can fail */
+   __rtnl_register(PF_BRIDGE, RTM_SETLINK, br_rtm_setlink, NULL);
+
+   return 0;
 }
 
 void __exit br_netlink_fini(void)
 {
-   rtnetlink_links[PF_BRIDGE] = NULL;
+   rtnl_unregister_all(PF_BRIDGE);
 }
 
Index: net-2.6.22/net/bridge/br_private.h
===
--- net-2.6.22.orig/net/bridge/br_private.h 2007-03-22 13:23:11.0 
+0100
+++ net-2.6.22/net/bridge/br_private.h  2007-03-22 13:24:03.0 +0100
@@ -235,7 +235,7 @@ extern void (*br_fdb_put_hook)(struct ne
 
 
 /* br_netlink.c */
-extern void br_netlink_init(void);
+extern int br_netlink_init(void);
 extern void br_netlink_fini(void);
 extern void br_ifinfo_notify(int event, struct net_bridge_port *port);
 

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 12/12] [RTNL]: Use rtnl registration interface for dump-all aliases

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/core/rtnetlink.c
===
--- net-2.6.22.orig/net/core/rtnetlink.c2007-03-22 13:23:11.0 
+0100
+++ net-2.6.22/net/core/rtnetlink.c 2007-03-22 13:24:04.0 +0100
@@ -958,12 +958,6 @@ static void rtnetlink_rcv(struct sock *s
} while (qlen);
 }
 
-static struct rtnetlink_link link_rtnetlink_table[RTM_NR_MSGTYPES] =
-{
-   [RTM_GETADDR - RTM_BASE] = { .dumpit = rtnl_dump_all },
-   [RTM_GETROUTE- RTM_BASE] = { .dumpit = rtnl_dump_all },
-};
-
 static int rtnetlink_event(struct notifier_block *this, unsigned long event, 
void *ptr)
 {
struct net_device *dev = ptr;
@@ -1013,6 +1007,9 @@ void __init rtnetlink_init(void)
 
rtnl_register(PF_UNSPEC, RTM_GETLINK, rtnl_getlink, rtnl_dump_ifinfo);
rtnl_register(PF_UNSPEC, RTM_SETLINK, rtnl_setlink, NULL);
+
+   rtnl_register(PF_UNSPEC, RTM_GETADDR, NULL, rtnl_dump_all);
+   rtnl_register(PF_UNSPEC, RTM_GETROUTE, NULL, rtnl_dump_all);
 }
 
 EXPORT_SYMBOL(__rta_fill);

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RESEND] [NET] rules: Unified rules dumping

2007-03-22 Thread Thomas Graf

Rediffed based on new rtnl registration patches.

Implements a unified, protocol independant rules dumping function
which is capable of both, dumping a specific protocol family or
all of them. This speeds up dumping as less lookups are required.

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/core/fib_rules.c
===
--- net-2.6.22.orig/net/core/fib_rules.c2007-03-22 13:23:10.0 
+0100
+++ net-2.6.22/net/core/fib_rules.c 2007-03-22 13:24:07.0 +0100
@@ -363,19 +363,15 @@ nla_put_failure:
return -EMSGSIZE;
 }
 
-int fib_rules_dump(struct sk_buff *skb, struct netlink_callback *cb, int 
family)
+static int dump_rules(struct sk_buff *skb, struct netlink_callback *cb,
+ struct fib_rules_ops *ops)
 {
int idx = 0;
struct fib_rule *rule;
-   struct fib_rules_ops *ops;
-
-   ops = lookup_rules_ops(family);
-   if (ops == NULL)
-   return -EAFNOSUPPORT;
 
rcu_read_lock();
list_for_each_entry(rule, ops->rules_list, list) {
-   if (idx < cb->args[0])
+   if (idx < cb->args[1])
goto skip;
 
if (fib_nl_fill_rule(skb, rule, NETLINK_CB(cb->skb).pid,
@@ -386,13 +382,44 @@ skip:
idx++;
}
rcu_read_unlock();
-   cb->args[0] = idx;
+   cb->args[1] = idx;
rules_ops_put(ops);
 
return skb->len;
 }
 
-EXPORT_SYMBOL_GPL(fib_rules_dump);
+static int fib_nl_dumprule(struct sk_buff *skb, struct netlink_callback *cb)
+{
+   struct fib_rules_ops *ops;
+   int idx = 0, family;
+
+   family = rtnl_msg_family(cb->nlh);
+   if (family != AF_UNSPEC) {
+   /* Protocol specific dump request */
+   ops = lookup_rules_ops(family);
+   if (ops == NULL)
+   return -EAFNOSUPPORT;
+
+   return dump_rules(skb, cb, ops);
+   }
+
+   rcu_read_lock();
+   list_for_each_entry_rcu(ops, &rules_ops, list) {
+   if (idx < cb->args[0] || !try_module_get(ops->owner))
+   goto skip;
+
+   if (dump_rules(skb, cb, ops) < 0)
+   break;
+
+   cb->args[1] = 0;
+   skip:
+   idx++;
+   }
+   rcu_read_unlock();
+   cb->args[0] = idx;
+
+   return skb->len;
+}
 
 static void notify_rule_change(int event, struct fib_rule *rule,
   struct fib_rules_ops *ops, struct nlmsghdr *nlh,
@@ -473,7 +500,7 @@ static int __init fib_rules_init(void)
 {
rtnl_register(PF_UNSPEC, RTM_NEWRULE, fib_nl_newrule, NULL);
rtnl_register(PF_UNSPEC, RTM_DELRULE, fib_nl_delrule, NULL);
-   rtnl_register(PF_UNSPEC, RTM_GETRULE, NULL, rtnl_dump_all);
+   rtnl_register(PF_UNSPEC, RTM_GETRULE, NULL, fib_nl_dumprule);
 
return register_netdevice_notifier(&fib_rules_notifier);
 }
Index: net-2.6.22/include/net/fib_rules.h
===
--- net-2.6.22.orig/include/net/fib_rules.h 2007-03-22 13:23:10.0 
+0100
+++ net-2.6.22/include/net/fib_rules.h  2007-03-22 13:24:07.0 +0100
@@ -97,7 +97,4 @@ extern intfib_rules_unregister(struct
 extern int fib_rules_lookup(struct fib_rules_ops *,
 struct flowi *, int flags,
 struct fib_lookup_arg *);
-
-extern int fib_rules_dump(struct sk_buff *,
-  struct netlink_callback *, int);
 #endif
Index: net-2.6.22/net/decnet/dn_rules.c
===
--- net-2.6.22.orig/net/decnet/dn_rules.c   2007-03-22 13:24:01.0 
+0100
+++ net-2.6.22/net/decnet/dn_rules.c2007-03-22 13:24:07.0 +0100
@@ -241,11 +241,6 @@ static u32 dn_fib_rule_default_pref(void
return 0;
 }
 
-static int dn_fib_dump_rules(struct sk_buff *skb, struct netlink_callback *cb)
-{
-   return fib_rules_dump(skb, cb, AF_DECnet);
-}
-
 static struct fib_rules_ops dn_fib_rules_ops = {
.family = AF_DECnet,
.rule_size  = sizeof(struct dn_fib_rule),
@@ -265,12 +260,10 @@ void __init dn_fib_rules_init(void)
 {
list_add_tail(&default_rule.common.list, &dn_fib_rules);
fib_rules_register(&dn_fib_rules_ops);
-   rtnl_register(PF_DECnet, RTM_GETRULE, NULL, dn_fib_dump_rules);
 }
 
 void __exit dn_fib_rules_cleanup(void)
 {
-   rtnl_unregister(PF_DECnet, RTM_GETRULE);
fib_rules_unregister(&dn_fib_rules_ops);
 }
 
Index: net-2.6.22/net/ipv4/fib_rules.c
===
--- net-2.6.22.orig/net/ipv4/fib_rules.c2007-03-22 13:23:53.0 
+0100
+++ net-2.6.22/net/ipv4/fib_rules.c 2007-0

Re: iproute2-2.6.20-070313 bug ?

2007-03-22 Thread Patrick McHardy

Denys wrote:
> /sbin/tc2 qdisc del dev ppp0 root 
> /sbin/tc2 qdisc add dev ppp0 root handle 1: prio 
> /sbin/tc2 qdisc add dev ppp0 parent 1:1 handle 2: tbf buffer 1024kb latency 
> 500ms rate 128kbit peakrate 256kbit minburst 16384 
> /sbin/tc2 filter add dev ppp0 parent 1:0 protocol ip prio 10 u32 match ip dst 
> 0.0.0.0/0 flowid 2:1 

That is an incredible huge buffer value.

> qdisc tbf 2: parent 1:1 rate 128000bit burst 4294932937b peakrate
256000bit minburst 16Kb lat 4.2s

And it causes an overflow.

The limit for the TBF burst value with nanosecond resolution is
~ 4 * rate (10^9 * burst / rate < 2^32 needs to hold), resoluting
in a worst-case latency of 4 seconds. I think this limit is in the
reasonable range. Your configuration results in a worst-case
queuing delay of 64s, and I doubt that you really want that.

Obviously its not good to break existing configurations, but I
would argue that this configuration is broken.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2-2.6.20-070313 bug ?

2007-03-22 Thread Patrick McHardy

Please don't remove CCs.

Denys wrote:
> 1024kb (if i am not wrong 1Mbyte) is huge?
> 
> For me it is ok, as soon as i have RAM.

Its not about the memory, its about the resulting queueing delay.
If you buffer packets for 64 seconds the sender will retransmit
them and you end up wasting bandwidth.

> Another thing, it is working well 
> with old tc. Just really if i have plenty of RAM's and i want 32second 
> buffer, why i cannot have that, and if i see it is really possible before?

I know it worked before. But I can't think of a reason why anyone
would want a buffer that large. Why do you want to queue packets
for up to 64 seconds?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2-2.6.20-070313 bug ?

2007-03-22 Thread Denys

Dear sir 

Sorry, i forgot to CC other members of discussion.

1024kb (if i am not wrong 1Mbyte) is huge? 

For me it is ok, as soon as i have RAM. Another thing, it is working well 
with old tc. Just really if i have plenty of RAM's and i want 32second 
buffer, why i cannot have that, and if i see it is really possible before? 

Possible i am misunderstanding something...
In real world i am seeing reasonable to have much bigger buffers, especially 
if there is no problem in resources (RAM, timer resolution, CPU). For 
example, as i remember we had failure on one of our STM-1, and Cisco's on 
Teleglobe was buffering about 20-30seconds of data without major packetloss. 

Another thing, why i was using buffer, and possible i use it wrong:
For example customer have 128Kbit/s account, and i want to give him burst to 
open web-pages fast (256Kbit/s), but if he use bandwidth non-stop, he will 
pass this buffer, and will be throttled back to 128Kbit/s. Now seems i cannot 
give such functionality.

On Thu, 22 Mar 2007 14:09:06 +0100, Patrick McHardy wrote
> Denys wrote:
> > /sbin/tc2 qdisc del dev ppp0 root 
> > /sbin/tc2 qdisc add dev ppp0 root handle 1: prio 
> > /sbin/tc2 qdisc add dev ppp0 parent 1:1 handle 2: tbf buffer 1024kb 
latency 
> > 500ms rate 128kbit peakrate 256kbit minburst 16384 
> > /sbin/tc2 filter add dev ppp0 parent 1:0 protocol ip prio 10 u32 match ip 
dst 
> > 0.0.0.0/0 flowid 2:1
> 
> That is an incredible huge buffer value.
> 
> > qdisc tbf 2: parent 1:1 rate 128000bit burst 4294932937b peakrate
> 256000bit minburst 16Kb lat 4.2s
> 
> And it causes an overflow.
> 
> The limit for the TBF burst value with nanosecond resolution is
> ~ 4 * rate (10^9 * burst / rate < 2^32 needs to hold), resoluting
> in a worst-case latency of 4 seconds. I think this limit is in the
> reasonable range. Your configuration results in a worst-case
> queuing delay of 64s, and I doubt that you really want that.
> 
> Obviously its not good to break existing configurations, but I
> would argue that this configuration is broken.

--
Virtual ISP S.A.L.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Boletin Informativo de GS Mailing List Notification

2007-03-22 Thread Boletin Informativo de GS

Boletin Informativo de GS Mailing List created a new information Group, called 
Internet. 
If you want to receive informations about it, you can modify your account here:
http://www.graficastudio.com/boletin/ccmail/[EMAIL PROTECTED]

For any problem, please contact us here:
[EMAIL PROTECTED]

---
Powered by CcMail 1.0
http://www.cicoandcico.com/products.php?option=ccmail

Unsubscribe/Modify: 
http://www.graficastudio.com/boletin/[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2-2.6.20-070313 bug ?

2007-03-22 Thread Denys

On Thu, 22 Mar 2007 14:23:01 +0100, Patrick McHardy wrote
> Please don't remove CCs.
> 
> Denys wrote:
> > 1024kb (if i am not wrong 1Mbyte) is huge?
> > 
> > For me it is ok, as soon as i have RAM.
> 
> Its not about the memory, its about the resulting queueing delay.
> If you buffer packets for 64 seconds the sender will retransmit
> them and you end up wasting bandwidth.
> 
> > Another thing, it is working well 
> > with old tc. Just really if i have plenty of RAM's and i want 32second 
> > buffer, why i cannot have that, and if i see it is really possible before?
> 
> I know it worked before. But I can't think of a reason why anyone
> would want a buffer that large. Why do you want to queue packets
> for up to 64 seconds?
Seems i misunderstand how it works. If i am not wrong, till buffer available, 
bandwidth will be given on "peakrate" speed, and when buffer is empty - on 
"rate" speed. I am wrong?
At least it was working like this before.

--
Virtual ISP S.A.L.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2-2.6.20-070313 bug ?

2007-03-22 Thread Patrick McHardy

Denys wrote:
>>>Another thing, it is working well 
>>>with old tc. Just really if i have plenty of RAM's and i want 32second 
>>>buffer, why i cannot have that, and if i see it is really possible before?
>>
>>I know it worked before. But I can't think of a reason why anyone
>>would want a buffer that large. Why do you want to queue packets
>>for up to 64 seconds?
> 
> Seems i misunderstand how it works. If i am not wrong, till buffer available, 
> bandwidth will be given on "peakrate" speed, and when buffer is empty - on 
> "rate" speed. I am wrong?

No, I got confused, sorry about that. Your configuration allows bursts
up to 64 seconds long. I guess there's nothing wrong with that.

I already asked Stephen to revert that patch, it was not meant to
be included yet, unfortunately it made it into the release. Even
more unfortunate is that it looks like we need larger types in the
ABI to properly support nano-second resolution.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

VIA Velocity VLAN vexation

2007-03-22 Thread linux

I have a machine (x86-32, 2.6.20.3) with two ethernet interfaces:
a 100M Tulip and a 1G VIA Velocity.  Both are connected to a common
VLAN-capable switch.  The eventually desired configuration is VLAN
support on the Gbit interface.

If I set the Tulip's switch port to tagged, and configure a VLAN on the
Tulip interface appropriately, packets flow as expected.

But if I try the same configuration on the Velocity interface, things
don't work.
I can see tagged ICMP pings go out, but no responses come back.
I can see ARP requests and responses on the target machine.
If I manually configure the ARP caches, I can see the pings and responses
on the target machine.
If I kludge the target's ARP cache to point back to the source's Tulip
interface, I can see the ping responses on the Tulip interface.

But I don't see the ping responses on the Velocity interface.

The vlan interface name and address is the same, so it can't be
firewall rules distinguishing.

I have tried various ping sizes from 0 to 1472.


Is this likely to be a problem with the via-velocity driver?
Is anyone working on it?  Or should I just get a different gigabit card?

Thanks for any advice!

00:09.0 Ethernet controller [0200]: VIA Technologies, Inc. VT6120/VT6121/VT6122 
Gigabit Ethernet Adapter [1106:3119] (rev 11)
00:0d.0 PCI bridge [0604]: Digital Equipment Corporation DECchip 21152 
[1011:0024] (rev 03)
02:04.0 Ethernet controller [0200]: Digital Equipment Corporation DECchip 
21142/43 [1011:0019] (rev 41)
02:05.0 Ethernet controller [0200]: Digital Equipment Corporation DECchip 
21142/43 [1011:0019] (rev 41)
02:06.0 Ethernet controller [0200]: Digital Equipment Corporation DECchip 
21142/43 [1011:0019] (rev 41)
02:07.0 Ethernet controller [0200]: Digital Equipment Corporation DECchip 
21142/43 [1011:0019] (rev 41)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2-2.6.20-070313 bug ?

2007-03-22 Thread Denys

Hi again

On Thu, 22 Mar 2007 14:43:43 +0100, Patrick McHardy wrote
> Denys wrote:
> >>>Another thing, it is working well 
> >>>with old tc. Just really if i have plenty of RAM's and i want 32second 
> >>>buffer, why i cannot have that, and if i see it is really possible 
before?
> >>
> >>I know it worked before. But I can't think of a reason why anyone
> >>would want a buffer that large. Why do you want to queue packets
> >>for up to 64 seconds?
> > 
> > Seems i misunderstand how it works. If i am not wrong, till buffer 
available, 
> > bandwidth will be given on "peakrate" speed, and when buffer is empty - 
on 
> > "rate" speed. I am wrong?
> 
> No, I got confused, sorry about that. Your configuration allows 
> bursts up to 64 seconds long. I guess there's nothing wrong with that.
> 
> I already asked Stephen to revert that patch, it was not meant to
> be included yet, unfortunately it made it into the release. Even
> more unfortunate is that it looks like we need larger types in the
> ABI to properly support nano-second resolution.

Thanks sir for your great help and great software and sorry that i took time, 
by not enough good explanation. 

--
Virtual ISP S.A.L.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 09/12] [DECNet]: Use rtnl registration interface

2007-03-22 Thread Steven Whitehouse

Hi,

On Thu, Mar 22, 2007 at 02:00:04PM +0100, Thomas Graf wrote:
> Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>
>
Acked-by: Steven Whitehouse <[EMAIL PROTECTED]>

for all the DECnet bits & also the DECnet changes in the other patch I saw
from you relating to the routing rules,

Steve.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] netxen: enum and #define cleanups

2007-03-22 Thread Amit Kale

On Thursday 22 March 2007 00:46, Andy Gospodarek wrote:
> This patch cleans up some rather generically named items in the netxen
> driver.  It seems bad to use names like USER_START and FLASH_TOTAL_SIZE,
> so I added a NETXEN_ to the front of them.
>
> This has been compile tested.
>
> Signed-off-by: Andy Gospodarek <[EMAIL PROTECTED]>
> ---
>
>  netxen_nic.h |   51
> ++- netxen_nic_ethtool.c | 
>   8 
>  netxen_nic_hw.c  |   10 +-
>  netxen_nic_init.c|   23 ---
>  4 files changed, 47 insertions(+), 45 deletions(-)
>
> diff --git a/drivers/net/netxen/netxen_nic.h
> b/drivers/net/netxen/netxen_nic.h index dd8ce35..8310584 100644
> --- a/drivers/net/netxen/netxen_nic.h
> +++ b/drivers/net/netxen/netxen_nic.h
> @@ -65,12 +65,13 @@
>
>  #define _NETXEN_NIC_LINUX_MAJOR 3
>  #define _NETXEN_NIC_LINUX_MINOR 3
> -#define _NETXEN_NIC_LINUX_SUBVERSION 3
> -#define NETXEN_NIC_LINUX_VERSIONID  "3.3.3"
> +#define _NETXEN_NIC_LINUX_SUBVERSION 4
> +#define NETXEN_NIC_LINUX_VERSIONID  "3.3.4"

This is the firmware identifier.  Shouldn't be changed.

Rest looks fine. Thanks.
-Amit


> -#define NUM_FLASH_SECTORS (64)
> -#define FLASH_SECTOR_SIZE (64 * 1024)
> -#define FLASH_TOTAL_SIZE  (NUM_FLASH_SECTORS * FLASH_SECTOR_SIZE)
> +#define NETXEN_NUM_FLASH_SECTORS (64)
> +#define NETXEN_FLASH_SECTOR_SIZE (64 * 1024)
> +#define NETXEN_FLASH_TOTAL_SIZE  (NETXEN_NUM_FLASH_SECTORS \
> + * NETXEN_FLASH_SECTOR_SIZE)
>
>  #define PHAN_VENDOR_ID 0x4040
>
> @@ -671,28 +672,28 @@ struct netxen_new_user_info {
>
>  /* Flash memory map */
>  typedef enum {
> - CRBINIT_START = 0,  /* Crbinit section */
> - BRDCFG_START = 0x4000,  /* board config */
> - INITCODE_START = 0x6000,/* pegtune code */
> - BOOTLD_START = 0x1, /* bootld */
> - IMAGE_START = 0x43000,  /* compressed image */
> - SECONDARY_START = 0x20, /* backup images */
> - PXE_START = 0x3E,   /* user defined region */
> - USER_START = 0x3E8000,  /* User defined region for new boards */
> - FIXED_START = 0x3F  /* backup of crbinit */
> + NETXEN_CRBINIT_START = 0,   /* Crbinit section */
> + NETXEN_BRDCFG_START = 0x4000,   /* board config */
> + NETXEN_INITCODE_START = 0x6000, /* pegtune code */
> + NETXEN_BOOTLD_START = 0x1,  /* bootld */
> + NETXEN_IMAGE_START = 0x43000,   /* compressed image */
> + NETXEN_SECONDARY_START = 0x20,  /* backup images */
> + NETXEN_PXE_START = 0x3E,/* user defined region */
> + NETXEN_USER_START = 0x3E8000,   /* User defined region for new boards */
> + NETXEN_FIXED_START = 0x3F   /* backup of crbinit */
>  } netxen_flash_map_t;
>
> -#define USER_START_OLD PXE_START /* for backward compatibility */
> -
> -#define FLASH_START  (CRBINIT_START)
> -#define INIT_SECTOR  (0)
> -#define PRIMARY_START(BOOTLD_START)
> -#define FLASH_CRBINIT_SIZE   (0x4000)
> -#define FLASH_BRDCFG_SIZE(sizeof(struct netxen_board_info))
> -#define FLASH_USER_SIZE  (sizeof(struct 
> netxen_user_info)/sizeof(u32))
> -#define FLASH_SECONDARY_SIZE (USER_START-SECONDARY_START)
> -#define NUM_PRIMARY_SECTORS  (0x20)
> -#define NUM_CONFIG_SECTORS   (1)
> +#define NETXEN_USER_START_OLD NETXEN_PXE_START   /* for backward
> compatibility */ +
> +#define NETXEN_FLASH_START   (NETXEN_CRBINIT_START)
> +#define NETXEN_INIT_SECTOR   (0)
> +#define NETXEN_PRIMARY_START (NETXEN_BOOTLD_START)
> +#define NETXEN_FLASH_CRBINIT_SIZE(0x4000)
> +#define NETXEN_FLASH_BRDCFG_SIZE (sizeof(struct netxen_board_info))
> +#define NETXEN_FLASH_USER_SIZE   (sizeof(struct
> netxen_user_info)/sizeof(u32)) +#define NETXEN_FLASH_SECONDARY_SIZE
>   (NETXEN_USER_START-NETXEN_SECONDARY_START) +#define
> NETXEN_NUM_PRIMARY_SECTORS(0x20)
> +#define NETXEN_NUM_CONFIG_SECTORS(1)
>  #define PFX "NetXen: "
>  extern char netxen_nic_driver_name[];
>
> diff --git a/drivers/net/netxen/netxen_nic_ethtool.c
> b/drivers/net/netxen/netxen_nic_ethtool.c index ee1b5a2..4dfa76b 100644
> --- a/drivers/net/netxen/netxen_nic_ethtool.c
> +++ b/drivers/net/netxen/netxen_nic_ethtool.c
> @@ -94,7 +94,7 @@ static const char
> netxen_nic_gstrings_test[][ETH_GSTRING_LEN] = {
>
>  static int netxen_nic_get_eeprom_len(struct net_device *dev)
>  {
> - return FLASH_TOTAL_SIZE;
> + return NETXEN_FLASH_TOTAL_SIZE;
>  }
>
>  static void
> @@ -475,7 +475,7 @@ netxen_nic_set_eeprom(struct net_device *dev, struct
> ethtool_eeprom *eeprom, return 0;
>   }
>
> - if (offset == BOOTLD_START) {
> + if (offset == NETXEN_BOOTLD_START) {
>   ret = netxen_flash_erase_primary(adapter);
>   if (ret != FLASH_SUCCESS) {
>   printk(KERN_ERR "%s: Flash erase failed.\n",
> @@ -483,10 +483,10 @@ netxen_nic_set_eeprom(st

[NET]: Fix fib_rules dump race

2007-03-22 Thread Patrick McHardy

[NET]: Fix fib_rules dump race

fib_rules_dump needs to use list_for_each_entry_rcu to protect against
concurrent changes to the rules list.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 215f1bf..3aea4e8 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -374,7 +374,7 @@ int fib_rules_dump(struct sk_buff *skb, struct 
netlink_callback *cb, int family)
return -EAFNOSUPPORT;
 
rcu_read_lock();
-   list_for_each_entry(rule, ops->rules_list, list) {
+   list_for_each_entry_rcu(rule, ops->rules_list, list) {
if (idx < cb->args[0])
goto skip;

Re: [PATCH] netxen: enum and #define cleanups

2007-03-22 Thread Andy Gospodarek

On Thu, Mar 22, 2007 at 07:59:05PM +0530, Amit Kale wrote:
> On Thursday 22 March 2007 00:46, Andy Gospodarek wrote:
> > This patch cleans up some rather generically named items in the netxen
> > driver.  It seems bad to use names like USER_START and FLASH_TOTAL_SIZE,
> > so I added a NETXEN_ to the front of them.
> >
> > This has been compile tested.
> >
> > Signed-off-by: Andy Gospodarek <[EMAIL PROTECTED]>
> > ---
> >
> >  netxen_nic.h |   51
> > ++- netxen_nic_ethtool.c | 
> >   8 
> >  netxen_nic_hw.c  |   10 +-
> >  netxen_nic_init.c|   23 ---
> >  4 files changed, 47 insertions(+), 45 deletions(-)
> >
> > diff --git a/drivers/net/netxen/netxen_nic.h
> > b/drivers/net/netxen/netxen_nic.h index dd8ce35..8310584 100644
> > --- a/drivers/net/netxen/netxen_nic.h
> > +++ b/drivers/net/netxen/netxen_nic.h
> > @@ -65,12 +65,13 @@
> >
> >  #define _NETXEN_NIC_LINUX_MAJOR 3
> >  #define _NETXEN_NIC_LINUX_MINOR 3
> > -#define _NETXEN_NIC_LINUX_SUBVERSION 3
> > -#define NETXEN_NIC_LINUX_VERSIONID  "3.3.3"
> > +#define _NETXEN_NIC_LINUX_SUBVERSION 4
> > +#define NETXEN_NIC_LINUX_VERSIONID  "3.3.4"
> 
> This is the firmware identifier.  Shouldn't be changed.
> 
> Rest looks fine. Thanks.
> -Amit
> 

Thanks, Amit.  Here's a repost without thost bits.


Signed-off-by: Andy Gospodarek <[EMAIL PROTECTED]>
---

 netxen_nic.h |   47 ---
 netxen_nic_ethtool.c |8 
 netxen_nic_hw.c  |   10 +-
 netxen_nic_init.c|   23 ---
 4 files changed, 45 insertions(+), 43 deletions(-)

diff --git a/drivers/net/netxen/netxen_nic.h b/drivers/net/netxen/netxen_nic.h
index dd8ce35..d5f0c06 100644
--- a/drivers/net/netxen/netxen_nic.h
+++ b/drivers/net/netxen/netxen_nic.h
@@ -68,9 +68,10 @@
 #define _NETXEN_NIC_LINUX_SUBVERSION 3
 #define NETXEN_NIC_LINUX_VERSIONID  "3.3.3"
 
-#define NUM_FLASH_SECTORS (64)
-#define FLASH_SECTOR_SIZE (64 * 1024)
-#define FLASH_TOTAL_SIZE  (NUM_FLASH_SECTORS * FLASH_SECTOR_SIZE)
+#define NETXEN_NUM_FLASH_SECTORS (64)
+#define NETXEN_FLASH_SECTOR_SIZE (64 * 1024)
+#define NETXEN_FLASH_TOTAL_SIZE  (NETXEN_NUM_FLASH_SECTORS \
+   * NETXEN_FLASH_SECTOR_SIZE)
 
 #define PHAN_VENDOR_ID 0x4040
 
@@ -671,28 +672,28 @@ struct netxen_new_user_info {
 
 /* Flash memory map */
 typedef enum {
-   CRBINIT_START = 0,  /* Crbinit section */
-   BRDCFG_START = 0x4000,  /* board config */
-   INITCODE_START = 0x6000,/* pegtune code */
-   BOOTLD_START = 0x1, /* bootld */
-   IMAGE_START = 0x43000,  /* compressed image */
-   SECONDARY_START = 0x20, /* backup images */
-   PXE_START = 0x3E,   /* user defined region */
-   USER_START = 0x3E8000,  /* User defined region for new boards */
-   FIXED_START = 0x3F  /* backup of crbinit */
+   NETXEN_CRBINIT_START = 0,   /* Crbinit section */
+   NETXEN_BRDCFG_START = 0x4000,   /* board config */
+   NETXEN_INITCODE_START = 0x6000, /* pegtune code */
+   NETXEN_BOOTLD_START = 0x1,  /* bootld */
+   NETXEN_IMAGE_START = 0x43000,   /* compressed image */
+   NETXEN_SECONDARY_START = 0x20,  /* backup images */
+   NETXEN_PXE_START = 0x3E,/* user defined region */
+   NETXEN_USER_START = 0x3E8000,   /* User defined region for new boards */
+   NETXEN_FIXED_START = 0x3F   /* backup of crbinit */
 } netxen_flash_map_t;
 
-#define USER_START_OLD PXE_START   /* for backward compatibility */
-
-#define FLASH_START(CRBINIT_START)
-#define INIT_SECTOR(0)
-#define PRIMARY_START  (BOOTLD_START)
-#define FLASH_CRBINIT_SIZE (0x4000)
-#define FLASH_BRDCFG_SIZE  (sizeof(struct netxen_board_info))
-#define FLASH_USER_SIZE(sizeof(struct 
netxen_user_info)/sizeof(u32))
-#define FLASH_SECONDARY_SIZE   (USER_START-SECONDARY_START)
-#define NUM_PRIMARY_SECTORS(0x20)
-#define NUM_CONFIG_SECTORS (1)
+#define NETXEN_USER_START_OLD NETXEN_PXE_START /* for backward compatibility */
+
+#define NETXEN_FLASH_START (NETXEN_CRBINIT_START)
+#define NETXEN_INIT_SECTOR (0)
+#define NETXEN_PRIMARY_START   (NETXEN_BOOTLD_START)
+#define NETXEN_FLASH_CRBINIT_SIZE  (0x4000)
+#define NETXEN_FLASH_BRDCFG_SIZE   (sizeof(struct netxen_board_info))
+#define NETXEN_FLASH_USER_SIZE (sizeof(struct 
netxen_user_info)/sizeof(u32))
+#define NETXEN_FLASH_SECONDARY_SIZE
(NETXEN_USER_START-NETXEN_SECONDARY_START)
+#define NETXEN_NUM_PRIMARY_SECTORS (0x20)
+#define NETXEN_NUM_CONFIG_SECTORS  (1)
 #define PFX "NetXen: "
 extern char netxen_nic_driver_name[];
 
diff --git a/drivers/net/netxen/netxen_nic_ethtool.c 
b/drivers/net/netxen/netxen_nic_ethtool.c
index ee1b5a2..4dfa76b 100644
--- a/drivers/net/netxen/netxen_nic_ethtool.c
+++ b/drivers/net/net

RFC: Established connections hash function

2007-03-22 Thread Nikolaos D. Bougalis


   Hello,

   I have noticed that the hash function that the kernel uses for
established TCP/IP connections is rather simplistic, specifically:

   h = (local address ^ local_port) ^ (remote_address ^ remote_port);
   h ^= h >> 16;
   h ^= h >> 8;

   Now, simple is great, but this has a number of issues, not the least of
which is that an attacker can very easily cause collisions and force
extremely long chain lengths, a situation that becomes worse the more
distinct IP addresses and listening ports a box has.

   Consider, for example, a box that has 20 ports open and 4 consecutive IP
addresses. An attacker that has an entire class C available can create
24,576 connections that hash to the same value, resulting in a ridiculously
overlong chain. With servers that do virtual hosting and have dozens of IPs,
the situation can become much worse very fast.

   This particular hash seems to be the odd-man out, since most other
network related hashes in the kernel seem to be Jenkins-based, and some use
tagged hashing to defeat algorithmic complexity attacks. For example, the
route hash uses this:

static unsigned int rt_hash_rnd;

static unsigned int rt_hash_code(u32 daddr, u32 saddr)
{
   return (jhash_2words(daddr, saddr, rt_hash_rnd)
   & rt_hash_mask);
}

   With this in mind, I propose the following replacement for inet_ehashfn,
which defeats algorithmic complexity attacks and achieves excellent
distribution:

unsigned int inet_ehashfn(const __be32 laddr, const __u16 lport,
 const __be32 faddr, const __be16 fport)
{
   return jhash_3words((__force __u32)faddr, (__force __u32)laddr,
   (((__force __u32)fport) << 16) + lport,
   inet_ehash_rnd);
}

   where inet_ehash_rnd is initialized once in tcp_init to a random 32-bit
value.

   I will be more than happy to provide a patch for this, but I figured I
would solicit some input first.

   Nik B.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC: Established connections hash function

2007-03-22 Thread Evgeniy Polyakov

On Thu, Mar 22, 2007 at 08:39:04AM -0700, Nikolaos D. Bougalis ([EMAIL 
PROTECTED]) wrote:
>This particular hash seems to be the odd-man out, since most other
> network related hashes in the kernel seem to be Jenkins-based, and some use
> tagged hashing to defeat algorithmic complexity attacks. For example, the
> route hash uses this:

It seems you do not know a history...
It is the fastest and actually the best hash for that workloads where it
is used, but unfortunately it is too simple for attacker to predict end
result.

> static unsigned int rt_hash_rnd;
> 
> static unsigned int rt_hash_code(u32 daddr, u32 saddr)
> {
>return (jhash_2words(daddr, saddr, rt_hash_rnd)
>& rt_hash_mask);
> }
> 
>With this in mind, I propose the following replacement for inet_ehashfn,
> which defeats algorithmic complexity attacks and achieves excellent
> distribution:
> 
> unsigned int inet_ehashfn(const __be32 laddr, const __u16 lport,
>  const __be32 faddr, const __be16 fport)
> {
>return jhash_3words((__force __u32)faddr, (__force __u32)laddr,
>(((__force __u32)fport) << 16) + lport,
>inet_ehash_rnd);
> }

And this is utterly broken. For more details please read netdev@
archives and trivial analysis of jhash_3words().

We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) though.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

L2TP support?

2007-03-22 Thread James Chapman


Is there interest in adding L2TP support?

I have a patch which could be submitted for review. The PPPoL2TP driver
presents a PPPoX socket to userspace pppd in the same way as the PPPoE
and PPPoATM drivers. The kernel handles all data traffic, while
userspace daemons do L2TP and PPP control message processing.

I posted an initial version of the patch in Sept 2004 (!). There was
some discussion about the use of PPPoX for L2TP in the following thread:

http://marc.info/?l=linux-netdev&m=109571479604766&w=2

In that thread, I was asked to improve the scalability of my solution by
avoiding the socket-per-session usage imposed by the PPPoX model. I
spent some time on this but was unable to come up with a better solution
than the original. Since then, the original PPPoX-based solution has
matured and is now use successfully by more than one L2TP protocol
implementation. My original work has been kept up to date with the
current kernel so could be submitted again if people want it. I didn't
want to repost it without revisiting the above thread first though... :)

Shall I post the patch?

--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development




-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2-2.6.20-070313 bug ?

2007-03-22 Thread Stephen Hemminger

On Thu, 22 Mar 2007 14:09:06 +0100
Patrick McHardy <[EMAIL PROTECTED]> wrote:

> Denys wrote:
> > /sbin/tc2 qdisc del dev ppp0 root 
> > /sbin/tc2 qdisc add dev ppp0 root handle 1: prio 
> > /sbin/tc2 qdisc add dev ppp0 parent 1:1 handle 2: tbf buffer 1024kb latency 
> > 500ms rate 128kbit peakrate 256kbit minburst 16384 
> > /sbin/tc2 filter add dev ppp0 parent 1:0 protocol ip prio 10 u32 match ip 
> > dst 
> > 0.0.0.0/0 flowid 2:1 
> 
> 
> That is an incredible huge buffer value.
> 
> > qdisc tbf 2: parent 1:1 rate 128000bit burst 4294932937b peakrate
> 256000bit minburst 16Kb lat 4.2s
> 
> And it causes an overflow.
> 
> The limit for the TBF burst value with nanosecond resolution is
> ~ 4 * rate (10^9 * burst / rate < 2^32 needs to hold), resoluting
> in a worst-case latency of 4 seconds. I think this limit is in the
> reasonable range. Your configuration results in a worst-case
> queuing delay of 64s, and I doubt that you really want that.
> 
> Obviously its not good to break existing configurations, but I
> would argue that this configuration is broken.
> 


tc should check for overflows and doesn't. Do you want to make a patch
for the obvious cases?

-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2-2.6.20-070313 bug ?

2007-03-22 Thread Patrick McHardy

Stephen Hemminger wrote:
> tc should check for overflows and doesn't. Do you want to make a patch
> for the obvious cases?

I agree. I'll take care of it once I'm done with my patches for 2.6.22.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC: Established connections hash function

2007-03-22 Thread Nikolaos D. Bougalis

On Thu, March 22, 2007 at 8:52 AM, Evgeniy Polyakov <[EMAIL PROTECTED]> 
wrote:



It seems you do not know a history...


   I know a lot about history. I may not know the specific history you had 
in mind though.


   I do see now that this has been brought up before. Before posting, I did 
search the archives, but obviously not closely enough. I blame it on the 
early morning hour.




It is the fastest and actually the best hash for that workloads where it
is used, but unfortunately it is too simple for attacker to predict end
result.


   Yes, the distribution of the vanilla function is decent, and yes it's 
very fast. But all that won't help you when chains start having tens of 
thousands of items, and you have to iterate through them constantly. But I 
guess that if it makes you feel better, you can call that "unfortunate."




unsigned int inet_ehashfn(const __be32 laddr, const __u16 lport,
 const __be32 faddr, const __be16 fport)
{
   return jhash_3words((__force __u32)faddr, (__force __u32)laddr,
   (((__force __u32)fport) << 16) + lport,
   inet_ehash_rnd);
}


And this is utterly broken. For more details please read netdev@
archives and trivial analysis of jhash_3words().


   Utterly broken? Nonsense. I have tested the actual function I proposed 
(sans the __force and __u32 stuff, which weren't necessary in my test 
program), against real data, collected from various servers in real-time. It 
has consistently achieved lower average chain lengths than the vanilla 
function and demonstrated no artifacting, and that's trivial to verify.


   The only analysis I could find was this 
http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14, which uses 
jhash_2words, and not jhash_3words, and which naively attempts to take the 
output of jhash_2words, and to perform the same mixing trick that the 
vanilla inet_ehashfn does and uses artificially generated data sets.


   But please, feel free to point out any other _unfavorable_ analyses of 
jhash_2words or jhash_3words that I may have missed.




We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) though.


   Please explain to me how jhash_2words solves the issue that you claim 
jhash_3words has, when they both use the same underlying bit-mixer?


   -n 



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: L2TP support?

2007-03-22 Thread Ingo Oeser

Hi James,

James Chapman schrieb:
> Is there interest in adding L2TP support?

Yes, if there is also a user space part somewhere.
 
> I have a patch which could be submitted for review. The PPPoL2TP driver
> presents a PPPoX socket to userspace pppd in the same way as the PPPoE
> and PPPoATM drivers. The kernel handles all data traffic, while
> userspace daemons do L2TP and PPP control message processing.

Like the pppoe-plugin for pppd? 

> In that thread, I was asked to improve the scalability of my solution by
> avoiding the socket-per-session usage imposed by the PPPoX model. 

Since that is imposed by your generic upper layer (PPPoX) this is no valid
argument against your code. (Yes, I've read that thread :-))

> Shall I post the patch?

Yes, please check your patch using the nice checklist in 
Documentation/SubmitChecklist and do it in suitable chunks 
according to Documentation/SubmittingPatches

I'm looking forward to review it!

Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC: Established connections hash function

2007-03-22 Thread Evgeniy Polyakov

On Thu, Mar 22, 2007 at 10:32:44AM -0700, Nikolaos D. Bougalis ([EMAIL 
PROTECTED]) wrote:
>Utterly broken? Nonsense. I have tested the actual function I proposed 
> (sans the __force and __u32 stuff, which weren't necessary in my test 
> program), against real data, collected from various servers in real-time. 
> It has consistently achieved lower average chain lengths than the vanilla 
> function and demonstrated no artifacting, and that's trivial to verify.

So what?
People test and work with XOR hash for years and they do not strike any
problems. If we talk about specially crafted data, then XOR one is no
worse than Jenkins with 3 words (which is even worse for blind attack of 
constant ports).

>The only analysis I could find was this 
> http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14, which uses 
> jhash_2words, and not jhash_3words, and which naively attempts to take the 
> output of jhash_2words, and to perform the same mixing trick that the 
> vanilla inet_ehashfn does and uses artificially generated data sets.

It is outdated, check recent netdev@ archives. Folding used in that test
does not change distribution, and data was presented as it can be
selected by attacker, who can create with any distribution.

>But please, feel free to point out any other _unfavorable_ analyses of 
> jhash_2words or jhash_3words that I may have missed.
> 
> 
> >We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) though.
> 
>Please explain to me how jhash_2words solves the issue that you claim 
> jhash_3words has, when they both use the same underlying bit-mixer?

$c value is not properly distributed and significanly breaks overall
distribution. Attacker, which controls $c (and it does it by controlling 
ports), can significantly increase selected hash chains.

But it is only $c, $a and $b are properly distributed, so jhash_2words()
is safer than jhash_3words().
Just create a simple application which does
jhash_3words(a, b, rand(), init) and jhash_2words(a, b, rand()) and see
results.

I want to emphasize: it is not about random generator, but about data,
controlled by attacker, which can select it like in tests. $c in
jhash_3words() is the weakest part, jhash_2words() works much better in
that regard.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[BRIDGE]: Fix fdb RCU race

2007-03-22 Thread Patrick McHardy

Fix what looks like a RCU race. Untested since this is only
used by ATM, which I don't have.

[BRIDGE]: Fix fdb RCU race

br_fdb_get use atomic_inc to increase the refcount of an element found
on a RCU protected list, which can lead to the following race:

CPU0CPU1

br_fdb_get:   rcu_read_lock
__br_fdb_get: find element
fdb_delete:   hlist_del_rcu
  br_fdb_put
br_fdb_put:   atomic_dec_and_test
  call_rcu(fdb_rcu_free)br_fdb_get:   atomic_inc
  rcu_read_unlock
fdb_rcu_free: kmem_cache_free

Use atomic_inc_not_zero instead.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

---
commit 6965873e9db0cb3f9a8412bd541a5309dcfb6eb6
tree 152e90dc86fe96ca7cb8f0e280827920ddb62247
parent 8559840c4ca3f2fff73a882803bc8916078fac1f
author Patrick McHardy <[EMAIL PROTECTED]> Thu, 22 Mar 2007 19:20:08 +0100
committer Patrick McHardy <[EMAIL PROTECTED]> Thu, 22 Mar 2007 19:20:08 +0100

 net/bridge/br_fdb.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index def2e40..8d566c1 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -197,8 +197,8 @@ struct net_bridge_fdb_entry *br_fdb_get(struct net_bridge 
*br,
 
rcu_read_lock();
fdb = __br_fdb_get(br, addr);
-   if (fdb)
-   atomic_inc(&fdb->use_count);
+   if (fdb && !atomic_inc_not_zero(&fdb->use_count))
+   fdb = NULL;
rcu_read_unlock();
return fdb;
 }

Re: [BRIDGE]: Fix fdb RCU race

2007-03-22 Thread Stephen Hemminger

On Thu, 22 Mar 2007 19:29:09 +0100
Patrick McHardy <[EMAIL PROTECTED]> wrote:

> Fix what looks like a RCU race. Untested since this is only
> used by ATM, which I don't have.
> 

That looks right, I wonder if the ATM hooks ever get used?

-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANN] Unified dynamic storage for different socket types instead of separate hash tables.

2007-03-22 Thread Stephen Hemminger


> 
> diff --git a/include/linux/netlink.h b/include/linux/netlink.h
> index 2a20f48..f11b4e7 100644
> --- a/include/linux/netlink.h
> +++ b/include/linux/netlink.h
> @@ -151,7 +151,6 @@ struct netlink_skb_parms
>  #define NETLINK_CB(skb)  (*(struct 
> netlink_skb_parms*)&((skb)->cb))
>  #define NETLINK_CREDS(skb)   (&NETLINK_CB((skb)).creds)
>  
> -
>  extern struct sock *netlink_kernel_create(int unit, unsigned int groups, 
> void (*input)(struct sock *sk, int len), struct module *module);
>  extern void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int 
> err);
>  extern int netlink_has_listeners(struct sock *sk, unsigned int group);

Minor nit this a one line blank only diff, probably just editing crap.


>  static inline void udp_lib_close(struct sock *sk, long timeout)
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 8d65d64..abe1632 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -901,7 +901,9 @@ struct sock *sk_clone(const struct sock *sk, const gfp_t 
> priority)
>   sock_copy(newsk, sk);
>  
>   /* SANITY */
> +#ifndef CONFIG_MDT_LOOKUP
>   sk_node_init(&newsk->sk_node);
> +#endif
>   sock_lock_init(newsk);
>   bh_lock_sock(newsk);

I would rather not see ifdef's in code. Could you just stub
out the function in the include file?


>  
> diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
> index 9e8ef50..5bfb0dc 100644
> --- a/net/ipv4/Kconfig
> +++ b/net/ipv4/Kconfig
> @@ -1,6 +1,14 @@
>  #
>  # IP configuration
>  #
> +
> +config MDT_LOOKUP
> + bool "Multidimensional trie socket lookup"
> + depends on !INET_TCP_DIAG
> + help
> +   This option replaces traditional hash table lookup for TCP sockets
> +   with multidimensional trie algorithm (similar to judy trie).

So you had to break the TCP_DIAG output to make this work.
Please fix, the ss command is useful.

> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index cf358c8..8c32545 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -1360,6 +1360,7 @@ fs_initcall(inet_init);
>  /*  
> */
>  
>  #ifdef CONFIG_PROC_FS
> +#ifndef CONFIG_MDT_LOOKUP
>  static int __init ipv4_proc_init(void)
>  {
>   int rc = 0;
> @@ -1388,7 +1389,12 @@ out_raw:
>   rc = -ENOMEM;
>   goto out;
>  }
> -
> +#else
> +static int __init ipv4_proc_init(void)
> +{
> + return 0;
> +}
> +#endif

If you are making it into a stub, why not get rid of it
completely?

> +
> +#define MDT_SET_LEAF_STORAGE(leaf, ptr) do { \
> + rcu_assign_pointer((leaf), (struct mdt_node *)(((unsigned long)(ptr)) | 
> MDT_LEAF_STRUCT_BIT)); \
> +} while (0)

Macro yuckinesss.



What is the code size change with this?
-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH]: Add security check before flushing SAD/SPD

2007-03-22 Thread Joy Latten

Within selinux we check for authorization before deleting entries from
SAD and SPD. 

We are not checking for authorization when flushing the SPD and
the SAD. It was perhaps missed in original patch.

This patch adds security check when flushing entries from SAD and SPD.

Please let me know if this patch is ok.
It was built against linux-2.6.21-rc4-git5. I have also tested it.

Joy

Signed-off-by: Joy Latten<[EMAIL PROTECTED]>


diff -urpN linux-2.6.20.orig/net/xfrm/xfrm_policy.c 
linux-2.6.20/net/xfrm/xfrm_policy.c
--- linux-2.6.20.orig/net/xfrm/xfrm_policy.c2007-03-21 14:25:51.0 
-0500
+++ linux-2.6.20/net/xfrm/xfrm_policy.c 2007-03-21 14:30:59.0 -0500
@@ -829,6 +829,8 @@ void xfrm_policy_flush(u8 type, struct x
 &xfrm_policy_inexact[dir], bydst) {
if (pol->type != type)
continue;
+   if (security_xfrm_policy_delete(pol))
+   continue;
hlist_del(&pol->bydst);
hlist_del(&pol->byidx);
write_unlock_bh(&xfrm_policy_lock);
@@ -850,6 +852,8 @@ void xfrm_policy_flush(u8 type, struct x
 bydst) {
if (pol->type != type)
continue;
+   if (security_xfrm_policy_delete(pol))
+   continue;
hlist_del(&pol->bydst);
hlist_del(&pol->byidx);
write_unlock_bh(&xfrm_policy_lock);
diff -urpN linux-2.6.20.orig/net/xfrm/xfrm_state.c 
linux-2.6.20/net/xfrm/xfrm_state.c
--- linux-2.6.20.orig/net/xfrm/xfrm_state.c 2007-03-21 14:25:51.0 
-0500
+++ linux-2.6.20/net/xfrm/xfrm_state.c  2007-03-21 14:27:48.0 -0500
@@ -400,7 +400,8 @@ void xfrm_state_flush(u8 proto, struct x
 restart:
hlist_for_each_entry(x, entry, xfrm_state_bydst+i, bydst) {
if (!xfrm_state_kern(x) &&
-   xfrm_id_proto_match(x->id.proto, proto)) {
+   xfrm_id_proto_match(x->id.proto, proto) &&
+   !security_xfrm_state_delete(x)) {
xfrm_state_hold(x);
spin_unlock_bh(&xfrm_state_lock);
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] [SCTP] Correctly reset ssthresh when restarting association

2007-03-22 Thread Vlad Yasevich

Reset ssthresh to the correct value (peer's a_rwnd) when restarting
association.

Signed-off-by: Vlad Yasevich <[EMAIL PROTECTED]>
---
 net/sctp/transport.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/sctp/transport.c b/net/sctp/transport.c
index c4699f5..4d8c2ab 100644
--- a/net/sctp/transport.c
+++ b/net/sctp/transport.c
@@ -538,7 +538,7 @@ void sctp_transport_reset(struct sctp_transport *t)
 * (see Section 6.2.1)
 */
t->cwnd = min(4*asoc->pathmtu, max_t(__u32, 2*asoc->pathmtu, 4380));
-   t->ssthresh = SCTP_DEFAULT_MAXWINDOW;
+   t->ssthresh = asoc->peer.i.a_rwnd;
t->rto = asoc->rto_initial;
t->rtt = 0;
t->srtt = 0;
-- 
1.5.0.3.438.gc49b2

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANN] Unified dynamic storage for different socket types instead of separate hash tables.

2007-03-22 Thread Evgeniy Polyakov

On Thu, Mar 22, 2007 at 11:43:04AM -0700, Stephen Hemminger ([EMAIL PROTECTED]) 
wrote:
> 
> > 
> > diff --git a/include/linux/netlink.h b/include/linux/netlink.h
> > index 2a20f48..f11b4e7 100644
> > --- a/include/linux/netlink.h
> > +++ b/include/linux/netlink.h
> > @@ -151,7 +151,6 @@ struct netlink_skb_parms
> >  #define NETLINK_CB(skb)(*(struct 
> > netlink_skb_parms*)&((skb)->cb))
> >  #define NETLINK_CREDS(skb) (&NETLINK_CB((skb)).creds)
> >  
> > -
> >  extern struct sock *netlink_kernel_create(int unit, unsigned int groups, 
> > void (*input)(struct sock *sk, int len), struct module *module);
> >  extern void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int 
> > err);
> >  extern int netlink_has_listeners(struct sock *sk, unsigned int group);
> 
> Minor nit this a one line blank only diff, probably just editing crap.

I tried to put there netlink socket definition, but it does not compile
without major surgery, so I removed it back and dropped one line.
Committed via 'git commit -a -m', so it was too late to change.
 
> >  static inline void udp_lib_close(struct sock *sk, long timeout)
> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index 8d65d64..abe1632 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -901,7 +901,9 @@ struct sock *sk_clone(const struct sock *sk, const 
> > gfp_t priority)
> > sock_copy(newsk, sk);
> >  
> > /* SANITY */
> > +#ifndef CONFIG_MDT_LOOKUP
> > sk_node_init(&newsk->sk_node);
> > +#endif
> > sock_lock_init(newsk);
> > bh_lock_sock(newsk);
> 
> I would rather not see ifdef's in code. Could you just stub
> out the function in the include file?

I would prefer to get rid of hash tables at all - code is full of such
ifdefs. But in short run it can be replaced - I even started it with TCP's
inet_lookup and friends.
 
> >  
> > diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
> > index 9e8ef50..5bfb0dc 100644
> > --- a/net/ipv4/Kconfig
> > +++ b/net/ipv4/Kconfig
> > @@ -1,6 +1,14 @@
> >  #
> >  # IP configuration
> >  #
> > +
> > +config MDT_LOOKUP
> > +   bool "Multidimensional trie socket lookup"
> > +   depends on !INET_TCP_DIAG
> > +   help
> > + This option replaces traditional hash table lookup for TCP sockets
> > + with multidimensional trie algorithm (similar to judy trie).
> 
> So you had to break the TCP_DIAG output to make this work.
> Please fix, the ss command is useful.

Yes, current code does not support statistics.
Existing stats run over whole hash table, I do not like such approach,
so I will introduce a per-protocol lists of all sockets, which can be
accessed from statistics code, but it is next step.

> > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> > index cf358c8..8c32545 100644
> > --- a/net/ipv4/af_inet.c
> > +++ b/net/ipv4/af_inet.c
> > @@ -1360,6 +1360,7 @@ fs_initcall(inet_init);
> >  /* 
> >  */
> >  
> >  #ifdef CONFIG_PROC_FS
> > +#ifndef CONFIG_MDT_LOOKUP
> >  static int __init ipv4_proc_init(void)
> >  {
> > int rc = 0;
> > @@ -1388,7 +1389,12 @@ out_raw:
> > rc = -ENOMEM;
> > goto out;
> >  }
> > -
> > +#else
> > +static int __init ipv4_proc_init(void)
> > +{
> > +   return 0;
> > +}
> > +#endif
> 
> If you are making it into a stub, why not get rid of it
> completely?

I did it for some parts, but created a stub for another.
Proc insterface as long as netlink statistics is not supported, since I
do not want to run over whole tree to search sockets, so I just removed
it.

> > +
> > +#define MDT_SET_LEAF_STORAGE(leaf, ptr) do { \
> > +   rcu_assign_pointer((leaf), (struct mdt_node *)(((unsigned long)(ptr)) | 
> > MDT_LEAF_STRUCT_BIT)); \
> > +} while (0)
> 
> Macro yuckinesss.

You did not see what Rusty created...
:)

> 
> 
> What is the code size change with this?

 include/linux/netlink.h|1 -
 include/net/af_unix.h  |4 +-
 include/net/inet_connection_sock.h |5 +-
 include/net/inet_hashtables.h  |5 -
 include/net/inet_timewait_sock.h   |   12 +
 include/net/lookup.h   |  120 +++
 include/net/netlink.h  |   29 ++
 include/net/raw.h  |1 +
 include/net/sock.h |  106 ---
 include/net/tcp.h  |7 +-
 include/net/udp.h  |5 +
 net/core/sock.c|2 +
 net/ipv4/Kconfig   |8 +
 net/ipv4/Makefile  |7 +-
 net/ipv4/af_inet.c |8 +-
 net/ipv4/icmp.c|   10 +
 net/ipv4/inet_connection_sock.c|6 +-
 net/ipv4/inet_diag.c   |   11 +-
 net/ipv4/inet_timewait_sock.c  |   35 ++-
 net/ipv4/ip_input.c|   10 +-
 net/ipv4/mdt.c |  598 
 net/ipv4/raw.c |   47 +++-
 net/ipv4/tcp.c |

Re: [PATCH]: Add security check before flushing SAD/SPD

2007-03-22 Thread David Miller

From: Joy Latten <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 12:35:39 -0600

> Within selinux we check for authorization before deleting entries from
> SAD and SPD. 
> 
> We are not checking for authorization when flushing the SPD and
> the SAD. It was perhaps missed in original patch.
> 
> This patch adds security check when flushing entries from SAD and SPD.
> 
> Please let me know if this patch is ok.
> It was built against linux-2.6.21-rc4-git5. I have also tested it.
> 
> Signed-off-by: Joy Latten<[EMAIL PROTECTED]>

I don't understand this and it does not sit well with me.

If we are flushing the policy database, we are flushing it
regardless of what the security layer might or might not say.

I would look at this patch differently if there were some
security level key being checked for a match here, which is
an input key to the flush, but that is not what is happening
here as the object is being looked at by itself.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANN] Unified dynamic storage for different socket types instead of separate hash tables.

2007-03-22 Thread David Miller

From: Evgeniy Polyakov <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 21:59:44 +0300

> Yes, current code does not support statistics.
> Existing stats run over whole hash table, I do not like such approach,
> so I will introduce a per-protocol lists of all sockets, which can be
> accessed from statistics code, but it is next step.

We are _NOT_ bloating up the socket structure even more because your
data structure does not support a "iterate over all objects"
operation.

We got rid of the linked list of all sockets per-protocol precisely
for this reason 10 years ago, you cannot add it back, sorry.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/12] [RESEND] rtnetlink message handler registration interface

2007-03-22 Thread David Miller

From: Thomas Graf <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 13:59:55 +0100

> The existing function names seem to have sentimental value to some
> people. Same patches but without changes to the functio names.
> 
> Introduces an interface to register rtnetlink message handlers
> and converts all users of rtnl_links[].

All 12 patches applied, thanks Thomas.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 09/12] [DECNet]: Use rtnl registration interface

2007-03-22 Thread David Miller

From: Steven Whitehouse <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 13:07:54 +

> Hi,
> 
> On Thu, Mar 22, 2007 at 02:00:04PM +0100, Thomas Graf wrote:
> > Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>
> >
> Acked-by: Steven Whitehouse <[EMAIL PROTECTED]>
> 
> for all the DECnet bits & also the DECnet changes in the other patch I saw
> from you relating to the routing rules,

Thanks for helping review Steven.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RESEND] [NET] rules: Unified rules dumping

2007-03-22 Thread David Miller

From: Thomas Graf <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 14:03:02 +0100

> Rediffed based on new rtnl registration patches.
> 
> Implements a unified, protocol independant rules dumping function
> which is capable of both, dumping a specific protocol family or
> all of them. This speeds up dumping as less lookups are required.
> 
> Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

This also looks good, applied, thanks Thomas.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANN] Unified dynamic storage for different socket types instead of separate hash tables.

2007-03-22 Thread Evgeniy Polyakov

On Thu, Mar 22, 2007 at 12:03:46PM -0700, David Miller ([EMAIL PROTECTED]) 
wrote:
> From: Evgeniy Polyakov <[EMAIL PROTECTED]>
> Date: Thu, 22 Mar 2007 21:59:44 +0300
> 
> > Yes, current code does not support statistics.
> > Existing stats run over whole hash table, I do not like such approach,
> > so I will introduce a per-protocol lists of all sockets, which can be
> > accessed from statistics code, but it is next step.
> 
> We are _NOT_ bloating up the socket structure even more because your
> data structure does not support a "iterate over all objects"
> operation.
> 
> We got rid of the linked list of all sockets per-protocol precisely
> for this reason 10 years ago, you cannot add it back, sorry.

Hmm...
My patch _removes_ them from socket structures!

--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -114,10 +114,12 @@ struct sock_common {
volatile unsigned char  skc_state;
unsigned char   skc_reuse;
int skc_bound_dev_if;
-   struct hlist_node   skc_node;
struct hlist_node   skc_bind_node;
atomic_tskc_refcnt;
+#ifndef CONFIG_MDT_LOOKUP
+   struct hlist_node   skc_node;
unsigned intskc_hash;
+#endif
struct proto*skc_prot;
 };

I specially have only one hash structure in the socket - skc_bind_node -
to be used for statistics and remove hash and skc_node (and for netlink
broadcasting too), so this code reduces socket structure by 12 bytes on
x86 (20 bytes on x86_64).

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANN] Unified dynamic storage for different socket types instead of separate hash tables.

2007-03-22 Thread David Miller

From: Evgeniy Polyakov <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 22:09:40 +0300

> I specially have only one hash structure in the socket - skc_bind_node -
> to be used for statistics and remove hash and skc_node (and for netlink
> broadcasting too), so this code reduces socket structure by 12 bytes on
> x86 (20 bytes on x86_64).

Yes, for your trie you've removed quite a bit, but now you're
going to add 2 pointers right back right?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANN] Unified dynamic storage for different socket types instead of separate hash tables.

2007-03-22 Thread Evgeniy Polyakov

On Thu, Mar 22, 2007 at 12:03:46PM -0700, David Miller ([EMAIL PROTECTED]) 
wrote:
> From: Evgeniy Polyakov <[EMAIL PROTECTED]>
> Date: Thu, 22 Mar 2007 21:59:44 +0300
> 
> > Yes, current code does not support statistics.
> > Existing stats run over whole hash table, I do not like such approach,
> > so I will introduce a per-protocol lists of all sockets, which can be
> > accessed from statistics code, but it is next step.
> 
> We are _NOT_ bloating up the socket structure even more because your
> data structure does not support a "iterate over all objects"
> operation.

And to be absolutely clear - existing interface does not support it too
- we iterate over every single hash entry, and then over every single
item in the chain (if it exists). I can create the same for the tree -
it is not complex at all, but it is not the most optimal solution, and 
since I remove several entries, I think it is not that bad to remove a 
bit less and optimize 'iterate over all object' case a bit.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [NET]: Fix fib_rules dump race

2007-03-22 Thread Thomas Graf

* Patrick McHardy <[EMAIL PROTECTED]> 2007-03-22 15:38

> [NET]: Fix fib_rules dump race
> 
> fib_rules_dump needs to use list_for_each_entry_rcu to protect against
> concurrent changes to the rules list.

Good catch, it's not serialized with add/del by rtnl mutex if the
dump gets interrupted.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/5] netem performance improvements

2007-03-22 Thread David Miller

From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Wed, 21 Mar 2007 10:42:31 -0700

> The following patches for the 2.6.22 net tree, increase the
> performance of netem by about 2x.  With 2.6.20 getting about
> 100K (out of possible 300K) packets per second, after these
> patches now at over 200K pps.

All patches applied to net-2.6.22, thanks Stephen.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANN] Unified dynamic storage for different socket types instead of separate hash tables.

2007-03-22 Thread Evgeniy Polyakov

On Thu, Mar 22, 2007 at 12:14:41PM -0700, David Miller ([EMAIL PROTECTED]) 
wrote:
> From: Evgeniy Polyakov <[EMAIL PROTECTED]>
> Date: Thu, 22 Mar 2007 22:09:40 +0300
> 
> > I specially have only one hash structure in the socket - skc_bind_node -
> > to be used for statistics and remove hash and skc_node (and for netlink
> > broadcasting too), so this code reduces socket structure by 12 bytes on
> > x86 (20 bytes on x86_64).
> 
> Yes, for your trie you've removed quite a bit, but now you're
> going to add 2 pointers right back right?

No, I will use the same hlist_node pointer (skc_bind_node) which was
there, and skc_node and skc_node are removed.

After all - we can traverse over the whole tree one-by one, it is even
possible to attach a bitmask to each level node (since it is an array)
of used/free entries and use it.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANN] Unified dynamic storage for different socket types instead of separate hash tables.

2007-03-22 Thread David Miller

From: Evgeniy Polyakov <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 22:14:49 +0300

> And to be absolutely clear - existing interface does not support it too
> - we iterate over every single hash entry, and then over every single
> item in the chain (if it exists). I can create the same for the tree -
> it is not complex at all, but it is not the most optimal solution, and 
> since I remove several entries, I think it is not that bad to remove a 
> bit less and optimize 'iterate over all object' case a bit.

This results in your trie having two new run-time costs:

1) More expensive trie insert/delete compared to hash
   insert/delete

2) An extra list insert/delete to give list of all sockets

So connection setup/teardown will be more expensive and
therefore our connection rates will be lower.

Evgeniy, your ideas are beautiful in theory, but all the details
kill all of your non-trivial work and make it useless in the end.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/3] [PATCHSET] netlink error management

2007-03-22 Thread David Miller

From: Thomas Graf <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 00:18:01 +0100

> This series of patches simplifies the error management and
> signalization of dump starts of netlink_run_queue() message
> handlers. It touches a fair bit of nfnetlink code as the
> error pointer has been passed on to subsystems.

Thomas can you respin these patches?  They no longer apply
now that I put your other rtnl_link bits into net-2.6.22

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [NET]: Fix fib_rules dump race

2007-03-22 Thread David Miller


Applied.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [BRIDGE]: Fix fdb RCU race

2007-03-22 Thread David Miller

From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 19:29:09 +0100

> Fix what looks like a RCU race. Untested since this is only
> used by ATM, which I don't have.

Also applied, thanks Patrick.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [SCTP] Correctly reset ssthresh when restarting association

2007-03-22 Thread David Miller

From: Vlad Yasevich <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 14:56:19 -0400

> Reset ssthresh to the correct value (peer's a_rwnd) when restarting
> association.
> 
> Signed-off-by: Vlad Yasevich <[EMAIL PROTECTED]>

Applied, thanks Vlad.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: fix up misplaced inlines.

2007-03-22 Thread David Miller

From: Dave Jones <[EMAIL PROTECTED]>
Date: Wed, 21 Mar 2007 20:18:28 -0400

> Turning up the warnings on gcc makes it emit warnings
> about the placement of 'inline' in function declarations.
> Here's everything that was under net/
> 
> Signed-off-by: Dave Jones <[EMAIL PROTECTED]>

Applied, thanks a lot Dave.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANN] Unified dynamic storage for different socket types instead of separate hash tables.

2007-03-22 Thread Evgeniy Polyakov

On Thu, Mar 22, 2007 at 12:21:20PM -0700, David Miller ([EMAIL PROTECTED]) 
wrote:
> From: Evgeniy Polyakov <[EMAIL PROTECTED]>
> Date: Thu, 22 Mar 2007 22:14:49 +0300
> 
> > And to be absolutely clear - existing interface does not support it too
> > - we iterate over every single hash entry, and then over every single
> > item in the chain (if it exists). I can create the same for the tree -
> > it is not complex at all, but it is not the most optimal solution, and 
> > since I remove several entries, I think it is not that bad to remove a 
> > bit less and optimize 'iterate over all object' case a bit.
> 
> This results in your trie having two new run-time costs:
> 
> 1) More expensive trie insert/delete compared to hash
>insert/delete

That's true, it requires additional allocation, which can be combined
with socket allocation though.

> 2) An extra list insert/delete to give list of all sockets

That is too small price.
And as I stated in another mail - we can iterate over whole tree just
like we do with hash tables - even faster, if I will attach a bitmask of
used entires to each level node.

> So connection setup/teardown will be more expensive and
> therefore our connection rates will be lower.

Hmm, I'm not sure - we already allocate several items during connection
setup/teardown time (request socket, usual socket, timewait socket), 
so additional one is not absolutely sure will end up with slower times.

And do not forget about other benefits dynamic structure gives us
compared to static hash table in DoS and traversal speed/locking.

> Evgeniy, your ideas are beautiful in theory, but all the details
> kill all of your non-trivial work and make it useless in the end.

I especially agree with first part of the sentence :))
So far I can not see serious disadvantages in this design.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANN] Unified dynamic storage for different socket types instead of separate hash tables.

2007-03-22 Thread David Miller

From: Evgeniy Polyakov <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 22:30:52 +0300

> > 2) An extra list insert/delete to give list of all sockets
> 
> That is too small price.

In your imagination.  Our connection rates went up significantly
when I got rid of the linked list we had many years ago.

Every memory access matters.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANN] Unified dynamic storage for different socket types instead of separate hash tables.

2007-03-22 Thread Evgeniy Polyakov

On Thu, Mar 22, 2007 at 12:36:04PM -0700, David Miller ([EMAIL PROTECTED]) 
wrote:
> > > 2) An extra list insert/delete to give list of all sockets
> > 
> > That is too small price.
> 
> In your imagination.  Our connection rates went up significantly
> when I got rid of the linked list we had many years ago.
> 
> Every memory access matters.

Ok, I never liked linked lists actually. :)

This one can be completely eliminated (hmm, it does not even exist so
far) by having per-node bitmask of used/free entries - it will be 
even faster than existing access and will not require locks (due to RCU
protection).

So, this allows to remove additional hlist_node structure from socket
(and change netlink one to not use it in broadcasting, so for netlink
sockets it will be moved into private netlink structure).

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC: Established connections hash function

2007-03-22 Thread Nikolaos D. Bougalis


On Thu, Mar 22, 2007 11:21 AM, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote:


   Utterly broken? Nonsense. I have tested the actual function I proposed
(sans the __force and __u32 stuff, which weren't necessary in my test
program), against real data, collected from various servers in real-time.
It has consistently achieved lower average chain lengths than the vanilla
function and demonstrated no artifacting, and that's trivial to verify.


So what?


   So what? Are you serious?



People test and work with XOR hash for years and they do not strike any
problems. If we talk about specially crafted data, then XOR one is no
worse than Jenkins with 3 words (which is even worse for blind attack of
constant ports).


   People _have_ had problems. _I_ have had problems. And when someone with 
a few thousand drones under his control hoses your servers because he can do 
math and he leaves you with 2-item long chains, _you_ will have 
problems. And sticking your head in the sand and saying "people work with 
XOR hash for years and they do not strike any problems" wont help you.




   The only analysis I could find was this
http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14, which uses
jhash_2words, and not jhash_3words, and which naively attempts to take 
the

output of jhash_2words, and to perform the same mixing trick that the
vanilla inet_ehashfn does and uses artificially generated data sets.


It is outdated, check recent netdev@ archives. Folding used in that test
does not change distribution, and data was presented as it can be
selected by attacker, who can create with any distribution.


   Be careful here. If the folding makes no difference, it says something 
very important about __jhash_mix, and that something goes against the very 
thing that you are saying.




   But please, feel free to point out any other _unfavorable_ analyses of
jhash_2words or jhash_3words that I may have missed.


>We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) though.

   Please explain to me how jhash_2words solves the issue that you claim
jhash_3words has, when they both use the same underlying bit-mixer?


$c value is not properly distributed and significanly breaks overall
distribution. Attacker, which controls $c (and it does it by controlling
ports), can significantly increase selected hash chains.


   I've tested the Jenkins hash extensively. I see no evidence of this 
"improper distribution" that you describe. In fact, about the only person 
that I've seen advocate this in the archives of netdev is you, and a lot of 
other very smart people disagree with you, so I consider myself to be in 
good company.




But it is only $c, $a and $b are properly distributed, so jhash_2words()
is safer than jhash_3words().
Just create a simple application which does
jhash_3words(a, b, rand(), init) and jhash_2words(a, b, rand()) and see
results.


   What exactly am I supposed to see in these results? Because whatever it 
is, it's not there. Feel free to provide a link to your data and a histogram 
that shows what you find of interest though, and I'll be happy to look at 
it.


   -n



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC: Established connections hash function

2007-03-22 Thread Evgeniy Polyakov

On Thu, Mar 22, 2007 at 12:44:09PM -0700, Nikolaos D. Bougalis ([EMAIL 
PROTECTED]) wrote:
> On Thu, Mar 22, 2007 11:21 AM, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote:
> 
> >>   Utterly broken? Nonsense. I have tested the actual function I proposed
> >>(sans the __force and __u32 stuff, which weren't necessary in my test
> >>program), against real data, collected from various servers in real-time.
> >>It has consistently achieved lower average chain lengths than the vanilla
> >>function and demonstrated no artifacting, and that's trivial to verify.
> >
> >So what?
> 
>So what? Are you serious?

We started our discussion a bit wrong - let's start it again, ok? :)
 
> >People test and work with XOR hash for years and they do not strike any
> >problems. If we talk about specially crafted data, then XOR one is no
> >worse than Jenkins with 3 words (which is even worse for blind attack of
> >constant ports).
> 
>People _have_ had problems. _I_ have had problems. And when someone with 
> a few thousand drones under his control hoses your servers because he can 
> do math and he leaves you with 2-item long chains, _you_ will have 
> problems. And sticking your head in the sand and saying "people work with 
> XOR hash for years and they do not strike any problems" wont help you.

You do not want to read what was written - _if_ we use artificial data,
then attacker can use it too, so if it is possible to break the system
with artificial data, then it is possible it will be broken in a real
life. If we use usual data, then we are ok (although Jenkins with 3
words is not ok).
 
> >>   The only analysis I could find was this
> >>http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14, which uses
> >>jhash_2words, and not jhash_3words, and which naively attempts to take 
> >>the
> >>output of jhash_2words, and to perform the same mixing trick that the
> >>vanilla inet_ehashfn does and uses artificially generated data sets.
> >
> >It is outdated, check recent netdev@ archives. Folding used in that test
> >does not change distribution, and data was presented as it can be
> >selected by attacker, who can create with any distribution.
> 
>Be careful here. If the folding makes no difference, it says something 
> very important about __jhash_mix, and that something goes against the very 
> thing that you are saying.

Grrr, I think I pointed several times already, that properly distributed
values do not change distribution after folding. And it can be seen in
all tests (and in that you pointed too).

> >>   But please, feel free to point out any other _unfavorable_ analyses of
> >>jhash_2words or jhash_3words that I may have missed.
> >>
> >>
> >>>We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) though.
> >>
> >>   Please explain to me how jhash_2words solves the issue that you claim
> >>jhash_3words has, when they both use the same underlying bit-mixer?
> >
> >$c value is not properly distributed and significanly breaks overall
> >distribution. Attacker, which controls $c (and it does it by controlling
> >ports), can significantly increase selected hash chains.
> 
>I've tested the Jenkins hash extensively. I see no evidence of this 
> "improper distribution" that you describe. In fact, about the only person 
> that I've seen advocate this in the archives of netdev is you, and a lot of 
> other very smart people disagree with you, so I consider myself to be in 
> good company.

Hmm, I ran tests to select proper hash for netchannel implementation
(actualy the same as sockets) and showed Jenkin's hash problems - it is
enough to have only problem to state that there is a problem, doesn't
it?
 
> >But it is only $c, $a and $b are properly distributed, so jhash_2words()
> >is safer than jhash_3words().
> >Just create a simple application which does
> >jhash_3words(a, b, rand(), init) and jhash_2words(a, b, rand()) and see
> >results.
> 
>What exactly am I supposed to see in these results? Because whatever it 
> is, it's not there. Feel free to provide a link to your data and a 

I will try to decipher phrase 'whatever it is, it's not there'...

> histogram that shows what you find of interest though, and I'll be happy to 
> look at it.

This thread for example:
http://marc.info/?t=11705761351&r=1&w=2

One your test shows thare are no problems, try that one I propose, which
can be even created in userspace - you do not want even to get into
account what I try to say to you.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/3] [PATCHSET] netlink error management

2007-03-22 Thread Thomas Graf

* David Miller <[EMAIL PROTECTED]> 2007-03-22 12:23
> From: Thomas Graf <[EMAIL PROTECTED]>
> Date: Thu, 22 Mar 2007 00:18:01 +0100
> 
> > This series of patches simplifies the error management and
> > signalization of dump starts of netlink_run_queue() message
> > handlers. It touches a fair bit of nfnetlink code as the
> > error pointer has been passed on to subsystems.
> 
> Thomas can you respin these patches?  They no longer apply
> now that I put your other rtnl_link bits into net-2.6.22

I guess I've posted too many patches series, the patches should
still apply, at least it works in my quilt tree, but depend on
others. The series I've posted in the correct order are:

[RESEND] rtnetlink message handler registration interface
  already applied
 
[RESEND] [NET] rules: Unified rules dumping
  already applied

[PATCH 0/5] [PATCHSET] Netlink Patches
  even though there have been objects because of nfnetlink
  at first, the patches can go in as-is

[NETFILTER] nfnetlink: netlink_run_queue() already checks
   for NLM_F_REQUEST

[PATCH 0/3] [PATCHSET] netlink error management
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: L2TP support?

2007-03-22 Thread James Chapman


Hi Ingo,

Ingo Oeser wrote:

Hi James,

James Chapman schrieb:

Is there interest in adding L2TP support?


Yes, if there is also a user space part somewhere.


Yes there is. There's a pppd plugin which comes with the openl2tp 
project, http://sf.net/projects/openl2tp. OpenL2TP supports both LAC and 
LNS operation. A patch is also available to allow this driver to be used 
with another L2TP implementation, l2tpd.



I have a patch which could be submitted for review. The PPPoL2TP driver
presents a PPPoX socket to userspace pppd in the same way as the PPPoE
and PPPoATM drivers. The kernel handles all data traffic, while
userspace daemons do L2TP and PPP control message processing.


Like the pppoe-plugin for pppd? 


Yes. :) The plan is to submit the pppol2tp plugin for pppd for inclusion 
in the pppd distro after the kernel driver is integrated.


Yes, please check your patch using the nice checklist in 
Documentation/SubmitChecklist and do it in suitable chunks 
according to Documentation/SubmittingPatches


Will do.

--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/5] netem: avoid excessive requeues

2007-03-22 Thread Patrick McHardy

Stephen Hemminger wrote:
> @@ -315,6 +316,7 @@ void qdisc_watchdog_schedule(struct qdis
>   ktime_t time;
>  
>   wd->qdisc->flags |= TCQ_F_THROTTLED;
> + smp_wmb();
>   time = ktime_set(0, 0);
>   time = ktime_add_ns(time, PSCHED_US2NS(expires));
>   hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS);
> @@ -325,6 +327,7 @@ void qdisc_watchdog_cancel(struct qdisc_
>  {
>   hrtimer_cancel(&wd->timer);
>   wd->qdisc->flags &= ~TCQ_F_THROTTLED;
> + smp_wmb();
>  }
>  EXPORT_SYMBOL(qdisc_watchdog_cancel);


These two look unnecessary, we're holding the queue lock.

> --- net-2.6.22.orig/net/sched/sch_netem.c
> +++ net-2.6.22/net/sched/sch_netem.c
> @@ -272,6 +272,10 @@ static struct sk_buff *netem_dequeue(str
>   struct netem_sched_data *q = qdisc_priv(sch);
>   struct sk_buff *skb;
>  
> + smp_mb();
> + if (sch->flags & TCQ_F_THROTTLED)
> + return NULL;
> +


Perhaps we should put this in qdisc_restart, other qdiscs have the
same problem.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC: Established connections hash function

2007-03-22 Thread Nikolaos D. Bougalis




We started our discussion a bit wrong - let's start it again, ok? :)


   Fair enough.



You do not want to read what was written - _if_ we use artificial data,
then attacker can use it too, so if it is possible to break the system
with artificial data, then it is possible it will be broken in a real
life. If we use usual data, then we are ok (although Jenkins with 3
words is not ok).


   I'm not saying that an attacker cannot use artificial data. Indeed, 
algorithmic complexity attacks are all about 'crafting' artificial data with 
certain properties. So, yes, I absolutely agree that attackers can and do 
use "artificial data."



>Be careful here. If the folding makes no difference, it says 
> something
> very important about __jhash_mix, and that something goes against the 
> very

> thing that you are saying.

Grrr, I think I pointed several times already, that properly distributed
values do not change distribution after folding. And it can be seen in
all tests (and in that you pointed too).


   Yes, I agree that the folding will not be a problem _IF_ the values are 
properly distributed -- although in that case, the folding is unnecessary. 
But that the Jenkins distribution didn't change (according to posts you 
made) after folding says that the output of Jenkins is pretty good to begin 
with ;)



> >>>We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) 
> >>>though.

> >>
> >>   Please explain to me how jhash_2words solves the issue that you 
> >> claim

> >>jhash_3words has, when they both use the same underlying bit-mixer?
> >
> >$c value is not properly distributed and significanly breaks overall
> >distribution. Attacker, which controls $c (and it does it by 
> >controlling

> >ports), can significantly increase selected hash chains.


   Even if we assume that $c is not properly distributed, using a secret 
cookie and mixing operations from different algebraic groups changes the 
calculus dramatically. It's no longer straight-forward for the attacker to 
generate collisions (as it is with the current function) because the '$c' 
supplied by the attacker is used in conjunction with the secret cookie 
before __jhash_mix thoroughly mixes the inputs to generate a hash.




>I've tested the Jenkins hash extensively. I see no evidence of this
> "improper distribution" that you describe. In fact, about the only 
> person
> that I've seen advocate this in the archives of netdev is you, and a lot 
> of

> other very smart people disagree with you, so I consider myself to be in
> good company.

Hmm, I ran tests to select proper hash for netchannel implementation
(actualy the same as sockets) and showed Jenkin's hash problems - it is
enough to have only problem to state that there is a problem, doesn't
it?


   Again, from what I've seen from your other posts, I don't believe you've 
identified any inherent problems with the Jenkins hash.


   But that aside for a moment, surely you will agree that the ability of 
an attacker with a few dozen machines under his control to trivially mount 
an algorithmic complexity attack causing serious performance drops is also a 
problem with the current code and one that must be addressed.




I will try to decipher phrase 'whatever it is, it's not there'...


   It meant that I saw nothing particularly interesting running the example 
you suggested and looking at the output.




This thread for example:
http://marc.info/?t=11705761351&r=1&w=2


   I went through most of this thread. I don't see an analysis of the 
Jenkins. Am I missing something?




One your test shows thare are no problems, try that one I propose, which
can be even created in userspace - you do not want even to get into
account what I try to say to you.


   I'm not trying to be obnoxious on purpose here, but I don't see the test 
that you are referring to. Could you be more specific?


   -n


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC: Established connections hash function

2007-03-22 Thread David Miller

From: "Nikolaos D. Bougalis" <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 12:44:09 -0700

> People _have_ had problems. _I_ have had problems. And when
> someone with a few thousand drones under his control hoses your
> servers because he can do math and he leaves you with 2-item
> long chains, _you_ will have problems.

No need to further argue this point, the people that matter
(ie. me :-) understand it, don't worry..
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/3] [PATCHSET] netlink error management

2007-03-22 Thread David Miller

From: Thomas Graf <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 20:57:54 +0100

> * David Miller <[EMAIL PROTECTED]> 2007-03-22 12:23
> > From: Thomas Graf <[EMAIL PROTECTED]>
> > Date: Thu, 22 Mar 2007 00:18:01 +0100
> > 
> > > This series of patches simplifies the error management and
> > > signalization of dump starts of netlink_run_queue() message
> > > handlers. It touches a fair bit of nfnetlink code as the
> > > error pointer has been passed on to subsystems.
> > 
> > Thomas can you respin these patches?  They no longer apply
> > now that I put your other rtnl_link bits into net-2.6.22
> 
> I guess I've posted too many patches series, the patches should
> still apply, at least it works in my quilt tree, but depend on
> others. The series I've posted in the correct order are:

It doesn't, that's why I asked you to respin.

Perhaps "patch" would allow them with some fuzz, but git
definitely doesn't want to apply it.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/5] netem: avoid excessive requeues

2007-03-22 Thread David Miller

From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 21:40:43 +0100

> > --- net-2.6.22.orig/net/sched/sch_netem.c
> > +++ net-2.6.22/net/sched/sch_netem.c
> > @@ -272,6 +272,10 @@ static struct sk_buff *netem_dequeue(str
> > struct netem_sched_data *q = qdisc_priv(sch);
> > struct sk_buff *skb;
> >  
> > +   smp_mb();
> > +   if (sch->flags & TCQ_F_THROTTLED)
> > +   return NULL;
> > +
> 
> 
> Perhaps we should put this in qdisc_restart, other qdiscs have the
> same problem.

Agreed, patches welcome :)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: Add security check before flushing SAD/SPD

2007-03-22 Thread Joy Latten

On Thu, 2007-03-22 at 12:01 -0700, David Miller wrote:
> From: Joy Latten <[EMAIL PROTECTED]>
> Date: Thu, 22 Mar 2007 12:35:39 -0600
> 
> > Within selinux we check for authorization before deleting entries from
> > SAD and SPD. 
> > 
> > We are not checking for authorization when flushing the SPD and
> > the SAD. It was perhaps missed in original patch.
> > 
> > This patch adds security check when flushing entries from SAD and SPD.
> > 
> > Please let me know if this patch is ok.
> > It was built against linux-2.6.21-rc4-git5. I have also tested it.
> > 
> > Signed-off-by: Joy Latten<[EMAIL PROTECTED]>
> 
> I don't understand this and it does not sit well with me.
> 
> If we are flushing the policy database, we are flushing it
> regardless of what the security layer might or might not say.
> 
> I would look at this patch differently if there were some
> security level key being checked for a match here, which is
> an input key to the flush, but that is not what is happening
> here as the object is being looked at by itself.

Yes, I understand what you are saying.
I was concerned about having to check each entry
to flush database.

I did this patch because we check for authorization
when deleting single specified entries from the SAD/SPD. It
seem like a hole to me that we check for this, but that same
user/process can delete the entire database with no checks.

Unfortunately, each policy entry or SA can have a different security
label. And that is why I would have to check each entry's
security label before deleting. To see if the user/process has
authorization to delete an entry with that security label.

Including selinux list for suggestions.

Joy

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC: Established connections hash function

2007-03-22 Thread Eric Dumazet


David Miller a écrit :

From: "Nikolaos D. Bougalis" <[EMAIL PROTECTED]>
Date: Thu, 22 Mar 2007 12:44:09 -0700


People _have_ had problems. _I_ have had problems. And when
someone with a few thousand drones under his control hoses your
servers because he can do math and he leaves you with 2-item
long chains, _you_ will have problems.


No need to further argue this point, the people that matter
(ie. me :-) understand it, don't worry..


Yes, I recall having one big server hit two years ago by an attack on tcp hash 
function. David sent me the patch to use jhash. It's performing well :)


Welcome to the club :)

= net/ipv4/tcp_ipv4.c 1.114 vs edited =
--- 1.114/net/ipv4/tcp_ipv4.c2005-03-26 15:04:35 -08:00
+++ edited/net/ipv4/tcp_ipv4.c2005-04-05 13:39:52 -07:00
@@ -103,14 +103,15 @@
  */
 int sysctl_local_port_range[2] = { 1024, 4999 };
 int tcp_port_rover = 1024 - 1;
+static u32 tcp_v4_hash_rand;

 static __inline__ int tcp_hashfn(__u32 laddr, __u16 lport,
  __u32 faddr, __u16 fport)
 {
-int h = (laddr ^ lport) ^ (faddr ^ fport);
-h ^= h >> 16;
-h ^= h >> 8;
-return h & (tcp_ehash_size - 1);
+return jhash_2words(laddr ^ faddr,
+(lport << 16) | fport,
+tcp_v4_hash_rand) &
+(tcp_ehash_size - 1);
 }

>  static __inline__ int tcp_sk_hashfn(struct sock *sk)
> @@ -2626,6 +2627,9 @@
>  panic("Failed to create the TCP control socket.\n");
>  tcp_socket->sk->sk_allocation   = GFP_ATOMIC;
>  inet_sk(tcp_socket->sk)->uc_ttl = -1;
> +
> +get_random_bytes(&tcp_v4_hash_rand, 4);
> +tcp_v4_hash_rand ^= jiffies;
>
>  /* Unhash it so that IP input processing does not even
>   * see it, we do not wish this socket to see incoming
>
>


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/8] [RESEND] netlink error magngement + others

2007-03-22 Thread Thomas Graf

Dave,
I've rediffed all patches and combined the two patchsets into
one.  Patrick has already queued the NLM_F_REQUEST change for
nfnetlink.

[PATCH 0/5] [PATCHSET] Netlink Patches
Converts westwood and vegas netlink code to use the typesafe
interface, removes an unused varaible, and move some netlink
queue management code into the generic layer.

[PATCH 0/3] [PATCHSET] netlink error management
This series of patches simplifies the error management and
signalization of dump starts of netlink_run_queue() message
handlers. It touches a fair bit of nfnetlink code as the
error pointer has been passed on to subsystems.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/8] [TCP] vegas: Use type safe netlink interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/ipv4/tcp_vegas.c
===
--- net-2.6.22.orig/net/ipv4/tcp_vegas.c2007-03-22 23:08:48.0 
+0100
+++ net-2.6.22/net/ipv4/tcp_vegas.c 2007-03-22 23:09:21.0 +0100
@@ -341,16 +341,14 @@ static void tcp_vegas_get_info(struct so
 {
const struct vegas *ca = inet_csk_ca(sk);
if (ext & (1 << (INET_DIAG_VEGASINFO - 1))) {
-   struct tcpvegas_info *info;
+   struct tcpvegas_info info = {
+   .tcpv_enabled = ca->doing_vegas_now,
+   .tcpv_rttcnt = ca->cntRTT,
+   .tcpv_rtt = ca->baseRTT,
+   .tcpv_minrtt = ca->minRTT,
+   };
 
-   info = RTA_DATA(__RTA_PUT(skb, INET_DIAG_VEGASINFO,
- sizeof(*info)));
-
-   info->tcpv_enabled = ca->doing_vegas_now;
-   info->tcpv_rttcnt = ca->cntRTT;
-   info->tcpv_rtt = ca->baseRTT;
-   info->tcpv_minrtt = ca->minRTT;
-   rtattr_failure: ;
+   nla_put(skb, INET_DIAG_VEGASINFO, sizeof(info), &info);
}
 }
 

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/8] [TCP] westwood: Use type safe netlink interface

2007-03-22 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/ipv4/tcp_westwood.c
===
--- net-2.6.22.orig/net/ipv4/tcp_westwood.c 2007-03-22 23:08:48.0 
+0100
+++ net-2.6.22/net/ipv4/tcp_westwood.c  2007-03-22 23:09:23.0 +0100
@@ -260,16 +260,13 @@ static void tcp_westwood_info(struct soc
 {
const struct westwood *ca = inet_csk_ca(sk);
if (ext & (1 << (INET_DIAG_VEGASINFO - 1))) {
-   struct rtattr *rta;
-   struct tcpvegas_info *info;
+   struct tcpvegas_info info = {
+   .tcpv_enabled = 1,
+   .tcpv_rtt = jiffies_to_usecs(ca->rtt),
+   .tcpv_minrtt = jiffies_to_usecs(ca->rtt_min),
+   };
 
-   rta = __RTA_PUT(skb, INET_DIAG_VEGASINFO, sizeof(*info));
-   info = RTA_DATA(rta);
-   info->tcpv_enabled = 1;
-   info->tcpv_rttcnt = 0;
-   info->tcpv_rtt = jiffies_to_usecs(ca->rtt);
-   info->tcpv_minrtt = jiffies_to_usecs(ca->rtt_min);
-   rtattr_failure: ;
+   nla_put(skb, INET_DIAG_VEGASINFO, sizeof(info), &info);
}
 }
 

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/8] [NETLINK]: Remove unused groups variable

2007-03-22 Thread Thomas Graf

Leftover from dynamic multicast groups allocation work.

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/netlink/af_netlink.c
===
--- net-2.6.22.orig/net/netlink/af_netlink.c2007-03-22 23:08:48.0 
+0100
+++ net-2.6.22/net/netlink/af_netlink.c 2007-03-22 23:09:24.0 +0100
@@ -396,7 +396,6 @@ static int netlink_create(struct socket 
 {
struct module *module = NULL;
struct netlink_sock *nlk;
-   unsigned int groups;
int err = 0;
 
sock->state = SS_UNCONNECTED;
@@ -418,7 +417,6 @@ static int netlink_create(struct socket 
if (nl_table[protocol].registered &&
try_module_get(nl_table[protocol].module))
module = nl_table[protocol].module;
-   groups = nl_table[protocol].groups;
netlink_unlock_table();
 
if ((err = __netlink_create(sock, protocol)) < 0)

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/8] [NETLINK]: Ignore !NLM_F_REQUEST messages directly in netlink_run_queue()

2007-03-22 Thread Thomas Graf

netlink_rcv_skb() is changed to skip messages which don't have the
NLM_F_REQUEST bit to avoid every netlink family having to perform this
check on their own.

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/netlink/af_netlink.c
===
--- net-2.6.22.orig/net/netlink/af_netlink.c2007-03-22 23:09:24.0 
+0100
+++ net-2.6.22/net/netlink/af_netlink.c 2007-03-22 23:15:33.0 +0100
@@ -1470,10 +1470,15 @@ static int netlink_rcv_skb(struct sk_buf
 
while (skb->len >= nlmsg_total_size(0)) {
nlh = nlmsg_hdr(skb);
+   err = 0;
 
if (nlh->nlmsg_len < NLMSG_HDRLEN || skb->len < nlh->nlmsg_len)
return 0;
 
+   /* Only requests are handled by the kernel */
+   if (!(nlh->nlmsg_flags & NLM_F_REQUEST))
+   goto skip;
+
if (cb(skb, nlh, &err) < 0) {
/* Not an error, but we have to interrupt processing
 * here. Note: that in this case we do not pull
@@ -1481,9 +1486,10 @@ static int netlink_rcv_skb(struct sk_buf
 */
if (err == 0)
return -1;
+   }
+skip:
+   if (nlh->nlmsg_flags & NLM_F_ACK || err)
netlink_ack(skb, nlh, err);
-   } else if (nlh->nlmsg_flags & NLM_F_ACK)
-   netlink_ack(skb, nlh, 0);
 
netlink_queue_skip(nlh, skb);
}
Index: net-2.6.22/net/xfrm/xfrm_user.c
===
--- net-2.6.22.orig/net/xfrm/xfrm_user.c2007-03-22 23:08:48.0 
+0100
+++ net-2.6.22/net/xfrm/xfrm_user.c 2007-03-22 23:15:33.0 +0100
@@ -1859,9 +1859,6 @@ static int xfrm_user_rcv_msg(struct sk_b
struct xfrm_link *link;
int type, min_len;
 
-   if (!(nlh->nlmsg_flags & NLM_F_REQUEST))
-   return 0;
-
type = nlh->nlmsg_type;
 
/* A control message: ignore them */
Index: net-2.6.22/net/netlink/genetlink.c
===
--- net-2.6.22.orig/net/netlink/genetlink.c 2007-03-22 23:08:48.0 
+0100
+++ net-2.6.22/net/netlink/genetlink.c  2007-03-22 23:15:33.0 +0100
@@ -304,9 +304,6 @@ static int genl_rcv_msg(struct sk_buff *
struct genlmsghdr *hdr = nlmsg_data(nlh);
int hdrlen, err = -EINVAL;
 
-   if (!(nlh->nlmsg_flags & NLM_F_REQUEST))
-   goto ignore;
-
if (nlh->nlmsg_type < NLMSG_MIN_TYPE)
goto ignore;
 
Index: net-2.6.22/net/core/rtnetlink.c
===
--- net-2.6.22.orig/net/core/rtnetlink.c2007-03-22 23:08:48.0 
+0100
+++ net-2.6.22/net/core/rtnetlink.c 2007-03-22 23:15:33.0 +0100
@@ -861,10 +861,6 @@ rtnetlink_rcv_msg(struct sk_buff *skb, s
int type;
int err;
 
-   /* Only requests are handled by kernel now */
-   if (!(nlh->nlmsg_flags&NLM_F_REQUEST))
-   return 0;
-
type = nlh->nlmsg_type;
 
/* A control message: ignore them */

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/8] [NETLINK]: Ignore control messages directly in netlink_run_queue()

2007-03-22 Thread Thomas Graf

Changes netlink_rcv_skb() to skip netlink controll messages and don't
pass them on to the message handler.

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/netlink/af_netlink.c
===
--- net-2.6.22.orig/net/netlink/af_netlink.c2007-03-22 23:15:33.0 
+0100
+++ net-2.6.22/net/netlink/af_netlink.c 2007-03-22 23:15:58.0 +0100
@@ -1479,6 +1479,10 @@ static int netlink_rcv_skb(struct sk_buf
if (!(nlh->nlmsg_flags & NLM_F_REQUEST))
goto skip;
 
+   /* Skip control messages */
+   if (nlh->nlmsg_type < NLMSG_MIN_TYPE)
+   goto skip;
+
if (cb(skb, nlh, &err) < 0) {
/* Not an error, but we have to interrupt processing
 * here. Note: that in this case we do not pull
Index: net-2.6.22/net/core/rtnetlink.c
===
--- net-2.6.22.orig/net/core/rtnetlink.c2007-03-22 23:15:33.0 
+0100
+++ net-2.6.22/net/core/rtnetlink.c 2007-03-22 23:15:58.0 +0100
@@ -863,10 +863,6 @@ rtnetlink_rcv_msg(struct sk_buff *skb, s
 
type = nlh->nlmsg_type;
 
-   /* A control message: ignore them */
-   if (type < RTM_BASE)
-   return 0;
-
/* Unknown message: reply with EINVAL */
if (type > RTM_MAX)
goto err_inval;
Index: net-2.6.22/net/netlink/genetlink.c
===
--- net-2.6.22.orig/net/netlink/genetlink.c 2007-03-22 23:15:33.0 
+0100
+++ net-2.6.22/net/netlink/genetlink.c  2007-03-22 23:15:58.0 +0100
@@ -304,9 +304,6 @@ static int genl_rcv_msg(struct sk_buff *
struct genlmsghdr *hdr = nlmsg_data(nlh);
int hdrlen, err = -EINVAL;
 
-   if (nlh->nlmsg_type < NLMSG_MIN_TYPE)
-   goto ignore;
-
family = genl_family_find_byid(nlh->nlmsg_type);
if (family == NULL) {
err = -ENOENT;
@@ -364,9 +361,6 @@ static int genl_rcv_msg(struct sk_buff *
*errp = err = ops->doit(skb, &info);
return err;
 
-ignore:
-   return 0;
-
 errout:
*errp = err;
return -1;
Index: net-2.6.22/net/xfrm/xfrm_user.c
===
--- net-2.6.22.orig/net/xfrm/xfrm_user.c2007-03-22 23:15:33.0 
+0100
+++ net-2.6.22/net/xfrm/xfrm_user.c 2007-03-22 23:15:58.0 +0100
@@ -1861,10 +1861,6 @@ static int xfrm_user_rcv_msg(struct sk_b
 
type = nlh->nlmsg_type;
 
-   /* A control message: ignore them */
-   if (type < XFRM_MSG_BASE)
-   return 0;
-
/* Unknown message: reply with EINVAL */
if (type > XFRM_MSG_MAX)
goto err_einval;

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 6/8] [NETLINK]: Remove error pointer from netlink message handler

2007-03-22 Thread Thomas Graf

The error pointer argument in netlink message handlers is used
to signal the special case where processing has to be interrupted
because a dump was started but no error happened. Instead it is
simpler and more clear to return -EINTR and have netlink_run_queue()
deal with getting the queue right.

nfnetlink passed on this error pointer to its subsystem handlers
but only uses it to signal the start of a netlink dump. Therefore
it can be removed there as well.

This patch also cleans up the error handling in the affected
message handlers to be consistent since it had to be touched anyway.

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/core/rtnetlink.c
===
--- net-2.6.22.orig/net/core/rtnetlink.c2007-03-22 23:15:58.0 
+0100
+++ net-2.6.22/net/core/rtnetlink.c 2007-03-22 23:15:59.0 +0100
@@ -851,8 +851,7 @@ static int rtattr_max;
 
 /* Process one rtnetlink message. */
 
-static __inline__ int
-rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, int *errp)
+static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 {
rtnl_doit_func doit;
int sz_idx, kind;
@@ -862,10 +861,8 @@ rtnetlink_rcv_msg(struct sk_buff *skb, s
int err;
 
type = nlh->nlmsg_type;
-
-   /* Unknown message: reply with EINVAL */
if (type > RTM_MAX)
-   goto err_inval;
+   return -EINVAL;
 
type -= RTM_BASE;
 
@@ -874,40 +871,33 @@ rtnetlink_rcv_msg(struct sk_buff *skb, s
return 0;
 
family = ((struct rtgenmsg*)NLMSG_DATA(nlh))->rtgen_family;
-   if (family >= NPROTO) {
-   *errp = -EAFNOSUPPORT;
-   return -1;
-   }
+   if (family >= NPROTO)
+   return -EAFNOSUPPORT;
 
sz_idx = type>>2;
kind = type&3;
 
-   if (kind != 2 && security_netlink_recv(skb, CAP_NET_ADMIN)) {
-   *errp = -EPERM;
-   return -1;
-   }
+   if (kind != 2 && security_netlink_recv(skb, CAP_NET_ADMIN))
+   return -EPERM;
 
if (kind == 2 && nlh->nlmsg_flags&NLM_F_DUMP) {
rtnl_dumpit_func dumpit;
 
dumpit = rtnl_get_dumpit(family, type);
if (dumpit == NULL)
-   goto err_inval;
+   return -EINVAL;
 
-   if ((*errp = netlink_dump_start(rtnl, skb, nlh,
-   dumpit, NULL)) != 0) {
-   return -1;
-   }
-
-   netlink_queue_skip(nlh, skb);
-   return -1;
+   err = netlink_dump_start(rtnl, skb, nlh, dumpit, NULL);
+   if (err == 0)
+   err = -EINTR;
+   return err;
}
 
memset(rta_buf, 0, (rtattr_max * sizeof(struct rtattr *)));
 
min_len = rtm_min[sz_idx];
if (nlh->nlmsg_len < min_len)
-   goto err_inval;
+   return -EINVAL;
 
if (nlh->nlmsg_len > min_len) {
int attrlen = nlh->nlmsg_len - NLMSG_ALIGN(min_len);
@@ -917,7 +907,7 @@ rtnetlink_rcv_msg(struct sk_buff *skb, s
unsigned flavor = attr->rta_type;
if (flavor) {
if (flavor > rta_max[sz_idx])
-   goto err_inval;
+   return -EINVAL;
rta_buf[flavor-1] = attr;
}
attr = RTA_NEXT(attr, attrlen);
@@ -926,15 +916,9 @@ rtnetlink_rcv_msg(struct sk_buff *skb, s
 
doit = rtnl_get_doit(family, type);
if (doit == NULL)
-   goto err_inval;
-   err = doit(skb, nlh, (void *)&rta_buf[0]);
-
-   *errp = err;
-   return err;
+   return -EINVAL;
 
-err_inval:
-   *errp = -EINVAL;
-   return -1;
+   return doit(skb, nlh, (void *)&rta_buf[0]);
 }
 
 static void rtnetlink_rcv(struct sock *sk, int len)
Index: net-2.6.22/net/netlink/genetlink.c
===
--- net-2.6.22.orig/net/netlink/genetlink.c 2007-03-22 23:15:58.0 
+0100
+++ net-2.6.22/net/netlink/genetlink.c  2007-03-22 23:15:59.0 +0100
@@ -295,60 +295,49 @@ int genl_unregister_family(struct genl_f
return -ENOENT;
 }
 
-static int genl_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
-  int *errp)
+static int genl_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 {
struct genl_ops *ops;
struct genl_family *family;
struct genl_info info;
struct genlmsghdr *hdr = nlmsg_data(nlh);
-   int hdrlen, err = -EINVAL;
+   int hdrlen, err;
 
family = genl_family_find_byid(nlh->nlmsg_type);
-   if (family == NULL) {
-   err = -ENOENT;
-   goto errout

[PATCH 7/8] [IPv4] diag: Use netlink_run_queue() to process the receive queue

2007-03-22 Thread Thomas Graf

Makes use of netlink_run_queue() to process the receive queue and
converts inet_diag_rcv_msg() to use the type safe netlink interface.

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/ipv4/inet_diag.c
===
--- net-2.6.22.orig/net/ipv4/inet_diag.c2007-03-22 23:15:31.0 
+0100
+++ net-2.6.22/net/ipv4/inet_diag.c 2007-03-22 23:16:05.0 +0100
@@ -806,68 +806,48 @@ done:
return skb->len;
 }
 
-static inline int inet_diag_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
+static int inet_diag_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 {
-   if (!(nlh->nlmsg_flags&NLM_F_REQUEST))
-   return 0;
+   int hdrlen = sizeof(struct inet_diag_req);
 
-   if (nlh->nlmsg_type >= INET_DIAG_GETSOCK_MAX)
-   goto err_inval;
+   if (nlh->nlmsg_type >= INET_DIAG_GETSOCK_MAX ||
+   nlmsg_len(nlh) < hdrlen)
+   return -EINVAL;
 
if (inet_diag_table[nlh->nlmsg_type] == NULL)
return -ENOENT;
 
-   if (NLMSG_LENGTH(sizeof(struct inet_diag_req)) > skb->len)
-   goto err_inval;
-
-   if (nlh->nlmsg_flags&NLM_F_DUMP) {
-   if (nlh->nlmsg_len >
-   (4 + NLMSG_SPACE(sizeof(struct inet_diag_req {
-   struct rtattr *rta = (void *)(NLMSG_DATA(nlh) +
-sizeof(struct inet_diag_req));
-   if (rta->rta_type != INET_DIAG_REQ_BYTECODE ||
-   rta->rta_len < 8 ||
-   rta->rta_len >
-   (nlh->nlmsg_len -
-NLMSG_SPACE(sizeof(struct inet_diag_req
-   goto err_inval;
-   if (inet_diag_bc_audit(RTA_DATA(rta), RTA_PAYLOAD(rta)))
-   goto err_inval;
-   }
-   return netlink_dump_start(idiagnl, skb, nlh,
- inet_diag_dump, NULL);
-   } else
-   return inet_diag_get_exact(skb, nlh);
-
-err_inval:
-   return -EINVAL;
-}
+   if (nlh->nlmsg_flags & NLM_F_DUMP) {
+   int err;
 
+   if (nlmsg_attrlen(nlh, hdrlen)) {
+   struct nlattr *attr;
 
-static inline void inet_diag_rcv_skb(struct sk_buff *skb)
-{
-   if (skb->len >= NLMSG_SPACE(0)) {
-   int err;
-   struct nlmsghdr *nlh = nlmsg_hdr(skb);
+   attr = nlmsg_find_attr(nlh, hdrlen,
+  INET_DIAG_REQ_BYTECODE);
+   if (attr == NULL ||
+   nla_len(attr) < sizeof(struct inet_diag_bc_op) ||
+   inet_diag_bc_audit(nla_data(attr), nla_len(attr)))
+   return -EINVAL;
+   }
 
-   if (nlh->nlmsg_len < sizeof(*nlh) ||
-   skb->len < nlh->nlmsg_len)
-   return;
-   err = inet_diag_rcv_msg(skb, nlh);
-   if (err || nlh->nlmsg_flags & NLM_F_ACK)
-   netlink_ack(skb, nlh, err);
+   err = netlink_dump_start(idiagnl, skb, nlh,
+inet_diag_dump, NULL);
+   if (err == 0)
+   err = -EINTR;
+   return err;
}
+
+   return inet_diag_get_exact(skb, nlh);
 }
 
 static void inet_diag_rcv(struct sock *sk, int len)
 {
-   struct sk_buff *skb;
-   unsigned int qlen = skb_queue_len(&sk->sk_receive_queue);
+   unsigned int qlen = 0;
 
-   while (qlen-- && (skb = skb_dequeue(&sk->sk_receive_queue))) {
-   inet_diag_rcv_skb(skb);
-   kfree_skb(skb);
-   }
+   do {
+   netlink_run_queue(sk, &qlen, &inet_diag_rcv_msg);
+   } while (qlen);
 }
 
 static DEFINE_SPINLOCK(inet_diag_register_lock);

--

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 8/8] [NETLINK]: Directly return -EINTR from netlink_dump_start()

2007-03-22 Thread Thomas Graf

Now that all users of netlink_dump_start() use netlink_run_queue()
to process the receive queue, it is possible to return -EINTR from
netlink_dump_start() directly, therefore simplying the callers.

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/core/rtnetlink.c
===
--- net-2.6.22.orig/net/core/rtnetlink.c2007-03-22 23:15:59.0 
+0100
+++ net-2.6.22/net/core/rtnetlink.c 2007-03-22 23:16:07.0 +0100
@@ -858,7 +858,6 @@ static int rtnetlink_rcv_msg(struct sk_b
int min_len;
int family;
int type;
-   int err;
 
type = nlh->nlmsg_type;
if (type > RTM_MAX)
@@ -887,10 +886,7 @@ static int rtnetlink_rcv_msg(struct sk_b
if (dumpit == NULL)
return -EINVAL;
 
-   err = netlink_dump_start(rtnl, skb, nlh, dumpit, NULL);
-   if (err == 0)
-   err = -EINTR;
-   return err;
+   return netlink_dump_start(rtnl, skb, nlh, dumpit, NULL);
}
 
memset(rta_buf, 0, (rtattr_max * sizeof(struct rtattr *)));
Index: net-2.6.22/net/ipv4/inet_diag.c
===
--- net-2.6.22.orig/net/ipv4/inet_diag.c2007-03-22 23:16:05.0 
+0100
+++ net-2.6.22/net/ipv4/inet_diag.c 2007-03-22 23:16:07.0 +0100
@@ -818,8 +818,6 @@ static int inet_diag_rcv_msg(struct sk_b
return -ENOENT;
 
if (nlh->nlmsg_flags & NLM_F_DUMP) {
-   int err;
-
if (nlmsg_attrlen(nlh, hdrlen)) {
struct nlattr *attr;
 
@@ -831,11 +829,8 @@ static int inet_diag_rcv_msg(struct sk_b
return -EINVAL;
}
 
-   err = netlink_dump_start(idiagnl, skb, nlh,
-inet_diag_dump, NULL);
-   if (err == 0)
-   err = -EINTR;
-   return err;
+   return netlink_dump_start(idiagnl, skb, nlh,
+ inet_diag_dump, NULL);
}
 
return inet_diag_get_exact(skb, nlh);
Index: net-2.6.22/net/netfilter/nf_conntrack_netlink.c
===
--- net-2.6.22.orig/net/netfilter/nf_conntrack_netlink.c2007-03-22 
23:15:59.0 +0100
+++ net-2.6.22/net/netfilter/nf_conntrack_netlink.c 2007-03-22 
23:16:07.0 +0100
@@ -724,11 +724,8 @@ ctnetlink_get_conntrack(struct sock *ctn
if (NFNL_MSG_TYPE(nlh->nlmsg_type) == IPCTNL_MSG_CT_GET_CTRZERO)
return -ENOTSUPP;
 #endif
-   err = netlink_dump_start(ctnl, skb, nlh, ctnetlink_dump_table,
-ctnetlink_done);
-   if (err == 0)
-   err = -EINTR;
-   return err;
+   return netlink_dump_start(ctnl, skb, nlh, ctnetlink_dump_table,
+ ctnetlink_done);
}
 
if (nfattr_bad_size(cda, CTA_MAX, cta_min))
@@ -1266,12 +1263,9 @@ ctnetlink_get_expect(struct sock *ctnl, 
return -EINVAL;
 
if (nlh->nlmsg_flags & NLM_F_DUMP) {
-   err = netlink_dump_start(ctnl, skb, nlh,
-ctnetlink_exp_dump_table,
-ctnetlink_done);
-   if (err == 0)
-   err = -EINTR;
-   return err;
+   return netlink_dump_start(ctnl, skb, nlh,
+ ctnetlink_exp_dump_table,
+ ctnetlink_done);
}
 
if (cda[CTA_EXPECT_MASTER-1])
Index: net-2.6.22/net/netlink/af_netlink.c
===
--- net-2.6.22.orig/net/netlink/af_netlink.c2007-03-22 23:15:59.0 
+0100
+++ net-2.6.22/net/netlink/af_netlink.c 2007-03-22 23:16:07.0 +0100
@@ -1426,7 +1426,12 @@ int netlink_dump_start(struct sock *ssk,
 
netlink_dump(sk);
sock_put(sk);
-   return 0;
+
+   /* We successfully started a dump, by returning -EINTR we
+* signal the queue mangement to interrupt processing of
+* any netlink messages so userspace gets a chance to read
+* the results. */
+   return -EINTR;
 }
 
 void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err)
Index: net-2.6.22/net/netlink/genetlink.c
===
--- net-2.6.22.orig/net/netlink/genetlink.c 2007-03-22 23:15:59.0 
+0100
+++ net-2.6.22/net/netlink/genetlink.c  2007-03-22 23:16:07.0 +0100
@@ -323,11 +323,8 @@ static int genl_rcv_msg(struct sk_buff *
if (ops->dumpit == NULL)
return -EOPNOTSUPP;
 
-

Recent net-2.6.22 patches break bootup!

2007-03-22 Thread Stephen Hemminger

On Thu, 22 Mar 2007 12:06:19 -0700 (PDT)
David Miller <[EMAIL PROTECTED]> wrote:

> From: Thomas Graf <[EMAIL PROTECTED]>
> Date: Thu, 22 Mar 2007 13:59:55 +0100
> 
> > The existing function names seem to have sentimental value to some
> > people. Same patches but without changes to the functio names.
> > 
> > Introduces an interface to register rtnetlink message handlers
> > and converts all users of rtnl_links[].
> 
> All 12 patches applied, thanks Thomas.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Something is broken now.  If I boot the system (Fedora) it gets to:

Bringing up loopback interface:  RTNETLINK answers: Invalid argument
Dump terminated
RTNETLINK answers: Invalid argument


tg3 device eth0 does not seem to be present, delaying initialization


then it hangs because cups won't come up without loopback

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: VIA Velocity VLAN vexation

2007-03-22 Thread Francois Romieu

[EMAIL PROTECTED] <[EMAIL PROTECTED]> :
[...]
> Is this likely to be a problem with the via-velocity driver?

Yes.

> Is anyone working on it ?

Not as much as I'd like to.

> Or should I just get a different gigabit card ?

This one probably got answered the 2005/11/29. :o)

I'll got to bed in a few minutes but I'll happily resurrect the
velocity vlan patches.

-- 
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Recent net-2.6.22 patches break bootup!

2007-03-22 Thread Thomas Graf

* Stephen Hemminger <[EMAIL PROTECTED]> 2007-03-22 14:27
> Something is broken now.  If I boot the system (Fedora) it gets to:
> 
> Bringing up loopback interface:  RTNETLINK answers: Invalid argument
> Dump terminated
> RTNETLINK answers: Invalid argument
> 
> 
> tg3 device eth0 does not seem to be present, delaying initialization
> 
> 
> then it hangs because cups won't come up without loopback

Thinko. It always returned the first message handler of a rtnl
family.

[RTNL]: Properly return rntl message handler

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.22/net/core/rtnetlink.c
===
--- net-2.6.22.orig/net/core/rtnetlink.c2007-03-23 00:31:37.0 
+0100
+++ net-2.6.22/net/core/rtnetlink.c 2007-03-23 00:32:52.0 +0100
@@ -122,10 +122,10 @@ static rtnl_doit_func rtnl_get_doit(int 
struct rtnl_link *tab;
 
tab = rtnl_msg_handlers[protocol];
-   if (tab == NULL || tab->doit == NULL)
+   if (tab == NULL || tab[msgindex].doit == NULL)
tab = rtnl_msg_handlers[PF_UNSPEC];
 
-   return tab ? tab->doit : NULL;
+   return tab ? tab[msgindex].doit : NULL;
 }
 
 static rtnl_dumpit_func rtnl_get_dumpit(int protocol, int msgindex)
@@ -133,10 +133,10 @@ static rtnl_dumpit_func rtnl_get_dumpit(
struct rtnl_link *tab;
 
tab = rtnl_msg_handlers[protocol];
-   if (tab == NULL || tab->dumpit == NULL)
+   if (tab == NULL || tab[msgindex].dumpit == NULL)
tab = rtnl_msg_handlers[PF_UNSPEC];
 
-   return tab ? tab->dumpit : NULL;
+   return tab ? tab[msgindex].dumpit : NULL;
 }
 
 /**
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: Add security check before flushing SAD/SPD

2007-03-22 Thread James Morris

On Thu, 22 Mar 2007, Joy Latten wrote:

> > I would look at this patch differently if there were some
> > security level key being checked for a match here, which is
> > an input key to the flush, but that is not what is happening
> > here as the object is being looked at by itself.
> 
> Yes, I understand what you are saying.
> I was concerned about having to check each entry
> to flush database.
> 
> I did this patch because we check for authorization
> when deleting single specified entries from the SAD/SPD. It
> seem like a hole to me that we check for this, but that same
> user/process can delete the entire database with no checks.

Indeed.  Removing an entry is modifying MAC policy, which requires 
appropriate authorization.

The security label is encapsulated with the object, which is why it's 
passed to the security layer.

Perhaps a better semantic would be to fail the entire flush operation if 
one of the security checks failed.  e.g. loop through for permissions 
first, then if all ok, loop through for deletion.

- James
-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: Add security check before flushing SAD/SPD

2007-03-22 Thread Joy Latten

On Thu, 2007-03-22 at 19:49 -0400, James Morris wrote:
> On Thu, 22 Mar 2007, Joy Latten wrote:
> 
> > > I would look at this patch differently if there were some
> > > security level key being checked for a match here, which is
> > > an input key to the flush, but that is not what is happening
> > > here as the object is being looked at by itself.
> > 
> > Yes, I understand what you are saying.
> > I was concerned about having to check each entry
> > to flush database.
> > 
> > I did this patch because we check for authorization
> > when deleting single specified entries from the SAD/SPD. It
> > seem like a hole to me that we check for this, but that same
> > user/process can delete the entire database with no checks.
> 
> Indeed.  Removing an entry is modifying MAC policy, which requires 
> appropriate authorization.
> 
> The security label is encapsulated with the object, which is why it's 
> passed to the security layer.
> 
> Perhaps a better semantic would be to fail the entire flush operation if 
> one of the security checks failed.  e.g. loop through for permissions 
> first, then if all ok, loop through for deletion.
> 
Ok, will code this up and test it if there are no objections.

Joy
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2.6] Add missing ioctls to 64<->32 conversion

2007-03-22 Thread Jean Tourrilhes

Hi John,

Johannes Berg and Michael Buesch noticed that the WPA ioctls
were missing from the 64<->32 bit conversion. This means that when
using a 32 bits userspace on a 64 bit kernel, those ioctls fail.
This patch was tested on 2.6.21-rc4. Would you mind pushing it
upstream ?
Thanks...

Jean

Signed-off-by: Jean Tourrilhes <[EMAIL PROTECTED]>

---

diff -u -p linux/fs/compat_ioctl.j1.c  linux/fs/compat_ioctl.c
--- linux/fs/compat_ioctl.j1.c  2007-03-06 17:49:33.0 -0800
+++ linux/fs/compat_ioctl.c 2007-03-06 17:56:19.0 -0800
@@ -2553,11 +2553,15 @@ HANDLE_IOCTL(I2C_RDWR, do_i2c_rdwr_ioctl
 HANDLE_IOCTL(I2C_SMBUS, do_i2c_smbus_ioctl)
 /* wireless */
 HANDLE_IOCTL(SIOCGIWRANGE, do_wireless_ioctl)
+HANDLE_IOCTL(SIOCGIWPRIV, do_wireless_ioctl)
+HANDLE_IOCTL(SIOCGIWSTATS, do_wireless_ioctl)
 HANDLE_IOCTL(SIOCSIWSPY, do_wireless_ioctl)
 HANDLE_IOCTL(SIOCGIWSPY, do_wireless_ioctl)
 HANDLE_IOCTL(SIOCSIWTHRSPY, do_wireless_ioctl)
 HANDLE_IOCTL(SIOCGIWTHRSPY, do_wireless_ioctl)
+HANDLE_IOCTL(SIOCSIWMLME, do_wireless_ioctl)
 HANDLE_IOCTL(SIOCGIWAPLIST, do_wireless_ioctl)
+HANDLE_IOCTL(SIOCSIWSCAN, do_wireless_ioctl)
 HANDLE_IOCTL(SIOCGIWSCAN, do_wireless_ioctl)
 HANDLE_IOCTL(SIOCSIWESSID, do_wireless_ioctl)
 HANDLE_IOCTL(SIOCGIWESSID, do_wireless_ioctl)
@@ -2565,6 +2569,11 @@ HANDLE_IOCTL(SIOCSIWNICKN, do_wireless_i
 HANDLE_IOCTL(SIOCGIWNICKN, do_wireless_ioctl)
 HANDLE_IOCTL(SIOCSIWENCODE, do_wireless_ioctl)
 HANDLE_IOCTL(SIOCGIWENCODE, do_wireless_ioctl)
+HANDLE_IOCTL(SIOCSIWGENIE, do_wireless_ioctl)
+HANDLE_IOCTL(SIOCGIWGENIE, do_wireless_ioctl)
+HANDLE_IOCTL(SIOCSIWENCODEEXT, do_wireless_ioctl)
+HANDLE_IOCTL(SIOCGIWENCODEEXT, do_wireless_ioctl)
+HANDLE_IOCTL(SIOCSIWPMKSA, do_wireless_ioctl)
 HANDLE_IOCTL(SIOCSIFBR, old_bridge_ioctl)
 HANDLE_IOCTL(SIOCGIFBR, old_bridge_ioctl)
 HANDLE_IOCTL(RTC_IRQP_READ32, rtc_ioctl)

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2.6] WE-22 : prevent information leak on 64 bit

2007-03-22 Thread Jean Tourrilhes

Hi,

Johannes Berg discovered that kernel space was leaking to
userspace on 64 bit platform. He made a first patch to fix that. This
is an improved version of his patch.
This was tested on 2.6.21-rc4. Would you mind pushing that
upstream ?
Thanks...

Jean

Signed-off-by: Jean Tourrilhes <[EMAIL PROTECTED]>

---

diff -u -p linux/include/linux/wireless.j1.h linux/include/linux/wireless.h
--- linux/include/linux/wireless.j1.h   2007-03-08 10:34:32.0 -0800
+++ linux/include/linux/wireless.h  2007-03-21 11:01:14.0 -0700
@@ -1,10 +1,10 @@
 /*
  * This file define a set of standard wireless extensions
  *
- * Version :   21  14.3.06
+ * Version :   22  16.3.07
  *
  * Authors :   Jean Tourrilhes - HPL - <[EMAIL PROTECTED]>
- * Copyright (c) 1997-2006 Jean Tourrilhes, All Rights Reserved.
+ * Copyright (c) 1997-2007 Jean Tourrilhes, All Rights Reserved.
  */
 
 #ifndef _LINUX_WIRELESS_H
@@ -85,7 +85,7 @@
  * (there is some stuff that will be added in the future...)
  * I just plan to increment with each new version.
  */
-#define WIRELESS_EXT   21
+#define WIRELESS_EXT   22
 
 /*
  * Changes :
@@ -221,6 +221,10 @@
  * - Add IW_RETRY_SHORT/IW_RETRY_LONG retry modifiers
  * - Power/Retry relative values no longer * 10
  * - Add explicit flag to tell stats are in 802.11k RCPI : IW_QUAL_RCPI
+ *
+ * V21 to V22
+ * --
+ * - Prevent leaking of kernel space in stream on 64 bits.
  */
 
 / CONSTANTS /
@@ -1085,4 +1089,15 @@ struct iw_event
 #define IW_EV_POINT_LEN(IW_EV_LCP_LEN + sizeof(struct iw_point) - \
 IW_EV_POINT_OFF)
 
+/* Size of the Event prefix when packed in stream */
+#define IW_EV_LCP_PK_LEN   (4)
+/* Size of the various events when packed in stream */
+#define IW_EV_CHAR_PK_LEN  (IW_EV_LCP_PK_LEN + IFNAMSIZ)
+#define IW_EV_UINT_PK_LEN  (IW_EV_LCP_PK_LEN + sizeof(__u32))
+#define IW_EV_FREQ_PK_LEN  (IW_EV_LCP_PK_LEN + sizeof(struct iw_freq))
+#define IW_EV_PARAM_PK_LEN (IW_EV_LCP_PK_LEN + sizeof(struct iw_param))
+#define IW_EV_ADDR_PK_LEN  (IW_EV_LCP_PK_LEN + sizeof(struct sockaddr))
+#define IW_EV_QUAL_PK_LEN  (IW_EV_LCP_PK_LEN + sizeof(struct iw_quality))
+#define IW_EV_POINT_PK_LEN (IW_EV_LCP_LEN + 4)
+
 #endif /* _LINUX_WIRELESS_H */
diff -u -p linux/include/net/iw_handler.j1.h linux/include/net/iw_handler.h
--- linux/include/net/iw_handler.j1.h   2007-03-16 17:36:22.0 -0700
+++ linux/include/net/iw_handler.h  2007-03-21 11:01:09.0 -0700
@@ -1,10 +1,10 @@
 /*
  * This file define the new driver API for Wireless Extensions
  *
- * Version :   7   18.3.05
+ * Version :   8   16.3.07
  *
  * Authors :   Jean Tourrilhes - HPL - <[EMAIL PROTECTED]>
- * Copyright (c) 2001-2006 Jean Tourrilhes, All Rights Reserved.
+ * Copyright (c) 2001-2007 Jean Tourrilhes, All Rights Reserved.
  */
 
 #ifndef _IW_HANDLER_H
@@ -207,7 +207,7 @@
  * will be needed...
  * I just plan to increment with each new version.
  */
-#define IW_HANDLER_VERSION 7
+#define IW_HANDLER_VERSION 8
 
 /*
  * Changes :
@@ -239,6 +239,10 @@
  * - Remove (struct iw_point *)->pointer from events and streams
  * - Remove spy_offset from struct iw_handler_def
  * - Add "check" version of event macros for ieee802.11 stack
+ *
+ * V7 to V8
+ * --
+ * - Prevent leaking of kernel space in stream on 64 bits.
  */
 
 / CONSTANTS /
@@ -500,7 +504,11 @@ iwe_stream_add_event(char *stream, /* 
/* Check if it's possible */
if(likely((stream + event_len) < ends)) {
iwe->len = event_len;
-   memcpy(stream, (char *) iwe, event_len);
+   /* Beware of alignement issues on 64 bits */
+   memcpy(stream, (char *) iwe, IW_EV_LCP_PK_LEN);
+   memcpy(stream + IW_EV_LCP_LEN,
+  ((char *) iwe) + IW_EV_LCP_LEN,
+  event_len - IW_EV_LCP_LEN);
stream += event_len;
}
return stream;
@@ -521,10 +529,10 @@ iwe_stream_add_point(char *   stream, /* 
/* Check if it's possible */
if(likely((stream + event_len) < ends)) {
iwe->len = event_len;
-   memcpy(stream, (char *) iwe, IW_EV_LCP_LEN);
+   memcpy(stream, (char *) iwe, IW_EV_LCP_PK_LEN);
memcpy(stream + IW_EV_LCP_LEN,
   ((char *) iwe) + IW_EV_LCP_LEN + IW_EV_POINT_OFF,
-  IW_EV_POINT_LEN - IW_EV_LCP_LEN);
+  IW_EV_POINT_PK_LEN - IW_EV_LCP_PK_LEN);
memcpy(stream + IW_EV_POINT_LEN, extra, iwe->u.data.length);
stream += event_len;
}
@@ -574,7 +582,11 @@ iwe_stream_check_add_event(char *  stream
/*

Re: [PATCH]: Add security check before flushing SAD/SPD

2007-03-22 Thread James Morris

On Thu, 22 Mar 2007, Joy Latten wrote:

> > Perhaps a better semantic would be to fail the entire flush operation if 
> > one of the security checks failed.  e.g. loop through for permissions 
> > first, then if all ok, loop through for deletion.
> > 
> Ok, will code this up and test it if there are no objections.

I'd suggest making the permission loop a noop if CONFIG_SECURITY=n, via a 
static inline function.


-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 13/13] mac68k: SONIC interrupt handling, version 2

2007-03-22 Thread Finn Thain


The previous version of this patch had a bug where a nubus sonic card 
would register two interrupt handlers. Only a built-in sonic needs both.

Install the built-in macsonic interrupt handler on both IRQs when using 
via_alt_mapping. Otherwise the rare interrupt that still comes from the 
nubus slot will wedge the nubus.

$ cat /proc/interrupts
auto   2:  89176 via2
auto   3: 744367 sonic
auto   4:  0 scc
auto   6: 318363 via1
auto   7:  0 NMI
mac9: 119413 framebuffer vbl
mac   10:   1971 ADB
mac   14: 198517 timer
mac   17:  89104 nubus
mac   19: 72 Mac ESP SCSI
mac   56:629 sonic
mac   62:1142593 ide0

Signed-off-by: Finn Thain <[EMAIL PROTECTED]>

 drivers/net/jazzsonic.c |   21 
 drivers/net/macsonic.c  |   49 
 drivers/net/sonic.c |   25 
 3 files changed, 62 insertions(+), 33 deletions(-)

Index: linux-2.6.20/drivers/net/jazzsonic.c
===
--- linux-2.6.20.orig/drivers/net/jazzsonic.c   2007-03-23 11:01:36.0 
+1100
+++ linux-2.6.20/drivers/net/jazzsonic.c2007-03-23 11:01:36.0 
+1100
@@ -88,6 +88,21 @@ static unsigned short known_revisions[] 
0x  /* end of list */
 };
 
+static int jazzsonic_open(struct net_device* dev) {
+   if (request_irq(dev->irq, &sonic_interrupt, IRQF_DISABLED, "sonic", 
dev)) {
+   printk(KERN_ERR "\n%s: unable to get IRQ %d .\n", dev->name, 
dev->irq);
+   return -EAGAIN;
+   }
+   return sonic_open(dev);
+}
+
+static int jazzsonic_close(struct net_device* dev) {
+   int err;
+   err = sonic_close(dev);
+   free_irq(dev->irq, dev);
+   return err;
+}
+
 static int __init sonic_probe1(struct net_device *dev)
 {
static unsigned version_printed;
@@ -169,8 +184,8 @@ static int __init sonic_probe1(struct ne
lp->rra_laddr = lp->rda_laddr + (SIZEOF_SONIC_RD * SONIC_NUM_RDS
 * SONIC_BUS_SCALE(lp->dma_bitmode));
 
-   dev->open = sonic_open;
-   dev->stop = sonic_close;
+   dev->open = jazzsonic_open;
+   dev->stop = jazzsonic_close;
dev->hard_start_xmit = sonic_send_packet;
dev->get_stats = sonic_get_stats;
dev->set_multicast_list = &sonic_multicast_list;
@@ -260,8 +275,6 @@ MODULE_DESCRIPTION("Jazz SONIC ethernet 
 module_param(sonic_debug, int, 0);
 MODULE_PARM_DESC(sonic_debug, "jazzsonic debug level (1-4)");
 
-#define SONIC_IRQ_FLAG IRQF_DISABLED
-
 #include "sonic.c"
 
 static int __devexit jazz_sonic_device_remove (struct platform_device *pdev)
Index: linux-2.6.20/drivers/net/macsonic.c
===
--- linux-2.6.20.orig/drivers/net/macsonic.c2007-03-23 11:01:36.0 
+1100
+++ linux-2.6.20/drivers/net/macsonic.c 2007-03-23 14:39:17.0 +1100
@@ -133,6 +133,49 @@ static inline void bit_reverse_addr(unsi
   nibbletab[(addr[i] >> 4) &0xf]);
 }
 
+static irqreturn_t macsonic_interrupt(int irq, void *dev_id) {
+   /* Under the A/UX interrupt scheme, the onboard SONIC interrupt comes
+* in at priority level 3. However, we sometimes get the level 2 inter-
+* rupt as well, which must prevent re-enterance of the sonic handler.
+*/
+   irqreturn_t result;
+   unsigned long flags;
+
+   local_irq_save(flags);
+   result = sonic_interrupt(irq, dev_id);
+   local_irq_restore(flags);
+   return result;
+}
+
+static int macsonic_open(struct net_device* dev) {
+   if (dev->irq == IRQ_AUTO_3) {
+   if (request_irq(dev->irq, &sonic_interrupt, IRQ_FLG_FAST, 
"sonic", dev)) {
+   printk(KERN_ERR "\n%s: unable to get IRQ %d .\n", 
dev->name, dev->irq);
+   return -EAGAIN;
+   }
+   if (request_irq(IRQ_NUBUS_9, &macsonic_interrupt, IRQ_FLG_FAST, 
"sonic", dev)) {
+   printk(KERN_ERR "\n%s: unable to get IRQ %d .\n", 
dev->name, IRQ_NUBUS_9);
+   free_irq(dev->irq, dev);
+   return -EAGAIN;
+   }
+   } else {
+   if (request_irq(dev->irq, &sonic_interrupt, IRQ_FLG_FAST, 
"sonic", dev)) {
+   printk(KERN_ERR "\n%s: unable to get IRQ %d .\n", 
dev->name, dev->irq);
+   return -EAGAIN;
+   }
+   }
+   return sonic_open(dev);
+}
+
+static int macsonic_close(struct net_device* dev) {
+   int err;
+   err = sonic_close(dev);
+   free_irq(dev->irq, dev);
+   if (dev->irq == IRQ_AUTO_3)
+   free_irq(IRQ_NUBUS_9, dev);
+   return err;
+}
+
 int __init macsonic_init(struct net_device* dev)
 {
struct sonic_local* lp = netdev_priv(dev);
@@ -163,8 +206

Re: Recent net-2.6.22 patches break bootup!

2007-03-22 Thread David Miller

From: Thomas Graf <[EMAIL PROTECTED]>
Date: Fri, 23 Mar 2007 00:47:04 +0100

> * Stephen Hemminger <[EMAIL PROTECTED]> 2007-03-22 14:27
> > Something is broken now.  If I boot the system (Fedora) it gets to:
> > 
> > Bringing up loopback interface:  RTNETLINK answers: Invalid argument
> > Dump terminated
> > RTNETLINK answers: Invalid argument
> > 
> > 
> > tg3 device eth0 does not seem to be present, delaying initialization
> > 
> > 
> > then it hangs because cups won't come up without loopback
> 
> Thinko. It always returned the first message handler of a rtnl
> family.
> 
> [RTNL]: Properly return rntl message handler
> 
> Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Applied, thanks Thomas.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: Add security check before flushing SAD/SPD

2007-03-22 Thread Eric Paris

On Thu, 2007-03-22 at 19:49 -0400, James Morris wrote:
> On Thu, 22 Mar 2007, Joy Latten wrote:
> 
> > > I would look at this patch differently if there were some
> > > security level key being checked for a match here, which is
> > > an input key to the flush, but that is not what is happening
> > > here as the object is being looked at by itself.
> > 
> > Yes, I understand what you are saying.
> > I was concerned about having to check each entry
> > to flush database.
> > 
> > I did this patch because we check for authorization
> > when deleting single specified entries from the SAD/SPD. It
> > seem like a hole to me that we check for this, but that same
> > user/process can delete the entire database with no checks.
> 
> Indeed.  Removing an entry is modifying MAC policy, which requires 
> appropriate authorization.
> 
> The security label is encapsulated with the object, which is why it's 
> passed to the security layer.
> 
> Perhaps a better semantic would be to fail the entire flush operation if 
> one of the security checks failed.  e.g. loop through for permissions 
> first, then if all ok, loop through for deletion.

Maybe I'm way out on a limb here but if I am a regular user and I say
rm /tmp/* and I only have permissions to delete some of the files I
expect just those couple to be delete, not the whole operation denied.

It seems reasonable to me that the check for every policy (which is
between current->security->sid and xp->security->ctx_sid) makes sense.
There doesn't appear to me right offhand to be anything intrinsic in the
code which says that a flush request must flush everything or nothing.

In either case though proper auditing needs to be addressed.  I see that
the first patch from Joy wouldn't audit deletion failures.  It appears
to me if the check is done per policy then the security hook return code
needs to be recorded and passed to xfrm_audit_log instead of the hard
coded 1 result used now.

Assuming we go with James's double loop what should we be auditing for a
security hook denial?  Just audit the first policy entry which we tried
to remove but couldn't and then leave the rest of the auditing in those
functions the way it is now in case there was no denial, calling
xfrm_audit_log with a hard coded 1 for the result?

-Eric

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/7] skge: deadlock on tx timeout

2007-03-22 Thread Jeff Garzik


Stephen Hemminger wrote:

The skge driver will deadlock if gets a transmit timeout
because the netif_tx_lock() is already held.

Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>


applied 1-3 to #upstream-fixes


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/7] skge: use per-port phy locking

2007-03-22 Thread Jeff Garzik


Stephen Hemminger wrote:

Rather than a workqueue and a per-board mutex to control PHY,
use a tasklet and spinlock. Tasklet is lower overhead and works
just as well for this.

Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>


Like we give a crap about overhead of PHY code.

This seems like the wrong direction to me, but let's see where this leads.

Jeff



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Fix return code in pci-skeleton.c

2007-03-22 Thread Jeff Garzik


Anton Blanchard wrote:

We assign the return value of register_netdev to i, but return rc later
on. Fix it.

Signed-off-by: Anton Blanchard <[EMAIL PROTECTED]>
---

diff --git a/drivers/net/pci-skeleton.c b/drivers/net/pci-skeleton.c
index 00ca0fd..6ca4e4f 100644
--- a/drivers/net/pci-skeleton.c
+++ b/drivers/net/pci-skeleton.c
@@ -710,8 +710,8 @@ match:
tp->chipset,
rtl_chip_info[tp->chipset].name);
 
-	i = register_netdev (dev);

-   if (i)
+   rc = register_netdev (dev);
+   if (rc)
goto err_out_unmap;


applied


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Revert "ucc_geth: returns NETDEV_TX_BUSY when BD ring is full"

2007-03-22 Thread Jeff Garzik


Li Yang wrote:

This reverts commit 18babd38547a042a4bfd4154a014d1ad33373eb0.

Michael Barkowski points out that it's wrong, and I agree.  The
patch causes a problem rather than fixes one after another
patch "ucc_geth: Fix BD processing" was applied.  Before that
patch, current packet should be blocked.  However after the patch
current packet is ok and we only need to block next.

Reported-by: Michael Barkowski <[EMAIL PROTECTED]>
Signed-off-by: Li Yang <[EMAIL PROTECTED]>
---
Sorry for the mistake I made.

drivers/net/ucc_geth.c |3 +--
1 files changed, 1 insertions(+), 2 deletions(-)


applied


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [NET] SAA9730: Fix large pile of warnings

2007-03-22 Thread Jeff Garzik


Ralf Baechle wrote:

The SAA9730 driver doesn't quite grok what the difference between an ioport
and memory mapped I/O is.  It just happened to work on the one Linux
system the SAA9730 happens to spend it's misserable existence on.


applied


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 >

1 - 100 of 127 matches

Mail list logo