Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-07 Thread Ben Greear

Patrick McHardy wrote:

Ben Greear wrote:


With this patch applied everything is looking much better.  I currently
have 400+ interfaces and one routing table per interface, and traffic
is passing as expected.

This is probably due to my own application polling interfaces for
stat updates...but I am seeing over 50% usage (with more system than
user-space)
in this setup on an otherwise lightly loaded system.  top shows no
process averaging
more than about 2% CPU (and only 2-3 are above 0.0 typically), which I find
a little strange.  load is around 3.0.



I can't imagine this beeing related to the increased number of
routing tables, with a number of entries slightly (not even two
times) over the hash size it shouldn't make that much of a
difference. It may of course be a bug, but I don't see it.


I think it was my polling logic that was the problem.  I fixed it up to
be more clever and the load went away.

Ben



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-07 Thread Patrick McHardy
Ben Greear wrote:
> With this patch applied everything is looking much better.  I currently
> have 400+ interfaces and one routing table per interface, and traffic
> is passing as expected.
> 
> This is probably due to my own application polling interfaces for
> stat updates...but I am seeing over 50% usage (with more system than
> user-space)
> in this setup on an otherwise lightly loaded system.  top shows no
> process averaging
> more than about 2% CPU (and only 2-3 are above 0.0 typically), which I find
> a little strange.  load is around 3.0.

I can't imagine this beeing related to the increased number of
routing tables, with a number of entries slightly (not even two
times) over the hash size it shouldn't make that much of a
difference. It may of course be a bug, but I don't see it.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-07 Thread Patrick McHardy
David Miller wrote:
> Nice work Patrick.
> 
> You guys have a lot of time to flesh out any remaining issues and
> failures, and then submit this for 2.6.19

Will do, I already expected to miss the deadline :)

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-07 Thread Ben Greear

Patrick McHardy wrote:

Ben Greear wrote:


Patrick McHardy wrote:



I took on Ben's challenge to increase the number of possible routing
tables, these are the resulting patches.



I am seeing problems..though they could be with the way I'm using the tool
or pehaps I patched the kernel incorrectly.

I applied the 3 patches to 2.6.17..all patches applied without problem,
but with a few lines of fuzz.  I get the same behaviour with and
without the new 'ip' patches applied.

If I do an 'ip ru show', then I see lots of tables, though not all it
seems. (I have not tried beyond 205 yet).  But, if I do an
'ip route show table XX', then I see nothing or incorrect values.



My patches introduced a bug when dumping tables which could lead to
incorrect routes beeing dumped. A second bug (that already existed)
makes the kernel fail when dumping more rules than fit in a skb.
I think I've already seen the patch to address the second problem
a short time ago sent by someone else. Anyway, this patch should
fix both.


With this patch applied everything is looking much better.  I currently
have 400+ interfaces and one routing table per interface, and traffic
is passing as expected.

This is probably due to my own application polling interfaces for
stat updates...but I am seeing over 50% usage (with more system than user-space)
in this setup on an otherwise lightly loaded system.  top shows no process 
averaging
more than about 2% CPU (and only 2-3 are above 0.0 typically), which I find
a little strange.  load is around 3.0.

I'll dig into my code and see if I can tune the stat-gathering logic a bit...

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-07 Thread David Miller
From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Fri, 07 Jul 2006 21:58:31 +0200

> My patches introduced a bug when dumping tables which could lead to
> incorrect routes beeing dumped. A second bug (that already existed)
> makes the kernel fail when dumping more rules than fit in a skb.
> I think I've already seen the patch to address the second problem
> a short time ago sent by someone else. Anyway, this patch should
> fix both.

Nice work Patrick.

You guys have a lot of time to flesh out any remaining issues and
failures, and then submit this for 2.6.19

Thanks again.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-07 Thread Patrick McHardy
Ben Greear wrote:
> Patrick McHardy wrote:
> 
>>> I took on Ben's challenge to increase the number of possible routing
>>> tables, these are the resulting patches.
> 
> 
> I am seeing problems..though they could be with the way I'm using the tool
> or pehaps I patched the kernel incorrectly.
> 
> I applied the 3 patches to 2.6.17..all patches applied without problem,
> but with a few lines of fuzz.  I get the same behaviour with and
> without the new 'ip' patches applied.
> 
> If I do an 'ip ru show', then I see lots of tables, though not all it
> seems. (I have not tried beyond 205 yet).  But, if I do an
> 'ip route show table XX', then I see nothing or incorrect values.

My patches introduced a bug when dumping tables which could lead to
incorrect routes beeing dumped. A second bug (that already existed)
makes the kernel fail when dumping more rules than fit in a skb.
I think I've already seen the patch to address the second problem
a short time ago sent by someone else. Anyway, this patch should
fix both.

diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 3c49e6b..6e1aaa4 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -357,6 +357,7 @@ int inet_dump_fib(struct sk_buff *skb, s
unsigned int e = 0, s_e;
struct fib_table *tb;
struct hlist_node *node;
+   int dumped = 0;
 
if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) &&
((struct rtmsg*)NLMSG_DATA(cb->nlh))->rtm_flags&RTM_F_CLONED)
@@ -365,16 +366,17 @@ int inet_dump_fib(struct sk_buff *skb, s
s_h = cb->args[0];
s_e = cb->args[1];
 
-   for (h = s_h; h < FIB_TABLE_HASHSZ; h++) {
+   for (h = s_h; h < FIB_TABLE_HASHSZ; h++, s_e = 0) {
e = 0;
hlist_for_each_entry(tb, node, &fib_table_hash[h], tb_hlist) {
if (e < s_e)
goto next;
-   if (e > s_e)
-   memset(&cb->args[1], 0, sizeof(cb->args) -
+   if (dumped)
+   memset(&cb->args[2], 0, sizeof(cb->args) -
 2 * sizeof(cb->args[0]));
if (tb->tb_dump(tb, skb, cb) < 0) 
goto out;
+   dumped = 1;
 next:
e++;
}
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index a41ab4b..6f33f12 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -459,13 +459,13 @@ int inet_dump_rules(struct sk_buff *skb,
 
rcu_read_lock();
hlist_for_each_entry(r, node, &fib_rules, hlist) {
-
if (idx < s_idx)
-   continue;
+   goto next;
if (inet_fill_rule(skb, r, NETLINK_CB(cb->skb).pid,
   cb->nlh->nlmsg_seq,
   RTM_NEWRULE, NLM_F_MULTI) < 0)
break;
+next:
idx++;
}
rcu_read_unlock();


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-07 Thread Ben Greear

Patrick McHardy wrote:

Patrick McHardy wrote:


I took on Ben's challenge to increase the number of possible routing tables,
these are the resulting patches.


I am seeing problems..though they could be with the way I'm using the tool
or pehaps I patched the kernel incorrectly.

I applied the 3 patches to 2.6.17..all patches applied without problem,
but with a few lines of fuzz.  I get the same behaviour with and
without the new 'ip' patches applied.

If I do an 'ip ru show', then I see lots of tables, though not all it seems.
(I have not tried beyond 205 yet).  But, if I do an 'ip route show table XX',
then I see nothing or incorrect values.

For my test, I am creating 200 virtual interfaces (mac-vlans in my case, but
802.1q should work equally well.)  I am giving them all IP addrs on the same
subnet, and a routing table for each source IP addr.

The commands I run to generate the routing tables are found in this file:
http://www.candelatech.com/oss/gc.txt

When I change back to kernel 2.6.16.16 with only my patchset applied, things
seem to be working, so it looks like an issue with the new kernel patches.
I can provide access to this machine as well as my full patch set, etc...

For whatever reason, table 5 does appear in a bizarre fashion:

[EMAIL PROTECTED] lanforge]$ more ~/tmp/ip.txt
[EMAIL PROTECTED] lanforge]# ip route show table 5
10.1.2.0/24 via 10.1.2.2 dev eth1#0
default via 10.1.2.1 dev eth1#0
[EMAIL PROTECTED] lanforge]# ip route show table 4
[EMAIL PROTECTED] lanforge]# ip route show table 3
[EMAIL PROTECTED] lanforge]# ip route show table 2
[EMAIL PROTECTED] lanforge]# ip route show table 1
[EMAIL PROTECTED] lanforge]# ip route show table 0
10.1.2.0/24 via 10.1.2.2 dev eth1#0  table 5
default via 10.1.2.1 dev eth1#0  table 5

#  Here is a listing of 'ip ru show'.
[EMAIL PROTECTED] lanforge]$ more ~/tmp/ru.txt
0:  from all lookup local
31203:  from 10.1.2.144 lookup 147
31204:  from 10.1.2.143 lookup 146
31205:  from 10.1.2.142 lookup 145
31206:  from 10.1.2.141 lookup 144
31207:  from 10.1.2.140 lookup 143
31208:  from 10.1.2.139 lookup 142
31209:  from 10.1.2.138 lookup 141
31210:  from 10.1.2.137 lookup 140
31211:  from 10.1.2.136 lookup 139
31212:  from 10.1.2.135 lookup 138
31213:  from 10.1.2.134 lookup 137
31214:  from 10.1.2.133 lookup 136
31215:  from 10.1.2.132 lookup 135
31216:  from 10.1.2.131 lookup 134
31217:  from 10.1.2.130 lookup 133
31218:  from 10.1.2.129 lookup 132
31219:  from 10.1.2.128 lookup 131
31220:  from 10.1.2.127 lookup 130
31221:  from 10.1.2.126 lookup 129
31222:  from 10.1.2.125 lookup 128
31223:  from 10.1.2.124 lookup 127
31224:  from 10.1.2.123 lookup 126
31225:  from 10.1.2.122 lookup 125
31226:  from 10.1.2.121 lookup 124
31227:  from 10.1.2.120 lookup 123
31228:  from 10.1.2.119 lookup 122
31229:  from 10.1.2.118 lookup 121
31230:  from 10.1.2.117 lookup 120
31231:  from 10.1.2.116 lookup 119
31232:  from 10.1.2.115 lookup 118
31233:  from 10.1.2.114 lookup 117
31234:  from 10.1.2.113 lookup 116
31235:  from 10.1.2.201 lookup 204
31236:  from 10.1.2.200 lookup 203
31237:  from 10.1.2.199 lookup 202
31238:  from 10.1.2.198 lookup 201
31239:  from 10.1.2.197 lookup 200
31240:  from 10.1.2.196 lookup 199
31241:  from 10.1.2.195 lookup 198
31242:  from 10.1.2.112 lookup 115
31243:  from 10.1.2.111 lookup 114
31244:  from 10.1.2.110 lookup 113
31245:  from 10.1.2.109 lookup 112
31246:  from 10.1.2.108 lookup 111
31247:  from 10.1.2.107 lookup 110
31248:  from 10.1.2.106 lookup 109
31249:  from 10.1.2.105 lookup 108
31250:  from 10.1.2.104 lookup 107
31251:  from 10.1.2.103 lookup 106
31252:  from 10.1.2.102 lookup 105
31253:  from 10.1.2.101 lookup 104
31254:  from 10.1.2.100 lookup 103
31255:  from 10.1.2.99 lookup 102
31256:  from 10.1.2.98 lookup 101
31257:  from 10.1.2.97 lookup 100
31258:  from 10.1.2.96 lookup 99
31259:  from 10.1.2.95 lookup 98
31260:  from 10.1.2.94 lookup 97
31261:  from 10.1.2.93 lookup 96
31262:  from 10.1.2.92 lookup 95
31263:  from 10.1.2.91 lookup 94
31264:  from 10.1.2.90 lookup 93
31265:  from 10.1.2.89 lookup 92
31266:  from 10.1.2.88 lookup 91
31267:  from 10.1.2.87 lookup 90
31268:  from 10.1.2.86 lookup 89
31269:  from 10.1.2.85 lookup 88
31270:  from 10.1.2.84 lookup 87
31271:  from 10.1.2.83 lookup 86
31272:  from 10.1.2.82 lookup 85
31273:  from 10.1.2.81 lookup 84
31274:  from 10.1.2.80 lookup 83
31275:  from 10.1.2.79 lookup 82

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-07 Thread Patrick McHardy
Patrick McHardy wrote:
> I took on Ben's challenge to increase the number of possible routing tables,
> these are the resulting patches.
> 
> The table IDs are changed to 32 bit values and are contained in a new netlink
> routing attribute. For compatibility rtm_table in struct rtmsg can still be
> used to access the first 255 tables and contains the low 8 bit of the table
> ID in case of dumps. Unfortunately there are no invalid values for rtm_table,
> so the best userspace can do in case of a new iproute version that tries to
> access tables > 255 on an old kernel is to use RTM_UNSPEC (0) for rtm_table,
> which will make the kernel allocate an empty table instead of silently adding
> routes to a more or less random table. The iproute patch will follow shortly.
> 
> The hash tables are statically sized since on-the-fly resizing would require
> introducing locking in the packet processing path (currently we need none),
> if this is a problem we could just directly attach table references to rules,
> since tables are never deleted or freed this would be a simple change.
> 
> One spot is still missing (nl_fib_lookup), so these patches are purely a RFC
> for now. Tested only with IPv4, I mainly converted DECNET as well to keep it
> in sync and because iteration over all possible table values, as done in many
> spots, has an unacceptable overhead with 32 bit values.


Since there were no objections, I would like to finalize this patch by
takeing care of nl_fib_lookup. Since it was introduced as a debugging
interface for fib_trie and the interface definitions are not even
public (contained in include/net), I wonder if anyone really cares about
backwards compatibility or if I can just change it.

Robert, Thomas, you are the only two users of the interface I'm aware
of, what do you think?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-03 Thread Thomas Graf
* Patrick McHardy <[EMAIL PROTECTED]> 2006-07-03 13:36
> They will as long as this feature isn't used, the RTA_TABLE
> attribute is only added to the message when the table id
> is > 255. Worked fine during my tests, or are you refering
> to something else?

Perfect, I said nothing :)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-03 Thread Patrick McHardy
Thomas Graf wrote:
> * Patrick McHardy <[EMAIL PROTECTED]> 2006-07-03 11:38
> 
>>That wasn't entirely true either, its not inet_check_attr but
>>rtnetlink_rcv_message that aborts, and it does this on all
>>kernels. Somehow I thought unknown attributes were usually
>>ignored ..
> 
> 
> This only applies to the first level of rtnetlink attributes,
> when using rtattr_parse() unknown attributes are ignored.
> 
> Once this ugly rta_buf has disappeared it will become more
> consistent.
> 
> Patches look good to me except that new iproute binaries
> won't work with older kernels anymore?

They will as long as this feature isn't used, the RTA_TABLE
attribute is only added to the message when the table id
is > 255. Worked fine during my tests, or are you refering
to something else?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-03 Thread Thomas Graf
* Patrick McHardy <[EMAIL PROTECTED]> 2006-07-03 11:38
> That wasn't entirely true either, its not inet_check_attr but
> rtnetlink_rcv_message that aborts, and it does this on all
> kernels. Somehow I thought unknown attributes were usually
> ignored ..

This only applies to the first level of rtnetlink attributes,
when using rtattr_parse() unknown attributes are ignored.

Once this ugly rta_buf has disappeared it will become more
consistent.

Patches look good to me except that new iproute binaries
won't work with older kernels anymore?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-03 Thread Patrick McHardy
Patrick McHardy wrote:
> Patrick McHardy wrote:
> 
>>I took on Ben's challenge to increase the number of possible routing tables,
>>these are the resulting patches.
>>
>>The table IDs are changed to 32 bit values and are contained in a new netlink
>>routing attribute. For compatibility rtm_table in struct rtmsg can still be
>>used to access the first 255 tables and contains the low 8 bit of the table
>>ID in case of dumps. Unfortunately there are no invalid values for rtm_table,
>>so the best userspace can do in case of a new iproute version that tries to
>>access tables > 255 on an old kernel is to use RTM_UNSPEC (0) for rtm_table,
>>which will make the kernel allocate an empty table instead of silently adding
>>routes to a more or less random table. The iproute patch will follow shortly.
> 
> 
> Actually that last part wasn't entirely true. The last couple of
> releases of the kernel include the inet_check_attr function,
> which (unwillingly) breaks with the tradition of ignoring
> unknown attributes and signals an error on receiving the RTA_TABLE
> attribute. So the iproute patch only includes the RTA_TABLE
> attribute when the table ID is > 255, in which case rtm_table
> is set to RT_TABLE_UNSPEC. Old kernels will still have the
> behaviour I described above. The patch has been tested to
> behave as expected on both patched and unpatched kernels.

That wasn't entirely true either, its not inet_check_attr but
rtnetlink_rcv_message that aborts, and it does this on all
kernels. Somehow I thought unknown attributes were usually
ignored .. anyway, this is a good thing in this case as it
will avoid unexpected behaviour and simply return an error
on kernels where this feature is not available.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET 00/04]: Increase number of possible routing tables

2006-07-03 Thread Patrick McHardy
Patrick McHardy wrote:
> I took on Ben's challenge to increase the number of possible routing tables,
> these are the resulting patches.
> 
> The table IDs are changed to 32 bit values and are contained in a new netlink
> routing attribute. For compatibility rtm_table in struct rtmsg can still be
> used to access the first 255 tables and contains the low 8 bit of the table
> ID in case of dumps. Unfortunately there are no invalid values for rtm_table,
> so the best userspace can do in case of a new iproute version that tries to
> access tables > 255 on an old kernel is to use RTM_UNSPEC (0) for rtm_table,
> which will make the kernel allocate an empty table instead of silently adding
> routes to a more or less random table. The iproute patch will follow shortly.

Actually that last part wasn't entirely true. The last couple of
releases of the kernel include the inet_check_attr function,
which (unwillingly) breaks with the tradition of ignoring
unknown attributes and signals an error on receiving the RTA_TABLE
attribute. So the iproute patch only includes the RTA_TABLE
attribute when the table ID is > 255, in which case rtm_table
is set to RT_TABLE_UNSPEC. Old kernels will still have the
behaviour I described above. The patch has been tested to
behave as expected on both patched and unpatched kernels.

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 5e33a20..7573c62 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -238,9 +238,8 @@ enum rt_class_t
RT_TABLE_DEFAULT=253,
RT_TABLE_MAIN=254,
RT_TABLE_LOCAL=255,
-   __RT_TABLE_MAX
 };
-#define RT_TABLE_MAX (__RT_TABLE_MAX - 1)
+#define RT_TABLE_MAX 0x
 
 
 
@@ -263,6 +262,7 @@ enum rtattr_type_t
RTA_CACHEINFO,
RTA_SESSION,
RTA_MP_ALGO,
+   RTA_TABLE,
__RTA_MAX
 };
 
diff --git a/include/rt_names.h b/include/rt_names.h
index 2d9ef10..07a10e0 100644
--- a/include/rt_names.h
+++ b/include/rt_names.h
@@ -5,7 +5,7 @@ #include 
 
 char* rtnl_rtprot_n2a(int id, char *buf, int len);
 char* rtnl_rtscope_n2a(int id, char *buf, int len);
-char* rtnl_rttable_n2a(int id, char *buf, int len);
+char* rtnl_rttable_n2a(__u32 id, char *buf, int len);
 char* rtnl_rtrealm_n2a(int id, char *buf, int len);
 char* rtnl_dsfield_n2a(int id, char *buf, int len);
 int rtnl_rtprot_a2n(__u32 *id, char *arg);
diff --git a/ip/ip_common.h b/ip/ip_common.h
index 1fe4a69..8b286b0 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -32,4 +32,12 @@ extern int do_multiaddr(int argc, char *
 extern int do_multiroute(int argc, char **argv);
 extern int do_xfrm(int argc, char **argv);
 
+static inline int rtm_get_table(struct rtmsg *r, struct rtattr **tb)
+{
+   __u32 table = r->rtm_table;
+   if (tb[RTA_TABLE])
+   table = *(__u32*) RTA_DATA(tb[RTA_TABLE]);
+   return table;
+}
+
 extern struct rtnl_handle rth;
diff --git a/ip/iproute.c b/ip/iproute.c
index a43c09e..4ebe617 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -75,7 +75,8 @@ static void usage(void)
 
 static struct
 {
-   int tb;
+   __u32 tb;
+   int cloned;
int flushed;
char *flushb;
int flushp;
@@ -125,6 +126,7 @@ int print_route(const struct sockaddr_nl
inet_prefix prefsrc;
inet_prefix via;
int host_len = -1;
+   __u32 table;
SPRINT_BUF(b1);

 
@@ -151,27 +153,23 @@ int print_route(const struct sockaddr_nl
host_len = 80;
 
if (r->rtm_family == AF_INET6) {
+   if (filter.cloned) {
+   if (!(r->rtm_flags&RTM_F_CLONED))
+   return 0;
+   }
if (filter.tb) {
-   if (filter.tb < 0) {
-   if (!(r->rtm_flags&RTM_F_CLONED))
-   return 0;
-   } else {
-   if (r->rtm_flags&RTM_F_CLONED)
+   if (r->rtm_flags&RTM_F_CLONED)
+   return 0;
+   if (filter.tb == RT_TABLE_LOCAL) {
+   if (r->rtm_type != RTN_LOCAL)
return 0;
-   if (filter.tb == RT_TABLE_LOCAL) {
-   if (r->rtm_type != RTN_LOCAL)
-   return 0;
-   } else if (filter.tb == RT_TABLE_MAIN) {
-   if (r->rtm_type == RTN_LOCAL)
-   return 0;
-   } else {
+   } else if (filter.tb == RT_TABLE_MAIN) {
+   if (r->rtm_type == RTN_LOCAL)
return 0;
-   }
+   } else {
+   return 0;
}
  

[RFC NET 00/04]: Increase number of possible routing tables

2006-07-03 Thread Patrick McHardy
I took on Ben's challenge to increase the number of possible routing tables,
these are the resulting patches.

The table IDs are changed to 32 bit values and are contained in a new netlink
routing attribute. For compatibility rtm_table in struct rtmsg can still be
used to access the first 255 tables and contains the low 8 bit of the table
ID in case of dumps. Unfortunately there are no invalid values for rtm_table,
so the best userspace can do in case of a new iproute version that tries to
access tables > 255 on an old kernel is to use RTM_UNSPEC (0) for rtm_table,
which will make the kernel allocate an empty table instead of silently adding
routes to a more or less random table. The iproute patch will follow shortly.

The hash tables are statically sized since on-the-fly resizing would require
introducing locking in the packet processing path (currently we need none),
if this is a problem we could just directly attach table references to rules,
since tables are never deleted or freed this would be a simple change.

One spot is still missing (nl_fib_lookup), so these patches are purely a RFC
for now. Tested only with IPv4, I mainly converted DECNET as well to keep it
in sync and because iteration over all possible table values, as done in many
spots, has an unacceptable overhead with 32 bit values.


 include/linux/rtnetlink.h |   11 +++
 include/net/dn_fib.h  |7 +-
 include/net/ip_fib.h  |   39 -
 net/decnet/dn_fib.c   |   62 ++---
 net/decnet/dn_route.c |1 
 net/decnet/dn_rules.c |   12 ++--
 net/decnet/dn_table.c |  133 --
 net/ipv4/fib_frontend.c   |  115 +--
 net/ipv4/fib_hash.c   |   30 +-
 net/ipv4/fib_lookup.h |4 -
 net/ipv4/fib_rules.c  |   18 +++---
 net/ipv4/fib_semantics.c  |5 +
 net/ipv4/fib_trie.c   |   32 +--
 net/ipv4/route.c  |1 
 net/ipv6/route.c  |1 
 15 files changed, 255 insertions(+), 216 deletions(-)

Patrick McHardy:
  [NET]: Use u32 for routing table IDs
  [NET]: Introduce RTA_TABLE routing attribute
  [IPV4]: Increase number of possible routing tables to 2^32
  [DECNET]: Increase number of possible routing tables to 2^32
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html