Re: Patch: reduce skb input dev on 64 bit machines

2005-07-28 Thread Alexey Kuznetsov
Hello!

 You're right.  That makes it safe with preemption.  However, netfilter
 could still queue the packet which means the neigh entry and dev can
 go away under it, right?

Netfilter is very clean about this. Each hook gets two devices as _arguments_
(input and output, skb-dev is one of them depending on context), nf_queue
grabs the references, nf_reinjects puts them.

Alexey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-27 Thread jamal
On Tue, 2005-26-07 at 09:54 -0700, Ben Greear wrote:
[..]
 You will need to enforce that nothing else gets the index 34 while eth7 is
 removed.  
 How do you do that?  

Thats trivial if you assume there's one management app which most of the
router vendors implementing it have. It will get tricky when you have 10
apps fighting to get index 34 for other devices. 
You could of course enforce a reserved set of indices right on bootup
via a kernel option for example; but that doesnt solve a stupid admin
with 10 scripts all fighting for index 34 each for a different device.

 If you try to put that into the kernel, you
 have a big nasty mess, and if you try to make user-space do it, any bogus
 script can hose your system and potentially screw up your firewall and
 worse.
 

Refer to above. It can actually be solved and not in a big mess like you
say. The question is whether such a feature is needed.

 Also, imagine that you remove your pro/100 pcmcia NIC and put in your
 tulip.  Both will be, say, eth1 currently, and that is probably what you
 want, but they will have different physical characteristics.  

The name is easy. Check out a utility like nameif for example which uses
MAC addresses as unique ids. DaveM mentions this in his other email on
this thread.

 Even if you
 put them in different cardbus slots, the likelyhood is that you want it
 to be called eth1 and treated the same regardless of which NIC you are
 using.  If you are matching firewall rules or whatever against a device
 index instead of a device name, this will fail because the device indexes
 will be different.

Again the name is easy. You can call a NIC whatever you want. 

 And, for purely virtual devices, with no lspci relationship, and no serial
 number, how do you match those?
 

[You are taking things too literally (which is always dangerous): When i
mentioned lspci, serial number etc - I was not defining scripture. I was
giving an example of how you could find uniqueness. DaveM mentioned MAC
addresses for example; think outside the box a little - find something
thats unique on per device type if you are going to write a management
script/program.]

 Maybe we could make a small effort to keep the device indexes the same
 more often, but still not guarantee it.  That may help snmp related tools
 without overly complicating the kernel or user space.

We already almost guarantee a device will get the same ifindex and name
if created at boot time on the same kernel.
As for reserving, as i said above - an admin could be allowed to reserve
ifindices. Now you could go further with boot time reserving of a
name,ifindex pair example ifreserve=eth7,34 and pass a series of
those; when someone creates eth7 they get ifindex 34 etc. But this is
assuming 10 other scripts will try to be mapping ifindex 34 to something
else.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-27 Thread jamal
On Tue, 2005-26-07 at 13:00 -0700, David S. Miller wrote:
 
 Calling __dev_get_by_index() at every classification check is quite
 silly and potentially expensive, so let's call using ifindex a last
 resort, yet correct, fix.

Just double checking (I think we are saying the same thing),
that using ifindices and requiring refcounting for input_dev means you
have to use __dev_get_by_index() on a per-packet basis.

The contention is if we do really care about refcounting: I dont think
we do. The only time it would really matter is when a module or device
is hotplugged out and back in. 

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-26 Thread jamal
On Tue, 2005-26-07 at 21:54 +1000, Herbert Xu wrote:
 David S. Miller [EMAIL PROTECTED] wrote:
  
  But how can this possibly work for skb-dst'less packets (such as IPV4
  ARP generated frames)?
 
 I think the answer is that it works by accident.

But does it matter really? If say we close down the device, 
do we want to be stopped from closing it until all the packets have been
transmitted? We purge the qdiscs iirc already on close, so why not the
transmit ring?

cheers,
jamal


PS:- Dave, I think all this holding refcount thing on the input_dev is
complicating things. Lets just use the ifindex and forget about
refcounting; worst is we will end up failing some classification
somewhere - which if somebody really cares about they will have some
policy daemon listening to things and fixing the ifindices in the
kernel.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-26 Thread Alexey Kuznetsov
Hello!

 But how can this possibly work for skb-dst'less packets (such as IPV4
 ARP generated frames)?

Consider this as an implicit argument to functions passing skbs
(that's why skb-dev can be killed btw). Caller is obliged to hold
reference to the device, when doing dev_queue_xmit(). The way depends
on caller: IP holds dst, packet socket grabs the reference explicitly,
ARP either holds a neighbour entry, when sending, or does processing
in input context.

Actual rule is simple: skb-dev is always implicit argument. Everyone,
who queues skb, must take care of saving and protecting this argument,
like all another ones.

Alexey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-26 Thread Herbert Xu
On Tue, Jul 26, 2005 at 10:09:19AM -0400, jamal wrote:

  I think the answer is that it works by accident.
 
 But does it matter really? If say we close down the device, 
 do we want to be stopped from closing it until all the packets have been
 transmitted? We purge the qdiscs iirc already on close, so why not the
 transmit ring?

It doesn't matter as long as the packet makes it to the qdiscs
or the transmit ring.  It's the cases when it doesn't that we
may have problems with dev going away.
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-25 Thread Jamal Hadi Salim
On Sun, 2005-24-07 at 23:18 -0400, Jamal Hadi Salim wrote:

  Posting a bunch of patches that explicitly set input_dev to dev
  right before netif_rx() sort of further proves my point :-)
  
 
 ;-
 Let me sleep it over and think about it - I am not thinking straight
 right now. I am back home, so i should be  able to dig all my notes.
 

Ok, Dave looked at my notes:
how about we set it in only ing_filter() if it is zero. i.e,


 if (!skb-input_dev) 
  skb-input_dev = skb-dev-ifindex;
-

We get rid of the printk and all other places that set input_dev except
for mirred.
Any new actions can continue to set it if they wish to. I will adapt 
some of the actions i have not yet submitted to this scheme.

Thomas:
bit 0 of tc_verd can be moved into cb;
bits 2,3,4,5 MAY be moved, but it's nastier.
Whats more nasty is given how abused cb is already by dumps etc, things
could be overwritten, unless you reserve something in cb for just the
those bits.
Maybe we can have a basic temp_flags 16 or 32 bits whose scope is only
valid locally.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-25 Thread David S. Miller
From: Jamal Hadi Salim [EMAIL PROTECTED]
Date: Mon, 25 Jul 2005 09:50:33 -0400

 how about we set it in only ing_filter() if it is zero. i.e,
 
 
  if (!skb-input_dev) 
   skb-input_dev = skb-dev-ifindex;
 -

Ok... how about we take this one step further?  The idea
being:

1) Put the if (!input_dev) set_it; code at the top of
   netif_receive_skb(), right before skb_bond()

2) Kill all the explicit input_dev settings for devices
   that we make now, ie. every case except where the actions
   do it

With that, I'm fine with keeping skb-input_dev around since
it really does have truly different semantics than real_dev
did.

There is, of course, still the issue of refcounting.  If we go
the ifindex route for skb-input_dev, it might make sense to
translate device specifications in action rules into ifindex.
But things could break if you down then up a device with the
same name after inserting an action rule specifying that name.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-25 Thread David S. Miller
From: Ben Greear [EMAIL PROTECTED]
Date: Sun, 24 Jul 2005 23:25:53 -0700

 When these other changes go in, is there any chance that
 we can get support for more than 256 routing tables?  That
 can help make all of these virtual interfaces more useful
 for some of us with strange routing fetishes :)

That requires a major netlink API change, we've discussed
this to death before :-)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-25 Thread David S. Miller
From: Jamal Hadi Salim [EMAIL PROTECTED]
Date: Mon, 25 Jul 2005 09:50:33 -0400

 
  if (!skb-input_dev) 
   skb-input_dev = skb-dev-ifindex;
 -
 
 We get rid of the printk and all other places that set input_dev except
 for mirred.
 Any new actions can continue to set it if they wish to. I will adapt 
 some of the actions i have not yet submitted to this scheme.

Ok, so can we agree on the following patch?  ing_filter() should
never see a NULL -input_dev, and we could add a BUG() check
there for that invariant if you want.

It seems like the skb-input_dev NULL check in act_api.c could
be eliminated, but I'm not so sure of that, can't that code get
called in the output path as well?

diff --git a/drivers/isdn/i4l/isdn_net.c b/drivers/isdn/i4l/isdn_net.c
--- a/drivers/isdn/i4l/isdn_net.c
+++ b/drivers/isdn/i4l/isdn_net.c
@@ -1786,7 +1786,6 @@ isdn_net_receive(struct net_device *ndev
lp-stats.rx_bytes += skb-len;
}
skb-dev = ndev;
-   skb-input_dev = ndev;
skb-pkt_type = PACKET_HOST;
skb-mac.raw = skb-data;
 #ifdef ISDN_DEBUG_NET_DUMP
diff --git a/drivers/isdn/i4l/isdn_ppp.c b/drivers/isdn/i4l/isdn_ppp.c
--- a/drivers/isdn/i4l/isdn_ppp.c
+++ b/drivers/isdn/i4l/isdn_ppp.c
@@ -1177,7 +1177,6 @@ isdn_ppp_push_higher(isdn_net_dev * net_
mlp-huptimer = 0;
 #endif /* CONFIG_IPPP_FILTER */
skb-dev = dev;
-   skb-input_dev = dev;
skb-mac.raw = skb-data;
netif_rx(skb);
/* net_dev-local-stats.rx_packets++; done in isdn_net.c */
diff --git a/drivers/net/ppp_generic.c b/drivers/net/ppp_generic.c
--- a/drivers/net/ppp_generic.c
+++ b/drivers/net/ppp_generic.c
@@ -1657,7 +1657,6 @@ ppp_receive_nonmp_frame(struct ppp *ppp,
skb-dev = ppp-dev;
skb-protocol = htons(npindex_to_ethertype[npi]);
skb-mac.raw = skb-data;
-   skb-input_dev = ppp-dev;
netif_rx(skb);
ppp-dev-last_rx = jiffies;
}
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -352,10 +352,10 @@ tcf_change_indev(struct tcf_proto *tp, c
 static inline int
 tcf_match_indev(struct sk_buff *skb, char *indev)
 {
-   if (0 != indev[0]) {
-   if  (NULL == skb-input_dev)
+   if (indev[0]) {
+   if  (!skb-input_dev)
return 0;
-   else if (0 != strcmp(indev, skb-input_dev-name))
+   if (strcmp(indev, skb-input_dev-name))
return 0;
}
 
diff --git a/include/net/x25device.h b/include/net/x25device.h
--- a/include/net/x25device.h
+++ b/include/net/x25device.h
@@ -8,7 +8,6 @@
 static inline __be16 x25_type_trans(struct sk_buff *skb, struct net_device 
*dev)
 {
skb-mac.raw = skb-data;
-   skb-input_dev = skb-dev = dev;
skb-pkt_type = PACKET_HOST;

return htons(ETH_P_X25);
diff --git a/net/core/dev.c b/net/core/dev.c
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1537,17 +1537,14 @@ static int ing_filter(struct sk_buff *sk
__u32 ttl = (__u32) G_TC_RTTL(skb-tc_verd);
if (MAX_RED_LOOP  ttl++) {
printk(Redir loop detected Dropping packet (%s-%s)\n,
-   
skb-input_dev?skb-input_dev-name:??,skb-dev-name);
+   skb-input_dev-name, skb-dev-name);
return TC_ACT_SHOT;
}
 
skb-tc_verd = SET_TC_RTTL(skb-tc_verd,ttl);
 
skb-tc_verd = SET_TC_AT(skb-tc_verd,AT_INGRESS);
-   if (NULL == skb-input_dev) {
-   skb-input_dev = skb-dev;
-   printk(ing_filter:  fixed  %s out 
%s\n,skb-input_dev-name,skb-dev-name);
-   }
+
spin_lock(dev-ingress_lock);
if ((q = dev-qdisc_ingress) != NULL)
result = q-enqueue(skb, q);
@@ -1573,6 +1570,9 @@ int netif_receive_skb(struct sk_buff *sk
if (!skb-stamp.tv_sec)
net_timestamp(skb-stamp);
 
+   if (!skb-input_dev)
+   skb-input_dev = skb-dev;
+
orig_dev = skb_bond(skb);
 
__get_cpu_var(netdev_rx_stat).total++;
diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c
--- a/net/ethernet/eth.c
+++ b/net/ethernet/eth.c
@@ -163,7 +163,6 @@ __be16 eth_type_trans(struct sk_buff *sk
skb-mac.raw=skb-data;
skb_pull(skb,ETH_HLEN);
eth = eth_hdr(skb);
-   skb-input_dev = dev;

if(*eth-h_dest1)
{
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-25 Thread David S. Miller
From: Thomas Graf [EMAIL PROTECTED]
Date: Tue, 26 Jul 2005 00:35:05 +0200

 We can also use ifindex and hold a reference. We won't access input_dev
 too often so a __dev_get_by_ifindex() won't be too expensive given the
 number of interfaces is not too big but that's another issue.

Since the reference is very local in each instance we actually use it,
we can elide the refcounting overhead most of the time perhaps.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-24 Thread David S. Miller
From: Jamal Hadi Salim [EMAIL PROTECTED]
Date: Sat, 23 Jul 2005 09:32:07 -0400

 This is part of mission skb diet. 
 Against  git/davem/2.6.14 that was on vger 30 minutes back: It changes
 input_dev to be an ifindex so we dont bother holding devices. 
 Would only cut a 4 byte fat on 64 bit machines.
 
 Signed-off-by: Jamal Hadi Salim [EMAIL PROTECTED]

I have a better change in the wings that totally eliminates
real_dev _AND_ input_dev completely, and passes them as
parameters into pt-func() and -enqueue() as it should have
been from the beginning.

input_dev is skb-dev at time netif_receive_skb() was called,
and also, this is pretty much what the real_dev code wants too,
it wants the device before skb_bond() was invoked.

So both cases want the same exact device pointer, and we can pass
them around as parameters instead of all of the current bogus
stuff.  And the mere act of passing this orig_dev in as a parameter
makes it easier to verify that references to it will not escape
from the softirq input packet processing context.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-24 Thread David S. Miller
From: Thomas Graf [EMAIL PROTECTED]
Date: Sat, 23 Jul 2005 16:29:42 +0200

 I think we should head for a 16bit ifindex at some point
 or is there any reason why 65K interfaces wouldn't be
 sufficient?

I think we are locked into using s32 for this.

I can definitely imagine virtual interfaces that could
tally into numbers far larger than 64K.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-24 Thread Jamal Hadi Salim
On Sun, 2005-24-07 at 19:02 -0700, David S. Miller wrote:

I agree they will mean the same thing. 
The example i gave was to show where one meaning contradicts the other.

 And the current specific settings of input_dev has the following
 problems:
 
 1) you cannot ingress classify on tunneling virtual devices, since
-input_dev only gets set for (some) hardware device types
 
 2) not all hardware device types are supported, because we have
to add specific settings of -input_dev, which is just broken
 

I worry that not all possible packet origins have been caught.
But thats why the printk exists with no ratelimiting ;-
Infact, thats exactly how we caught/fixed most.

 I have never ever seen one good argument for the way input_dev
 is set now, and your new response is no different.
 

The input_dev could be set by different actions depending on some policy
(whether it is mirred action redirecting to ingress or simple
meta action just setting it to something just because the admin is on
crack). I dont wanna prejudge why it is set. I know of a few places
where i can clearly set them (which is where they are being set at the
moment) and I am trying to catch others with that printk. The rest is
left to the clever admin.

 Why, for example, are hardware devices any different from virtual ones
 for ingress classification purposes? 

Weird. I just scanned the git/2614 and you are correct.
I am pretty sure that you ACKed quiet a few changes i sent that were
caught by the printk.
I attached the few i could find.

  A clean generic interface would
 consider them the same and equally specifyable.  Right?
 
 Every time a driver decapsulation happens (regardless of whether
 this is some hardware or virtual device), we have a new input
 device to classify upon.  I don't see why you wish to continue
 breaking this logical representation.
 

I dont wanna break any good stuff that exists. I just wanna provide
ability to add more places where i override things.

cheers,
jamal
--- a/net/ipv4/ip_output.c  2004/12/31 14:26:08 1.1
+++ b/net/ipv4/ip_output.c  2004/12/31 14:27:53
@@ -111,6 +111,7 @@
 #ifdef CONFIG_NETFILTER_DEBUG
nf_debug_ip_loopback_xmit(newskb);
 #endif
+   newskb-input_dev = newskb-dev;
netif_rx(newskb);
return 0;
 }
--- a/net/ipv6/ip6_output.c 2004-12-24 16:33:51.0 -0500
+++ b/net/ipv6/ip6_output.c 2004-12-31 10:29:47.505392096 -0500
@@ -102,7 +102,7 @@
newskb-pkt_type = PACKET_LOOPBACK;
newskb-ip_summed = CHECKSUM_UNNECESSARY;
BUG_TRAP(newskb-dst);
-
+   newskb-input_dev = newskb-dev;
netif_rx(newskb);
return 0;
 }
--- 2610-bk1/net/ipv4/xfrm4_input.c 2004/12/31 17:00:25 1.1
+++ 2610-bk1/net/ipv4/xfrm4_input.c 2004/12/31 17:01:05
@@ -142,6 +142,7 @@
dst_release(skb-dst);
skb-dst = NULL;
}
+   skb-input_dev = skb-dev;
netif_rx(skb);
return 0;
} else {
--- 2610-bk1/net/ipv4/ipmr.c2004/12/31 17:01:24 1.1
+++ 2610-bk1/net/ipv4/ipmr.c2004/12/31 17:01:54
@@ -1461,6 +1461,7 @@
((struct net_device_stats*)reg_dev-priv)-rx_bytes += skb-len;
((struct net_device_stats*)reg_dev-priv)-rx_packets++;
nf_reset(skb);
+   skb-input_dev = skb-dev;
netif_rx(skb);
dev_put(reg_dev);
return 0;
@@ -1516,6 +1517,7 @@
((struct net_device_stats*)reg_dev-priv)-rx_bytes += skb-len;
((struct net_device_stats*)reg_dev-priv)-rx_packets++;
skb-dst = NULL;
+   skb-input_dev = skb-dev;
nf_reset(skb);
netif_rx(skb);
dev_put(reg_dev);
--- 2610-bk1/net/ipv4/ip_gre.c  2004/12/31 17:02:02 1.1
+++ 2610-bk1/net/ipv4/ip_gre.c  2004/12/31 17:03:08
@@ -655,6 +655,7 @@
skb-dst = NULL;
nf_reset(skb);
ipgre_ecn_decapsulate(iph, skb);
+   skb-input_dev = skb-dev;
netif_rx(skb);
read_unlock(ipgre_lock);
return(0);
--- 2610-bk1/net/ipv4/ipip.c2004/12/31 17:03:24 1.1
+++ 2610-bk1/net/ipv4/ipip.c2004/12/31 17:03:52
@@ -497,6 +497,7 @@
skb-dst = NULL;
nf_reset(skb);
ipip_ecn_decapsulate(iph, skb);
+   skb-input_dev = skb-dev;
netif_rx(skb);
read_unlock(ipip_lock);
return 0;
--- 2610-bk1/net/ipv6/ip6_tunnel.c  2004/12/31 17:06:12 1.1
+++ 2610-bk1/net/ipv6/ip6_tunnel.c  2004/12/31 17:07:03
@@ -547,6 +547,7 @@
ip6ip6_ecn_decapsulate(ipv6h, skb);
t-stat.rx_packets++;
t-stat.rx_bytes += skb-len;
+   skb-input_dev = skb-dev;
netif_rx(skb);
read_unlock(ip6ip6_lock);
return 0;
--- 2610-bk1/net/ipv6/xfrm6_input.c 2004/12/31 17:07:18 1.1
+++ 2610-bk1/net/ipv6/xfrm6_input.c   

Re: Patch: reduce skb input dev on 64 bit machines

2005-07-23 Thread Thomas Graf
* Jamal Hadi Salim [EMAIL PROTECTED] 2005-07-23 09:32
 @@ -205,7 +205,6 @@ struct sk_buff {
   struct sock *sk;
   struct timeval  stamp;
   struct net_device   *dev;
 - struct net_device   *input_dev;
   struct net_device   *real_dev;
  
   union {
 @@ -231,7 +230,7 @@ struct sk_buff {
  
   struct  dst_entry   *dst;
   struct  sec_path*sp;
 -
 + int input_dev;
   /*
* This is the control buffer. It is free to use for every
* layer. Please put your private variables there. If you

I think we should head for a 16bit ifindex at some point
or is there any reason why 65K interfaces wouldn't be
sufficient?  We waste one bit at the moment to have simpler
error handling for all the functions returning indices.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-23 Thread Jamal Hadi Salim
On Sat, 2005-23-07 at 16:29 +0200, Thomas Graf wrote:

 I think we should head for a 16bit ifindex at some point
 or is there any reason why 65K interfaces wouldn't be
 sufficient?  We waste one bit at the moment to have simpler
 error handling for all the functions returning indices.

Strange, i didnt see my original email echo back to me. Must be the
stupid ISP again. I thought things would work well after we moved to
vger.

ifindex has to be 32 bit because most management apps (SNMP etc) expect
it to be 32 bit. I havent scanned the code but making int actually
doesnt seem right - unless some code is expecting it to be a -ve number
somewhere; it should be type u32.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-23 Thread Thomas Graf
* Jamal Hadi Salim [EMAIL PROTECTED] 2005-07-23 10:37
 ifindex has to be 32 bit because most management apps (SNMP etc) expect
 it to be 32 bit. I havent scanned the code but making int actually
 doesnt seem right - unless some code is expecting it to be a -ve number
 somewhere; it should be type u32.

No, we use s32 at the moment with the limitation that negative indices
are illegal.  We don't have to break the interfaces just because of this,
but since dev_new_ifindex() will be reusing gone indices if needed it
might be possible to at least use u16 within the kernel. 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-23 Thread Jamal Hadi Salim
On Sat, 2005-23-07 at 16:50 +0200, Thomas Graf wrote:

 No, we use s32 at the moment with the limitation that negative indices
 are illegal.  We don't have to break the interfaces just because of this,
 but since dev_new_ifindex() will be reusing gone indices if needed it
 might be possible to at least use u16 within the kernel. 

It's not worth it i think - too huge a patch and will deviate from years
of what people have been expecting it to be; i just checked some of the
snmp code and some RFCs and they do expect it to be s32. 

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-23 Thread Patrick McHardy

Jamal Hadi Salim wrote:
This is part of mission skb diet. 
Against  git/davem/2.6.14 that was on vger 30 minutes back: It changes
input_dev to be an ifindex so we dont bother holding devices. 
Would only cut a 4 byte fat on 64 bit machines.


Signed-off-by: Jamal Hadi Salim [EMAIL PROTECTED]


Let me propose again to just set it here for all cases. So far there
hasn't been a single exception where indev is not the input device
as seen by netif_rx(), and I don't expect any to come up. In any case
you should guard this printk by net_ratelimit() to avoid spamming
peoples logs.



+   if (!skb-input_dev) {
+   skb-input_dev = skb-dev-ifindex;
+   printk(ing_filter: Repaired input to %s 
\n,skb-dev-name);
}

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-23 Thread Ben Greear

Jamal Hadi Salim wrote:

On Sat, 2005-23-07 at 16:50 +0200, Thomas Graf wrote:



No, we use s32 at the moment with the limitation that negative indices
are illegal.  We don't have to break the interfaces just because of this,
but since dev_new_ifindex() will be reusing gone indices if needed it
might be possible to at least use u16 within the kernel. 



It's not worth it i think - too huge a patch and will deviate from years
of what people have been expecting it to be; i just checked some of the
snmp code and some RFCs and they do expect it to be s32. 


Besides, who's to say 65000 interfaces is adequate? :)

Ben



cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch: reduce skb input dev on 64 bit machines

2005-07-23 Thread Ben Greear

Thomas Graf wrote:


I think we should head for a 16bit ifindex at some point
or is there any reason why 65K interfaces wouldn't be
sufficient?  We waste one bit at the moment to have simpler
error handling for all the functions returning indices.


Although I don't know anyone doing so, it would not be too hard to
put more than 65k 802.1Q VLANs on a single machine

Ben

--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html