subject:"\[PATCH 2\/3\] \[VLAN\]\: Update iif when receiving via VLAN device"

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-09 Thread Jamal Hadi Salim

On Sun, 2006-09-07 at 08:52 -0400, Jamal Hadi Salim wrote:

Ok, didnt complete my thought there ..

 I dont like it, like i said, because it adds one more check
 in the first path. I was contemplating at some point even not doing the 


.. not doing the  check for device up and just shove the packet at it.

cheers,
jamal

PS:-  warning on response still applies.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-09 Thread Thomas Graf

* Jamal Hadi Salim [EMAIL PROTECTED] 2006-07-09 08:52
 On Sun, 2006-09-07 at 01:46 +0200, Thomas Graf wrote:
  * Jamal Hadi Salim [EMAIL PROTECTED] 2006-07-08 10:14
 [..]
   There is no easy way to detect such a deadlock. I think it reduces to
   the same case as eth0-eth0, i.e:
  
  It's very simple. Just use the same method and add a flag if a mirred
  is currently in use as Herbert did to qdisc_run().
  
 
 But that would make it more complex in the sense it would require a lot
 more code.
 
 Another simpler approach is to check for recursion right in 

It's very interesting that you change from there is no easy way
to another simpler approach... in just one posting :-)

 dev_queue_xmit() - but i did not dare to do that because i could not
 justify it for any other code other than loops created by the action
 code. i.e something along the lines of:
 
 if (q-enqueue) {
   if (!spin_trylock(dev-queue_lock)) {
   printk(yikes recursed device %s\n,dev-name);
 goto tx_recursion_detected;
 }

That's not gonna work, dev-queue_lock may be held legimitately
by someone else than an underlying dev_queue_xmit() call.

 [BTW, notice how i used code to describe my view above ;-]

I'm very proud of you.

 Again, I was hoping that Herbert's stuff may change this view (and
 therefore give it more legitimacy) - but if the above can be accepted
 then we can forget touching mirred.

You need this:

if (test_and_set_bit(__LINK_STATE_ENQUEUEING, dev-state))
goto tx_recursion_detected;

spin_lock(queue_lock);
enqueue(...)
qdisc_run()
spin_unlock(queue_lock);

clear_bit(__LINK_TATE_ENQUEUEING, dev-state);

Unfortunately we'd still need __LINK_STATE_QDISC_RUNNING due
to net_tx_action().

 No Thomas - this is a valid check. As valid as mirred checking if device
 is up first. I dont like it, like i said, because it adds one more check
 in the first path. I was contemplating at some point even not doing the 

I'm not going to argue about this.

 By mirred deadlock i think you mean the deadlock that happens in
 dev_queue_xmit()? 

I meant the deadlock within mirred but the deadlock on queue_lock
is happening first anyway.

 What is still not clear above?

Frankly, I have no idea which deadlocks are intentional, what ideas
you have about ingress-egress, etc because it is not documented.
The list is very long. I'm not going to waste time trying to work
out a patch that will then not suit the ideas in your mind and on
your notes. You're maintaing this code, no?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-09 Thread Jamal Hadi Salim

On Sun, 2006-09-07 at 15:33 +0200, Thomas Graf wrote:
 * Jamal Hadi Salim [EMAIL PROTECTED] 2006-07-09 08:52
[..]
  Another simpler approach is to check for recursion right in 
 
 It's very interesting that you change from there is no easy way
 to another simpler approach... in just one posting :-)
 

I said there was no easy way of detecting - which is different from
preventing. Both approaches i showed were things i used (as far back as
last year) but did not find palatable to submit.
Hopefully we dont go into a tangent on this.

  if (q-enqueue) {
  if (!spin_trylock(dev-queue_lock)) {
  printk(yikes recursed device %s\n,dev-name);
  goto tx_recursion_detected;
  }
 
 That's not gonna work, dev-queue_lock may be held legimitately
 by someone else than an underlying dev_queue_xmit() call.
 

If there is a legitimate reason then it wont work. I cant think of one
though.

  [BTW, notice how i used code to describe my view above ;-]
 
 I'm very proud of you.
 

Thank you thank you

  Again, I was hoping that Herbert's stuff may change this view (and
  therefore give it more legitimacy) - but if the above can be accepted
  then we can forget touching mirred.
 
 You need this:
 
 if (test_and_set_bit(__LINK_STATE_ENQUEUEING, dev-state))
   goto tx_recursion_detected;
 
 spin_lock(queue_lock);
 enqueue(...)
 qdisc_run()
 spin_unlock(queue_lock);
 
 clear_bit(__LINK_TATE_ENQUEUEING, dev-state);
 
 Unfortunately we'd still need __LINK_STATE_QDISC_RUNNING due
 to net_tx_action().
 

This is also another approach that would work. If you think its simpler
go ahead and shoot a patch.

  By mirred deadlock i think you mean the deadlock that happens in
  dev_queue_xmit()? 
 
 I meant the deadlock within mirred but the deadlock on queue_lock
 is happening first anyway.
 

if you can stop things at the queue you wont have to worry about mirred.

  What is still not clear above?
 
 Frankly, I have no idea which deadlocks are intentional, what ideas
 you have about ingress-egress, etc because it is not documented.
 The list is very long. 

Ask one question at a time and i will answer. Some of the things we
discussed have no place to be commented-on in the code. But i think what
is obvious is:

A-*-A is a no-no.
And in some cases it is fine to let the user just fsck themselves
because then they will understand it is a bad idea [1] when shit
happens. OTOH, if there was a KISS way of doing it (as in the ifb case,
why not).

 I'm not going to waste time trying to work
 out a patch that will then not suit the ideas in your mind and on
 your notes. You're maintaing this code, no?

Yes, of course otherwise i wouldnt bother to comment on any patches.

cheers,
jamal

[1] there was a long discussion a few years back when i was still in
lkml on someone trying to redirect (i think) /dev/null - /dev/mem
(dont hold me accountable if those are not the right devices, just
absorb the moral of this instead). The consensus was it would be very
difficult to do without making a lot of changes and if someone wishes to
do that they deserved what they are getting.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-09 Thread Thomas Graf

* Jamal Hadi Salim [EMAIL PROTECTED] 2006-07-09 10:03
 On Sun, 2006-09-07 at 15:33 +0200, Thomas Graf wrote:
  That's not gonna work, dev-queue_lock may be held legimitately
  by someone else than an underlying dev_queue_xmit() call.
  
 
 If there is a legitimate reason then it wont work. I cant think of one
 though.

See sch_generic.c, it's documented. A simple grep on queue_lock
would have told you the same.

 This is also another approach that would work. If you think its simpler
 go ahead and shoot a patch.

It's not simpler, it's correct, while your patch is wrong.

 A-*-A is a no-no.
 And in some cases it is fine to let the user just fsck themselves
 because then they will understand it is a bad idea [1] when shit
 happens. OTOH, if there was a KISS way of doing it (as in the ifb case,
 why not).

I remind you that you started mentioning this A-*-A case while
talking about tx deadlocks that were supposed to be prevented with
the !from check or something along that lines. I can't really tell
because you explain it differently in every posting.

 Yes, of course otherwise i wouldnt bother to comment on any patches.

So maintain the code and fix your bugs.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-09 Thread Jamal Hadi Salim

On Sun, 2006-09-07 at 16:19 +0200, Thomas Graf wrote:
 * Jamal Hadi Salim [EMAIL PROTECTED] 2006-07-09 10:03
  On Sun, 2006-09-07 at 15:33 +0200, Thomas Graf wrote:
   That's not gonna work, dev-queue_lock may be held legimitately
   by someone else than an underlying dev_queue_xmit() call.
   
  
  If there is a legitimate reason then it wont work. I cant think of one
  though.
 
 See sch_generic.c, it's documented. A simple grep on queue_lock
 would have told you the same.
 

If you mean that the device will also try to grab the qlock there, then
that is fine still for the serialization. It all starts at
dev_queue_xmit.

  This is also another approach that would work. If you think its simpler
  go ahead and shoot a patch.
 
 It's not simpler, it's correct, while your patch is wrong.
 

Ok, calm down, will you? Man, you make it very hard to follow Daves
Tibetan approach. Let me tell you exactly how i feel about your
approach: It is unnecessarily complex. The approach i posted is not only
fine, it works with only 4-5 lines of code; i have numerous tests
against it over a long period of time. I was trying to be polite and
follow Daves advice not to insist it has to be done my way - it almost
seems impossible with you trying to nitpick on little unimportant
details.

  A-*-A is a no-no.
  And in some cases it is fine to let the user just fsck themselves
  because then they will understand it is a bad idea [1] when shit
  happens. OTOH, if there was a KISS way of doing it (as in the ifb case,
  why not).
 
 I remind you that you started mentioning this A-*-A case while
 talking about tx deadlocks that were supposed to be prevented with
 the !from check or something along that lines. I can't really tell
 because you explain it differently in every posting.
 

Because you are looking for preciseness at the micro/code level and i am
trying to explain at the conceptual/macro level (I have to adjust to
your mode but it is hard for me because i dont think at our level that
matters). When i sense you didnt understand me the last time, I try to
explain it better. The deadlock happens on transmit. If you want to be
precise, it happens when dev_queue_xmit is invoked. If you want more
preciseness, when the queue lock is contended. If you want more
preciseness, the code sample that i showed you should help illustrate
it.

  Yes, of course otherwise i wouldnt bother to comment on any patches.
 
 So maintain the code and fix your bugs.

Ok, I dont think this is gonna work - I may have to give up on getting
any resolution. 

Heres the suggestion again and you can still shoot a patch:
- If we can avoid doing anything at mirred that is the most preferable
approach. i.e get the change for free. In other words if it complicates
things it is not worth it. Someone redirecting eth0 to eth0 on egress
deserves whats happening to them.

- Try it against Herberts patch. It may resolve something or may require
an adjustment to Herberts patch so we can kill two birds with one stone
i.e get it for free. I dont know. I guess i shouldnt say i dont know
it is not precise, my point is you may find things when you attempt the
change. The patch is in Daves 2.6.18 tree.

- Iam fine with your suggestion to use a flag. The problem is not which
scheme you use - it is whether the change at such a crucial path in the
stack would be acceptable to other people.

Anyways, I am off.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-09 Thread Thomas Graf

* Jamal Hadi Salim [EMAIL PROTECTED] 2006-07-09 11:00
 If you mean that the device will also try to grab the qlock there, then
 that is fine still for the serialization. It all starts at
 dev_queue_xmit.

Look at where dev-queue_lock is taken, whenever a qdisc or
filter is added, modified or deleted the lock is taken. Using
your approach packets get dropped while such an operation is
taking place. Your approach is wrong.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-08 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-07-01 09:35
 On Sat, 2006-01-07 at 13:28 +0200, Thomas Graf wrote:
  Please enlighten me, I may be making a wrong assumption again.
  
  1) tc_verd is reset to 0 after dq in ri_tasklet()
  2) TC_AT is set to AT_EGRESS in dev_queue_xmit()
  3) TC_FROM being derived from TC_AT in tcf_mirred() when redirecting
  4) TC_FROM has the value AT_EGRESS when entering ifb_xmit()
  
 
 Let me answer the next bit and it may clear this.

It's pretty clear actually, given eth0-ifb0-ifb1 it would look
like:

dev_queue_xmit(eth0)
  tcf_mirred - ifb0
dev_queue_xmit(ifb0)
  tcf_mirred - ifb1
dev_queue_xmit(ifb1)
  ifb_xmit(ifb1)
  QUEUE /* Here we queue the packet for the first time and
 release the stack of tx locks acquired on the
 way. TC_FROM was never reset up here so this can't
 possibly prevent any tx deadlocks. However, the
 !from check is effective later on ... */
  ri_tasklet(ifb1)
dev_queue_xmit(ifb0)
  ifb_xmit(ifb0) /* classification disabled */
/* Drop due to !from with input_dev = ifb1 which is
   good as it prevents to loop the packet back to
   ifb1 again which I refered to earlier */

Is this how it was intented?

 using what i said as looping in the case of A-B-C-D-A for the case
 of egress since that is what you are alluding to;
 note that ifb0 will be entered twice: First for example the tasklet in
 ifb0 will emit the packet, and then it will go all the way back to xmit
 on ifb0. This is where the issue is. Does that clear it?

I tried to stay out of A-B-A for now since that's currently
broken due to mirred unless the deadlock is intentional, f.e.
when setting up eth0-ifb0-eth0 like this:

ip link set ifb0 up
tc qdisc add dev ifb0 parent root handle 1: prio
tc qdisc add dev eth0 parent root handle 1: prio
tc filter add dev eth0 parent 1: protocol ip prio 10 u32
  match ip protocol 1 0xff flowid 1:1
  action mirred egress redirect dev ifb0
tc filter add dev ifb0 parent 1: protocol ip prio 10 u32
  match ip protocol 1 0xff flowid 1:1
  action mirred egress redirect dev eth0

The following deadlock will ocur:

dev_queue_xmit(eth0)
  tcf_mirred - ifb0  /* acquire tcf_mirred-lock on eth0 */
dev_queue_xmit(ifb0)
  tcf_mirred - eth0  /* acquire tcf_mirred-lock on ifb0
dev_queue_xmit(eth0)
  tcf_mirred - ifb0  /* deadlock */

This is assuming that no tx deadlock happens of course. I
did try this out just to make sure, the machine just hung.
I can't see any code trying to prevent this but this is
another discussion as ifb is not involved in this, I think
it's purely a problem of mirred.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-08 Thread Jamal Hadi Salim

On Sat, 2006-08-07 at 10:14 -0400, Jamal Hadi Salim wrote:

 I have a dated  patch to mirred (may not apply cleanly) 

Sorry forgot to attach the patch. Attached for real this time;-

 that i believe
 will fix this specific one. Try to see if it also fixes this case you
 have.

I meant i know this works for eth0-eth0 i am not sure if it will fix
your specific case. I need my laptop at the moment ;-
If it does it will be an ok solution but not the best (because it
introduces an unnecessary check for the common case).

cheers,
jamal
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 4fcccbd..d8946b3 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -208,6 +208,12 @@ bad_mirred:
 
 	skb2-dev = dev;
 	skb2-input_dev = skb-dev;
+	if (skb2-input_dev == skb2-dev) {
+		if (net_ratelimit())
+			printk( Mirred: Loop detected to %s\n,skb2-dev-name);
+		goto bad_mirred;
+	}
+
 	dev_queue_xmit(skb2);
 	spin_unlock(p-lock);
 	return p-action;

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-08 Thread Thomas Graf

* Thomas Graf [EMAIL PROTECTED] 2006-07-09 01:46
 __LINK_STATE_QDISC_RUNNING will prevent an eventual tx deadlock,
 it will not prevent the mirred deadlock.

BTW, it will also not protect you from deadlocking on
dev-queue_lock while enqueueing.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-01 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-30 22:59
 On Sat, 2006-01-07 at 01:45 +0200, Thomas Graf wrote:
  Fact is that the from verdict is set to a meaningful value again at 
  dev_queue_xmit() or ing_filter() so ifb_xmit() only sees valid values. 
 
 Ok, Thomas this is one of those places where we end up colliding for no
 good reason - your statement above is accusatory. And sometimes i say 3
 things and you pick one and use that as context. The ifb does clear the
 field. So if you get redirected again, it will be 0.

Please enlighten me, I may be making a wrong assumption again.

1) tc_verd is reset to 0 after dq in ri_tasklet()
2) TC_AT is set to AT_EGRESS in dev_queue_xmit()
3) TC_FROM being derived from TC_AT in tcf_mirred() when redirecting
4) TC_FROM has the value AT_EGRESS when entering ifb_xmit()

 different instances of the same device do not deadlock. If however, you
 have a series of devices in which A-B-C-D-A then you will have a
 possible deadlock. In the case of the ifb i dissallowed it because it
 was obvious from the testing. 

Correct me if I'm wrong. The skb is queued between each device. When
moving skbs from rq to tq the tx lock is acquired, if not available
the packet is requeued for later. Due to the queueing the tx lock
of the previous device is released. Where exactly is the deadlock?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-01 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-30 22:23
 I am  not certain i understood then: Are we in the mode where the
 refcount is not needed because chances are small that a device will
 disappear? It seems to me after all this trouble that it may not be so
 bad to refcount (I guess i meant refcount the device on input to the ifb
 and decrement on the output).

Based on two principles:

 a) All netdevices a packet is processed through is properly
refcounted until the packet is queued for the first time.
Given that iif is strictly set to the previous device, a
refcnt is certainly taken up to the point where the skb is
put into rq. There is one exception to this: The fact that
you don't set skb-dev = ifb when reinjecting at ingress
will not take a refcnt on the ifb. This sounds like a broken
architecture to me but it doesn't matter as you want to
prohibit ifb - ifb anyways.
 
 b) The netdevice used for xmit via dev_queue_xmit() is already
protected by the initial xmit attempt.

I can't see reason why to hurt performance by introducing more
atomic operations in your fast path.

If you have more concerns regarding these patches, feel free to
fix your own architecture or propose to drop the patches, I think
I've spend enough time on this.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-01 Thread jamal

On Sat, 2006-01-07 at 13:28 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-30 22:59

 Please enlighten me, I may be making a wrong assumption again.
 
 1) tc_verd is reset to 0 after dq in ri_tasklet()
 2) TC_AT is set to AT_EGRESS in dev_queue_xmit()
 3) TC_FROM being derived from TC_AT in tcf_mirred() when redirecting
 4) TC_FROM has the value AT_EGRESS when entering ifb_xmit()
 

Let me answer the next bit and it may clear this.

  different instances of the same device do not deadlock. If however, you
  have a series of devices in which A-B-C-D-A then you will have a
  possible deadlock. In the case of the ifb i dissallowed it because it
  was obvious from the testing. 
 
 Correct me if I'm wrong. The skb is queued between each device. When
 moving skbs from rq to tq the tx lock is acquired, if not available
 the packet is requeued for later. Due to the queueing the tx lock
 of the previous device is released. Where exactly is the deadlock?

using what i said as looping in the case of A-B-C-D-A for the case
of egress since that is what you are alluding to;
note that ifb0 will be entered twice: First for example the tasklet in
ifb0 will emit the packet, and then it will go all the way back to xmit
on ifb0. This is where the issue is. Does that clear it?

cheers,
jamal


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-07-01 Thread jamal

On Sat, 2006-01-07 at 13:51 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-30 22:23
  I am  not certain i understood then: Are we in the mode where the
  refcount is not needed because chances are small that a device will
  disappear? It seems to me after all this trouble that it may not be so
  bad to refcount (I guess i meant refcount the device on input to the ifb
  and decrement on the output).
 
 Based on two principles:
 
  a) All netdevices a packet is processed through is properly
 refcounted until the packet is queued for the first time.
 Given that iif is strictly set to the previous device, a
 refcnt is certainly taken up to the point where the skb is
 put into rq. 

yes, but that refcount may be lost by the time it is dequeued and
retransmitted. Am i wrong thinking so?

 There is one exception to this: The fact that
 you don't set skb-dev = ifb when reinjecting at ingress
 will not take a refcnt on the ifb. This sounds like a broken
 architecture to me but it doesn't matter as you want to
 prohibit ifb - ifb anyways.

Ok, Thomas: Please stop critiqueing the architecture non-stop. You will
not convince me of anything this way. Lets just focus on this issue and
then you can go back and critique all you want.

  
  b) The netdevice used for xmit via dev_queue_xmit() is already
 protected by the initial xmit attempt.
 

Is it guaranteed? if yes, the patch is perfect.

 I can't see reason why to hurt performance by introducing more
 atomic operations in your fast path.
 

well, i would be fine with this - with a caveat that nothing really has
changed in ifb then, no? i.e the value of the patch in that case would
be to convert input_dev to iif, correct? If yes, this is fine by me
since as we have discussed the likelihood of this was small.

 If you have more concerns regarding these patches, feel free to
 fix your own architecture or propose to drop the patches, I think
 I've spend enough time on this.

Again, you are not being helpful by throwing in side remarks like these.
I am being very fair to you when you ask questions on how X works after
all assumptions you make - yet i ask questions which i see as valid and
you start throwing your hands off in the air. And then Patrick says i am
creating endless debates; this is why we go into loops.
Just answer the questions: I am trying to understand or you can choose
to ignore the emails all together.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-29 21:11
 Heres what it would look at ingress:
 
 step 0: coming from wire via eth0,
 dev=eth0, input_dev=eth0
 
 step 1: redirect to ifb0, leaving redirect
 dev=ifb0, input_dev=eth0
 
 step 2: leaving ifb0, coming back to ingress side of stack
 
 dev= eth0, input_dev=ifb0

That creates a nice loop on ingress. Upon reentering the
stack with skb-dev set to eth0 again we'll go through the
same ingress filters as the first time and we'll hit ifb0
again over and over. Are you suggesting everyone has to
insert a pass action matching input_dev in order to escape
the loop when using ifb?

  When leaving ifb0 you want for...
  ... egress:
 skb-dev=to (eth0) skb-iif=from (ifb0)
  ... ingress:
 skb-dev=at (ifb0) skb-iif=from (eth0)
  
 
 Yes, this is correct. I described the flow of the first one in the
 earlier email and the ingress side.

How can it be correct if it differs from your description
above? What I described is what the patch changes it to.
Looking closer at ifb it contains a race when updating
skb-dev. Preempt has to be disabled when updating skb-dev
before calling netif_rx() otherwise the device might disappear.

[NET]: Use interface index to keep input device information

Using the interface index instead of a direct reference
allows a safe usage beyond the scope where an interface
could disappear.

The old input_dev field was incorrectly made dependant
on CONFIG_NET_CLS_ACT in skb_copy().

The ifb device is fixed to set skb-dev in a manner that
the device can't disappear before calling netif_rx() and
the semantics are fixed so a packet reentering the stack
looks like it would have been received on the ifb device.

Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Index: net-2.6.git/include/linux/skbuff.h
===
--- net-2.6.git.orig/include/linux/skbuff.h
+++ net-2.6.git/include/linux/skbuff.h
@@ -181,7 +181,6 @@ enum {
  * @sk: Socket we are owned by
  * @tstamp: Time we arrived
  * @dev: Device we arrived on/are leaving by
- * @input_dev: Device we arrived on
  * @h: Transport layer header
  * @nh: Network layer header
  * @mac: Link layer header
@@ -192,6 +191,7 @@ enum {
  * @data_len: Data length
  * @mac_len: Length of link layer header
  * @csum: Checksum
+ * @iif: Device we arrived on
  * @local_df: allow local fragmentation
  * @cloned: Head may be cloned (check refcnt to be sure)
  * @nohdr: Payload reference only, must not modify header
@@ -228,7 +228,6 @@ struct sk_buff {
struct sock *sk;
struct skb_timeval  tstamp;
struct net_device   *dev;
-   struct net_device   *input_dev;
 
union {
struct tcphdr   *th;
@@ -266,6 +265,7 @@ struct sk_buff {
data_len,
mac_len,
csum;
+   int iif;
__u32   priority;
__u8local_df:1,
cloned:1,
Index: net-2.6.git/include/net/pkt_cls.h
===
--- net-2.6.git.orig/include/net/pkt_cls.h
+++ net-2.6.git/include/net/pkt_cls.h
@@ -352,14 +352,19 @@ tcf_change_indev(struct tcf_proto *tp, c
 static inline int
 tcf_match_indev(struct sk_buff *skb, char *indev)
 {
+   int ret = 1;
+
if (indev[0]) {
-   if  (!skb-input_dev)
-   return 0;
-   if (strcmp(indev, skb-input_dev-name))
+   struct net_device *dev;
+
+   dev = dev_get_by_index(skb-iif);
+   if  (!dev)
return 0;
+   ret = !strcmp(indev, dev-name);
+   dev_put(dev);
}
 
-   return 1;
+   return ret;
 }
 #endif /* CONFIG_NET_CLS_IND */
 
Index: net-2.6.git/net/core/dev.c
===
--- net-2.6.git.orig/net/core/dev.c
+++ net-2.6.git/net/core/dev.c
@@ -1715,8 +1715,8 @@ static int ing_filter(struct sk_buff *sk
if (dev-qdisc_ingress) {
__u32 ttl = (__u32) G_TC_RTTL(skb-tc_verd);
if (MAX_RED_LOOP  ttl++) {
-   printk(Redir loop detected Dropping packet (%s-%s)\n,
-   skb-input_dev-name, skb-dev-name);
+   printk(Redir loop detected Dropping packet (%d-%s)\n,
+   skb-iif, skb-dev-name);
return TC_ACT_SHOT;
}
 
@@ -1749,8 +1749,8 @@ int netif_receive_skb(struct sk_buff *sk
if (!skb-tstamp.off_sec)
net_timestamp(skb);
 
-   if (!skb-input_dev)
-   skb-input_dev = skb-dev;
+   if (!skb-iif)
+   skb-iif = skb-dev-ifindex;
 
orig_dev = skb_bond(skb);
 
Index: net-2.6.git/net/core/skbuff.c

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Nicolas Dichtel


Thomas Graf a écrit :

* Nicolas Dichtel [EMAIL PROTECTED] 2006-06-30 15:16

That creates a nice loop on ingress. Upon reentering the
stack with skb-dev set to eth0 again we'll go through the
same ingress filters as the first time and we'll hit ifb0
again over and over. Are you suggesting everyone has to
insert a pass action matching input_dev in order to escape
the loop when using ifb?

Bit 8 of skb-tc_verd is set by IFB, so packet isn't reclassify.
This bit avoid the loop.


Right, my mistake. Just making classification impossible after
going through an ifb device certainly seems like a perfect
idea, nice!

Classification has already be done on the first pass. Mirror
action must be the last classifier.

Nicolas
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Fri, 2006-30-06 at 15:08 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-29 21:11
  Heres what it would look at ingress:
  
  step 0: coming from wire via eth0,
  dev=eth0, input_dev=eth0
  
  step 1: redirect to ifb0, leaving redirect
  dev=ifb0, input_dev=eth0
  
  step 2: leaving ifb0, coming back to ingress side of stack
  
  dev= eth0, input_dev=ifb0
 
 That creates a nice loop on ingress. Upon reentering the
 stack with skb-dev set to eth0 again we'll go through the
 same ingress filters as the first time and we'll hit ifb0
 again over and over. 

loops are taken care of by other metadata. 

 Are you suggesting everyone has to
 insert a pass action matching input_dev in order to escape
 the loop when using ifb?
 

This works, there are no loops. If you use a meta setter and changed
input_dev to something that creates a loop it will still be caught when
ttls expire.

   When leaving ifb0 you want for...
   ... egress:
  skb-dev=to (eth0) skb-iif=from (ifb0)
   ... ingress:
  skb-dev=at (ifb0) skb-iif=from (eth0)
   
  
  Yes, this is correct. I described the flow of the first one in the
  earlier email and the ingress side.
 
 How can it be correct if it differs from your description
 above? What I described is what the patch changes it to.

Double check again: it works as described above; your change messes it.

 Looking closer at ifb it contains a race when updating
 skb-dev. Preempt has to be disabled when updating skb-dev
 before calling netif_rx() otherwise the device might disappear.
 

I am going to ignore the patch until we resolve the issue of iif vs
input_dev. Why dont we discuss that?

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Fri, 2006-30-06 at 16:15 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-30 09:57
  On Fri, 2006-30-06 at 15:08 +0200, Thomas Graf wrote:
   That creates a nice loop on ingress. Upon reentering the
   stack with skb-dev set to eth0 again we'll go through the
   same ingress filters as the first time and we'll hit ifb0
   again over and over. 
  
  loops are taken care of by other metadata. 
 
 Not really, assuming a simple setup such as:
 
   mirred attached to eth0 redirecting to ifb0
   mirred attached to ifb0 redirecting to ifb1
 
 will result in:
 
 eth0::tcf_mirred() skb-dev = ifb0, input_dev = eth0
 ifb0::tcf_mirred() skb-dev = ifb1, input_dev = ifb0
 ifb1::ifb_xmit()   skb-dev = ifb0, input_dev = ifb1, set ncls bit
 ifb0::ifb_xmit()   skb-dev = ifb1, input_dev = ifb0
 ifb1::ifb_xmit()   skb-dev = ifb0, input_dev = ifb1
 ...
 
 Oh dear... and we don't even have a ttl to catch this.
 

Did you actually try to run this before you reached this conclusion?

  I am going to ignore the patch until we resolve the issue of iif vs
  input_dev. Why dont we discuss that?
 
 It's starting to get useless to discuss with you. 

sigh.
ok, why dont you take a deep breath or a break because i think we are
not going to make any progress this way. I know you may be feeling
frustrated but i am equally if not more frustrated as well.
Our conflict comes in the fact that you are trying to change the
architecture in the name of fixing bugs. 
With all due respect, the architecture works. I have invested many many
hours testing and verifying. There may be coding bugs - and those need
fixing. Kill the bugs.

 You agreed in
 your last posting that the 3rd option, being that not caring about
 whether a device might disappear but having a way to check for it,
 is what we agreed on and what makes most sense, 

yes, i recalled that as the last discussion we had.

 yet you fail to see
 that using ifindex is exactly what reflects this descision.
 

The choice is between using an ifindex and a pointer to the device. 
Why is it solvable via an ifindex but not a device pointer?
I may have misunderstood you, so look at this as a fresh opportunity to
enlighten me. Forget about all the discussion weve had thus far.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-30 10:35
 On Fri, 2006-30-06 at 16:15 +0200, Thomas Graf wrote:
  eth0::tcf_mirred() skb-dev = ifb0, input_dev = eth0
  ifb0::tcf_mirred() skb-dev = ifb1, input_dev = ifb0
  ifb1::ifb_xmit()   skb-dev = ifb0, input_dev = ifb1, set ncls bit
  ifb0::ifb_xmit()   skb-dev = ifb1, input_dev = ifb0
  ifb1::ifb_xmit()   skb-dev = ifb0, input_dev = ifb1
  ...
  
  Oh dear... and we don't even have a ttl to catch this.
  
 
 Did you actually try to run this before you reached this conclusion?

I did, fortunately some other bug prevents this from happening,
packets are simply dropped somewhere.

 With all due respect, the architecture works. I have invested many many
 hours testing and verifying. There may be coding bugs - and those need
 fixing. Kill the bugs.

Right, just run this

tc filter add dev eth0 parent 1: protocol ip prio 10
  u32 match ip tos 0 0 flowid 1:1
  action mirred egress redirect dev ifb1
tc filter add dev ifb1 parent 1: protocol ip prio 10
  u32 match ip tos 0 0 flowid 1:1
  action mirred egress redirect dev ifb0


Anyways, I give up. Last time I've been running after you trying
to fix the many bugs you leave behind. Ever noticed that whenever
you add some new code it's someone else following up with tons of
small bugfix patches having a hard time trying to figure out the
actual intent. I'll just duplicate the code for my purpose, so
much easier.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Patrick McHardy

Nicolas Dichtel wrote:
 Bit 8 of skb-tc_verd is set by IFB, so packet isn't reclassify.
 This bit avoid the loop.

It would, if something would actually set it.

~/src/kernel/linux-2.6$ grep NCLS -r net/
net/core/dev.c: if (skb-tc_verd  TC_NCLS) {
net/core/dev.c: skb-tc_verd = CLR_TC_NCLS(skb-tc_verd);
net/sched/act_api.c:if (skb-tc_verd  TC_NCLS) {
net/sched/act_api.c:skb-tc_verd = CLR_TC_NCLS(skb-tc_verd);
net/sched/act_api.c:D2PRINTK((%p)tcf_action_exec: cleared
TC_NCLS in %s out %s\n,
~/src/kernel/linux-2.6$

Am I missing something? Jamal, where is this bit supposed to be set?

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Patrick McHardy

Patrick McHardy wrote:
 Nicolas Dichtel wrote:
 
Bit 8 of skb-tc_verd is set by IFB, so packet isn't reclassify.
This bit avoid the loop.
 
 
 It would, if something would actually set it.
 
 ~/src/kernel/linux-2.6$ grep NCLS -r net/
 net/core/dev.c: if (skb-tc_verd  TC_NCLS) {
 net/core/dev.c: skb-tc_verd = CLR_TC_NCLS(skb-tc_verd);
 net/sched/act_api.c:if (skb-tc_verd  TC_NCLS) {
 net/sched/act_api.c:skb-tc_verd = CLR_TC_NCLS(skb-tc_verd);
 net/sched/act_api.c:D2PRINTK((%p)tcf_action_exec: cleared
 TC_NCLS in %s out %s\n,
 ~/src/kernel/linux-2.6$
 
 Am I missing something? Jamal, where is this bit supposed to be set?

OK, I'm apparently missing the ability to read :) Please disregard ..
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Fri, 2006-30-06 at 18:32 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-30 10:35
 
  Did you actually try to run this before you reached this conclusion?
 
 I did, fortunately some other bug prevents this from happening,
 packets are simply dropped somewhere.
 

It is not a bug, Thomas! I am getting a little frustrated now.
The packets will be dropped because we set the at field to zero which is
invalid. That is done on purpose. It is only meaningful for ifb. The
challenge is much bigger than it appears. You could end up deadlocking
on the tx lock. So this was the choice i had to make.

  With all due respect, the architecture works. I have invested many many
  hours testing and verifying. There may be coding bugs - and those need
  fixing. Kill the bugs.
 
 Right, just run this
 
 tc filter add dev eth0 parent 1: protocol ip prio 10
   u32 match ip tos 0 0 flowid 1:1
 action mirred egress redirect dev ifb1
 tc filter add dev ifb1 parent 1: protocol ip prio 10
   u32 match ip tos 0 0 flowid 1:1
 action mirred egress redirect dev ifb0
 
 
 Anyways, I give up. 

I understood you when you first posted. This is part of my testing. 
Try also to redirect from the same device eth0 to eth0 etc and see some
more interesting things.

 Last time I've been running after you trying
 to fix the many bugs you leave behind. Ever noticed that whenever
 you add some new code it's someone else following up with tons of
 small bugfix patches having a hard time trying to figure out the
 actual intent. 

You could always ask instead of making assumptions. And dont get me wrong, I 
honestly do appreciate you fixing bugs. You have been great and i have
always thanked you - except in situations like this where you just
stress the hell out of me. Bugs will always be there. 
Even God has bugs - but you are not fixing bugs in this case. 

You are attempting to change architecture (which works just fine) in the way 
you 
think it should work - and then point to something as a bug because it 
doesnt work the way you think it should work. 
This is a problem not just with you BTW, but with Patrick as well (although he 
has 
gotten better lately). There is a huge difference for example when dealing with 
Herbert. My approach in situations like this, which you dont have to follow, is 
to ask first what the intent was then if i dont like the intent try to convince 
the
owner that there maybe better ways. Or why they are wrong. 

To make my case, look at what you just did above in just the last 2 emails: 
You made a claim there is a bug. 
I asked you if you had really tested what you are pointing to (because i
know i test for that). You come back and make claims the bug is
elsewhere. This could go on forever typically - and infact it is throughout this
thread. 
Didnt ask if this is perhaps the way it is supposed to work or what the
intent was. It has to be a bug and by golly you are fixing it. Put yourself in 
my shoes.

 I'll just duplicate the code for my purpose, so
 much easier.
 

Well, I am sorry you feel that way. I dont even remember where it is that 
we started.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Patrick McHardy

Thomas Graf wrote:
 * Thomas Graf [EMAIL PROTECTED] 2006-06-30 18:32
 
Anyways, I give up. Last time I've been running after you trying
to fix the many bugs you leave behind. Ever noticed that whenever
you add some new code it's someone else following up with tons of
small bugfix patches having a hard time trying to figure out the
actual intent. I'll just duplicate the code for my purpose, so
much easier.
 
 
 There you go, leaves ifb broken as-is, at least prevents it
 from crashing randomly when the input_dev disappears.

That would be a pity. After this patch has been ACKed, could we start
over with the other bugs one at a time? I wasn't really able to gather
the problems from this thread, but some of the behaviour you mentioned
does seem questionable.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Ben Greear


Thomas Graf wrote:

* Thomas Graf [EMAIL PROTECTED] 2006-06-30 18:32


Anyways, I give up. Last time I've been running after you trying
to fix the many bugs you leave behind. Ever noticed that whenever
you add some new code it's someone else following up with tons of
small bugfix patches having a hard time trying to figure out the
actual intent. I'll just duplicate the code for my purpose, so
much easier.



There you go, leaves ifb broken as-is, at least prevents it
from crashing randomly when the input_dev disappears.

[NET]: Use interface index to keep input device information

Using the interface index instead of a direct reference
allows a safe usage beyond the scope where an interface
could disappear.

The old input_dev field was incorrectly made dependant
on CONFIG_NET_CLS_ACT in skb_copy().

Signed-off-by: Thomas Graf [EMAIL PROTECTED]



===
--- net-2.6.git.orig/drivers/net/ifb.c
+++ net-2.6.git/drivers/net/ifb.c
@@ -158,19 +158,23 @@ static int ifb_xmit(struct sk_buff *skb,
stats-tx_packets++;
stats-tx_bytes+=skb-len;
 
-	if (!from || !skb-input_dev) {

+   if (!from || !skb-iif) {


Since iif of 0 is valid (afaik), this check is now bogus, eh?

Ben

--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Fri, 2006-30-06 at 19:18 +0200, Patrick McHardy wrote:
 Thomas Graf wrote:
  * Thomas Graf [EMAIL PROTECTED] 2006-06-30 18:32
  
 Anyways, I give up. Last time I've been running after you trying
 to fix the many bugs you leave behind. Ever noticed that whenever
 you add some new code it's someone else following up with tons of
 small bugfix patches having a hard time trying to figure out the
 actual intent. I'll just duplicate the code for my purpose, so
 much easier.
  
  
  There you go, leaves ifb broken as-is, at least prevents it
  from crashing randomly when the input_dev disappears.
 
 That would be a pity. After this patch has been ACKed, could we start
 over with the other bugs one at a time? I wasn't really able to gather
 the problems from this thread, but some of the behaviour you mentioned
 does seem questionable.
 

I will summarize what the outstanding issues are, the rest of the bugs
just ignore otherwise the discussion is a waste of time and may
get out of control.

1) ifb references skb-input_dev
2) mirred sets the skb-input_dev which is used in #1

It is possible that when #1 happens infact input_dev is gone because no
ref count is incremented. Ok, Thomas is that sufficient to discuss the
crux of the matter?
At one point many months ago, the logic was for now the likelihood this
will happen is low but we need to cover for by at least figuring the
existence of input_dev when referencing it.

Thomas makes the claim, this can be achieved only by using an ifindex.
And i havent been able to see how. I have a small performance problem if
i just use ifindex. Using ifindex will eventually save 32 bits on the
64 bit machines. I posed the question as to which was more beneficial
as a solution that hasnt been addressed.


cheers,
jamal






-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-30 13:19
 On Fri, 2006-30-06 at 18:32 +0200, Thomas Graf wrote:
  * jamal [EMAIL PROTECTED] 2006-06-30 10:35
  
   Did you actually try to run this before you reached this conclusion?
  
  I did, fortunately some other bug prevents this from happening,
  packets are simply dropped somewhere.
  
 
 It is not a bug, Thomas! I am getting a little frustrated now.
 The packets will be dropped because we set the at field to zero which is
 invalid. That is done on purpose. It is only meaningful for ifb. The
 challenge is much bigger than it appears. You could end up deadlocking
 on the tx lock. So this was the choice i had to make.

Please explain, tc_verd is reset in the tasklet after dequeueing and set
again in dev_queue_xmit(). ifb_xmit will always see a valid tc_verd
at egress.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Patrick McHardy

jamal wrote:
 On Fri, 2006-30-06 at 19:18 +0200, Patrick McHardy wrote:
 
Thomas Graf wrote:

* Thomas Graf [EMAIL PROTECTED] 2006-06-30 18:32


Anyways, I give up. Last time I've been running after you trying
to fix the many bugs you leave behind. Ever noticed that whenever
you add some new code it's someone else following up with tons of
small bugfix patches having a hard time trying to figure out the
actual intent. I'll just duplicate the code for my purpose, so
much easier.


There you go, leaves ifb broken as-is, at least prevents it
from crashing randomly when the input_dev disappears.

That would be a pity. After this patch has been ACKed, could we start
over with the other bugs one at a time? I wasn't really able to gather
the problems from this thread, but some of the behaviour you mentioned
does seem questionable.

 
 
 I will summarize what the outstanding issues are, the rest of the bugs
 just ignore otherwise the discussion is a waste of time and may
 get out of control.
 
 1) ifb references skb-input_dev

That should be fixed by this patch.

 2) mirred sets the skb-input_dev which is used in #1

I think Thomas was more complaining about the values it is set to
and the fact that only a single redirection is possible for each
packet, no?

 It is possible that when #1 happens infact input_dev is gone because no
 ref count is incremented. Ok, Thomas is that sufficient to discuss the
 crux of the matter?
 At one point many months ago, the logic was for now the likelihood this
 will happen is low but we need to cover for by at least figuring the
 existence of input_dev when referencing it.
 
 Thomas makes the claim, this can be achieved only by using an ifindex.
 And i havent been able to see how. I have a small performance problem if
 i just use ifindex. Using ifindex will eventually save 32 bits on the
 64 bit machines. I posed the question as to which was more beneficial
 as a solution that hasnt been addressed.

Are we still talking about this? Its easy: a pointer without taking
a reference can become stale, with ifindex you do the lookup right
when using it, so you either get an result or you don't. It also
saves atomic operations for anyone _not_ using it, which is that
vast majority of users. So the patch clearly makes sense to me.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Fri, 2006-30-06 at 19:42 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-30 13:32

  Thomas makes the claim, this can be achieved only by using an ifindex.
  And i havent been able to see how. I have a small performance problem if
  i just use ifindex. Using ifindex will eventually save 32 bits on the
  64 bit machines. I posed the question as to which was more beneficial
  as a solution that hasnt been addressed.
 
 I'd appreciate if you'd stop sperading lies. 

 I claimed it to be the
 best solution, not the only one. Everyone agrees that it's possible
 to use a pointer and take a reference in netif_receive_skb() while
 it also seems obvious to most that it's not a good idea to take two
 additional atomic operations in the fast path for this purpose.

I thought we went past that point already - and i made it clear that
the reference is _not_ taken in netif_receive_skb().

So assuming it is taken in mirred (i havent thought of where it is
decremented), why would using the ifindex be better?

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Fri, 2006-30-06 at 19:42 +0200, Patrick McHardy wrote:
 jamal wrote:

  Thomas makes the claim, this can be achieved only by using an ifindex.
  And i havent been able to see how. I have a small performance problem if
  i just use ifindex. Using ifindex will eventually save 32 bits on the
  64 bit machines. I posed the question as to which was more beneficial
  as a solution that hasnt been addressed.
 
 Are we still talking about this? Its easy: a pointer without taking
 a reference can become stale, 

I should have been clear: reference gets taken in mirred.

 with ifindex you do the lookup right
 when using it, so you either get an result or you don't. 

My arguement was:
dev get by index per device will involve a a) lookup + b) incrementing
the refcount. 
if i use the dev pointer in that path then it becomes only #b

in both cases, you need to increment the counter and then somewhere
decrement it.

cheers,
jamal

 It also
 saves atomic operations for anyone _not_ using it, which is that
 vast majority of users. So the patch clearly makes sense to me.



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Fri, 2006-30-06 at 19:33 +0200, Patrick McHardy wrote:
 jamal wrote:
  You are attempting to change architecture (which works just fine) in the 
  way you 
  think it should work - and then point to something as a bug because it 
  doesnt work the way you think it should work. 
  This is a problem not just with you BTW, but with Patrick as well (although 
  he has 
  gotten better lately). There is a huge difference for example when dealing 
  with 
  Herbert. My approach in situations like this, which you dont have to 
  follow, is 
  to ask first what the intent was then if i dont like the intent try to 
  convince the
  owner that there maybe better ways. Or why they are wrong. 
 
 Well, I thought I stay out of this, but since you mention me ..
 

I think you will add value to the discussion ;-
Regardless, we need to settle these kind of issues so we can work better
together. A while back i said was going to bring some of these issues at
the next netconf - but since the discussion is happening already, lets
do it here. Maybe we should take this discussion privately?

 I also had the feeling it has gotten easier working with you lately,

The feeling is mutual. 

 but I can understand Thomas's pain, I had the same thoughts more than
 once. Your code often does have an enormous amount of bugs and
 whitespace and other stylistic problems and working with you can be
 challenging for multiple reasons, for example having to go to endless,

One thing is clear in my mind at least (and i have said it several
times): I am not as good at the semantics as either yourself or Thomas
or Dave or Acme etc but i have tons of other things that compensate for.
Probably not as good is not the best  description  - rather my brain
cells put less respect on the stylistic commas than they do on the
message. Perhaps it's dylexsia or me trying to subconsciouly maximize my
time. If you push me hard, however, i will get close to do just as a
good job as you;-  Still, cant live without you guys! 
 
I would describe the majority of the fixes you have sent against me to
fall into the stylistic category and into fixing TheLinuxWay (i.e same
bug in multiple places and files). This is not to say i am not at fault,
or there havent been issues which made me wonder, but we may have
different metrics of what bugs i would call my mom and say Mom, I just
fixed enormous amount of bugs. 
[As an example if you went hunting around the kernel, you will find a
lot of things that dont totally conform to lindent rules. You could
literally send 100s of patches].
Perhaps it is the language usage that puts us at odds at times.

 often entirely pointless discussions for obvious fixes, 

You may see it that way - I dont and so when that happens it is not by
any means to engage in a meaningless debate. 
The discussions are typically of the same nature as i just had with
Thomas (just a simple one line change in ifb which looked very
obvious).

The way it goes is that there is a certain assumption made, and a
solution is suggested in the form of a patch. Accepting such a patch
(which i once in a while dont even get to see until it is in) means a
lot of the other assumptions get invalidated and is followed by tons of
bug fixes. This has happened a few times to me. And once or twice i
have bitched because i ended having to support some user for days with
the wrong view.

 especially if you're already pissed by noticing 50 bugs at once. 

I dont get agitated by 100 patches which fix stylistic issues or real
bugs - although i would have preferred to see the non-obvious first
before they go in just in case or at least CCed on submission.
Where i get irritated, depending on how my day is going: when i see even
a single patch laded with implicit insults. It comes out to me as
unnecessary arrogance and immaturity. It may not be intentional use of
language, but thats how it comes out. 

 The reason why you
 might be able to work better with Herbert is IMO that he usually
 doesn't touch what you seem to feel is your area. 

I think it's the maturity approach perhaps. With Herbert, it ends up
being about solving the problem; there are always moments of tension and
what may appear as endless discussion (look at the thread on
qdisc_is_running), but in the end it doesnt turn out into a high school
debate to prove one is better than the other. So maybe it is the mutual
respect and maturity in the discussion. 

In regards to areas, you may be right; i cross more into Herberts path
than he does mine.
My reaction in this may be related to the way i treat other people and
my expectations:
When i submit patches to i would make sure i cc people who i think have
stakes or better insight in that area even if they seem irrelevant.
In-fact i may have even private conversations with them first if they
are large patches.
As an example i would make sure i cc you if i had some netfilter issues
or Herbert when i submit ipsec patches and we would have a gazillion
discussions which would

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-30 15:34
 I thought we went past that point already - and i made it clear that
 the reference is _not_ taken in netif_receive_skb().
 
 So assuming it is taken in mirred (i havent thought of where it is
 decremented), why would using the ifindex be better?

The issue exists regardless of mirred/ifb. As soon as the packet is
queued for the first time we leave netif_receive_skb() and the dev
reference is dropped. Therefore in order to allow functionality
like tcf_match_indev() at egress we have to either take a reference
or ensure that we can catch the unlikely case of the device having
disappeared. I think everyone would agree to use device pointers
if only mirred would acquire it.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread David Miller

From: jamal [EMAIL PROTECTED]
Date: Fri, 30 Jun 2006 15:40:26 -0400

 My arguement was:
 dev get by index per device will involve a a) lookup + b) incrementing
 the refcount. 
 if i use the dev pointer in that path then it becomes only #b

 in both cases, you need to increment the counter and then somewhere
 decrement it.

The position is that -input_dev usage is not fast path.

It may be a fast path in your mirred and ifb devices, but for the rest
of the networking and %99 of users it is totally unused.  We have a
lot of bits of state that sit in the sk_buff but which are used by
a tiny minority, yet the space consumption is eaten by everyone.

Maybe when the IFB and mirred are in more widespread use we can
investigate a different approach.

But at this time any change that makes fringe sk_buff state objects
shrink or disappear will be seriously considered.

Another part of this issue is that if you use a pointer there is no
clean place to clean up the input_dev reference within the scope it is
actually used.  The only reliable place is kfree_skb() which is far
beyond the scope that this -input_dev thing is used.  We have a lot
of problems in some parts of the networking because object references
are held for too long.  One example is all the awful side effects of
the way the bridge netfilter code holds onto object references for
long periods of time in the netfilter call chains.

And we know the refcounting is broken.  And one way we know to fix
the refcounting without incurring the cost upon everyone is to
use an interface index and only make the input_dev users eat the
atomic operation incurred by grabbing the reference.

And even for the input_dev users, it isn't such a big deal, there
are other costs already associated with hitting these actions and
devices that use input_dev, the atomic reference grab there should
be lost in the noise I think.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-30 15:59
 One thing is clear in my mind at least (and i have said it several
 times): I am not as good at the semantics as either yourself or Thomas
 or Dave or Acme etc but i have tons of other things that compensate for.
 Probably not as good is not the best  description  - rather my brain
 cells put less respect on the stylistic commas than they do on the
 message. Perhaps it's dylexsia or me trying to subconsciouly maximize my
 time. If you push me hard, however, i will get close to do just as a
 good job as you;-  Still, cant live without you guys! 

Lack of stylistic commas is less of a concern, rather the lack of
comments and notes where absolutely essential. You explained to me
that stacking ifb devices is not supported in order to avoid tx lock
deadlocks. There is not a single note or hint anywhere that would
remotely imply this. When questioned about things like that you often
refer to notes you have somewhere on paper, promising to get back once
looked up, eventually that happens, in this thread it didn't.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread David Miller

From: Thomas Graf [EMAIL PROTECTED]
Date: Fri, 30 Jun 2006 22:30:34 +0200

 Lack of stylistic commas is less of a concern, rather the lack of
 comments and notes where absolutely essential.

Yes, intent is very important to document when it is not
obvious.

Everyone is guilty of not doing this from time to time. :)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Fri, 2006-30-06 at 22:08 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-30 15:34

  So assuming it is taken in mirred (i havent thought of where it is
  decremented), why would using the ifindex be better?
 
 The issue exists regardless of mirred/ifb. As soon as the packet is
 queued for the first time we leave netif_receive_skb() and the dev
 reference is dropped. Therefore in order to allow functionality
 like tcf_match_indev() at egress we have to either take a reference
 or ensure that we can catch the unlikely case of the device having
 disappeared. I think everyone would agree to use device pointers
 if only mirred would acquire it.

Thomas, this is certainly more reasonable explanation. 

On Fri, 2006-30-06 at 13:17 -0700, David Miller wrote:

 We have a
 lot of bits of state that sit in the sk_buff but which are used by
 a tiny minority, yet the space consumption is eaten by everyone.
 

 Maybe when the IFB and mirred are in more widespread use we can
 investigate a different approach.
 

Not unreasonable.
So saving the bits for the majority trumps the need for slightly more
cycles in mirred/ifb until they get more widespread.


 Another part of this issue is that if you use a pointer there is no
 clean place to clean up the input_dev reference within the scope it is
 actually used. 
 

  The only reliable place is kfree_skb() which is far
 beyond the scope that this -input_dev thing is used. 
 

And i cant think of a good place to do it. 

  We have a lot
 of problems in some parts of the networking because object references
 are held for too long.  One example is all the awful side effects of
 the way the bridge netfilter code holds onto object references for
 long periods of time in the netfilter call chains.
 
 And we know the refcounting is broken.  And one way we know to fix
 the refcounting without incurring the cost upon everyone is to
 use an interface index and only make the input_dev users eat the
 atomic operation incurred by grabbing the reference.
 
 And even for the input_dev users, it isn't such a big deal, there
 are other costs already associated with hitting these actions and
 devices that use input_dev, the atomic reference grab there should
 be lost in the noise I think.

Alright guys - lets get the change in. I wish we had this specific
discussion sooner. 

Thomas, can you post your latest incarnation so i can mow comment more
peacefully? ;-

cheers,
jamal




-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Fri, 2006-30-06 at 13:33 -0700, David Miller wrote:
 From: Thomas Graf [EMAIL PROTECTED]
 Date: Fri, 30 Jun 2006 22:30:34 +0200

  Lack of stylistic commas is less of a concern, rather the lack of
  comments and notes where absolutely essential.

 Yes, intent is very important to document when it is not
 obvious.

I agree - and i try hard to document but at times there's too much
and a line needs to be drawn.
As an example:
the eth-ifb-ifb case though is a very corner case. All the IMQ types
need to only redirect to one ifb; while i test it ive always seen the
test as more of a boundary check than a useful setup.

And Thomas: I do keep notebooks ;- I doodle about everything, they dont
all fit in my knapsack anymore so sometimes when i say i will go and
refer to my notes, it is real.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-30 16:54
 I agree - and i try hard to document but at times there's too much
 and a line needs to be drawn.
 As an example:
 the eth-ifb-ifb case though is a very corner case. All the IMQ types
 need to only redirect to one ifb; while i test it ive always seen the
 test as more of a boundary check than a useful setup.

Maybe you should just start by explaining why it doesn't work,
your original explanation doesn't make much sense.

 And Thomas: I do keep notebooks ;- I doodle about everything, they dont
 all fit in my knapsack anymore so sometimes when i say i will go and
 refer to my notes, it is real.

It's plain arrogant to rely code understanding on information that
is not available to others. It makes everyone being dependant on
you, very nice.

To put it in another way, if you would do your job right when
writing the code, contacting or CCing you upon changes would
be only of a formal matter since the code is understandable and
the intent is clear or documented.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

Ok, I think found the last patch you posted, comments below (I have to
run off soon):

On Fri, 2006-30-06 at 19:13 +0200, Thomas Graf wrote:

 There you go, leaves ifb broken as-is, at least prevents it
 from crashing randomly when the input_dev disappears.
 

I hope the above comment does show up in the logs ;-

 [NET]: Use interface index to keep input device information
 

 ===
 --- net-2.6.git.orig/include/net/pkt_cls.h
 +++ net-2.6.git/include/net/pkt_cls.h
 @@ -352,14 +352,19 @@ tcf_change_indev(struct tcf_proto *tp, c
  static inline int
  tcf_match_indev(struct sk_buff *skb, char *indev)
  {
 + int ret = 1;
 +
   if (indev[0]) {
 - if  (!skb-input_dev)
 - return 0;
 - if (strcmp(indev, skb-input_dev-name))
 + struct net_device *dev;
 +
 + dev = dev_get_by_index(skb-iif);
 + if  (!dev)
   return 0;
 + ret = !strcmp(indev, dev-name);
 + dev_put(dev);
   }
  

[..]

 Index: net-2.6.git/drivers/net/ifb.c
 ===
 --- net-2.6.git.orig/drivers/net/ifb.c
 +++ net-2.6.git/drivers/net/ifb.c
 @@ -158,19 +158,23 @@ static int ifb_xmit(struct sk_buff *skb,
   stats-tx_packets++;
   stats-tx_bytes+=skb-len;
  
 - if (!from || !skb-input_dev) {
 + if (!from || !skb-iif) {

Can you have something similar to what you did in
tcf_match_indev above?

i.e grab dev by skb-iif so as to increment refcount - if it doesnt
exist it becomes equivalent to !skb-input_dev and if it exists you drop
the ref inside the else below after changing input_dev 

   dev_kfree_skb(skb);
   stats-rx_dropped++;
   return ret;
   } else {
 + struct net_device *iif;
   /* 
* note we could be going
* ingress - egress or
* egress - ingress
   */
 - skb-dev = skb-input_dev;
 - skb-input_dev = dev;
 + iif = __dev_get_by_index(skb-iif);
 + if (!iif)
 + goto dropped;
 + skb-dev = iif;
 + skb-iif = dev-ifindex;
   if (from  AT_INGRESS) {
   skb_pull(skb, skb-dev-hard_header_len);
   } else {
 -

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-30 17:09
 On Fri, 2006-30-06 at 19:13 +0200, Thomas Graf wrote:
  Index: net-2.6.git/drivers/net/ifb.c
  ===
  --- net-2.6.git.orig/drivers/net/ifb.c
  +++ net-2.6.git/drivers/net/ifb.c
  @@ -158,19 +158,23 @@ static int ifb_xmit(struct sk_buff *skb,
  stats-tx_packets++;
  stats-tx_bytes+=skb-len;
   
  -   if (!from || !skb-input_dev) {
  +   if (!from || !skb-iif) {
 
 Can you have something similar to what you did in
 tcf_match_indev above?
 
 i.e grab dev by skb-iif so as to increment refcount - if it doesnt
 exist it becomes equivalent to !skb-input_dev and if it exists you drop
 the ref inside the else below after changing input_dev 

A non-existant iif is already equivalent to !iif since it jumps
into the same branch. Grabing a reference is completely pointless,
the netdevice represented by skb-iif is at this point until the
packet gets queued covered by a reference taken in netif_rx().
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Fri, 2006-30-06 at 23:10 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-30 16:54
  I agree - and i try hard to document but at times there's too much
  and a line needs to be drawn.
  As an example:
  the eth-ifb-ifb case though is a very corner case. All the IMQ types
  need to only redirect to one ifb; while i test it ive always seen the
  test as more of a boundary check than a useful setup.
 
 Maybe you should just start by explaining why it doesn't work,
 your original explanation doesn't make much sense.
 

Better to explain the reason for ifb first:
ifb exists initially as a replacement for IMQ. 
1) qdiscs/policies that are per device as opposed to system wide.
This now allows for sharing.

2) Allows for queueing incoming traffic for shaping instead of
dropping.

In other wise, the main use is for multiple devices to redirect to it.
Main desire is not for it to redirect to any other ifb device or eth
devices. I actually tried to get it to do that, but run into issues
of complexity and and came up with decision to drop instead of killing
the machine.
Other than that, it can redirect to any other devices - but may still
not be meaningful.

I have been thinking of Herberts change of qdisc_is_running and this may
help actually.

  And Thomas: I do keep notebooks ;- I doodle about everything, they dont
  all fit in my knapsack anymore so sometimes when i say i will go and
  refer to my notes, it is real.
 
 It's plain arrogant to rely code understanding on information that
 is not available to others. It makes everyone being dependant on
 you, very nice.
 

I try hard to document more than many many people. I give tutorials, I
write papers when time allows, I spend time responding to emails. And i
do it all for the love of the game. 
But lets look at the other side of the coin:
Dont you think it is polite to ask when something is not clear instead
of making assumptions? 

 To put it in another way, if you would do your job right when
 writing the code, contacting or CCing you upon changes would
 be only of a formal matter since the code is understandable and
 the intent is clear or documented.

I wish I could achieve that. I dont think theres any piece of code that
reaches those standards in the kernel actually. And if theres a piece of
code that achieves that it should be emulated. Most of the times, things
that are hard are not about the code but about issues like algorithmics,
embedded policies etc (I think you are saying the same).

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Thomas Graf

This is boring, I reversed everything to not change any semantics and
you still complain.

* jamal [EMAIL PROTECTED] 2006-06-30 17:35
 On Fri, 2006-30-06 at 23:20 +0200, Thomas Graf wrote:
  * jamal [EMAIL PROTECTED] 2006-06-30 17:09
   the ref inside the else below after changing input_dev 
  
  A non-existant iif is already equivalent to !iif since it jumps
  into the same branch. 
 
 I thought 0 was a valid iif as Ben G was pointing - if it is not, you
 are right. 

No, 1 ist the first valid ifindex, see dev_new_index().

  Grabing a reference is completely pointless,
  the netdevice represented by skb-iif is at this point until the
  packet gets queued covered by a reference taken in netif_rx().
 
 You have to consider that this could happen at both ingress and egress.

Just think for a second, do you expect the device the packet will
be leaving at to disappear?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-30 17:31
 Better to explain the reason for ifb first:
 ifb exists initially as a replacement for IMQ. 
 1) qdiscs/policies that are per device as opposed to system wide.
 This now allows for sharing.
 
 2) Allows for queueing incoming traffic for shaping instead of
 dropping.
 
 In other wise, the main use is for multiple devices to redirect to it.
 Main desire is not for it to redirect to any other ifb device or eth
 devices. I actually tried to get it to do that, but run into issues
 of complexity and and came up with decision to drop instead of killing
 the machine.
 Other than that, it can redirect to any other devices - but may still
 not be meaningful.

Last time I'm asking. Why are packets dropped? You mentioned tc_verd
is set to 0 leading to a invalid from verdict. Fact is that the
from verdict is set to a meaningful value again at dev_queue_xmit()
or ing_filter() so ifb_xmit() only sees valid values. tx locks of
individual ifb devices are independant, why would it deadlock? Where
is the packet exactly dropped?

 I have been thinking of Herberts change of qdisc_is_running and this may
 help actually.

Help on what?

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Sat, 2006-01-07 at 01:22 +0200, Thomas Graf wrote:
 This is boring, I reversed everything to not change any semantics and
 you still complain.
 

You reversed a bug that you had introduced. Do you want me to review
this patch or not now? Be a little reasonable please.

 * jamal [EMAIL PROTECTED] 2006-06-30 17:35

   Grabing a reference is completely pointless,
   the netdevice represented by skb-iif is at this point until the
   packet gets queued covered by a reference taken in netif_rx().
  
  You have to consider that this could happen at both ingress and egress.
 
 Just think for a second, do you expect the device the packet will
 be leaving at to disappear?

I am  not certain i understood then: Are we in the mode where the
refcount is not needed because chances are small that a device will
disappear? It seems to me after all this trouble that it may not be so
bad to refcount (I guess i meant refcount the device on input to the ifb
and decrement on the output).

As a note - and i am sure you know this: 
The packet comes to the ifb and gets queued. It gets dequeued at a later
time and is sent off either to the ingress or egress. At any point
during the enq/deq the other device being referenced could 
disappear. This is a lesser sin on ingress than it is on egress.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread Patrick McHardy

jamal wrote:
 On Fri, 2006-30-06 at 19:33 +0200, Patrick McHardy wrote:
 
Well, I thought I stay out of this, but since you mention me ..

 I think you will add value to the discussion ;-
 Regardless, we need to settle these kind of issues so we can work better
 together. A while back i said was going to bring some of these issues at
 the next netconf - but since the discussion is happening already, lets
 do it here. Maybe we should take this discussion privately?

Whatever helps resolving this, feel free to take this private if you
feel like it. I think I already managed to get along with our
differences quite well, but it was a hard path. Having friends that
use the same discussion tactics may have helped :) Just to clarify what
I'm saying below, I like you a lot personally, this is purely
a work thing.

I also had the feeling it has gotten easier working with you lately,
 
 
 The feeling is mutual. 
 
 
but I can understand Thomas's pain, I had the same thoughts more than
once. Your code often does have an enormous amount of bugs and
whitespace and other stylistic problems and working with you can be
challenging for multiple reasons, for example having to go to endless,
 
 
 One thing is clear in my mind at least (and i have said it several
 times): I am not as good at the semantics as either yourself or Thomas
 or Dave or Acme etc but i have tons of other things that compensate for.
 Probably not as good is not the best  description  - rather my brain
 cells put less respect on the stylistic commas than they do on the
 message. Perhaps it's dylexsia or me trying to subconsciouly maximize my
 time. If you push me hard, however, i will get close to do just as a
 good job as you;-  Still, cant live without you guys! 

I think the things you did and which I struggeled with were right
to do, but what I don't understand is why you can't do it right
yourself. Software development has a lot of unpleasant parts,
but ignoring them can't be the right answer. In a shared project,
you should always remember that if you're not doing the work, some
else will have to.

 I would describe the majority of the fixes you have sent against me to
 fall into the stylistic category and into fixing TheLinuxWay (i.e same
 bug in multiple places and files). This is not to say i am not at fault,
 or there havent been issues which made me wonder, but we may have
 different metrics of what bugs i would call my mom and say Mom, I just
 fixed enormous amount of bugs. 
 [As an example if you went hunting around the kernel, you will find a
 lot of things that dont totally conform to lindent rules. You could
 literally send 100s of patches].
 Perhaps it is the language usage that puts us at odds at times.

Sorry, absolutely not. The bugs I fixed were not just stylistic
things, although I still tried to take care of them where possible.
It was a huge number of bad bugs, even affecting kernels that didn't
even used the feature which introduced the bugs. As for TheLinuxWay
(I think I made my opinion on this clear already), cut-and-paste
of crappy code is not it. I'm always amazed how people can copy
crappy code without even thinking about it, when I copy code I usually
fix bugs in the code I copy _before_ doing so.

Language might of course still contribute to our misunderstandings.

often entirely pointless discussions for obvious fixes, 
 
 
 You may see it that way - I dont and so when that happens it is not by
 any means to engage in a meaningless debate. 
 The discussions are typically of the same nature as i just had with
 Thomas (just a simple one line change in ifb which looked very
 obvious).

Sometimes you may be right, but despite all the lengthy discussions
we went through, I can't recall a single patch of mine that didn't
went in eventually. But OK, people are complaining about missing
clarification of intent, we're all sometimes guilty of that.
Still, starting a huge discussion about patches that clearly
move in the right direction like Thomas' one which started this
thread is frustrating to the submitter. Discussions about technical
matters should use technical, comprehensible arguments, and nothing
else. This basically means on of three things:

- Bad because it does ..
- Bad because this patch is better
- Applied

Vague expressions about not good, might be able to do better or
might create problems without specification are not helpful as long
as they are vague. The point is: _show_ that the submitters argument
is wrong, and if it really is, that shouldn't be hard. This is
especially true if its not some random guy but someone familiar with
what he is changing. This thread is one of many examples.

 The way it goes is that there is a certain assumption made, and a
 solution is suggested in the form of a patch. Accepting such a patch
 (which i once in a while dont even get to see until it is in) means a
 lot of the other assumptions get invalidated and is followed by tons of
 bug fixes. This has happened a few times to me.

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-30 Thread jamal

On Sat, 2006-01-07 at 01:45 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-30 17:31
  Better to explain the reason for ifb first:
  ifb exists initially as a replacement for IMQ. 
  1) qdiscs/policies that are per device as opposed to system wide.
  This now allows for sharing.
  
  2) Allows for queueing incoming traffic for shaping instead of
  dropping.
  
  In other wise, the main use is for multiple devices to redirect to it.
  Main desire is not for it to redirect to any other ifb device or eth
  devices. I actually tried to get it to do that, but run into issues
  of complexity and and came up with decision to drop instead of killing
  the machine.
  Other than that, it can redirect to any other devices - but may still
  not be meaningful.
 
 Last time I'm asking.

Then please listen carefully - because if you read my message with a
reasonable openness the answer is there. 

  Why are packets dropped? 

When looping happens the only sane thing to do is to drop packets.

I mentioned this was a very tricky thing to achieve in all cases since
there are many behaviors and netdevices that have to be considered. As
an example i ended not submitting the code for looping from egress to
ingress because i felt it needed more testing. And i am certain i have
missed some other device combinations.
Loops could happen within ingress or egress. They could mostly happen
because of mirred or ifb. And some i have discovered via testing.

 You mentioned tc_verd is set to 0 leading to a invalid from verdict. 

yes, for ifb.

 Fact is that the from verdict is set to a meaningful value again at 
 dev_queue_xmit() or ing_filter() so ifb_xmit() only sees valid values. 

Ok, Thomas this is one of those places where we end up colliding for no
good reason - your statement above is accusatory. And sometimes i say 3
things and you pick one and use that as context. The ifb does clear the
field. So if you get redirected again, it will be 0.

 tx locks of individual ifb devices are independant, why would it deadlock? 

different instances of the same device do not deadlock. If however, you
have a series of devices in which A-B-C-D-A then you will have a
possible deadlock. In the case of the ifb i dissallowed it because it
was obvious from the testing. 

 Where is the packet exactly dropped?
 

For the above in the ifb when they come in with from=0.

  I have been thinking of Herberts change of qdisc_is_running and this may
  help actually.
 
 Help on what?

In the case of deadlocks. Now that it is impossible to contend for the
txlock twice (with that patch), a second redirect will end up just
enqueueing.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-28 09:46
 Why not use iflink?
 It exists, by definition, specifically as the ifindex of the down muxed
 netdevice (should probably add that definition in the .h).
 
 This is semantically different from a message to the stack which says
 this came to you from input device X (represented by input_dev).

I disagree, iflink information is bogus once we start redirecting.

  What's special about that? It seems a perfectly fine and simple way
  to create namespaces for sockets without enforcing applications to
  bind to the correct vlan devices directly.
  
 
 Can you achieve the same by binding to any arbitrary netdevice? If you
 can then i would agree.

It was your request to not update input_dev in netif_receive_skb()
but rather have each netdevice handle it manually. What's currently
done in ifb:

skb-dev = skb-input_dev;
skb-input_dev = dev;

Confusing, isn't it? I really don't get input_dev/iif is supposed to
be updated in ifb. If a packet is to be redirected, the mirred action
shall take care of providing this information to the next netdevice.
My additional request is to provide a flag to disable this for special
purposes not hurting anyone.

 Please refer to what i said above in that you could end up redirecting
 many times. I have not added yet the code which allows mirred to also do
 ingress redirection which would make it even more interesting.
 The challenge is this, and if you can solve it we would be fine:
 - I need to access both the input_dev and dev and their metadata.
 I could clearly find them by their ifindices and then reference them
 after that. For performance reasons that is not optimal in the fast
 path.  

Nice, you forget that while redirecting the device pointed to by
input_dev can disappear leaving behind an illegal reference. The
purpose of this patch is to fix this.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread jamal


I noticed the other email carrying a patch. Let me respond to this email
and hopefully that will address the patch.

On Thu, 2006-29-06 at 10:51 +0200, Thomas Graf wrote: 
 * jamal [EMAIL PROTECTED] 2006-06-28 09:46
[..] 
 I disagree, iflink information is bogus once we start redirecting.

After redirecting, iflink is available i.e it doesnt make any
more sense and therefore it should be (ab)usable to carry other info.
But the more i think about it, using skb-dev is just fine for 
finding the device you are looking for even when it gets redirected.
 
I am assuming you still want to find out - in the case of a topology
which constitutes more than one netdevice - the second last netdevice,
correct?

  
  Can you achieve the same by binding to any arbitrary netdevice? If you
  can then i would agree.
 
 It was your request to not update input_dev in netif_receive_skb()
 but rather have each netdevice handle it manually. 

For good reasons of course.

note: you didnt answer the question i asked, but probably the answer
doesnt matter.

 What's currently
 done in ifb:
 
   skb-dev = skb-input_dev;
   skb-input_dev = dev;
 
 Confusing, isn't it? 

not at all. Let me explain the design intent further below.

 I really don't get input_dev/iif is supposed to
 be updated in ifb. If a packet is to be redirected, the mirred action
 shall take care of providing this information to the next netdevice.

ifb is a special purpose device.
Think of it as a redirector, or injector if you want a better noun. If
you are going to inject packets back to the stack at arbitrary points,
then you need to reflect that in the packet metadata implying where it
came from. Likewise if you are going to redirect packets into arbitrary
points in the stack (as mirred does), you need to do the same. 
I dont know if that makes sense. 

 My additional request is to provide a flag to disable this for special
 purposes not hurting anyone.

I am failing to see the purpose. 
With all due respect, you are changing the design intent, Thomas.

I know your intent is noble in trying to save the 32 bits on 64 bit
machines (at least thats where your patch seems to have started) but the
cost:benefit ratio as i have pointed out is unreasonable. 
If you can meet the requirements i have specified (I am leaving them
below) i will be a happy camper.

  The challenge is this, and if you can solve it we would be fine:
  - I need to access both the input_dev and dev and their metadata.
  I could clearly find them by their ifindices and then reference them
  after that. For performance reasons that is not optimal in the fast
  path.  
 

So just we are clear: I have strong desire to save compute rather than
than saving 32 bits per skb on 64 bit machine. 

 Nice, you forget that while redirecting the device pointed to by
 input_dev can disappear leaving behind an illegal reference. 

We have had this discussion before - that is resolvable via
dev_put/hold.

 The purpose of this patch is to fix this.

Appreciated - but can you do it without replacing the input_dev?

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-29 19:23
  What's currently
  done in ifb:
  
  skb-dev = skb-input_dev;
  skb-input_dev = dev;
  
  Confusing, isn't it? 
 
 not at all. Let me explain the design intent further below.

Then let me show you what your code does:

 [mirred attached to filter on eth0 redirecting to ifb0]

tcf_mirred():skb2-input_dev = skb-dev; (skb2-input_dev=eth0)
ifb_xmit():  skb-dev = skb-input_dev; (skb-dev=eth0)
 skb-input_dev = dev; (skb-input_dev=ifb0)

So when reentering the stack the skb looks like it would be
on eth0 coming from ifb0. Is that what you wanted? Shouldn't
it be skb-dev=ifb0 skb-input_dev=eth0? That's what my patch
would change it to.

 I know your intent is noble in trying to save the 32 bits on 64 bit
 machines (at least thats where your patch seems to have started) but the
 cost:benefit ratio as i have pointed out is unreasonable. 

Did I ever claim this? You made this up right now. As of now
it doesn't save a single bit.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread David Miller

From: Thomas Graf [EMAIL PROTECTED]
Date: Fri, 30 Jun 2006 01:39:33 +0200

 * jamal [EMAIL PROTECTED] 2006-06-29 19:23
  I know your intent is noble in trying to save the 32 bits on 64 bit
  machines (at least thats where your patch seems to have started) but the
  cost:benefit ratio as i have pointed out is unreasonable. 

 Did I ever claim this? You made this up right now. As of now
 it doesn't save a single bit.

Right.

From what I can see Thomas's work allows to do correct
reference counting on the input_dev, something which is
not possible right now and is a bug.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread jamal

On Fri, 2006-30-06 at 01:39 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-29 19:23

  not at all. Let me explain the design intent further below.
 
 Then let me show you what your code does:
 
  [mirred attached to filter on eth0 redirecting to ifb0]
 
 tcf_mirred():skb2-input_dev = skb-dev; (skb2-input_dev=eth0)
 ifb_xmit():  skb-dev = skb-input_dev; (skb-dev=eth0)
  skb-input_dev = dev; (skb-input_dev=ifb0)
 
 So when reentering the stack the skb looks like it would be
 on eth0 coming from ifb0. Is that what you wanted? 

ok, that looks like egress side of the stack, correct?
 
indeed thats what i meant earlier i.e above looks right. Another way to
look at it, is that on the stack, the dev part acts as a to label of
the direction and the input_dev maps to the from. 


step 0: Packet arriving at egress of eth0 
It will have skb-dev pointing to eth0 and skb-input_dev will depend
on whether it is coming from the local path (in which it will be NULL)
or it was being routed (in which case it will reflect the netdevice it
last passed on input.

Step 1: when it leaves redirect
As you note it will have indev(from) = eth0 and dev(to) still ifb0

Step 2: when it leaves ifb going back to eth0
it will have input_dev(from) = ifb0 and dev(to)  eth0

Of course this gets more interesting if you had say redirected to
another device like lo which redirected to ifb1. Or when you try to
create loops etc.

And if you look at ingress, it will be slightly different; however,
in both cases (ingress/egress) things are consistent in the labeling of
from/to to be input_dev/dev

 Shouldn't
 it be skb-dev=ifb0 skb-input_dev=eth0? That's what my patch
 would change it to.
 

yes, that would change the semantics

  I know your intent is noble in trying to save the 32 bits on 64 bit
  machines (at least thats where your patch seems to have started) but the
  cost:benefit ratio as i have pointed out is unreasonable. 
 
 Did I ever claim this? You made this up right now. As of now
 it doesn't save a single bit.

Ok, I apologize - i assumed it has something to do with skb diet. 

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread David Miller

From: jamal [EMAIL PROTECTED]
Date: Thu, 29 Jun 2006 20:08:19 -0400

 What am i missing?
 on 64bit machine, does it not save 32 bits to use an ifindex as opposed
 to the pointer?

The objects around it are pointers, which are 64-bit, and thus
the 32-bit object gets padded out to 64-bits in the layout of
the struct so that the next pointer member can be properly
aligned.

It does not change the size of sk_buff at all.

 Yes, it is a bug, but:
 dev_hold/put dont work anymore? why do you need an ifindex instead?

You sure you want to do that atomic operation on every single
input packet, regardless of whether egresss operations are
using it or not?

We should put the cost of features at the actual users, and not
impose it upon everyone.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread jamal

On Thu, 2006-29-06 at 17:12 -0700, David Miller wrote:
 From: jamal [EMAIL PROTECTED]
 Date: Thu, 29 Jun 2006 20:08:19 -0400

  What am i missing?
  on 64bit machine, does it not save 32 bits to use an ifindex as opposed
  to the pointer?

 The objects around it are pointers, which are 64-bit, and thus
 the 32-bit object gets padded out to 64-bits in the layout of
 the struct so that the next pointer member can be properly
 aligned.

 It does not change the size of sk_buff at all.

I see; i take it if things were moved around that may change?

  Yes, it is a bug, but:
  dev_hold/put dont work anymore? why do you need an ifindex instead?

 You sure you want to do that atomic operation on every single
 input packet, regardless of whether egress operations are
 using it or not?

Can you avoid doing the refcount?
Note Thomas is doing dev_get_by_index (which will do the atomic ref
count).

For me the choice is between having the iif and:
- __get device from ifindex
- reference dev-something

vs getting the input_dev and
- reference dev-something

 We should put the cost of features at the actual users, and not
 impose it upon everyone.

I didnt quiet follow, the ref count seems only needed in the
redirection, no?

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread Ben Greear


David Miller wrote:


I'm saying that, we don't need the refcount, just setting
the skb-input_index thing, unless someone actually cares
about the input device.

As long as the packet hits not paths that care about the
SKB input device, no atomic refcounts are taken.  It's
just an integer sitting there in the SKB.


Also, if this lookup is to be done multiple times per skb, we could
do the lookup once and grab the ref at that time and store the
pointer in the skb.  If you wanted to be truly horible about it..could
use a 64-bit input_index and copy the pointer there (and set a flag somewhere
noting it is no longer an index but a pointer) so save the extra
64-bits.

Ben



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-29 20:03
 On Fri, 2006-30-06 at 01:39 +0200, Thomas Graf wrote:
  * jamal [EMAIL PROTECTED] 2006-06-29 19:23
 
   not at all. Let me explain the design intent further below.
  
  Then let me show you what your code does:
  
   [mirred attached to filter on eth0 redirecting to ifb0]
  
  tcf_mirred():skb2-input_dev = skb-dev; (skb2-input_dev=eth0)
  ifb_xmit():  skb-dev = skb-input_dev; (skb-dev=eth0)
   skb-input_dev = dev; (skb-input_dev=ifb0)
  
  So when reentering the stack the skb looks like it would be
  on eth0 coming from ifb0. Is that what you wanted? 
 
 ok, that looks like egress side of the stack, correct?

No, that's the ingress side leaving ifb again via netif_rx()
skb-dev should represent the from/at= and not to=. 

For egress your code is correct although impossible to guess
right due to total lack of a comment where it would make sense
and naming that doesn't give any implications.

When leaving ifb0 you want for...
... egress:
   skb-dev=to (eth0) skb-iif=from (ifb0)
... ingress:
   skb-dev=at (ifb0) skb-iif=from (eth0)

So we move the update to the tasklet and set skb-dev to
skb-iif before dev_queue_xmit() and skb-dev = dev (ifb)
before calling netif_rx()

Does that fullfil your requirements?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread jamal

On Thu, 2006-29-06 at 17:29 -0700, David Miller wrote:
 From: jamal [EMAIL PROTECTED]
 Date: Thu, 29 Jun 2006 20:26:20 -0400
[..]
  I see; i take it if things were moved around that may change?

 Yes.

Ok, relief - so i was not totally unreasonable then ;-

  Can you avoid doing the refcount?
  Note Thomas is doing dev_get_by_index (which will do the atomic ref
  count).

 He is doing that where skb-input_device is needed, which is
 what we want.

I am saying the same thing as well - i think. mirred touches
the input_dev and therefore setting the refcount in mirred is valid -
but iam unsure where to unset it.

  I didnt quiet follow, the ref count seems only needed in the
  redirection, no?

 I'm saying that, we don't need the refcount, just setting
 the skb-input_index thing, unless someone actually cares
 about the input device.

the ifb references it; only mirred redirects to the ifb at the moment.
You would need to increment in mirred, no?
Why do i feel i am missing something? ;-

 As long as the packet hits not paths that care about the
 SKB input device, no atomic refcounts are taken.  It's
 just an integer sitting there in the SKB.

indeed. 
I think whether it becomes ifindex or pointer you need to increment the
refcounter. and decrement somewhere.
The challenge for me is a choice to use more cycles if you use ifindex
vs less cycles with a pointer. The advantage for going with ifindex
would be to save those bits(if you rearrange). The question is which is
reasonable?;-

cheers,
jamal 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-29 20:48
 the ifb references it; only mirred redirects to the ifb at the moment.
 You would need to increment in mirred, no?
 Why do i feel i am missing something? ;-

The point is to avoid having an atomic operation for every packet
when setting iif in netif_receive_skb(). If it was only for
mirred nobody would complain I guess.

 I think whether it becomes ifindex or pointer you need to increment the
 refcounter. and decrement somewhere.
 The challenge for me is a choice to use more cycles if you use ifindex
 vs less cycles with a pointer. The advantage for going with ifindex
 would be to save those bits(if you rearrange). The question is which is
 reasonable?;-

The third choice is to just don't care if the interface goes away
but have a chance to figure it out and just assume as if it would
have never been set. The number of devices that can disappear w/o
user control is very very limited and not worth an atomic operation
for every single packet.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread jamal

On Fri, 2006-30-06 at 02:46 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-29 20:03
  On Fri, 2006-30-06 at 01:39 +0200, Thomas Graf wrote:
   * jamal [EMAIL PROTECTED] 2006-06-29 19:23
  
not at all. Let me explain the design intent further below.
   
   Then let me show you what your code does:
   
[mirred attached to filter on eth0 redirecting to ifb0]
   
   tcf_mirred():skb2-input_dev = skb-dev; (skb2-input_dev=eth0)
   ifb_xmit():  skb-dev = skb-input_dev; (skb-dev=eth0)
skb-input_dev = dev; (skb-input_dev=ifb0)
   
   So when reentering the stack the skb looks like it would be
   on eth0 coming from ifb0. Is that what you wanted? 
  
  ok, that looks like egress side of the stack, correct?
 
 No, that's the ingress side leaving ifb again via netif_rx()

 skb-dev should represent the from/at= and not to=. 
 

Heres what it would look at ingress:

step 0: coming from wire via eth0,
dev=eth0, input_dev=eth0

step 1: redirect to ifb0, leaving redirect
dev=ifb0, input_dev=eth0

step 2: leaving ifb0, coming back to ingress side of stack

dev= eth0, input_dev=ifb0

Again, it gets interesting when you have redirection multiple times
and create loops. It will get more interesting when i submit the code
for redirecting to ingress.
The good news is there is very strong consistency.
I dont know if at is the correct description. I visualize it as
hitting the stack just at the point.

when the packet first comes in both from/to point to eth0.
So the semantics are the same as in egress.

 For egress your code is correct although impossible to guess
 right due to total lack of a comment where it would make sense
 and naming that doesn't give any implications.
 

There are some simple comments and  I have been trying very hard to
explain this for a long time.
Patrick was the last person who tortured me ;- Perhaps i need to
write a document. 

 When leaving ifb0 you want for...
 ... egress:
skb-dev=to (eth0) skb-iif=from (ifb0)
 ... ingress:
skb-dev=at (ifb0) skb-iif=from (eth0)
 

Yes, this is correct. I described the flow of the first one in the
earlier email and the ingress side.
Although lets not talk about iif yet. I posed a question earlier on
which scheme would be better.

 So we move the update to the tasklet and set skb-dev to
 skb-iif before dev_queue_xmit() and skb-dev = dev (ifb)
 before calling netif_rx()
 
 Does that fullfil your requirements?

You lost me on what you are trying to achieve. Just restore the line you
removed in ifb and things would be fine. Lets then settle the issue of
iif vs input_dev.

cheers,
jamal



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-29 Thread jamal

On Fri, 2006-30-06 at 02:55 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-29 20:48

 The point is to avoid having an atomic operation for every packet
 when setting iif in netif_receive_skb(). If it was only for
 mirred nobody would complain I guess.
 

I never intended to punish all users. I think i was misunderstood.
The only problem is where do you decrement the refcount when you
increment at mirred?
In your case, I assume it is at some sort of rule destruction?

  I think whether it becomes ifindex or pointer you need to increment the
  refcounter. and decrement somewhere.
  The challenge for me is a choice to use more cycles if you use ifindex
  vs less cycles with a pointer. The advantage for going with ifindex
  would be to save those bits(if you rearrange). The question is which is
  reasonable?;-
 
 The third choice is to just don't care if the interface goes away
 but have a chance to figure it out and just assume as if it would
 have never been set. The number of devices that can disappear w/o
 user control is very very limited and not worth an atomic operation
 for every single packet.
 

Now that you mention this - I think that option 3 is what we said last
time we had this discussion ;-

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-28 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-27 09:07
 I am reading the thread backwards, so i may miss some of the obvious.
 I also dont remember the exact discussion we had - but the consensus was
 to leave the field setting as is.
 Note the meta-setter (been sitting on it for too long) also set the
 input_dev.

Could you share that piece of code so I don't have to duplicate
that effort?

 BTW, in regards to the VLANs, what is wrong with using the
 netdev-iflink to figure the real device? 


vlan1
  ifb1
vlan2
vlan3
  ifb2
vlan4

dhcp daemon would bind to ifb1 and use pktinfo cmsg to figure
out whether the packet was received on vlan1 or vlan2, the
actual physical device is of no interest anymore.

Looking at your ifb code I see that you set skb-dev based on
iif and update iif unconditionally. I see a lot of problems
with this, firstly iif is already updated in the mirred action,
secondly iif doesn't necessarily point to the device the packet
was last seen on. Could you maybe explain the policy of iif
you have in mind, it doesn't seem very consistent.

BTW, why does ifb have a hard header length? The resulting
push and pull only slows things down.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-28 Thread jamal

On Wed, 2006-28-06 at 12:18 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-27 09:07
[..]
  Note the meta-setter (been sitting on it for too long) also set the
  input_dev.
 
 Could you share that piece of code so I don't have to duplicate
 that effort?
 

I will look it up and send it your way - it is not complete (I have
changed machines twice since i first wrote it). Or maybe the intent of
your question was to ask how was the setting of input_dev was done?

  BTW, in regards to the VLANs, what is wrong with using the
  netdev-iflink to figure the real device? 
 
 
 vlan1
   ifb1
 vlan2
 vlan3
   ifb2
 vlan4
 
 dhcp daemon would bind to ifb1 and use pktinfo cmsg to figure
 out whether the packet was received on vlan1 or vlan2, the
 actual physical device is of no interest anymore.
 

Let me see if i understood correctly:
you have a device from both vlan1 and vlan2 both being redirected to
ifb1? and vlan 3, 4= ifb2?
And i take it to make it interesting vlan1,2 on eth0 and vlan3,4 ontop
of eth1? note the iflink would/should reflect eth1/2. If you had bonding
or bridging in between it gets more interesting. But you are saying
you are no longer interested in the info that it came via eth1/2 at some
point, correct? 
Note, in such a case: iflink rewriting will do it just fine
but then you loose the original info (I think it would be good to not
loose such info to know the origin). I dont know if cmsg uses iflink at
all - they probably look at the ifindex.

Ok, now question: why are you binding on top of ifb? we could have
redirected this to ipip, or tun0 or loopback or eql etc where it may
equally not have made sense to bind to. I can see something like this
making sense for bonding or bridging but not as a generic answer for all
netdevices.

 Looking at your ifb code I see that you set skb-dev based on
 iif and update iif unconditionally. I see a lot of problems
 with this, firstly iif is already updated in the mirred action,
 secondly iif doesn't necessarily point to the device the packet
 was last seen on. Could you maybe explain the policy of iif
 you have in mind, it doesn't seem very consistent.
 

The concept is to use ifb as a transitionary device. Packets come into
it from either the ingress side of the stack or egress and get returned
on the same side/spot they were found - with ifb represented as the
input device. What ifb does is independent of what mirred does (we could
redirect to other devices other than ifb).
 
Now that i pay attention to your patch i think i see the complications
it is introducing (i am not done yet)- and i did warn about this in the
discussion _and_ didnt we agree to leave this stuff alone? Can we drop
this change please?

 BTW, why does ifb have a hard header length? The resulting
 push and pull only slows things down.

Good question. I know there were a lot of oddities i had to deal with
when different link layers cross over (some have smaller headers than
other and yet others had none ;-)
I will have to go and look at my notes test cases and get back to you; 

It does work as it is right now (I only notice a pull - there is no
push).

cheers,
jamal 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-28 Thread Thomas Graf

* jamal [EMAIL PROTECTED] 2006-06-28 08:22
 Let me see if i understood correctly:
 you have a device from both vlan1 and vlan2 both being redirected to
 ifb1? and vlan 3, 4= ifb2?
 And i take it to make it interesting vlan1,2 on eth0 and vlan3,4 ontop
 of eth1? note the iflink would/should reflect eth1/2. If you had bonding
 or bridging in between it gets more interesting. But you are saying
 you are no longer interested in the info that it came via eth1/2 at some
 point, correct? 
 Note, in such a case: iflink rewriting will do it just fine
 but then you loose the original info (I think it would be good to not
 loose such info to know the origin). I dont know if cmsg uses iflink at
 all - they probably look at the ifindex.

It's using rt_iif which is currently derived from skb-dev even
though it would make sense to use skb-iif once that is well
defined.

 Ok, now question: why are you binding on top of ifb? we could have
 redirected this to ipip, or tun0 or loopback or eql etc where it may
 equally not have made sense to bind to. I can see something like this
 making sense for bonding or bridging but not as a generic answer for all
 netdevices.

What's special about that? It seems a perfectly fine and simple way
to create namespaces for sockets without enforcing applications to
bind to the correct vlan devices directly.

 The concept is to use ifb as a transitionary device. Packets come into
 it from either the ingress side of the stack or egress and get returned
 on the same side/spot they were found - with ifb represented as the
 input device. What ifb does is independent of what mirred does (we could
 redirect to other devices other than ifb).

This is clear, my question was for you to explain when you intend
to update input_dev etc. By overwriting it in ifb you enforce that
valueable information is lost.

There is nothing wrong with updating iif to point to ifb under
certain circumstances but it's not required to strictly enforce
this. Having mirred do so conditionally is a lot more flexible
without losing any value.

 Now that i pay attention to your patch i think i see the complications
 it is introducing (i am not done yet)- and i did warn about this in the
 discussion _and_ didnt we agree to leave this stuff alone? Can we drop
 this change please?

This statement doesn't make sense, it merely transforms a device
reference to a reference by interface index because the current
method of handling it doesn't make sense at all since the region
where input_dev stays valid without the risk of the device
disappearing is very limited.


 It does work as it is right now (I only notice a pull - there is no
 push).

In order to pull in ifb you need to push in mirred, quite
obvious, no?  Why are you enforcing users of ifb to fake an
ethernet header that is of no use at all?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-28 Thread jamal

On Wed, 2006-28-06 at 15:01 +0200, Thomas Graf wrote:
 * jamal [EMAIL PROTECTED] 2006-06-28 08:22
[..]
  Note, in such a case: iflink rewriting will do it just fine
  but then you loose the original info (I think it would be good to not
  loose such info to know the origin). I dont know if cmsg uses iflink at
  all - they probably look at the ifindex.
 
 It's using rt_iif which is currently derived from skb-dev even
 though it would make sense to use skb-iif once that is well
 defined.
 

Why not use iflink?
It exists, by definition, specifically as the ifindex of the down muxed
netdevice (should probably add that definition in the .h).

This is semantically different from a message to the stack which says
this came to you from input device X (represented by input_dev).

  Ok, now question: why are you binding on top of ifb? we could have
  redirected this to ipip, or tun0 or loopback or eql etc where it may
  equally not have made sense to bind to. I can see something like this
  making sense for bonding or bridging but not as a generic answer for all
  netdevices.
 
 What's special about that? It seems a perfectly fine and simple way
 to create namespaces for sockets without enforcing applications to
 bind to the correct vlan devices directly.
 

Can you achieve the same by binding to any arbitrary netdevice? If you
can then i would agree.

 This is clear, my question was for you to explain when you intend
 to update input_dev etc. By overwriting it in ifb you enforce that
 valueable information is lost.

It is design intent for this field to carry information on where the
packet came from. We may loop many times between devices (between
ingress-egress or ingress-ingress, within the ttl limit) and each time
the only meaningful thing is which was the last netdevice that was
sending it.

 There is nothing wrong with updating iif to point to ifb under
 certain circumstances but it's not required to strictly enforce
 this. Having mirred do so conditionally is a lot more flexible
 without losing any value.
 

Refer to what i said above.

  Now that i pay attention to your patch i think i see the complications
  it is introducing (i am not done yet)- and i did warn about this in the
  discussion _and_ didnt we agree to leave this stuff alone? Can we drop
  this change please?
 
 This statement doesn't make sense, it merely transforms a device
 reference to a reference by interface index because the current
 method of handling it doesn't make sense at all since the region
 where input_dev stays valid without the risk of the device
 disappearing is very limited.
 

Please refer to what i said above in that you could end up redirecting
many times. I have not added yet the code which allows mirred to also do
ingress redirection which would make it even more interesting.
The challenge is this, and if you can solve it we would be fine:
- I need to access both the input_dev and dev and their metadata.
I could clearly find them by their ifindices and then reference them
after that. For performance reasons that is not optimal in the fast
path.  
[Note, we have had this discussion before and even then we came to some
consensus that it would be hard to achieve that hence my view, again, to
drop this].

  It does work as it is right now (I only notice a pull - there is no
  push).
 
 In order to pull in ifb you need to push in mirred, quite
 obvious, no?  

I think that my thinking at the time was that ifb can only receive
packets via mirred and therefore the pull on ingress at ifb may have
been to counter the push that occurs at mirred. Let me look at my notes
and testcases when i get back home; it may make sense not to have it at
ifb - the fact that it works as is tells me there is more to it than the
obvious.

 Why are you enforcing users of ifb to fake an
 ethernet header that is of no use at all?

This may just be an artifact of inheriting things from dummy device.
The answer will be clear when i look at my notes. 

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-27 Thread jamal

On Tue, 2006-27-06 at 12:03 +0200, Thomas Graf wrote:
 * Patrick McHardy [EMAIL PROTECTED] 2006-06-27 00:29
  Doesn't sound too silly (actually quite nifty in my opinion),
  but I'm not sure I understand correctly, binding to ifb doesn't
  work since it changes both skb-dev and skb-input_dev.
 
 I don't understand this concern. So far the mirred action
 updates iif but that can be made configurable.
 

I am reading the thread backwards, so i may miss some of the obvious.
I also dont remember the exact discussion we had - but the consensus was
to leave the field setting as is.
Note the meta-setter (been sitting on it for too long) also set the
input_dev.

BTW, in regards to the VLANs, what is wrong with using the
netdev-iflink to figure the real device? 

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-26 Thread Thomas Graf

Updating iif to the VLAN device helps keeping routing
namespaces defined in case packets from multiple VLANs
collapse on a single device again.

Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Index: net-2.6.git/net/8021q/vlan_dev.c
===
--- net-2.6.git.orig/net/8021q/vlan_dev.c
+++ net-2.6.git/net/8021q/vlan_dev.c
@@ -156,6 +156,8 @@ int vlan_skb_recv(struct sk_buff *skb, s
return -1;
}
 
+   /* Make packets look like they have been received on the VLAN device */
+   skb-iif = skb-dev-ifindex;
skb-dev-last_rx = jiffies;
 
/* Bump the rx counters for the VLAN device. */

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-26 Thread Patrick McHardy

Thomas Graf wrote:
 Updating iif to the VLAN device helps keeping routing
 namespaces defined in case packets from multiple VLANs
 collapse on a single device again.
 
 Signed-off-by: Thomas Graf [EMAIL PROTECTED]
 
 Index: net-2.6.git/net/8021q/vlan_dev.c
 ===
 --- net-2.6.git.orig/net/8021q/vlan_dev.c
 +++ net-2.6.git/net/8021q/vlan_dev.c
 @@ -156,6 +156,8 @@ int vlan_skb_recv(struct sk_buff *skb, s
   return -1;
   }
  
 + /* Make packets look like they have been received on the VLAN device */
 + skb-iif = skb-dev-ifindex;
   skb-dev-last_rx = jiffies;

I know this was discussed before, but I can't remember the
exact outcome. Why don't we just unconditionally update iif
in netif_receive_skb()?

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-26 Thread David Miller

From: Patrick McHardy [EMAIL PROTECTED]
Date: Mon, 26 Jun 2006 19:04:15 +0200

 I know this was discussed before, but I can't remember the
 exact outcome. Why don't we just unconditionally update iif
 in netif_receive_skb()?

Software devices might have interesting semantics that would
make not setting iif desirable.

Once you set iif, you can't just undo it because the information
is lost.

I also would really prefer to set it unconditionally in
netif_receive_skb(), but Jamal's concerns in this area are real.
We really need to evaluate this on a case-by-case basis.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-26 Thread Patrick McHardy

David Miller wrote:
 From: Patrick McHardy [EMAIL PROTECTED]
 Date: Mon, 26 Jun 2006 19:04:15 +0200

I know this was discussed before, but I can't remember the
exact outcome. Why don't we just unconditionally update iif
in netif_receive_skb()?

 Software devices might have interesting semantics that would
 make not setting iif desirable.

Right, I remember now.

 Once you set iif, you can't just undo it because the information
 is lost.

 I also would really prefer to set it unconditionally in
 netif_receive_skb(), but Jamal's concerns in this area are real.
 We really need to evaluate this on a case-by-case basis.

I still have a hard time imagining a case where this wouldn't fit,
netfilter is perfectly happy with just using skb-dev as input
device on the input path so far. I also think it will be confusing
to the user to have different notions of what constitutes the input
device. If anyone can think of an example where the user would
really expect the input device to be something different than
skb-dev I'd be really interested to hear it.

BTW, this is not meant to be an objection to Thomas's patch, just
me still wondering ..

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-26 Thread Thomas Graf

* David Miller [EMAIL PROTECTED] 2006-06-26 10:46
 From: Patrick McHardy [EMAIL PROTECTED]
 Date: Mon, 26 Jun 2006 19:04:15 +0200

  I know this was discussed before, but I can't remember the
  exact outcome. Why don't we just unconditionally update iif
  in netif_receive_skb()?

 Software devices might have interesting semantics that would
 make not setting iif desirable.

 Once you set iif, you can't just undo it because the information
 is lost.

 I also would really prefer to set it unconditionally in
 netif_receive_skb(), but Jamal's concerns in this area are real.
 We really need to evaluate this on a case-by-case basis.

I'm playing with a FIFO device based on Jamal's previous
work to replace the broken IMQ device and provide real
queueing at ingress. A set of VLAN devices could be redirected
to such a FIFO device and let applications bind to it in order
to implement trivial bind(INADDR_ANY) namespaces based on anything
that can be selected by classifiers. This example would benefit
from a conditional iif update (given that the mirred action is
extended to take a flag to steer this) so applications like a
dhcp daemon could use a raw socket on the FIFO device and thus
benefit from the bind namespace but still access the original
interface the packet was received from using pktinfo cmsg.

Maybe silly but the best I could come up with :-)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-26 Thread Patrick McHardy

Thomas Graf wrote:
 * David Miller [EMAIL PROTECTED] 2006-06-26 10:46

From: Patrick McHardy [EMAIL PROTECTED]
Date: Mon, 26 Jun 2006 19:04:15 +0200

I know this was discussed before, but I can't remember the
exact outcome. Why don't we just unconditionally update iif
in netif_receive_skb()?

Software devices might have interesting semantics that would
make not setting iif desirable.

Once you set iif, you can't just undo it because the information
is lost.

I also would really prefer to set it unconditionally in
netif_receive_skb(), but Jamal's concerns in this area are real.
We really need to evaluate this on a case-by-case basis.

 I'm playing with a FIFO device based on Jamal's previous
 work to replace the broken IMQ device and provide real
 queueing at ingress.

All my testing (quite a lot) in this area so far suggested that queueing
at ingress doesn't work and is actually counter-productive. It gets
worse with increasing signal propagation delays (signal in this case
means congestion indications). I actually wish I never introduced
the stupid IMQ device, it raises false hope and a lot of people seem
to fall for it.

 A set of VLAN devices could be redirected
 to such a FIFO device and let applications bind to it in order
 to implement trivial bind(INADDR_ANY) namespaces based on anything
 that can be selected by classifiers. This example would benefit
 from a conditional iif update (given that the mirred action is
 extended to take a flag to steer this) so applications like a
 dhcp daemon could use a raw socket on the FIFO device and thus
 benefit from the bind namespace but still access the original
 interface the packet was received from using pktinfo cmsg.

 Maybe silly but the best I could come up with :-)

Doesn't sound too silly (actually quite nifty in my opinion),
but I'm not sure I understand correctly, binding to ifb doesn't
work since it changes both skb-dev and skb-input_dev.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] [VLAN]: Update iif when receiving via VLAN device

2006-06-26 Thread Patrick McHardy

Patrick McHardy wrote:
 All my testing (quite a lot) in this area so far suggested that queueing
 at ingress doesn't work and is actually counter-productive. It gets
 worse with increasing signal propagation delays (signal in this case
 means congestion indications). I actually wish I never introduced
 the stupid IMQ device, it raises false hope and a lot of people seem
 to fall for it.

Small clarification: with queueing at ingress I mean queueing after
the bottleneck. It might work if you introduce a virtual bottleneck,
but in that case in my experience the bandwidth limit you have to use
is dependant on the number of TCP flows, so usually you can't specify
a limit that will always work, and its wasteful if there is a smaller
number of TCP flows. The reason why you would do it, limiting or
prioritizing incoming traffic on the _real_ bottleneck, is not
achievable of course. Also noteworthy is that policers don't behave
any better, and any other schemes I've tried (I also experienced a
lot of with TCP rate control) have similar or related problems.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

69 matches

Mail list logo