From: jamal [EMAIL PROTECTED]
Date: Fri, 06 Jul 2007 10:39:15 -0400
If the issue is usability of listing 1024 netdevices, i can think of
many ways to resolve it.
I would agree with this if there were a reason for it, it's totally
unnecessary complication as far as I can see.
These virtual
On Fri, 2007-07-06 at 10:39 -0400, jamal wrote:
The first thing that crossed my mind was if you want to select a
destination port based on a destination MAC you are talking about a
switch/bridge. You bring up the issue of a huge number of virtual NICs
if you wanted arbitrary guests which is a
On Tue, 2007-07-03 at 22:20 -0400, jamal wrote:
On Tue, 2007-03-07 at 14:24 -0700, David Miller wrote:
[.. some useful stuff here deleted ..]
That's why you have to copy into a purpose-built set of memory
that is composed of pages that _ONLY_ contain TX packet buffers
and nothing else.
On Fri, 2007-06-07 at 17:32 +1000, Rusty Russell wrote:
[..some good stuff deleted here ..]
Hope that adds something,
It does - thanks.
I think i was letting my experience pollute my thinking earlier when
Dave posted. The copy-avoidance requirement is clear to me[1].
I had another issue
jamal wrote:
If the issue is usability of listing 1024 netdevices, i can think of
many ways to resolve it.
One way we can resolve the listing is with a simple tag to the netdev
struct i could say list netdevices for guest 0-10 etc etc.
This would be a useful feature, not only for
On Sat, 2007-30-06 at 13:33 -0700, David Miller wrote:
It's like twice as fast, since the switch doesn't have to copy
the packet in, switch it, then the destination guest copies it
into it's address space.
There is approximately one copy for each hop you go over through these
virtual
On Tue, 2007-03-07 at 14:24 -0700, David Miller wrote:
[.. some useful stuff here deleted ..]
That's why you have to copy into a purpose-built set of memory
that is composed of pages that _ONLY_ contain TX packet buffers
and nothing else.
The cost of going through the switch is too high,
David Miller wrote:
Now I get to pose a problem for everyone, prove to me how useful
this new code is by showing me how it can be used to solve a
reocurring problem in virtualized network drivers of which I've
had to code one up recently, see my most recent blog entry at:
It would be great if we could finally get a working e1000
multiqueue patch so work in this area can actually be tested.
I'm actively working on this right now. I'm on vacation next week, but
hopefully I can get something working before I leave OLS and post it.
-PJ
-
To unsubscribe from this
On Fri, 2007-29-06 at 21:35 -0700, David Miller wrote:
Awesome, but let's concentrate on the client since I can actually
implement and test anything we come up with :-)
Ok, you need to clear one premise for me then ;-
You said the model is for the guest/client to hook have a port to the
host
From: jamal [EMAIL PROTECTED]
Date: Sat, 30 Jun 2007 10:52:44 -0400
On Fri, 2007-29-06 at 21:35 -0700, David Miller wrote:
Awesome, but let's concentrate on the client since I can actually
implement and test anything we come up with :-)
Ok, you need to clear one premise for me then ;-
Ok everything is checked into net-2.6.23, thanks everyone.
Dave, thank you for your patience and feedback on this whole process.
Patrick and everyone else, thank you for your feedback and assistance.
I am looking at your posed virtualization question, but I need sleep
since I just remembered
Ive changed the topic for you friend - otherwise most people wont follow
(as youve said a few times yourself ;-).
On Thu, 2007-28-06 at 21:20 -0700, David Miller wrote:
Now I get to pose a problem for everyone, prove to me how useful
this new code is by showing me how it can be used to solve
jamal wrote:
On Thu, 2007-28-06 at 21:20 -0700, David Miller wrote:
Each guest gets a unique MAC address. There is a queue per-port
that can fill up.
What all the drivers like this do right now is stop the queue if
any of the per-port queues fill up, and that's why my sunvnet
driver does
On Fri, 2007-29-06 at 13:59 +0200, Patrick McHardy wrote:
I'm guessing that that wouldn't allow to do unicast filtering for
the guests on the real device without hacking the bridge code for
this special case.
For ingress (i guess you could say for egress as well): we can do it as
well today
jamal wrote:
On Fri, 2007-29-06 at 13:59 +0200, Patrick McHardy wrote:
The difference to a real bridge is that the
all addresses are completely known in advance, so it doesn't need
promiscous mode for learning.
You mean the per-virtual MAC addresses are known in advance, right?
Yes.
On Fri, 2007-29-06 at 15:08 +0200, Patrick McHardy wrote:
jamal wrote:
On Fri, 2007-29-06 at 13:59 +0200, Patrick McHardy wrote:
Right, but the current bridging code always uses promiscous mode
and its nice to avoid that if possible.
Looking at the code, it
should be easy to avoid though
Patrick McHardy wrote:
Right, but the current bridging code always uses promiscous mode
and its nice to avoid that if possible. Looking at the code, it
should be easy to avoid though by disabling learning (and thus
promisous mode) and adding unicast filters for all static fdb entries.
I am
Patrick McHardy wrote:
Ben Greear wrote:
Could someone give a quick example of when I am wrong and promisc mode
would allow
a NIC to receive a significant number of packets not really destined for
it?
In a switched environment it won't have a big effect, I agree.
It might help avoid
Ben Greear wrote:
Patrick McHardy wrote:
Right, but the current bridging code always uses promiscous mode
and its nice to avoid that if possible. Looking at the code, it
should be easy to avoid though by disabling learning (and thus
promisous mode) and adding unicast filters for all static
This conversation begins to go into a pointless direction already, as
I feared it would.
Nobody is going to configure bridges, classification, tc, and all of
this other crap just for a simple virtualized guest networking device.
It's a confined and well defined case that doesn't need any of
From: jamal [EMAIL PROTECTED]
Date: Fri, 29 Jun 2007 21:30:53 -0400
On Fri, 2007-29-06 at 14:31 -0700, David Miller wrote:
Maybe for the control node switch, yes, but not for the guest network
devices.
And that is precisely what i was talking about - and i am sure thats how
the
PJ Waskiewicz wrote:
+
static int __init prio_module_init(void)
{
- return register_qdisc(prio_qdisc_ops);
+ int err;
+ err = register_qdisc(prio_qdisc_ops);
+ if (!err)
+ err = register_qdisc(rr_qdisc_ops);
+ return err;
}
Thats still broken.
PJ Waskiewicz wrote:
+
static int __init prio_module_init(void) {
- return register_qdisc(prio_qdisc_ops);
+ int err;
+ err = register_qdisc(prio_qdisc_ops);
+ if (!err)
+ err = register_qdisc(rr_qdisc_ops);
+ return err;
}
Thats still broken.
Waskiewicz Jr, Peter P wrote:
PJ Waskiewicz wrote:
+
static int __init prio_module_init(void) {
-return register_qdisc(prio_qdisc_ops);
+int err;
+err = register_qdisc(prio_qdisc_ops);
+if (!err)
+err = register_qdisc(rr_qdisc_ops);
+return err;
}
Thats
Its not error handling. You do:
err = register qdisc 1
if (err)
return err;
err = register qdisc 2
if (err)
unregister qdisc 2
return err
anyways, I already fixed that and cleaned up prio_classify
the way I suggested. Will send shortly.
Thanks for fixing; however, the
Patrick McHardy wrote:
PJ Waskiewicz wrote:
+
static int __init prio_module_init(void)
{
-return register_qdisc(prio_qdisc_ops);
+int err;
+err = register_qdisc(prio_qdisc_ops);
+if (!err)
+err = register_qdisc(rr_qdisc_ops);
+return err;
}
Thats
Waskiewicz Jr, Peter P wrote:
Thanks for fixing; however, the current sch_prio doesn't unregister the
qdisc if register_qdisc() on prio fails, or does that happen implicitly
because the module will probably unload?
It failed, there's nothing to unregister. But when you register two
qdiscs and
PJ Waskiewicz wrote:
+#ifdef CONFIG_NET_SCH_MULTIQUEUE
+ if (q-mq)
+ skb-queue_mapping =
+
q-prio2band[bandTC_PRIO_MAX];
+ else
+ skb-queue_mapping = 0;
Waskiewicz Jr, Peter P wrote:
PJ Waskiewicz wrote:
+#ifdef CONFIG_NET_SCH_MULTIQUEUE
+if (q-mq)
+skb-queue_mapping =
+
q-prio2band[bandTC_PRIO_MAX];
+else
+
Absolutely not. First of all, its perfectly valid to use
non-multiqueue qdiscs on multiqueue devices. Secondly, its
only the root qdisc that has to know about multiqueue since
that one controls the child qdiscs.
Think about it, it makes absolutely no sense to have the
child qdisc even
Waskiewicz Jr, Peter P wrote:
[...]
The only reasonable thing it can do is not care about
multiqueue and just dequeue as usual. In fact I think it
should be an error to configure multiqueue on a non-root qdisc.
Ack. This is a thought process that trips me up from time to time...I
see
Waskiewicz Jr, Peter P wrote:
[...]
The only reasonable thing it can do is not care about
multiqueue and
just dequeue as usual. In fact I think it should be an error to
configure multiqueue on a non-root qdisc.
Ack. This is a thought process that trips me up from time
to
From: Patrick McHardy [EMAIL PROTECTED]
Date: Thu, 28 Jun 2007 21:24:37 +0200
Waskiewicz Jr, Peter P wrote:
[...]
The only reasonable thing it can do is not care about
multiqueue and just dequeue as usual. In fact I think it
should be an error to configure multiqueue on a non-root qdisc.
enum
{
- TCA_PRIO_UNPSEC,
- TCA_PRIO_TEST,
You misunderstood me. You can work on top of my compat
attribute patches, but the example code should not have to go
in to apply your patch.
Ok. I'll fix my patches.
diff --git a/net/sched/Kconfig b/net/sched/Kconfig index
@@ -70,14 +72,28 @@ prio_classify(struct sk_buff *skb, struct Qdisc
*sch, int *qerr) #endif
if (TC_H_MAJ(band))
band = 0;
+ if (q-mq)
+ skb-queue_mapping =
+
Waskiewicz Jr, Peter P wrote:
@@ -70,14 +72,28 @@ prio_classify(struct sk_buff *skb, struct Qdisc
*sch, int *qerr) #endif
if (TC_H_MAJ(band))
band = 0;
+ if (q-mq)
+skb-queue_mapping =
+
Thats not necessary. I just though you could add one exit point:
...
out:
skb-queue_mapping = q-mq ? band : 0;
return q-queues[band];
}
But if that doesn't work don't bother ..
Unfortunately it won't, given how band might be used like this to select
the queue:
return
PJ Waskiewicz wrote:
diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 09808b7..ec3a9a5 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -103,8 +103,8 @@ struct tc_prio_qopt
enum
{
- TCA_PRIO_UNPSEC,
- TCA_PRIO_TEST,
You
PJ Waskiewicz wrote:
+ /* If we're multiqueue, make sure the number of incoming bands
+ * matches the number of queues on the device we're associating with.
+ */
+ if (tb[TCA_PRIO_MQ - 1])
+ q-mq = *(unsigned char *)RTA_DATA(tb[TCA_PRIO_MQ - 1]);
+
+ if
Updated: This patch applies on top of Patrick McHardy's RTNETLINK
nested compat attribute patches. These are required to preserve
ABI for iproute2 when working with the multiqueue qdiscs.
Add the new sch_rr qdisc for multiqueue network device support.
Allow sch_prio and sch_rr to be compiled
#include linux/module.h
@@ -40,9 +42,13 @@
struct prio_sched_data
{
int bands;
+#ifdef CONFIG_NET_SCH_RR
+ int curband; /* for round-robin */
+#endif
struct tcf_proto *filter_list;
u8 prio2band[TC_PRIO_MAX+1];
struct Qdisc *queues[TCQ_PRIO_BANDS];
+
Waskiewicz Jr, Peter P wrote:
#include linux/module.h
@@ -40,9 +42,13 @@
struct prio_sched_data
{
int bands;
+#ifdef CONFIG_NET_SCH_RR
+int curband; /* for round-robin */
+#endif
struct tcf_proto *filter_list;
u8 prio2band[TC_PRIO_MAX+1];
struct Qdisc
Patrick McHardy wrote:
void skb_set_queue_mapping(struct sk_buff *skb, unsigned int queue)
{
#ifdef CONFIG_NET_SCH_MULTIQUEUE
skb-queue_mapping = queue;
#else
skb-queue_mapping = 0;
#endif
Maybe even use it everywhere and guard skb-queue_mapping by
an #ifdef, on 32 bit it does
Patrick McHardy wrote:
Waskiewicz Jr, Peter P wrote:
Thought about this more last night and this morning. As far as I can
tell, I still need this. If the qdisc gets loaded with multiqueue
turned on, I can just use the value of band to assign
skb-queue_mapping. But if the qdisc is loaded
Patrick McHardy wrote:
Waskiewicz Jr, Peter P wrote:
Thought about this more last night and this morning. As
far as I can
tell, I still need this. If the qdisc gets loaded with multiqueue
turned on, I can just use the value of band to assign
skb-queue_mapping. But if the qdisc is
Add the new sch_rr qdisc for multiqueue network device support.
Allow sch_prio to be compiled with or without multiqueue hardware
support.
sch_rr is part of sch_prio, and is referenced from MODULE_ALIAS. This
was done since sch_prio and sch_rr only differ in their dequeue routine.
PJ Waskiewicz wrote:
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 475df84..ca0b352 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -102,8 +102,16 @@ config NET_SCH_ATM
To compile this code as a module, choose M here: the
module will be called sch_atm.
The dependencies seem to be very confused. SCHED_PRIO does
not depend on anything new, SCH_RR also doesn't depend on
anything. SCH_PRIO_MQ and SCH_RR_MQ (which is missing) depend
on SCH_PRIO/SCH_RR. A single NET_SCH_MULTIQUEUE option seems
better than adding one per scheduler though.
I
Waskiewicz Jr, Peter P wrote:
The dependencies seem to be very confused. SCHED_PRIO does
not depend on anything new, SCH_RR also doesn't depend on
anything. SCH_PRIO_MQ and SCH_RR_MQ (which is missing) depend
on SCH_PRIO/SCH_RR. A single NET_SCH_MULTIQUEUE option seems
better than adding one
50 matches
Mail list logo