date:20060720

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Evgeniy Polyakov

 Hello!

Hello, Alexey.

[ Sorry for long delay, there are some problems with mail servers, so I
can not access them remotely, so I create mail by hads, hopefully thread
will not be broken. ]

 There is no socket spinlock anymore.
 Above lock is skb_queue lock which is held inside
 skb_dequeue/skb_queue_tail calls.

 Lock is named differently, but it is still here.
 BTW for UDP even the name is the same.

There is no bh processing, that lock is needed for 4 operations when skb
is enqueued/dequeued.

And if I would changed skbs to different structures there were no locks
at all - it is extremely lightweight, it can not be compared with socket
lock at all.

No bh/irq processing at all, natural speed management - that is main idea
behind netchannels.

  Equivalent of socket user lock.
 
 No, it is an equivalent for hash lock in socket table.

OK. But you have to introduce socket mutex somewhere in any case.
Even in ATCP.

Actually not - VJ's idea is to have only one consumer and one provider,
so no locks needed, but I agree, in general case it is needed, but _only_
to protect against several netchannel userspace consumers.
There is no BH protocol processing at all, so there is no need to
pprotect against someone who will add data while you are processing own
chunk.

 Just an example - tcp_established() can be called with bh disabled
 under the socket lock.

 When we have a process context in hands, it is not.

Did you ask youself, why do not we put all the packets to
backlog/prequeue
and just wait when user will read the data? It would be 100% equivalent
to netchannels.

How many hacks just to be a bit closer to userspace processing,
implemented in netchannels!

The answer is simple: because we cannot wait. If user delays for
200msec,
wait for connection collapse due to retransmissions. If the segment is
out of order, immediate attention is required. Any scheme, which tries
to wait for user unconditionally, at least has to run a watchdog timer,
which fires before sender senses the gap.

If userspace is scheduled away for too much time, it is bloody wrong to
ack the data, that is impossible to read due to the fact that system is
being busy. It is just postponing the work from one end to another - ack
now and stop when queue is full, or postpone the ack generation when
segment is realy being read.

And this is what we do for ages. Grep for VJ in sources. :-)
netchannels have nothing to do with it, it is much elder idea.

And it was Van, who decided to move away from BH/irq processing.
It was slow and a bit pain way (how many hacks with prequeue, with
direct processing, it is enough just to look how TCP socket lock is locked
in different contexts :)

 In that case one copies the whole data into userspace, so access for
 20 bytes of headers completely does not matter.

For short packets it matters.

But I said not this. I said it looks _worse_. A bit, but worse.

At least for 80 bytes it does not matter at all.
And it is very likely that data is misaligned, so half of the
header will be in a cache line. And socket code has the same problem -
skb-cb can be flushed away, and tcp_recvmsg() needs to get it again.
And actually I never understood nanooptimisation behind more serious
problems (i.e. one cache line vs. 50MB/sec speed).

 Hmm, for 80 bytes sized packets win was about 2.5 times. Could you
 please show me lines inside existing code, which should be commented,
 so I got 50Mbyte/sec for that?

If I knew it would be done. :-)

Actually, it is the action, which I would expect. This, but
not dropping all the TCP stack.

I tried to use existing one, and I had speed and CPU usage win, but it's
magnitude was not what I expected, so I started userspace network stack
implementation. It was succeded, and there are _very_ major
optimisations over existing code, when processing is fully moved into
userspace, but also there are big problems, like one syscall per ack, 
so I decided to use that stack as a base for in-kernel process protocol 
processing, and I succeded. Probably I will return to the userspace 
network stack idea when I complete zero-copy networking support.

 I showed there, that using existing stack it is imposible

Please, understand, it is such statements that compromise your work.
If it is impossible then it is not interesting.

Do not mix soft and warm - I just post the facts, that netchannel TCP
implementation works (sumetimes much) faster.
It is socket code that probably has some misoptimisations, and if it is
impossible to fix them (well, it least it is very hard), then it is not
interesting.

I definitely do not say, that it must be removed/replaced/anything - it
works perfectly ok, but it is possible to have better performance by
changing architecture, and it was done.

Alexey

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Evgeniy Polyakov

Hello.

[ Sorry for long delay, there are some problems with mail servers, so I
can not access them remotely, so I create mail by hads, hopefully thread
will not be broken. ]

  Your description makes it sound as if you would take a huge leap,
  changing all in-kernel code _and_ the userspace interface in a
  single
  patch.  Am I wrong?  Or am I right and would it make sense to
  extract
  small incremental steps from your patch similar to those Van did in
  his non-published work?
 
 My first implementation used existing kernel code and showed small
 performance win - there was binding of the socket to netchannel and
 all
 protocol processing was moved into process context.

Iirc, Van didn't show performance numbers but rather cpu utilization
numbers.  And those went down significantly without changing the
userspace interface.

At least lca presentation graphs shows exactly different numbers - 
performance without CPU utilization (but not as his tables).

Did you look at cpu utilization as well?  If you did and your numbers
are worse than Vans, he either did something smarter than you or
forged his numbers (quite unlikely).

Interesting sentence from political correcteness point of view :)

I did both CPU and speed measurements when used socket code [1], 
and both of them showed small gain, but I only tested 1gbit setup, so
they can not be compared with Van's.
But even with 1gb I was not satisfied with them, so I started different
implementation, which I described in my e-mail to Alexey.

1. speed/cpu measurements of one of the netchannels implementation which
used socket code.
http://thread.gmane.org/gmane.linux.network/36609/focus=36614

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] drivers/net/wireless/d80211: Check configuration type in hw-config_interface.

2006-07-20 Thread Jiri Benc

On Wed, 19 Jul 2006 22:26:52 +0200, Jean-Mickael Guerin wrote:
 This patch prevents a NULL pointer dereferencing in AP mode:
 ieee80211_if_config will set conf-bssid only if device is of type STA 
 or IBSS.
 I see it using following commands right after module loading (with rt61)
 # iwconfig wlan0 mode Master
 # ifconfig wlan0 up

The patch seems to fix the problem at a wrong place. rt2x00 has broken
add_interface handler - it allows adding of AP interface even though the
driver doesn't support AP mode. It is add_interface callback that should
be fixed in rt2x00.

The check in the patch most likely won't be needed even after AP mode
support is added to rt2x00 - the driver needs to handle AP mode
differently so config_interface callback will be rewritten anyway.

adm8211 driver doesn't have this problem and doesn't need to be
modified.

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] drivers/net/wireless/d80211: Check configuration type in hw-config_interface.

2006-07-20 Thread Jiri Benc

On Wed, 19 Jul 2006 18:07:05 -0700, Michael Wu wrote:
 Why is that? Isn't there a BSSID in AP mode too? Perhaps it is calling 
 config_interface before setting the BSSID?

The bssid field in ieee80211_if_conf structure is not set in AP mode.
There is no need for that - you already have a MAC address of the AP
interface (from add_interface callback). That's your BSSID.

 adm8211 doesn't support AP mode yet, but it's good to know this crash won't 
 occur when it does. :)

The crash won't occur even without the patch - you will need to do
completely different things in adm8211_config_interface for AP mode than
for STA or IBSS mode and you will put some switch there anyway. No
reason for doing it now and bloating the code with a check for condition
that cannot happen.

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Qlogic qla3xxx driver v2.02.00-k36 for upstream inclusion.

2006-07-20 Thread Michael Tokarev

By the way, should it work with ISP4010 controllers?
Those expose network interface card subdevice too,
but aren't listed in pci_device_table of the driver,
and after adding the device ID to the driver, it still
does not quite work (I tried, just out of curiosity) -
the NIC on ISP4010 is - it seems - close but not exactly
the same as the driver expects.

Thanks.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH][NET] ULi526x - driver cleanups

2006-07-20 Thread Henne

From: Henrik Kretzschmar [EMAIL PROTECTED]

Some little cleanups for ULI-TULIP-driver:

pci_module_init() conversion to pci_register_driver()
remove rc, an unneeded variable from uli526x_module_init()
let the debug macros use correct loglevels
add a loglevel to a printk
let some code more look like CodingStyle

Signed-off-by: Henrik Kretzschmar [EMAIL PROTECTED]

---

--- linux-2.6.18-rc2/drivers/net/tulip/uli526x.c2006-07-18 
13:37:09.0 +0200
+++ linux/drivers/net/tulip/uli526x.c   2006-07-20 11:43:07.0 +0200
@@ -82,9 +82,9 @@
 #define ULI526X_TX_TIMEOUT ((16*HZ)/2) /* tx packet time-out time 8 s */
 #define ULI526X_TX_KICK(4*HZ/2)/* tx packet Kick-out time 2 s 
*/
 
-#define ULI526X_DBUG(dbug_now, msg, value) if (uli526x_debug || (dbug_now)) 
printk(KERN_ERR DRV_NAME : %s %lx\n, (msg), (long) (value))
+#define ULI526X_DBUG(dbug_now, msg, value) if (uli526x_debug || (dbug_now)) 
printk(KERN_DEBUG DRV_NAME : %s %lx\n, (msg), (long) (value))
 
-#define SHOW_MEDIA_TYPE(mode) printk(KERN_ERR DRV_NAME : Change Speed to 
%sMhz %s duplex\n,mode  1 ?100:10, mode  4 ? full:half);
+#define SHOW_MEDIA_TYPE(mode) printk(KERN_NOTICE DRV_NAME : Change Speed to 
%sMhz %s duplex\n,mode  1 ?100:10, mode  4 ? full:half);
 
 
 /* CR9 definition: SROM/MII */
@@ -373,7 +373,8 @@
if (err)
goto err_out_res;
 
-   printk(KERN_INFO %s: ULi M%04lx at pci%s,,dev-name,ent-driver_data 
 16,pci_name(pdev));
+   printk(KERN_INFO %s: ULi M%04lx at pci%s,,dev-name,
+   ent-driver_data  16,pci_name(pdev));
 
for (i = 0; i  6; i++)
printk(%c%02x, i ? ':' : ' ', dev-dev_addr[i]);
@@ -1027,7 +1028,7 @@
if ( time_after(jiffies, dev-trans_start + ULI526X_TX_TIMEOUT) 
) {
db-reset_TXtimeout++;
db-wait_reset = 1;
-   printk( %s: Tx timeout - resetting\n,
+   printk(KERN_ERR %s: Tx timeout - resetting\n,
   dev-name);
}
}
@@ -1671,18 +1672,17 @@
 
 
 static struct pci_device_id uli526x_pci_tbl[] = {
-   { 0x10B9, 0x5261, PCI_ANY_ID, PCI_ANY_ID, 0, 0, PCI_ULI5261_ID },
-   { 0x10B9, 0x5263, PCI_ANY_ID, PCI_ANY_ID, 0, 0, PCI_ULI5263_ID },
-   { 0, }
+   {0x10B9, 0x5261, PCI_ANY_ID, PCI_ANY_ID, 0, 0, PCI_ULI5261_ID},
+   {0x10B9, 0x5263, PCI_ANY_ID, PCI_ANY_ID, 0, 0, PCI_ULI5263_ID},
+   {}
 };
 MODULE_DEVICE_TABLE(pci, uli526x_pci_tbl);
 
-
 static struct pci_driver uli526x_driver = {
-   .name   = uli526x,
-   .id_table   = uli526x_pci_tbl,
-   .probe  = uli526x_init_one,
-   .remove = __devexit_p(uli526x_remove_one),
+   .name = uli526x,
+   .id_table = uli526x_pci_tbl,
+   .probe = uli526x_init_one,
+   .remove = __devexit_p(uli526x_remove_one),
 };
 
 MODULE_AUTHOR(Peer Chen, [EMAIL PROTECTED]);
@@ -1702,7 +1702,6 @@
 
 static int __init uli526x_init_module(void)
 {
-   int rc;
 
printk(version);
printed_version = 1;
@@ -1714,25 +1713,21 @@
if (cr6set)
uli526x_cr6_user_set = cr6set;
 
-   switch(mode) {
+   switch (mode) {
case ULI526X_10MHF:
case ULI526X_100MHF:
case ULI526X_10MFD:
case ULI526X_100MFD:
uli526x_media_mode = mode;
break;
-   default:uli526x_media_mode = ULI526X_AUTO;
+   default:
+   uli526x_media_mode = ULI526X_AUTO;
break;
}
 
-   rc = pci_module_init(uli526x_driver);
-   if (rc  0)
-   return rc;
-
-   return 0;
+   return pci_register_driver(uli526x_driver);
 }
 
-
 /*
  * Description:
  * when user used rmmod to delete module, system invoked clean_module()


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC/PATCH][Bonding]: keep slave state when admin down

2006-07-20 Thread jamal


When a bonding netdevice is admin-ed down it looses the slaves
attributes (set via ifenslave). This is not consistent with other
behavior of netdevices (example a qdisc attached to a netdevice doesnt
disappear or an attached IP address etc).
The included patch fixes this. Ive tested by ifenslaving, downing the
bond, checking /proc and making sure it still has the slaves, up-ing the
bond and making sure things continue to work.

Jay/Bonding folks if you are ok with it, just ACK it or include it in
your tree etc. Otherwise we can discuss. This is against linus tree.

cheers,
jamal
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 8b95123..df319be 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3420,7 +3420,6 @@ static int bond_close(struct net_device 
 
 	write_lock_bh(bond-lock);
 
-	bond_mc_list_destroy(bond);
 
 	/* signal timers not to re-arm */
 	bond-kill_timers = 1;
@@ -3451,8 +3450,6 @@ static int bond_close(struct net_device 
 		break;
 	}
 
-	/* Release the bonded slaves */
-	bond_release_all(bond_dev);
 
 	if ((bond-params.mode == BOND_MODE_TLB) ||
 	(bond-params.mode == BOND_MODE_ALB)) {
@@ -4237,6 +4234,9 @@ static void bond_free_all(void)
 	list_for_each_entry_safe(bond, nxt, bond_dev_list, bond_list) {
 		struct net_device *bond_dev = bond-dev;
 
+		bond_mc_list_destroy(bond);
+		/* Release the bonded slaves */
+		bond_release_all(bond_dev);
 		unregister_netdevice(bond_dev);
 		bond_deinit(bond_dev);
 	}

Oops in IFB

2006-07-20 Thread Nicolas DICHTEL


Hi,

When there is no memory left for creating all IFB devices (requesting
by user), a oops happens on the system.
Please find enclosed a patch to solve this.


Regards,
Nicolas


[IFB] After ifb_init_one() failed, i is increased. Decrease
it before entering in the loop for freeing the other ifb devices.

Signed-off-by: Nicolas Dichtel [EMAIL PROTECTED]
--- a/drivers/net/ifb.c	2006-07-20 15:16:31.923529050 +0200
+++ b/drivers/net/ifb.c	2006-07-20 15:17:36.370188249 +0200
@@ -271,6 +271,7 @@
 	for (i = 0; i  numifbs  !err; i++)
 		err = ifb_init_one(i); 
 	if (err) { 
+		i--;
 		while (--i = 0)
 			ifb_free_one(i);
 	}

Re: Oops in IFB

2006-07-20 Thread jamal

On Thu, 2006-20-07 at 15:33 +0200, Nicolas DICHTEL wrote:


 [IFB] After ifb_init_one() failed, i is increased. Decrease
 it before entering in the loop for freeing the other ifb devices.
 
 Signed-off-by: Nicolas Dichtel [EMAIL PROTECTED]

Thanks Nicolas.

Acked-by: Jamal Hadi Salim [EMAIL PROTECTED]

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [IPROUTE2]: update documentation on mirred and IFB

2006-07-20 Thread jamal

On Thu, 2006-20-07 at 01:59 +0100, Andy Furniss wrote:
 jamal wrote:
  About two more or so to complete these..
  
  cheers,
  jamal
  
  
 +tc qdisc add dev lo eth0 ?
 

Thanks for catching that Andy. It was attempt at adding ingress to
qdisc. I will wait for Stephen to swallow the other patches and then fix
this - I have at least two more patches to send in that area. Or you can
get a little gitty and send a patch ;-
BTW, if there are areas in the docs, help etc that need clarification
let me know or fix them and send patches. Or if there are better
examples to give send patches.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Oops in IFB

2006-07-20 Thread jamal

On Thu, 2006-20-07 at 09:40 -0400, jamal wrote:
 On Thu, 2006-20-07 at 15:33 +0200, Nicolas DICHTEL wrote:
 
 
  [IFB] After ifb_init_one() failed, i is increased. Decrease
  it before entering in the loop for freeing the other ifb devices.
  
  Signed-off-by: Nicolas Dichtel [EMAIL PROTECTED]
 
 Thanks Nicolas.
 

BTW, in the name of the LinuxWay(tm) - can you also submit a similar
patch for dummy? It suffers from the same bug.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Oops in IFB

2006-07-20 Thread Nicolas DICHTEL


jamal a écrit :

BTW, in the name of the LinuxWay(tm) - can you also submit a similar
patch for dummy? It suffers from the same bug.

No problem, patch is enclosed.

Cheers,
Nicolas

[DUMMY] Avoid an oops when dummy_init_one() failed

Signed-off-by: Nicolas Dichtel [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Oops in IFB

2006-07-20 Thread Nicolas DICHTEL


Sorry, I forgot the patch ;-)

Nicolas

Nicolas DICHTEL a écrit :

jamal a écrit :

BTW, in the name of the LinuxWay(tm) - can you also submit a similar
patch for dummy? It suffers from the same bug.

No problem, patch is enclosed.

Cheers,
Nicolas

[DUMMY] Avoid an oops when dummy_init_one() failed

Signed-off-by: Nicolas Dichtel [EMAIL PROTECTED]
--- a/drivers/net/dummy.c	2006-07-20 16:19:09.395351558 +0200
+++ b/drivers/net/dummy.c	2006-07-20 16:19:58.802327279 +0200
@@ -132,6 +132,7 @@
 	for (i = 0; i  numdummies  !err; i++)
 		err = dummy_init_one(i); 
 	if (err) { 
+		i--;
 		while (--i = 0)
 			dummy_free_one(i);
 	}

[IPV4]: Fix nexthop realm dumping for multipath routes

2006-07-20 Thread Patrick McHardy

[IPV4]: Fix nexthop realm dumping for multipath routes

Routing realms exist per nexthop, but are only returned to userspace for
the first nexthop. This is due to the fact that iproute2 only allows to
set the realm for the first nexthop and the kernel refuses multipath routes
where only a single realm is present.

Dump all realms for multipath routes to enable iproute to correctly display
them.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit c76610a1027809f58840fe65b7abc8704f80dcc8
tree 9651193c156548539845ed0a2bd8af8e51182a00
parent 8e0ae6dc963ce12c8d9264d27509ff551dcb57fa
author Patrick McHardy [EMAIL PROTECTED] Wed, 19 Jul 2006 19:22:24 +0200
committer Patrick McHardy [EMAIL PROTECTED] Wed, 19 Jul 2006 19:22:24 +0200

 net/ipv4/fib_semantics.c |   12 
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 3c45256..1f19cdf 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -963,10 +963,6 @@ fib_dump_info(struct sk_buff *skb, u32 p
rtm-rtm_protocol = fi-fib_protocol;
if (fi-fib_priority)
RTA_PUT(skb, RTA_PRIORITY, 4, fi-fib_priority);
-#ifdef CONFIG_NET_CLS_ROUTE
-   if (fi-fib_nh[0].nh_tclassid)
-   RTA_PUT(skb, RTA_FLOW, 4, fi-fib_nh[0].nh_tclassid);
-#endif
if (rtnetlink_put_metrics(skb, fi-fib_metrics)  0)
goto rtattr_failure;
if (fi-fib_prefsrc)
@@ -976,6 +972,10 @@ #endif
RTA_PUT(skb, RTA_GATEWAY, 4, fi-fib_nh-nh_gw);
if (fi-fib_nh-nh_oif)
RTA_PUT(skb, RTA_OIF, sizeof(int), fi-fib_nh-nh_oif);
+#ifdef CONFIG_NET_CLS_ROUTE
+   if (fi-fib_nh[0].nh_tclassid)
+   RTA_PUT(skb, RTA_FLOW, 4, fi-fib_nh[0].nh_tclassid);
+#endif
}
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
if (fi-fib_nhs  1) {
@@ -994,6 +994,10 @@ #ifdef CONFIG_IP_ROUTE_MULTIPATH
nhp-rtnh_ifindex = nh-nh_oif;
if (nh-nh_gw)
RTA_PUT(skb, RTA_GATEWAY, 4, nh-nh_gw);
+#ifdef CONFIG_NET_CLS_ROUTE
+   if (nh-nh_tclassid)
+   RTA_PUT(skb, RTA_FLOW, 4, nh-nh_tclassid);
+#endif
nhp-rtnh_len = skb-tail - (unsigned char*)nhp;
} endfor_nexthops(fi);
mp_head-rta_type = RTA_MULTIPATH;

RE: [PATCH] Qlogic qla3xxx driver v2.02.00-k36 for upstream inclusion.

2006-07-20 Thread Ron Mercer

qla3xxx driver  does not support ISP4010.

 -Original Message-
 From: Michael Tokarev [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, July 20, 2006 2:13 AM
 To: Ron Mercer
 Cc: netdev@vger.kernel.org
 Subject: Re: [PATCH] Qlogic qla3xxx driver v2.02.00-k36 for 
 upstream inclusion.
 
 By the way, should it work with ISP4010 controllers?
 Those expose network interface card subdevice too, but 
 aren't listed in pci_device_table of the driver, and after 
 adding the device ID to the driver, it still does not quite 
 work (I tried, just out of curiosity) - the NIC on ISP4010 is 
 - it seems - close but not exactly the same as the driver expects.
 
 Thanks.
 
 /mjt
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Qlogic qla3xxx driver v2.02.00-k36 for upstream inclusion.

2006-07-20 Thread Andrew Vasquez

On Thu, 20 Jul 2006, Ron Mercer wrote:

 qla3xxx driver  does not support ISP4010.

Exactly...  The qla3xxx driver supports the NIC function only.

  -Original Message-
  From: Michael Tokarev [mailto:[EMAIL PROTECTED] 
  Sent: Thursday, July 20, 2006 2:13 AM
  To: Ron Mercer
  Cc: netdev@vger.kernel.org
  Subject: Re: [PATCH] Qlogic qla3xxx driver v2.02.00-k36 for 
  upstream inclusion.
  
  By the way, should it work with ISP4010 controllers?
  Those expose network interface card subdevice too, but 
  aren't listed in pci_device_table of the driver, and after 
  adding the device ID to the driver, it still does not quite 
  work (I tried, just out of curiosity) - the NIC on ISP4010 is 
  - it seems - close but not exactly the same as the driver expects.

You'll need to use the qla4xxx driver to drive the iSCSI function.

Regards,
Andrew Vasquez
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Qlogic qla3xxx driver v2.02.00-k36 for upstream inclusion.

2006-07-20 Thread Michael Tokarev

Andrew Vasquez wrote:
 On Thu, 20 Jul 2006, Ron Mercer wrote:
 
 qla3xxx driver  does not support ISP4010.
 
 Exactly...  The qla3xxx driver supports the NIC function only.

...which is provided by ISP4010 card, as appears on PCI bus:

04:04.0 Ethernet controller: QLogic Corp. QLA3010 Network Adapter (rev 05)
04:04.1 Network controller: QLogic Corp. QLA4010 iSCSI TOE Adapter (rev 05)

(the first (sub)device).  So it *looks* like the card has *both*
a NIC and iSCSI TOE adapter, and the NIC part is pretty much similar
to what qla3xxx driver expects...  That's why my curiosity. ;)

(not that it matters much, just.. curious, really.
Well.  Not exactly.  It'd be nice to compare a NIC w/o Jumbo
frames support (which we have on all machines connected to
the iSCSI segment), with something more.. advanced.  So I
wondered if I can utilize the NIC part of the ISP4010 for
the test.  iSCSI part of the card works significantly slower
than open-iscsi stack on non-jumbo-frames-aware Tigon GigE NIC).

[]
 You'll need to use the qla4xxx driver to drive the iSCSI function.

Yeah, I know.  I posted some results to open-iscsi@ list about a week
ago.  It basically works (the new one, with open-iscsi infrastructure),
but is slooow... ;)

Thanks.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH][NET]: fix dummy initialization

2006-07-20 Thread Nicolas DICHTEL


Same problem and same fix that for IFB.

Regards,
Nicolas


[NET][DUMMY] Avoid an oops when dummy_init_one() failed

Signed-off-by: Nicolas Dichtel [EMAIL PROTECTED]
--- a/drivers/net/dummy.c	2006-07-20 16:19:09.395351558 +0200
+++ b/drivers/net/dummy.c	2006-07-20 16:19:58.802327279 +0200
@@ -132,6 +132,7 @@
 	for (i = 0; i  numdummies  !err; i++)
 		err = dummy_init_one(i); 
 	if (err) { 
+		i--;
 		while (--i = 0)
 			dummy_free_one(i);
 	}

Re: Weird TCP SACK problem. in Linux...

2006-07-20 Thread Oumer Teyeb

Hi Alexy, Is there anything linux specific about the DSACK 
implementation that might lead to increase in the number of 
retransmissions, but leads to improvment in download time when 
timestamps are not used (and the reverse effect  when timestamps are 
used, less retransmissions but bigger download times)?  because I 
couldnt figure it out,also is there anywhere where the reordering 
response of tcp linux described? (it seem dupthreshold is dynamically 
adjusted based on the reordering history... but I was not able to find 
out how...)...


Oumer Teyeb wrote:


Oumer Teyeb wrote:


Hi,

Alexey Kuznetsov wrote:


Condition triggering start of fast retransmit is the same.
The behaviour while retransmit is different. FACKless code
behaves more like NewReno.
 

Ok, that is a good point!!  Now at least I can convince myself the 
CDFs for the first retransmissions showing that SACK leads to earlier 
retransmissions than no SACK are not wrongand I can even convince 
myself that this is the real reason behind sack/fack's performance 
degredation for the case of no timestamps,:-)... ...



Actually, then the increase in the number of retransmissions and the 
increase in teh download time from no SACK - SACK for timestamp case 
seems to make sense also...my reasoning is like this...if there is 
timestamps, that means there is reordering detection...hence the 
number retransmissions are reduced because we avoid the time spent in 
fast recovery when we introduce SACK on top of timestamps, we 
enter fast retransmits earlier than no SACK case as we seem to agree, 
and since the timestamp reduces the number of retransmission once we 
are in fast recovery, the retransmissions we see are basically the 
first few retransmissions that made us enter the false fast 
retransmits, so we have a little increase in the retransmissions and a 
little increase in the download times... but when no timestamps are 
used, there is no reordering detection and so SACK leads to less 
number of retransmissions because it retransmits selectively, but it 
doesnt improve the download time because it enters fast retransmit 
eralier than the no SACK and in this case the fast retransmits are 
very costly because they are not detected lead to window reduction 
am I making sense?:-) still the DSACK case is puzzling me


Regards,
Oumer
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] drivers/net/wireless/d80211: Check configuration type in hw-config_interface.

2006-07-20 Thread Ivo Van Doorn


Hi,


 This patch prevents a NULL pointer dereferencing in AP mode:
 ieee80211_if_config will set conf-bssid only if device is of type STA
 or IBSS.
 I see it using following commands right after module loading (with rt61)
 # iwconfig wlan0 mode Master
 # ifconfig wlan0 up

The patch seems to fix the problem at a wrong place. rt2x00 has broken
add_interface handler - it allows adding of AP interface even though the
driver doesn't support AP mode. It is add_interface callback that should
be fixed in rt2x00.


Well rt2x00 does support AP mode, our latest CVS tree (patches for
wireless-dev are in progress) has even shown a working configuration
for some users.
So add_interface is correct at allowing the AP interface, perhaps some
more steps
are required to make it completely work, but it is work in progress.


The check in the patch most likely won't be needed even after AP mode
support is added to rt2x00 - the driver needs to handle AP mode
differently so config_interface callback will be rewritten anyway.


I'll make a check to see if the bssid is NULL or invalid in the
config_bssid() function,
and make sure that in AP mode the MAC is written as BSSID.

Ivo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Alexey Kuznetsov

Hello!

Small question first:

 userspace, but also there are big problems, like one syscall per ack,

I do not see redundant syscalls. Is not it expected to send ACKs only
after receiving data as you said? What is the problem?


Now boring things:

 There is no BH protocol processing at all, so there is no need to
 pprotect against someone who will add data while you are processing own
 chunk.

Essential part of socket user lock is the same mutex.

Backlog is actually not a protection, but a thing equivalent to netchannel.
The difference is only that it tries to process something immediately,
when it is safe. You can omit this and push everything to backlog(=netchannel),
which is processed only by syscalls, if you do not care about latency.


 How many hacks just to be a bit closer to userspace processing,
 implemented in netchannels!

Moving processing closer to userspace is not a goal, it is a tool.
Which sometimes useful, but generally quite useless.

F.e. in your tests it should not affect performance at all,
end user is just a sink.

What's about prequeueing, it is a bright example. Guess why is it useful?
What does it save? Nothing, like netchannel. Answer is: it is just a tool
to generate coarsed ACKs in a controlled manner without essential violation
of protocol. (Well, and to combine checksumming and copy if you do not like how
your card does this)


 If userspace is scheduled away for too much time, it is bloody wrong to
 ack the data, that is impossible to read due to the fact that system is
 being busy. It is just postponing the work from one end to another - ack
 now and stop when queue is full, or postpone the ack generation when
 segment is realy being read.

... when you get all the segments nicely aligned, blah-blah-blah.

If you do not care about losses-congestion-delays-delacks-whatever,
you have a totally different protocol. Sending window feedback
is only a minor part of tcp. But even these boring tcp intrinsics
are not so important, look at ideal lossless network:

Think what happens f.e. while plain file transfer to your notebook.
You get 110MB/sec for a few seconds, then writeback is fired and
disk io subsystems discovers that the disk holds only 50MB/sec.
If you are unlucky and some another application starts, disk is so congested
that it will take lots of seconds to make a progress with io.
For this time another side will retransmit, because poor thing thought
rtt is 100 usecs and you will never return to 50MB/sec.

You have to _CLOSE_ window in the case of long delay, rather than to forget
to ack. See the difference?

It is just because actual end user is still far far away.
And this happens all the time, when you relay the results to another
application via pipe, when... Well, the only case where real end user
is user of netchannel is when you receive to a sink.


 But I said not this. I said it looks _worse_. A bit, but worse.
 
 At least for 80 bytes it does not matter at all.

Hello-o, do you hear me? :-)

I am asking: it looks not much better, but a bit worse,
then what is real reason for better performance, unless it is
due to castration of protocol?

Simplify protocol, move all the processing (even memory copies) to softirq,
leave to user space only feeding pages to copy and you will have unbeatable
performance. Been there, done that, not with TCP of course, but if you do not
care about losses and ACK clocking and send an ACK once per window,
I do not see how it can spoil the situation.


 And actually I never understood nanooptimisation behind more serious
 problems (i.e. one cache line vs. 50MB/sec speed).

You deal with 80 byte packets, to all that I understand.
If you lose one cacheline per packet, it is a big problem.

All that we can change is protocol overhead. Handling data part
is invariant anyway. You are scared of complexity of tcp, but
you obviously forget one thing: cpu is fast.
The code can look very complicated: some crazy hash functions,
damn hairy protocol processing, but if you take care about caches etc.,
all this is dominated by the first look into packet in eth_type_trans()
or ip_rcv().

BTW, when you deal with normal data flow, cache can be not dirtied
by data at all, it can be bypassed.


 works perfectly ok, but it is possible to have better performance by
 changing architecture, and it was done.

It is exactly the point of trouble. From all that I see and you said,
better performance is got not due to change of architecture,
but despite of this.

A proof that we can perform better by changing protocol is not required,
it is kinda obvious. The question is how to make existing protocol
to perform better.

I have no idea, why your tcp performs better. It can be everything:
absence of slow start, more coarse ACKs, whatever. I believe you were careful
to check those reasons and to do a fair comparison, but then the only guess
remains that you saved lots of i-cache getting rid of long code path.

And none of those guesses can be attributed to

Re: [PATCH] drivers/net/wireless/d80211: Check configuration type in hw-config_interface.

2006-07-20 Thread Ivo Van Doorn


Hi,


  This patch prevents a NULL pointer dereferencing in AP mode:
  ieee80211_if_config will set conf-bssid only if device is of type STA
  or IBSS.
  I see it using following commands right after module loading (with rt61)
  # iwconfig wlan0 mode Master
  # ifconfig wlan0 up

 The patch seems to fix the problem at a wrong place. rt2x00 has broken
 add_interface handler - it allows adding of AP interface even though the
 driver doesn't support AP mode. It is add_interface callback that should
 be fixed in rt2x00.

Well rt2x00 does support AP mode, our latest CVS tree (patches for
wireless-dev are in progress) has even shown a working configuration
for some users.
So add_interface is correct at allowing the AP interface, perhaps some
more steps
are required to make it completely work, but it is work in progress.

 The check in the patch most likely won't be needed even after AP mode
 support is added to rt2x00 - the driver needs to handle AP mode
 differently so config_interface callback will be rewritten anyway.

I'll make a check to see if the bssid is NULL or invalid in the
config_bssid() function,
and make sure that in AP mode the MAC is written as BSSID.


I won't be able to send this patch seperately, so it will become part
of the larger series
that I am currently working on, that patch series throws a lot of code
around and has some major changes to the rt2x00 code so expect very
large patches.
The bssid is NULL fix was already in my cvs tree,
so only the update to use the MAC as BSSID in master mode has been added.

Ivo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] smc911x: Re-release spinlock on spurious interrupt

2006-07-20 Thread Peter Korsgaard

 Peter == Peter Korsgaard [EMAIL PROTECTED] writes:

 Peter Hi,
 Peter The smc911x driver forgets to release the spinlock on spurious
 Peter interrupts. This little patch fixes it.

Crap - forgot to sign off :/

Signed-off-by: Peter Korsgaard [EMAIL PROTECTED]

diff -Naur linux-2.6.18-rc2.orig/drivers/net/smc911x.c 
linux-2.6.18-rc2/drivers/net/smc911x.c
--- linux-2.6.18-rc2.orig/drivers/net/smc911x.c 2006-07-20 10:26:20.0 
+0200
+++ linux-2.6.18-rc2/drivers/net/smc911x.c  2006-07-20 17:44:26.0 
+0200
@@ -1092,6 +1092,7 @@
/* Spurious interrupt check */
if ((SMC_GET_IRQ_CFG()  (INT_CFG_IRQ_INT_ | INT_CFG_IRQ_EN_)) !=
(INT_CFG_IRQ_INT_ | INT_CFG_IRQ_EN_)) {
+   spin_unlock_irqrestore(lp-lock, flags);
return IRQ_NONE;
}

-- 
Bye, Peter Korsgaard
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: e1000: fix it on thinkpad x60 / eeprom checksum read fails

2006-07-20 Thread Auke Kok


Pavel Machek wrote:

Hi!


e1000 in thinkpad x60 fails without this dirty hack. What to do with
it?

Signed-off-by: Pavel Machek [EMAIL PROTECTED]

NAK, certainly this should never be merged in any tree...

this is a known issue that we're tracking here:

http://sourceforge.net/tracker/index.php?func=detailaid=1474679group_id=42302atid=447449

Summary of the issue:

Lenovo has used certain BIOS versions where ASPD/DSPD was turned on which 
turns the PHY off when no cable is inserted to save power. The e1000 driver 
already turns off this feature but can't do this until the driver is 
loaded. It seems that turning this feature on causes the MAC to give read 
errors.


Lenovo seems to have the feature turned off in their latest BIOS versions, 
we encourage all people to upgrade their BIOS with the latest version from 
Lenovo (available from their website). It seems that for at least 2 people, 
this has fixed the problem.


Inserting a cable obviously might also work :)


Hehe.

We did reproduce the problem initially with the old BIOS (1.01-1.03) on a 
T60 system, but unfortunately the bug disappeared into nothingness.


Bypassing the checksum leaves the NIC in an uncertain state and is not 
recommended.


Okay, perhaps this should be inserted as a comment into the driver,
and printk should be fixed to point at this explanation?

Can't we enable the driver with the bad checksum, then read the _real_
data?


no.

We're working on a solution where we make sure that the PHY is physically 
turned on properly before we read the EEPROM, which would be the proper fix. 
It's completely not acceptable to run when the EEPROM checksum fails - you 
might even be running with the wrong MAC address, or worse. Lets fix this the 
right way instead.


Auke

PS: adding netdev to the CC...
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH][Bonding]: keep slave state when admin down

2006-07-20 Thread Jay Vosburgh

jamal [EMAIL PROTECTED] wrote:

When a bonding netdevice is admin-ed down it looses the slaves
attributes (set via ifenslave). This is not consistent with other
behavior of netdevices (example a qdisc attached to a netdevice doesnt
disappear or an attached IP address etc).
The included patch fixes this. Ive tested by ifenslaving, downing the
bond, checking /proc and making sure it still has the slaves, up-ing the
bond and making sure things continue to work.

Do the initscript and sysconfig packages (/sbin/ifup, ifdown,
that stuff in /etc/sysconfig/network-scripts, etc) do the right thing
with this change?

If memory serves, the initscripts will down the bond during
setup; I'm not sure if there is any dependency on that action releasing
all (possibly preexisting) slaves.

I don't have a big problem with this, but I'm a little concerned
that there may be dependencies on the existing behavior.

-J

---
-Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alternate to Ixia's ANVL test harness for tcp compliance.

2006-07-20 Thread Piet Delaney

Hey Gang:

Both at UNM and Bluelane we have used Ixia's ANVL test harness for 
verifying TCP protocol compliance with the RFC's. Recent additions
to Ixia's ANVL GUI provide a ethereal like GUI. It looks really slick;
even providing ladder diagrams for quickly viewing the big picture.

Unfortunately Ixia told me they don't have any plains to port the
new GUI to linux. Instead they are trying to migrate Linux developers,
us, to using Windows. Yeck!

With Ixia migrating away from Linux I was wondering if we should
consider using an alternate test bed for TCP protocol compliance.

Do any of you use tools other than ANVL for RFC compliance while
hacking to the tcp code?

In the unlikely event that there isn't an alternate; is there any
interest in a netdev group effort to motivate Ixia to porting their C 
sharp code to linux. I get the feeling that come of their developers
would like to port the code to linux.

-piet

-- 
Piet Delaney
BlueLane Teck
W: (408) 200-5256; [EMAIL PROTECTED]
H: (408) 243-8872; [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Alternate to Ixia's ANVL test harness for tcp compliance.

2006-07-20 Thread Jeff Garzik


Piet Delaney wrote:

Do any of you use tools other than ANVL for RFC compliance while
hacking to the tcp code?

In the unlikely event that there isn't an alternate; is there any
interest in a netdev group effort to motivate Ixia to porting their C 
sharp code to linux. I get the feeling that come of their developers

would like to port the code to linux.



Linux is the most RFC-compliant net stack in the world...  if they don't 
want to support Linux, it's their loss.  :)


Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Alternate to Ixia's ANVL test harness for tcp compliance.

2006-07-20 Thread jamal

On Thu, 2006-20-07 at 12:49 -0700, Piet Delaney wrote:
 Hey Gang:
 
 Both at UNM and Bluelane we have used Ixia's ANVL test harness for 
 verifying TCP protocol compliance with the RFC's. Recent additions
 to Ixia's ANVL GUI provide a ethereal like GUI. It looks really slick;
 even providing ladder diagrams for quickly viewing the big picture.
 
 Unfortunately Ixia told me they don't have any plains to port the
 new GUI to linux. Instead they are trying to migrate Linux developers,
 us, to using Windows. Yeck!
 
 With Ixia migrating away from Linux I was wondering if we should
 consider using an alternate test bed for TCP protocol compliance.
 
 Do any of you use tools other than ANVL for RFC compliance while
 hacking to the tcp code?
 

Talk to the USAGI folks. They have something similar to ANVL called TAHI
that they use to check compliance in IPV4, IPV6 and IPSEC. It should be
extendable with some effort to do TCP.

 In the unlikely event that there isn't an alternate; is there any
 interest in a netdev group effort to motivate Ixia to porting their C 
 sharp code to linux. I get the feeling that come of their developers
 would like to port the code to linux.

Create competition for them  - it is the easiest way to get them
motivated.


cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bug in e1000 + semantics of flow control WAS(Re: [e1000]: flow control on by default - good idea really?

2006-07-20 Thread jamal


I went back to this today.  I am typing this from a scribbled sticky
note in a big hurry -  but i still believe I took the correct notes.

It does seem there is no distinction between what ethernet advertises
for flow control capability vs what it ends up negotiating with its
partner i.e there is some ambiguity. I havent checked tg3, this on e1000
only.

On Fri, 2006-07-07 at 08:28 -0400, jamal wrote:
 On Thu, 2006-06-07 at 23:59 -0700, David Miller wrote:
 
  
  It's autonegotiated, check you kernel message logs when the link
  came up, you'll see this:
  
  tg3: eth0: Flow control is on for TX and on for RX.
  
 
 yikes - yes, this would be it.
 
 I  could be wrong and i will double check:
 I think when the e1000 says via ethtool rx is on - it means that it 
 is _advertising_ flow control as opposed to detecting partner has flow
 control capability.
 Auke, can you also check this as well?

Semantic #1

For example, configure:
ethtool -A eth0 rx off
ethtool -A eth0 tx on

debopolis:~# ethtool -a eth0
Pause parameters for eth0:
Autonegotiate:  on
RX: off
TX: on

The other side was set to do symmetric TX flow control only.

Now enforce autonegotiation:
ethtool -r eth0

ethtool -a eth0
Pause parameters for eth0:
Autonegotiate:  on
RX: off
TX: off

Ok, this is what i expected if this thing (output of ethtool) was
supposed to store state as opposed to configuration.
But if it is state that is stored, then what about that the values
before autonegotian - surely that state is invalid, no?

It would be nice (for debug/usability reasons) to be able to see what i
configured vs what i end up negotiating with the link partner. I think
this may be an ethtool issue, but it could also be a driver issue.

I send 1 Mpps to eth0 and see no flow control packets back. good. So it
does store state

#2: The other semantic

debopolis:~# ethtool -A eth0 rx on
debopolis:~# ethtool -A eth0 tx on

debopolis:~# ethtool -a eth0
Pause parameters for eth0:
Autonegotiate:  on
RX: on
TX: on

Other side was set to do symmetric TX flow control only.

Now enforce autonegotiation:
debopolis:~# ethtool -r eth0

lets see what we came up with:
debopolis:~# ethtool -a eth0
Pause parameters for eth0:
Autonegotiate:  on
RX: on
TX: on

Now that is contradictory to #1 semantic - I would have expected this TX
flow control on the e1000 to be off. 
Unless it is meant to store configuration info and not what you have
negotiated.
 
Trying sending traffic to the e1000 at about 1Mpps.
I observe that the e1000 is sending out about 800Kpps of flow control
packets back ;-

So which semantics are correct? I would claim #2 flow control behavior
to be a bug. I just dont have time to chase a fix - hopefully whoever
reads this from the e1000 crowd can fix it.

More importantly can we have two variables storing the two pieces on
information separately instead of the ambiguity of just one?


cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH][Bonding]: keep slave state when admin down

2006-07-20 Thread jamal

On Thu, 2006-20-07 at 11:50 -0700, Jay Vosburgh wrote:

 
   Do the initscript and sysconfig packages (/sbin/ifup, ifdown,
 that stuff in /etc/sysconfig/network-scripts, etc) do the right thing
 with this change?
 

I havent seen issues so far.

   If memory serves, the initscripts will down the bond during
 setup; I'm not sure if there is any dependency on that action releasing
 all (possibly preexisting) slaves.
 

The one i have experimented with has no issues - but you may be right
some people depend on this behavior at shutdown.

   I don't have a big problem with this, but I'm a little concerned
 that there may be dependencies on the existing behavior.

I could add a module parameter that restores old behavior when asked to
and we keep that for a while and have it print a warning message.
The other alternative is just release it and see if someone complains.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Alternate to Ixia's ANVL test harness for tcp compliance.

2006-07-20 Thread Piet Delaney

On Thu, 2006-07-20 at 16:04 -0400, Jeff Garzik wrote:
 Piet Delaney wrote:
  Do any of you use tools other than ANVL for RFC compliance while
  hacking to the tcp code?
  
  In the unlikely event that there isn't an alternate; is there any
  interest in a netdev group effort to motivate Ixia to porting their C 
  sharp code to linux. I get the feeling that come of their developers
  would like to port the code to linux.
 
 
 Linux is the most RFC-compliant net stack in the world...  if they don't 
 want to support Linux, it's their loss.  :)

They aren't exactly dropping support for Linux, they 'just' are not 
plaining to port the new ethereal like GUI to Lunux:
--
Hi Piet,

Unfortunately there is no plan to redesign the GUI for Linux. We added
support for Windows a couple of releases back. The latest release 7.10
has been benefit by a new Windows GUI framework we have designed for all
windows based Ixia test application. The new GUI is based on C#, which
includes ethereal like packet decode, ladder diagram, Outlook like GUI
design. Currently there is big challenge to implement the same GUI for
Linux. The needed resource is also an issue. Ixia will continue to
maintain and support Linux platform. Please rest assure. Both windows
and linux platforms share the same under layer test engine.. So there is
no difference in test cases. Ixia also offers an upgrade path from Linux
to Windows. Please contact your local Ixia sales if you are interested.
Dean
---

I wonder if Microsoft is providing the big challenge to porting the
same GUI to linux. The world really doesn't need yet another Java
language. Gosling is a Genius, I studied his X11 News Server enough
to know first hand. Microsoft lost in court with their violating the
Java standards and C sharp seems to be just another stratagy to their
bizarre attempt to world domination (Like the SCO mess).

I suggest that Linux networking companies like UNM and us be Beta
customers for a port. So far it hasn't been entertained TMBK.
 
-piet

 
   Jeff
 
 
-- 
Piet Delaney
BlueLane Teck
W: (408) 200-5256; [EMAIL PROTECTED]
H: (408) 243-8872; [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Alternate to Ixia's ANVL test harness for tcp compliance.

2006-07-20 Thread Andi Kleen

On Thursday 20 July 2006 21:49, Piet Delaney wrote:

 Unfortunately Ixia told me they don't have any plains to port the
 new GUI to linux. Instead they are trying to migrate Linux developers,
 us, to using Windows. Yeck!


With some luck it will just work in wine.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Evgeniy Polyakov

On Thu, Jul 20, 2006 at 08:41:00PM +0400, Alexey Kuznetsov ([EMAIL PROTECTED]) 
wrote:
 Hello!

Hello, Alexey.

 Small question first:
 
  userspace, but also there are big problems, like one syscall per ack,
 
 I do not see redundant syscalls. Is not it expected to send ACKs only
 after receiving data as you said? What is the problem?

I mean that each ack is a pure syscall without any data, so overhead is
quite huge compared to the situatin when acks are created in
kernelspace.
At least slow start will eat a lot of CPU with them.

 Now boring things:
 
  There is no BH protocol processing at all, so there is no need to
  pprotect against someone who will add data while you are processing own
  chunk.
 
 Essential part of socket user lock is the same mutex.
 
 Backlog is actually not a protection, but a thing equivalent to netchannel.
 The difference is only that it tries to process something immediately,
 when it is safe. You can omit this and push everything to 
 backlog(=netchannel),
 which is processed only by syscalls, if you do not care about latency.

If we consider netchannels as how Van Jackobson discribed them, then
mutext is not needed, since it is impossible to have several readers or
writers. But in socket case even if there is only one userspace
consumer, that lock must be held to protect against bh (or introduce
several queues and complicate a lot their's management (ucopy for
example)).
 
  How many hacks just to be a bit closer to userspace processing,
  implemented in netchannels!
 
 Moving processing closer to userspace is not a goal, it is a tool.
 Which sometimes useful, but generally quite useless.
 
 F.e. in your tests it should not affect performance at all,
 end user is just a sink.
 
 What's about prequeueing, it is a bright example. Guess why is it useful?
 What does it save? Nothing, like netchannel. Answer is: it is just a tool
 to generate coarsed ACKs in a controlled manner without essential violation
 of protocol. (Well, and to combine checksumming and copy if you do not like 
 how
 your card does this)

I can not agree here. 
The main goal of the protocol is data delivery to the user, but not
it's blind accepting and data transmit from user, but not some other
ring.
As you see, sending is already implemented in process' context,
but receiving is not directly connected to the user.
THe more elemnts between user and it's data we have, the more
probability of some problems there. And we already have two queues just
to eliminate one of them.
Moving protocol (no matter if it is TCP or not) closer to user allows
naturally control the dataflow - when user can read that data(and _this_
is the main goal), user acks, when it can not - it does not generate
ack. In theory that can lead to the full absence of the congestions,
especially if receiving window can be controlled in both directions.
At least with current state of routers it does not lead to the broken
connections.

  If userspace is scheduled away for too much time, it is bloody wrong to
  ack the data, that is impossible to read due to the fact that system is
  being busy. It is just postponing the work from one end to another - ack
  now and stop when queue is full, or postpone the ack generation when
  segment is realy being read.
 
 ... when you get all the segments nicely aligned, blah-blah-blah.
 
 If you do not care about losses-congestion-delays-delacks-whatever,
 you have a totally different protocol. Sending window feedback
 is only a minor part of tcp. But even these boring tcp intrinsics
 are not so important, look at ideal lossless network:
 
 Think what happens f.e. while plain file transfer to your notebook.
 You get 110MB/sec for a few seconds, then writeback is fired and
 disk io subsystems discovers that the disk holds only 50MB/sec.
 If you are unlucky and some another application starts, disk is so congested
 that it will take lots of seconds to make a progress with io.
 For this time another side will retransmit, because poor thing thought
 rtt is 100 usecs and you will never return to 50MB/sec.
 
 You have to _CLOSE_ window in the case of long delay, rather than to forget
 to ack. See the difference?
 
 It is just because actual end user is still far far away.
 And this happens all the time, when you relay the results to another
 application via pipe, when... Well, the only case where real end user
 is user of netchannel is when you receive to a sink.

There is one problem in your logic.
RTT will not be so small, since acks are not sent when user does not
read data.

  But I said not this. I said it looks _worse_. A bit, but worse.
  
  At least for 80 bytes it does not matter at all.
 
 Hello-o, do you hear me? :-)
 
 I am asking: it looks not much better, but a bit worse,
 then what is real reason for better performance, unless it is
 due to castration of protocol?

Well, if speed would be measured in lines of code, that atcp gets far less than
existing tcp, but performance win is only 2.5 times.

Re: Alternate to Ixia's ANVL test harness for tcp compliance.

2006-07-20 Thread David Miller

From: Piet Delaney [EMAIL PROTECTED]
Date: Thu, 20 Jul 2006 13:24:34 -0700

 I wonder if Microsoft is providing the big challenge to porting the
 same GUI to linux. The world really doesn't need yet another Java
 language. Gosling is a Genius, I studied his X11 News Server enough
 to know first hand. Microsoft lost in court with their violating the
 Java standards and C sharp seems to be just another stratagy to their
 bizarre attempt to world domination (Like the SCO mess).

Under Linux we have Mono as a C-sharp implementation.  For the kind of
GUI they most likely have, porting shouldn't be much of an issue at
all.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Ben Greear


Evgeniy Polyakov wrote:


Backlog is actually not a protection, but a thing equivalent to netchannel.
The difference is only that it tries to process something immediately,
when it is safe. You can omit this and push everything to backlog(=netchannel),
which is processed only by syscalls, if you do not care about latency.



If we consider netchannels as how Van Jackobson discribed them, then
mutext is not needed, since it is impossible to have several readers or
writers. But in socket case even if there is only one userspace
consumer, that lock must be held to protect against bh (or introduce
several queues and complicate a lot their's management (ucopy for
example)).


Out of curiosity, is it possible to have the single producer logic
if you have two+ ethernet interfaces handling frames for a single
TCP connection?  (I am assuming some sort of multi-path routing
logic...)

Thanks,
Ben

--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Alternate to Ixia's ANVL test harness for tcp compliance.

2006-07-20 Thread Jeff Garzik


Piet Delaney wrote:

I wonder if Microsoft is providing the big challenge to porting the
same GUI to linux. The world really doesn't need yet another Java
language. Gosling is a Genius, I studied his X11 News Server enough
to know first hand. Microsoft lost in court with their violating the
Java standards and C sharp seems to be just another stratagy to their
bizarre attempt to world domination (Like the SCO mess).


Runtime dynamic bytecode languages -- Java, Perl, Python, Ruby, ... -- 
do seem to be all the rage.


As DaveM noted, though, C# is fully supported under Linux.

Or maybe they could go for Gtk+, which has successfully been used to 
maintain complex GUIs apps on both Windows and Linux.  GIMP is the most 
notable example, but use of Gtk+, GLib, and mingw has meant that you can 
build Linux-ish apps on Windows without nasty porting layers like Cygwin.


Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Ian McDonald



If we consider netchannels as how Van Jackobson discribed them, then
mutext is not needed, since it is impossible to have several readers or
writers. But in socket case even if there is only one userspace
consumer, that lock must be held to protect against bh (or introduce
several queues and complicate a lot their's management (ucopy for
example)).


As I recall Van's talk you don't need a lock with a ring buffer if you
have a start and end variable pointing to location within ring buffer.

He didn't explain this in great depth as it is computer science 101
but here is how I would explain it:

Once socket is initialiased consumer is the only one that sets start
variable and network driver reads this only. It is the other way
around for the end variable. As long as the writes are atomic then you
are fine. You only need one ring buffer in this scenario and two
atomic variables.

Having atomic writes does have overhead but far less than locking semantic.
--
Ian McDonald
Web: http://wand.net.nz/~iam4
Blog: http://imcdnzl.blogspot.com
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Alternate to Ixia's ANVL test harness for tcp compliance.

2006-07-20 Thread Piet Delaney

On Thu, 2006-07-20 at 17:31 -0400, Jeff Garzik wrote:
 Piet Delaney wrote:
  I wonder if Microsoft is providing the big challenge to porting the
  same GUI to linux. The world really doesn't need yet another Java
  language. Gosling is a Genius, I studied his X11 News Server enough
  to know first hand. Microsoft lost in court with their violating the
  Java standards and C sharp seems to be just another stratagy to their
  bizarre attempt to world domination (Like the SCO mess).
 
 Runtime dynamic bytecode languages -- Java, Perl, Python, Ruby, ... -- 
 do seem to be all the rage.
 
 As DaveM noted, though, C# is fully supported under Linux.
 
 Or maybe they could go for Gtk+, which has successfully been used to 
 maintain complex GUIs apps on both Windows and Linux.  GIMP is the most 
 notable example, but use of Gtk+, GLib, and mingw has meant that you can 
 build Linux-ish apps on Windows without nasty porting layers like Cygwin.

Perhaps, but my experience with GTK has been that it's difficult
to get installed right if you put it on /usr/local. I tried compiling
ethereal for our platform and it needed GTK and a series of other
libraries. I suspect it's likely a major effort to migrate from a
Microsoft C sharp environment to GTK.

-piet

 
   Jeff
 
 
-- 
Piet Delaney
BlueLane Teck
W: (408) 200-5256; [EMAIL PROTECTED]
H: (408) 243-8872; [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Alternate to Ixia's ANVL test harness for tcp compliance.

2006-07-20 Thread Brent Cook

On Thursday 20 July 2006 16:31, Jeff Garzik wrote:
 Piet Delaney wrote:
  I wonder if Microsoft is providing the big challenge to porting the
  same GUI to linux. The world really doesn't need yet another Java
  language. Gosling is a Genius, I studied his X11 News Server enough
  to know first hand. Microsoft lost in court with their violating the
  Java standards and C sharp seems to be just another stratagy to their
  bizarre attempt to world domination (Like the SCO mess).

 Runtime dynamic bytecode languages -- Java, Perl, Python, Ruby, ... --
 do seem to be all the rage.

 As DaveM noted, though, C# is fully supported under Linux.

 Or maybe they could go for Gtk+, which has successfully been used to
 maintain complex GUIs apps on both Windows and Linux.  GIMP is the most
 notable example, but use of Gtk+, GLib, and mingw has meant that you can
 build Linux-ish apps on Windows without nasty porting layers like Cygwin.

   Jeff


Base C# support is pretty good in Mono, but you still have to be quite careful 
when creating a cross-platform application with it. Microsoft's version 
implements a number of libraries that still are not quite as well implemented 
in Mono (if at all). The toolkit libraries (Windows Forms, to the latest 
stuff with Vista) are a bit of a moving target. Plus, the .Net platform still 
lets developers interact with COM objects and other Windows-only code.

Just because the GUI is C# does not mean that it does not have a number of 
Windows-only dependencies, unless it was implemented with portability 
in-mind.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Alternate to Ixia's ANVL test harness for tcp compliance.

2006-07-20 Thread Jeff Garzik


Brent Cook wrote:
Just because the GUI is C# does not mean that it does not have a number of 
Windows-only dependencies, unless it was implemented with portability 
in-mind.


Well, sure...  The same can be said of any source code base, for any set 
of platforms, for any given language.


Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] via-velocity: fix reported speed and link detected status

2006-07-20 Thread Francois Romieu

Jay Cliburn [EMAIL PROTECTED] :
 The via-velocity driver reports incorrect speed and link detected status
 as viewed by ethtool (and probably other tools). This patch fixes those
 incorrect reports and prettifies a long line.

Looks fine.

Fixed the whitespace/tabs damage, the 190 cols comment  and taged as
'upstream-20060720-00' in branch 'upstream' at
git://electric-eye.fr.zoreil.com/home/romieu/linux-2.6.git

-- 
Ueimor
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Alexey Kuznetsov

Hello!

 Moving protocol (no matter if it is TCP or not) closer to user allows
 naturally control the dataflow - when user can read that data(and _this_
 is the main goal), user acks, when it can not - it does not generate
 ack. In theory

To all that I rememeber, in theory absence of feedback leads
to loss of control yet. The same is in practice, unfortunately.
You must say that window is closed, otherwise sender is totally
confused.


 There is one problem in your logic.
 RTT will not be so small, since acks are not sent when user does not
 read data.

It is arithmetics: rtt = window/rate.

And rto stays rounded up to 200 msec, unless you messed the connection
so hard that it is not alive. Check.


  Simplify protocol, move all the processing (even memory copies) to softirq,
  leave to user space only feeding pages to copy and you will have unbeatable
  performance. Been there, done that, not with TCP of course, but if you do 
  not
  care about losses and ACK clocking and send an ACK once per window,
  I do not see how it can spoil the situation.
 
 Do you live in a perfect world, where user does not want what was
 requested?

All the time I am trying to bring you attention that you read to sink. :-)
At least, read to disk to move it a little closer to reality.
Or at least do it from terminal and press ^Z sometimes.


  You deal with 80 byte packets, to all that I understand.
  If you lose one cacheline per packet, it is a big problem.
 
 So actual netchannels speed is even better? :)

atcp. If you get rid of netchannels, leave only atcp, the speed will
be at least not worse. No doubts.


 tell me, why we should keep (enabled) that redundant functionality?
 Because it can work better in some other places, and that is correct,
 but why it should be enabled then in majority of the cases?

Did not I tell you something like that? :-) Optimize real thing,
even trying to detect the situations when retransmissions are redundant
and eliminate the code.


 Let's draw the line.
...
 That was my opinion on the topic. It looks like neither you, nor me will
 not change our point of view about that right now :)

I agree. :)

Alexey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Oops in IFB

2006-07-20 Thread David Miller

From: Nicolas DICHTEL [EMAIL PROTECTED]
Date: Thu, 20 Jul 2006 16:31:16 +0200

 Sorry, I forgot the patch ;-)

Also applied, thanks Nicolas.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [IPV4]: Fix nexthop realm dumping for multipath routes

2006-07-20 Thread David Miller


Good catch, applied, thanks Patrick.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Weird TCP SACK problem. in Linux...

2006-07-20 Thread Alexey Kuznetsov

Hello!

 Hmmm... I dont understand thisso if reording can be detected, (i.e 
 we use timestamps, DSACK), the dupthreshold is increased

Yes.

 implementation that might lead to increase in the number of 
 retransmissions, but leads to improvment in download time

Hmm... I thought and still do not know.


 couldnt figure it out,also is there anywhere where the reordering 
 response of tcp linux described? (it seem dupthreshold is dynamically 
 adjusted based on the reordering history... but I was not able to find 
 out how...)...

That's comment from tcp_input:

 * Reordering detection.
 * 
 * Reordering metric is maximal distance, which a packet can be displaced
 * in packet stream. With SACKs we can estimate it:
 *
 * 1. SACK fills old hole and the corresponding segment was not
 *ever retransmitted - reordering. Alas, we cannot use it
 *when segment was retransmitted.
 * 2. The last flaw is solved with D-SACK. D-SACK arrives
 *for retransmitted and already SACKed segment - reordering..


Alexey

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL (RTAB BUG)

2006-07-20 Thread Alexey Kuznetsov

Hello!

 It shouldn't be.  Any decimal number can be expressed
 as a fraction, eg:

I remember this. :-) I stalled selecting corrects divisors
to fight over/underflows. Not becuase it was difficult,
just because did not see a reason to do this.



 But doing so would get rid of the table implementation 
 and the flexibility it has given us to date.  For that 
 reason I feel uncomfortable with it.
 
 The engineering decision becomes this - are there any
 other protocols like ATM out there that could justify 
 such a change? 

Is it faster? You say, yes. Is it required? You say, yes.
Is there some protocols, which needs more flexibility? No.

 know a good deal more about them than I do.  What say
 you?

Frankly, I seriously believed that rtabs is a good way to handle ATM. :-)
I seriously believed that you have to do something like:
((packet_size+cell_payload-1)/cell_payload)*cell_size

So, if in reality even this protocol does not justify keeping ratbs, kill
them.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

A question about linux/net/ipv4/ipcomp.c

2006-07-20 Thread Igor V. Liferenko

Hello, everyone.

I'm having fun reading RFC's and looking through linux source code for
implementation examples.

What I'm not able to understand is this piece of code :

union {
struct iphdriph;
charbuf[60];
} tmp_iph;

and corresponding RFC 791 statement : The maximal internet header is
60 octets.

Would you please say why it's 60, and not 52?

Well, what I came up with is this :

RFC791: The option-length octet counts the option-type octet and the
option-length octet as well as the option-data octets. [i.e.
options' total length may be up to 2^8/8 octets (32)]. Then, header
lenght without options is 20 octets. So, a maximum header length is
32+20=52 octets. RFC791: Internet Header Length is the length of the
internet header in 32 bit words... 52 octets is 52*8 bits and it's a
multiple of 32.

Thanks in advance
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RTLWS8-CFP] Eighth Real-Time Linux Workshop 2nd CFP

2006-07-20 Thread mcguire


We apologize for multiple receipts.







  Eighth Real-Time Linux Workshop

October 12-15, 2006
 Lanzhou University - SISE
  Tianshui South Road 222
   Lanzhou, Gansu 73
 P.R.China


  General

   Following  the  meetings  of  developers  and  users at the previous 7
   successful  real-time Linux workshops held in Vienna, Orlando, Milano,
   Boston,  and  Valencia, Singapore, Lille, the Real-Time Linux Workshop
   for  2006  will  come back to Asia again, to be held at the School for
   Information  Science  and  Engineering, Lanzhou University, in Lanzhou
   China.

   Embedded  and  real-time Linux is rapidly gaining traction in the Asia
   Pacific  region.  Embedded  systems  in  both  automation/control  and
   entertainment moving to 32/64bit systems, opening the door for the use
   of  full  featured  OS  like  GNU/Linux  on  COTS  based systems. With
   real-time  capabilities being a common demand for embedded systems the
   soft  and  hard  real-time  variants are an important extension to the
   versatile GNU/Linux GPOS.

   Authors  are  invited  to  submit  original  work dealing with general
   topics  related  to  real-time  Linux  research,  experiments and case
   studies,  as  well  as issues of integration of real-time and embedded
   Linux.  A  special focus will be on industrial case studies. Topics of
   interest include, but are not limited to:

 * Modifications and variants of the GNU/Linux operating system
   extending its real-time capabilities,
 * Contributions to real-time Linux variants, drivers and extensions,
 * User-mode real-time concepts, implementation and experience,
 * Real-time Linux applications, in academia, research and industry,
 * Work in progress reports, covering recent developments,
 * Educational material on real-time Linux,
 * Tools for embedding Linux or real-time Linux and embedded
   real-time Linux applications,
 * RTOS core concepts, RT-safe synchronization mechanisms,
 * RT-safe interaction of RT and non RT components,
 * IPC mechanisms in RTOS,
 * Analysis and Benchmarking methods and results of 
   real-time GNU/Linux variants,
 * Debugging techniques and tools, both for code and temporal
   debugging of core RTOS components, drivers and real-time
   applications,
 * Real-time related extensions to development environments.
  
  Further information:
 
  EN: http://www.realtimelinuxfoundation.org/events/rtlws-2006/ws.html 
  CN: http://dslab.lzu.edu.cn/rtlws8/index.html

  Awarded papers

  The  Programme Committee  will award a best paper in the category Real-
  Time Systems Theory.  This best paper will be invited  for  publication 
  to the Real-Time Systems Journal, RTSJ. 
  
  The  Programme Committee will award a best paper in the category Real-
  Time Systems Application. This best paper will be invited for publication 
  to the Dr Dobbs Journal. Moreover, the publication of the other papers in
  a special issue of Dr Dobbs Journal is in discussion. 

  Abstract submission

  In  order register an abstract, please go to:
  http://www.realtimelinuxfoundation.org/rtlf/register-abstract.html

  Venue

  Lanzhou University Information Building, School of Information Science
  and Engineering, Laznhou University, http://www.lzu.edu.cn/.

  Registration

  In  order  to  participate  to  the  workshop,  please register on the
  registration page at:
  http://www.realtimelinuxfoundation.org/rtlf/register-participant.html

  Accommodation

  Please refer to the Lanzhou hotel page for accomodation at
  http://dslab.lzu.edu.cn/rtlws8/hotels/hotels.htm

  Travel information

  For travel information and directions how to get to Lanzhou from an 
  international airport in China please refer to:
  http://www.realtimelinuxfoundation.org/events/rtlws-2006/

  Important dates

  August28:  Abstract submission
  September 15:  Notification of acceptance
  September 29:  Final paper

  Pannel Participants:

 o Roberto Bucher - Scuola Universitaria Professionale della Svizzera
   Italiana, Switzerland, RTAI/ADEOS/RTAI-Lab.

 o Alfons Crespo Lorente - University of Valenica, Spain,Departament
   d'Informtica de Sistemes i Computadors, XtratuM.

 o Herman Haertig - Technical University Dresden, Germany,Institute for
   System Architecture, L4/Fiasco/L4Linux.

 o Nicholas Mc Guire - Lanzhou University, P.R. China, Distributed and
   Embedded Systems Lab, RTLinux/GPL.

 o Douglas Niehaus - University of Kansas, USA, Information and
   Telecommunication Technology Center, RT-preempt.

  Organization committee:

 * Prof. Li LIAN (Co-Chair), (SISE, Lanzhou University, CHINA)
 * Xiaoping ZHANG, LZU, CHINA
 *

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread David Miller

From: Alexey Kuznetsov [EMAIL PROTECTED]
Date: Fri, 21 Jul 2006 02:59:08 +0400

  Moving protocol (no matter if it is TCP or not) closer to user allows
  naturally control the dataflow - when user can read that data(and _this_
  is the main goal), user acks, when it can not - it does not generate
  ack. In theory

 To all that I rememeber, in theory absence of feedback leads
 to loss of control yet. The same is in practice, unfortunately.
 You must say that window is closed, otherwise sender is totally
 confused.

Correct, and too large delay even results in retransmits.  You can say
that RTT will be adjusted by delay of ACK, but if user context
switches cleanly at the beginning, resulting in near immediate ACKs,
and then blocks later you will get spurious retransmits.  Alexey's
example of blocking on a disk write is a good example.  I really don't
like when pure NULL data sinks are used for benchmarking these kinds
of things because real applications 1) touch the data, 2) do something
with that data, and 3) have some life outside of TCP!

If you optimize an application that does nothing with the data it
receives, you have likewise optimized nothing :-)

All this talk reminds me of one thing, how expensive tcp_ack() is.
And this expense has nothing to do with TCP really.  The main cost is
purging and freeing up the skbs which have been ACK'd in the
retransmit queue.

So tcp_ack() sort of inherits the cost of freeing a bunch of SKBs
which haven't been touched by the cpu in some time and are thus nearly
guarenteed to be cold in the cache.

This is the kind of work we could think about batching to user
sleeping on some socket call.

Also notice that retransmit queue is potentially a good use of an
array similar VJ netchannel lockless queue data structure. :)

BTW, notice that TSO makes this work touch less skb state.  TSO also
decreases cpu utilization.  I think these two things are no
coincidence. :-)

I have even toyed with the idea of eventually abstracting the
retransmit queue into a pure data representation.  The skb_shinfo()
page vector is very nearly this already.  Or, a less extreme idea
where we have fully retained huge TSO skbs, but we do not chop them up
to create smaller TSO frames.  Instead, we add offset GSO attribute
which is used in the clones.

Calls to tso_fragment() would be replaced with pure clones and
adjustment of skb-len and the new skb-gso_offset in the clone.
Rest of the logic would remain identical except that non-linear data
would start skb-gso_offset bytes into the skb_shinfo() described
area.

In this way we could also set tp-xmit_size_goal to it's maximum
possible value, always.  Actually, I was looking at this the other day
and this clamping of xmit_size_goal to 1/2 max_window is extremely
dubious.  In fact it's downright wrong, only MSS needs this limiting
for sender side SWS avoidance.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

49 matches

Mail list logo