Re: [PATCH] fix limited slow start bug

2007-02-25 Thread Roger While


Dave M wrote :

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 415193e..18a468d 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -302,7 +302,7 @@ struct tcp_sock {
u32 snd_ssthresh;   /* Slow start size threshold*/
u32 snd_cwnd;   /* Sending congestion window*/
u16 snd_cwnd_cnt;   /* Linear increase counter  */
-   u16 snd_cwnd_clamp; /* Do not allow snd_cwnd to grow above this */
+   u32 snd_cwnd_clamp; /* Do not allow snd_cwnd to grow above this */
u32 snd_cwnd_used;
u32 snd_cwnd_stamp;


Was anything done about size/member alignment of struct tcp_sock per
mail from last year  -
http://marc.theaimsgroup.com/?l=linux-netdev&m=114318857102290&w=2

(I have no idea what current size is)


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: all syscalls initially taking 4usec on a P4? Re: nonblocking UDPv4 recvfrom() taking 4usec @ 3GHz?

2007-02-25 Thread Pavel Machek
Hi!

> > I've done so, with some interesting results. Source on
> > http://ds9a.nl/tmp/recvtimings.c - be careful to adjust the '3000' divider
> > to your CPU frequency if you care about absolute numbers!
> > 
> > These are two groups, each consisting of 10 consecutive nonblocking UDP
> > recvfroms, with 10 packets preloaded. Reported is the number of microseconds
> > per recvfrom call which yielded a packet:
> > 
> > $ ./recvtimings
> > 4.142333
> 
> It can be recvfrom only problem - syscall overhead on my p4 (core duo,
> debian testing) is bout 300 usec - to test I ran 

core duo is _not_ p4 class cpu; rsulets there will be very different.

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IPv6 anycast refcnt fix

2007-02-25 Thread Michal Wrobel

This patch fixes a bug in Linux IPv6 stack which caused anycast address
to be added to a device prior DAD has been completed. This led to
incorrect reference count which resulted in infinite wait for
unregister_netdevice completion on interface removal.

Signed-off-by: Michal Wrobel <[EMAIL PROTECTED]>

--- linux/net/ipv6/addrconf.c   2007-02-22 19:46:27.0 +0100
+++ linux/net/ipv6/addrconf.c   2007-02-25 00:22:37.0 +0100
@@ -456,6 +456,8 @@
ipv6_dev_mc_dec(dev, &addr);
}
for (ifa=idev->addr_list; ifa; ifa=ifa->if_next) {
+   if (ifa->flags&IFA_F_TENTATIVE)
+   continue;
if (idev->cnf.forwarding)
addrconf_join_anycast(ifa);
else

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Adding data to SKB - odd checksum errors

2007-02-25 Thread Kristian Evensen

Hello,

I am working on an algorithm to add data from the previous skb (on the 
queue) to the front of the current skb. This should be beneficial for a 
certain kind of TCP-traffic, and I am curious as to wether it will work 
or not.


Currently I have implemented a small algorithm to copy the data that 
works most of the time. It is called from write_xmit (right before the 
while-loop) and performs a number of checks (does the skb fit in the 
window, does it have enought space - mostly the same as the 
retrans_try_collapse-function) before it copies the data. I first 
"allocate" new data at the back of the skb using skb_put, and if that is 
succsessfull I copy the new data (using first memmove to move the old 
data and then memcpy), update the seq-number of this skb and calculate a 
new checksum. The algorithm will (currently) not work with non-linear 
skbs. I have pasted the code at the bottom of this mail.


The weird thing about this algorithm is that something will suddenly 
make it go wrong/it makes something else go wrong, and the packets that 
I send have the wrong TCP-checksum. I have looked around the code for 
any counters I might have missed or similar, but I cant find any. If I 
have understod skb's correctly, I shouldn't have to update any counters 
except skb->len (which put does) since I dont expand the SKB I only use 
the space already reserved for it.


Does anyone have any ideas to what might be wrong or can spot any 
errors/misunderstandings in my code?


Thanks,
Kristian

The code:

This goes into write_xmit before the loop:
if(sysctl_tcp_thin_aggressive_bundling && tcp_stream_is_thin(tp)){
   if(skb->prev != (struct sk_buff*) &(sk)->sk_write_queue
   && !(TCP_SKB_CB(skb)->flags & TCPCB_FLAG_SYN)
   && (skb_shinfo(skb)->nr_frags == 0 &&
   skb_shinfo(skb->prev)->nr_frags == 0)){
   tcp_trans_try_collapse2(sk, skb, tcp_current_mss(sk, 0))
   }
}

This is the code that copies the data:
static int tcp_trans_try_collapse2(struct sock *sk, struct sk_buff *skb, 
int mss_now)

{
   struct tcp_sock *tp = tcp_sk(sk);

   /* Make sure that this isnt refereced by somebody else
*/
   if(!skb_cloned(skb)){
   struct sk_buff *prev_skb = skb_copy(skb->prev, GFP_ATOMIC);
   int skb_size = skb->len, prev_skb_size = prev_skb->len;
   u16 flags = TCP_SKB_CB(prev_skb)->flags;

   /* Since this technique currently does not support SACK, I
   * return -1 if the previous has been SACK'd. */
   if(TCP_SKB_CB(prev_skb)->sacked & TCPCB_SACKED_ACKED){
   return -1;
   }

   /* Current skb is out of window. */
   if (after(TCP_SKB_CB(skb)->end_seq, tp->snd_una+tp->snd_wnd)){
   return -1;
   }

   /* Punt if not enough space exists in the first SKB for
* the data in the second, or the total combined payload
* would exceed the MSS.
*/
   if ((prev_skb_size > skb_tailroom(skb)) ||
   ((skb_size + prev_skb_size) > mss_now)){
   return -1;
   }

   /*To avoid duplicate copies.*/
   if(TCP_SKB_CB(skb)->seq <= TCP_SKB_CB(prev_skb)->seq)
   return -1;

   /*First, check I have enough room*/
   if(skb_tailroom(skb) < prev_skb->len)
   return -1;

   copy = skb_put(skb, prev_skb->len);

   if(copy){
   memmove(skb->data + prev_skb->len, skb->data, skb->len - 
prev_skb->len);

   memcpy(skb->data, prev_skb->data, prev_skb->len);
   TCP_SKB_CB(skb)->seq = TCP_SKB_CB(prev_skb)->seq;
   skb->csum = csum_partial(skb->data, skb->len, 0);
   }

   __kfree_skb(prev_skb);
   }

   return 1;
}
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


natsemi: Fix detection of vanilla natsemi cards

2007-02-25 Thread Mark Brown
Bob Tracy <[EMAIL PROTECTED]> reported that the addition of support
for Aculab E1/T1 cPCI carrier cards broke detection of vanilla natsemi
cards.  This patch fixes that: the problem is that the driver-specific
ta in the PCI device table is an index into a second table and this
had not been updated for the vanilla cards.

This patch fixes the problem minimally.

Signed-Off-By: Mark Brown <[EMAIL PROTECTED]>

--- linux.orig/drivers/net/natsemi.c2007-02-23 11:13:03.0 +
+++ linux/drivers/net/natsemi.c 2007-02-23 11:12:00.0 +
@@ -260,7 +260,7 @@
 
 static const struct pci_device_id natsemi_pci_tbl[] __devinitdata = {
{ PCI_VENDOR_ID_NS, 0x0020, 0x12d9, 0x000c, 0, 0, 0 },
-   { PCI_VENDOR_ID_NS, 0x0020, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
+   { PCI_VENDOR_ID_NS, 0x0020, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 1 },
{ } /* terminate list */
 };
 MODULE_DEVICE_TABLE(pci, natsemi_pci_tbl);

-- 
"You grabbed my hand and we fell into it, like a daydream - or a fever."
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[IPROUTE2][ALL] update rest to use nl_mgrp

2007-02-25 Thread jamal
cheers,
jamal
[ALL] update rest to use nl_mgrp

Signed-off-by: J Hadi Salim <[EMAIL PROTECTED]>

---
commit 539bc1cc1b002700504ad8cbe82ea451026c5fe4
tree 208bd273db5bf023c33e5256da615a20173c1921
parent 40076f622e0aacb2b792d3ac1b5d12aa97c4da9c
author Jamal Hadi Salim <[EMAIL PROTECTED]> Sun, 25 Feb 2007 11:43:54 -0500
committer Jamal Hadi Salim <[EMAIL PROTECTED]> Sun, 25 Feb 2007 11:43:54 -0500

 ip/ipmonitor.c  |   12 ++--
 ip/rtmon.c  |   10 +-
 tc/tc_monitor.c |2 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/ip/ipmonitor.c b/ip/ipmonitor.c
index 704ada7..f1a1f27 100644
--- a/ip/ipmonitor.c
+++ b/ip/ipmonitor.c
@@ -132,22 +132,22 @@ int do_ipmonitor(int argc, char **argv)
}
 
if (llink)
-   groups |= RTMGRP_LINK;
+   groups |= nl_mgrp(RTNLGRP_LINK);
if (laddr) {
if (!preferred_family || preferred_family == AF_INET)
-   groups |= RTMGRP_IPV4_IFADDR;
+   groups |= nl_mgrp(RTNLGRP_IPV4_IFADDR);
if (!preferred_family || preferred_family == AF_INET6)
-   groups |= RTMGRP_IPV6_IFADDR;
+   groups |= nl_mgrp(RTNLGRP_IPV6_IFADDR);
}
if (lroute) {
if (!preferred_family || preferred_family == AF_INET)
-   groups |= RTMGRP_IPV4_ROUTE;
+   groups |= nl_mgrp(RTNLGRP_IPV4_ROUTE);
if (!preferred_family || preferred_family == AF_INET6)
-   groups |= RTMGRP_IPV6_ROUTE;
+   groups |= nl_mgrp(RTNLGRP_IPV6_ROUTE);
}
if (lprefix) {
if (!preferred_family || preferred_family == AF_INET6)
-   groups |= RTMGRP_IPV6_PREFIX;
+   groups |= nl_mgrp(RTNLGRP_IPV6_PREFIX);
}
 
if (file) {
diff --git a/ip/rtmon.c b/ip/rtmon.c
index 8c464cb..b538a52 100644
--- a/ip/rtmon.c
+++ b/ip/rtmon.c
@@ -134,18 +134,18 @@ main(int argc, char **argv)
exit(-1);
}
if (llink)
-   groups |= RTMGRP_LINK;
+   groups |= nl_mgrp(RTNLGRP_LINK);
if (laddr) {
if (!family || family == AF_INET)
-   groups |= RTMGRP_IPV4_IFADDR;
+   groups |= nl_mgrp(RTNLGRP_IPV4_IFADDR);
if (!family || family == AF_INET6)
-   groups |= RTMGRP_IPV6_IFADDR;
+   groups |= nl_mgrp(RTNLGRP_IPV6_IFADDR);
}
if (lroute) {
if (!family || family == AF_INET)
-   groups |= RTMGRP_IPV4_ROUTE;
+   groups |= nl_mgrp(RTNLGRP_IPV4_ROUTE);
if (!family || family == AF_INET6)
-   groups |= RTMGRP_IPV6_ROUTE;
+   groups |= nl_mgrp(RTNLGRP_IPV6_ROUTE);
}
 
fp = fopen(file, "w");
diff --git a/tc/tc_monitor.c b/tc/tc_monitor.c
index 1af6cf0..bf58744 100644
--- a/tc/tc_monitor.c
+++ b/tc/tc_monitor.c
@@ -68,7 +68,7 @@ int do_tcmonitor(int argc, char **argv)
 {
struct rtnl_handle rth;
char *file = NULL;
-   unsigned groups = RTMGRP_TC;
+   unsigned groups = nl_mgrp(RTNLGRP_TC);
 
while (argc > 0) {
if (matches(*argv, "file") == 0) {


[IPROUTE2][GENERAL] nl_mgrp to crap if base multicast groups exceeded

2007-02-25 Thread jamal

cheers,
jamal

[GENERAL] nl_mgrp to crap if base multicast groups exceeded

The old scheme of bitmasks works only for the first 32 groups.
Above that the setsockopt scheme must be used.

Signed-off-by: J Hadi Salim <[EMAIL PROTECTED]>

---
commit f3d272cea2870805677809bf121737fb6c36dc8e
tree b1e42d5c8d9122a600f2f81e04b0d197642b1878
parent 539bc1cc1b002700504ad8cbe82ea451026c5fe4
author Jamal Hadi Salim <[EMAIL PROTECTED]> Sun, 25 Feb 2007 11:50:53 -0500
committer Jamal Hadi Salim <[EMAIL PROTECTED]> Sun, 25 Feb 2007 11:50:53 -0500

 include/utils.h |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/utils.h b/include/utils.h
index 1769ca1..a3fd335 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #include "libnetlink.h"
 #include "ll_map.h"
@@ -129,7 +130,11 @@ static __inline__ int get_user_hz(void)
 
 static inline __u32 nl_mgrp(__u32 group)
 {
-   return group ? (1 << (group -1)) : 0;
+   if (group > 31 ) {
+   fprintf(stderr, "Use setsockopt for this group %d\n", group);
+   exit(-1);
+   }
+   return group ? (1 << (group - 1)) : 0;
 }
 
 


Re: [tipc-discussion] [RFC: 2.6 patch] net/tipc/: possible cleanups

2007-02-25 Thread Adrian Bunk
On Sat, Feb 24, 2007 at 04:19:19PM -0800, Stephens, Allan wrote:
>...
> 2) There are portions of TIPC's native API which are intended for use by
> driver programmers, but which are not being used by any code that is
> currently in the kernel.  While removing these API's from TIPC will only
> impact these "freeloaders", it has the potential to discourage future
> programmers who *do* want to contribute their work to the kernel by
> removing API's that are apparently necessary/useful when doing coding of
> this sort.

It can be re-added at any time when an in-kernel user comes.

But the most interesting question is:
Why is noone interested in getting his TIPC using drivers merged?

> Regards,
> Al
>...

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: all syscalls initially taking 4usec on a P4? Re: nonblocking UDPv4 recvfrom() taking 4usec @ 3GHz?

2007-02-25 Thread Evgeniy Polyakov
On Sun, Feb 25, 2007 at 11:41:54AM +0100, Pavel Machek ([EMAIL PROTECTED]) 
wrote:
> > > I've done so, with some interesting results. Source on
> > > http://ds9a.nl/tmp/recvtimings.c - be careful to adjust the '3000' divider
> > > to your CPU frequency if you care about absolute numbers!
> > > 
> > > These are two groups, each consisting of 10 consecutive nonblocking UDP
> > > recvfroms, with 10 packets preloaded. Reported is the number of 
> > > microseconds
> > > per recvfrom call which yielded a packet:
> > > 
> > > $ ./recvtimings
> > > 4.142333
> > 
> > It can be recvfrom only problem - syscall overhead on my p4 (core duo,
> > debian testing) is bout 300 usec - to test I ran 
> 
> core duo is _not_ p4 class cpu; rsulets there will be very different.

Results nevertheless are the same.
Each syscall takes some time first (noticebly more than subsequent
calls), and that was a main problem for Bert.
Given the high load, recvfrom() can even take tens of microseconds
(although I can not provide a profile output yet, but I showed a data).

So, syscall overhead itself is very small no matter which type of the
CPU is used - athlon is about 300 nsec, via epia about 1.4 usec), 
but the whole function can take quite a lot of time.

>   Pavel
> 
> -- 
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


2.6.21-rc1: known regressions (part 3)

2007-02-25 Thread Adrian Bunk
This email lists some known regressions in 2.6.21-rc1 compared to 2.6.20
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject: forcedeth: skb_over_panic
References : http://bugzilla.kernel.org/show_bug.cgi?id=8058
Submitter  : Albert Hopkins <[EMAIL PROTECTED]>
Status : unknown


Subject: natsemi ethernet card not detected correctly
References : http://lkml.org/lkml/2007/2/23/4
 http://lkml.org/lkml/2007/2/23/7
Submitter  : Bob Tracy <[EMAIL PROTECTED]>
Caused-By  : Mark Brown <[EMAIL PROTECTED]>
Handled-By : Mark Brown <[EMAIL PROTECTED]>
Patch  : http://lkml.org/lkml/2007/2/23/142
Status : patch available


Subject: request_module: runaway loop modprobe net-pf-1
References : http://lkml.org/lkml/2007/2/21/206
Submitter  : YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]>
Caused-By  : Kay Sievers <[EMAIL PROTECTED]>
 commit c353c3fb0700a3c17ea2b0237710a184232ccd7f
Handled-By : Greg KH <[EMAIL PROTECTED]>
Status : problem is being discussed


Subject: IPV6=m, SUNRPC=y compile error
References : http://bugzilla.kernel.org/show_bug.cgi?id=8050
 http://lkml.org/lkml/2007/2/12/442
 http://lkml.org/lkml/2007/2/20/384
Submitter  : Michael-Luke Jones <[EMAIL PROTECTED]>
 Pete Clements <[EMAIL PROTECTED]>
 Sid Boyce <[EMAIL PROTECTED]>
Caused-By  : Chuck Lever <[EMAIL PROTECTED]>
Handled-By : YOSHIFUJI Hideaki / 吉藤英明 <[EMAIL PROTECTED]>
Status : patch available


Subject: WARNING: "compat_agp_ioctl" undefined!
References : http://lkml.org/lkml/2007/2/21/272
Submitter  : Andreas Schwab <[EMAIL PROTECTED]>
Handled-By : Dave Jones <[EMAIL PROTECTED]>
Status : patch available


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SV: Re: [tipc-discussion] [RFC: 2.6 patch] net/tipc/: possible cleanups

2007-02-25 Thread Jon Paul Maloy

--- Adrian Bunk <[EMAIL PROTECTED]> skrev:

> It can be re-added at any time when an in-kernel
> user comes.
> 
> But the most interesting question is:
> Why is noone interested in getting his TIPC using
> drivers merged?
> 
I don't think lack of interest is the issue here. The
users I know anything about, would be both happy and
proud to contribute code to the main tree. One I know
about,who has developed a very interesting "reliable
bond interface" based on this API, doesn't regard his
code to be up to the kernel coding standards yet,
although I am trying to encourage him. Another one
thinks his function is just too specialized to be of
any common interest.
In the future, I would be also be very interested in
seeing a cross-node netlink implementation, carried
over TIPC, using this API. (Unfortunately, I don't
dare to commit to this myself right now, there is too
much left to be done in TIPC.)
So, as you see, keeping the exported symbols would be
a definite advantage, as current and future developers
would not have to patch the kernel to do their work.

Regards
///jon 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] cxgb3 - FW version update

2007-02-25 Thread Steve Wise
Hey Divy,

You missed a printk change.  Here is an updated patch.



Update FW version to 3.2

Signed-off-by: Steve Wise <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/t3_hw.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index 365a7f5..08a6295 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -884,11 +884,11 @@ int t3_check_fw_version(struct adapter *
major = G_FW_VERSION_MAJOR(vers);
minor = G_FW_VERSION_MINOR(vers);
 
-   if (type == FW_VERSION_T3 && major == 3 && minor == 1)
+   if (type == FW_VERSION_T3 && major == 3 && minor == 2)
return 0;
 
CH_ERR(adapter, "found wrong FW version(%u.%u), "
-  "driver needs version 3.1\n", major, minor);
+  "driver needs version 3.2\n", major, minor);
return -EINVAL;
 }
 


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] cxgb3 - FW version update

2007-02-25 Thread Jeff Garzik

Steve Wise wrote:

Hey Divy,

You missed a printk change.  Here is an updated patch.



Update FW version to 3.2

Signed-off-by: Steve Wise <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/t3_hw.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index 365a7f5..08a6295 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -884,11 +884,11 @@ int t3_check_fw_version(struct adapter *
major = G_FW_VERSION_MAJOR(vers);
minor = G_FW_VERSION_MINOR(vers);
 
-	if (type == FW_VERSION_T3 && major == 3 && minor == 1)

+   if (type == FW_VERSION_T3 && major == 3 && minor == 2)
return 0;
 
 	CH_ERR(adapter, "found wrong FW version(%u.%u), "

-  "driver needs version 3.1\n", major, minor);
+  "driver needs version 3.2\n", major, minor);


I would rather fix the code to use constants, and thus avoid this 
problem ever happening again.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Weird problem with PPPoE on tap interface

2007-02-25 Thread Florian Zumbiehl
Hi,

I'm experiencing a pretty strange problem with kernel PPPoE on tap
interfaces with a vanilla 2.6.20 kernel that prevents the PPP connection
from being established:

Every PPPoE session packet (that is, LCP, since it never gets to a stage
where any other session data is being exchanged) is delivered to pppd
twice. This can be seen from pppd's logging messages when debugging
is enabled, and strace confirmed that it indeed does read() the exact
same data twice in a row from the same file descriptor - even though
a tcpdump on the corresponding tap interface does show each packet only
once.

For confirming this, I used a program with a select() loop that simply
moves packets unchanged and without reordering back and forth between
a "real" ethernet interface in promiscuous mode and the tap interface
used by pppd.

What makes this even stranger, is, that the setup works perfectly (and
only a single copy of packets is delivered to pppd) if I simply replace
the tap interface in pppd's config with the "real" ethernet interface
that the tap interface was previously bridged to (it's an ISA NE2K
clone, BTW).

Finally, it also works perfectly when I use userspace rp-pppoe through
the tap interface.

So far, I also confirmed that in kernel space, the call to ppp_input()
in drivers/net/pppoe.c is executed as many times as pppd receives a
packet, so the problem must be somewhere before that.

Well, I'm gonna try to find out more - but if someone with some
more knowledge of the involved kernel code would be willing to help
with this in some way or another, that would certainly be
appreciated ;-) - if you do need any additional information, let me
know ...

Florian
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix oops in xfrm4_dst_destroy()

2007-02-25 Thread Bernhard Walle
With 2.6.21-rc1, I get an oops when running 'ifdown eth0' and an IPsec
connection is active. If I shut down the connection before running 'ifdown
eth0', then there's no problem.  The critical operation of this script is to
kill dhcpd.

The problem is probably caused by commit with git identifier
4337226228e1cfc1d70ee975789c6bd070fb597c (Linus tree) "[IPSEC]: IPv4 over IPv6
IPsec tunnel".

This patch fixes that oops. I don't know the network code of the Linux kernel
in deep, so if that fix is wrong, please change it. But please fix the oops. :)


Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]>
---
 net/ipv4/xfrm4_policy.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.21-rc1/net/ipv4/xfrm4_policy.c
===
--- linux-2.6.21-rc1.orig/net/ipv4/xfrm4_policy.c
+++ linux-2.6.21-rc1/net/ipv4/xfrm4_policy.c
@@ -291,7 +291,7 @@ static void xfrm4_dst_destroy(struct dst
 
if (likely(xdst->u.rt.idev))
in_dev_put(xdst->u.rt.idev);
-   if (dst->xfrm->props.family == AF_INET && likely(xdst->u.rt.peer))
+   if (dst->xfrm && dst->xfrm->props.family == AF_INET && 
likely(xdst->u.rt.peer))
inet_putpeer(xdst->u.rt.peer);
xfrm_dst_destroy(xdst);
 }
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


reliable bond interface

2007-02-25 Thread Randy Macleod

  In a different thred,

Jon Paul Maloy wrote:
> One (TIPC user) I know about,who has developed a very interesting
> "reliable bond interface" based on this API, doesn't regard his code
> to be up to the kernel coding standards yet, although I am trying to
> encourage him.

  Sounds interesting! Can you ask this person to post the code or
at least to present the basic design on the tipc and/or netdev mailing 
list/s?


// Randy

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix limited slow start bug

2007-02-25 Thread David Miller
From: Roger While <[EMAIL PROTECTED]>
Date: Sun, 25 Feb 2007 09:55:34 +0100

> Was anything done about size/member alignment of struct tcp_sock per
> mail from last year  -
> http://marc.theaimsgroup.com/?l=linux-netdev&m=114318857102290&w=2
> 
> (I have no idea what current size is)

Nothing has been done yet but I've been thinking about it a lot
over the past year and I've had some discussions with other
developers such as Arnaldo.

It's just a matter of me being backlogged, so I never get to
it as often as I would like :)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix oops in xfrm4_dst_destroy()

2007-02-25 Thread Patrick McHardy
Bernhard Walle wrote:
> With 2.6.21-rc1, I get an oops when running 'ifdown eth0' and an IPsec
> connection is active. If I shut down the connection before running 'ifdown
> eth0', then there's no problem.  The critical operation of this script is to
> kill dhcpd.
> 
> The problem is probably caused by commit with git identifier
> 4337226228e1cfc1d70ee975789c6bd070fb597c (Linus tree) "[IPSEC]: IPv4 over IPv6
> IPsec tunnel".
> 
> This patch fixes that oops. I don't know the network code of the Linux kernel
> in deep, so if that fix is wrong, please change it. But please fix the oops. 
> :)

Looks good, when the xfrm_dst is freed in __xfrm4_bundle_create
after a failed call to xfrm_dst_lookup the xfrm pointer is not
set, and this is also expected by xfrm_dst_destroy.

Acked-by: Patrick McHardy <[EMAIL PROTECTED]>

> ---
>  net/ipv4/xfrm4_policy.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-2.6.21-rc1/net/ipv4/xfrm4_policy.c
> ===
> --- linux-2.6.21-rc1.orig/net/ipv4/xfrm4_policy.c
> +++ linux-2.6.21-rc1/net/ipv4/xfrm4_policy.c
> @@ -291,7 +291,7 @@ static void xfrm4_dst_destroy(struct dst
>  
>   if (likely(xdst->u.rt.idev))
>   in_dev_put(xdst->u.rt.idev);
> - if (dst->xfrm->props.family == AF_INET && likely(xdst->u.rt.peer))
> + if (dst->xfrm && dst->xfrm->props.family == AF_INET && 
> likely(xdst->u.rt.peer))
>   inet_putpeer(xdst->u.rt.peer);
>   xfrm_dst_destroy(xdst);
>  }

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[NET] sgiseeq: Don't include unnecessary headerfiles.

2007-02-25 Thread Ralf Baechle
Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]>

diff --git a/drivers/net/sgiseeq.c b/drivers/net/sgiseeq.c
index a833e7f..52ed522 100644
--- a/drivers/net/sgiseeq.c
+++ b/drivers/net/sgiseeq.c
@@ -12,26 +12,15 @@
 #include 
 #include 
 #include 
-#include 
-#include 
-#include 
-#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
-#include 
 
-#include 
-#include 
-#include 
-#include 
-#include 
 #include 
 #include 
-#include 
 
 #include "sgiseeq.h"
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix limited slow start bug

2007-02-25 Thread Arnaldo Carvalho de Melo

On 2/25/07, David Miller <[EMAIL PROTECTED]> wrote:

From: Roger While <[EMAIL PROTECTED]>
Date: Sun, 25 Feb 2007 09:55:34 +0100

> Was anything done about size/member alignment of struct tcp_sock per
> mail from last year  -
> http://marc.theaimsgroup.com/?l=linux-netdev&m=114318857102290&w=2
>
> (I have no idea what current size is)

Nothing has been done yet but I've been thinking about it a lot
over the past year and I've had some discussions with other
developers such as Arnaldo.

It's just a matter of me being backlogged, so I never get to
it as often as I would like :)


Attached goes a current (DaveM's net-2.6 git tree build) pahole
picture of tcp_sock on UP, 32bits, summary:

}; /* size: 1288, cachelines: 21 */
  /* last cacheline: 8 bytes */

and for the really curious, take a look at:

http://oops.ghostprotocols.net:81/acme/dwarves/tcp_sock.pahole.expand_types.txt

All the types are expanded, makes a pretty big picture :-)

- Arnaldo
[EMAIL PROTECTED] net-2.6]$ pahole ../OUTPUT/qemu/net-2.6/vmlinux tcp_sock
/*  /home/acme/git/net-2.6/include/linux/tcp.h:227 */
struct tcp_sock {
struct inet_connection_sock inet_conn;   /* 0   844 */
/* --- cacheline 13 boundary (832 bytes) was 12 bytes ago --- */
u16tcp_header_len;   /*   844 2 */
u16xmit_size_goal;   /*   846 2 */
__be32 pred_flags;   /*   848 4 */
u32rcv_nxt;  /*   852 4 */
u32snd_nxt;  /*   856 4 */
u32snd_una;  /*   860 4 */
u32snd_sml;  /*   864 4 */
u32rcv_tstamp;   /*   868 4 */
u32lsndtime; /*   872 4 */
struct {
struct sk_buff_head prequeue;/* 028 */
struct task_struct * task;   /*28 4 */
struct iovec * iov;  /*32 4 */
intmemory;   /*36 4 */
intlen;  /*40 4 */
} ucopy; /*   87644 */
/* --- cacheline 14 boundary (896 bytes) was 24 bytes ago --- */
u32snd_wl1;  /*   920 4 */
u32snd_wnd;  /*   924 4 */
u32max_window;   /*   928 4 */
u32mss_cache;/*   932 4 */
u32window_clamp; /*   936 4 */
u32rcv_ssthresh; /*   940 4 */
u32frto_highmark;/*   944 4 */
u8 reordering;   /*   948 1 */
u8 frto_counter; /*   949 1 */
u8 nonagle;  /*   950 1 */
u8 keepalive_probes; /*   951 1 */
u32srtt; /*   952 4 */
u32mdev; /*   956 4 */
/* --- cacheline 15 boundary (960 bytes) --- */
u32mdev_max; /*   960 4 */
u32rttvar;   /*   964 4 */
u32rtt_seq;  /*   968 4 */
u32packets_out;  /*   972 4 */
u32left_out; /*   976 4 */
u32retrans_out;  /*   980 4 */
struct tcp_options_received rx_opt;  /*   98424 */
u32snd_ssthresh; /*  1008 4 */
u32snd_cwnd; /*  1012 4 */
u16snd_cwnd_cnt; /*  1016 2 */
u16snd_cwnd_clamp;   /*  1018 2 */
u32snd_cwnd_used;/*  1020 4 */
/* --- cacheline 16 boundary (1024 bytes) --- */
u32snd_cwnd_stamp;   /*  1024 4 */
struct sk_buff_headout_of_order_queue;   /*  102828 */
u32rcv_wnd;  /*  1056 4 */
u32rcv_wup;  /*  1060 4 */
u32write_seq;/*  1064 4 */
u32pushed_seq;   /*  1068 4 */
u32copied_seq;   /*  1072 4 */
struct t

Re: [PATCH] fix limited slow start bug

2007-02-25 Thread Arnaldo Carvalho de Melo

On 2/25/07, Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> wrote:

On 2/25/07, David Miller <[EMAIL PROTECTED]> wrote:
> From: Roger While <[EMAIL PROTECTED]>
> Date: Sun, 25 Feb 2007 09:55:34 +0100
>
> > Was anything done about size/member alignment of struct tcp_sock per
> > mail from last year  -
> > http://marc.theaimsgroup.com/?l=linux-netdev&m=114318857102290&w=2
> >
> > (I have no idea what current size is)
>
> Nothing has been done yet but I've been thinking about it a lot
> over the past year and I've had some discussions with other
> developers such as Arnaldo.
>
> It's just a matter of me being backlogged, so I never get to
> it as often as I would like :)

Attached goes a current (DaveM's net-2.6 git tree build) pahole
picture of tcp_sock on UP, 32bits, summary:

}; /* size: 1288, cachelines: 21 */
   /* last cacheline: 8 bytes */

and for the really curious, take a look at:

http://oops.ghostprotocols.net:81/acme/dwarves/tcp_sock.pahole.expand_types.txt

All the types are expanded, makes a pretty big picture :-)



And looking at it I saw I have Ingo's timer debugging option enabled,
which makes struct timer_list a bit bigger:

struct timer_list {
   struct list_head {
   struct list_head * next;   /* 0 4 */
   struct list_head * prev;   /* 4 4 */
   } entry;   /* 0 8 */
   long unsigned int expires; /* 8 4 */
   void   (*function)(long unsigned int); /*12 4 */
   long unsigned int data;/*16 4 */
   struct tvec_t_base_s * base;   /*20 4 */

   void * start_site; /*24 4 */
   char   start_comm[16]; /*2816 */
   intstart_pid;  /*44 4 */
} icsk_delack_timer; /*   66848 */

The three last members are related to debugging, so discount 24 bytes
times 3, as there are tree struct timer_list inside struct tcp_sock.

- Arnaldo
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.21-rc1: known regressions (part 3)

2007-02-25 Thread Greg KH
On Sun, Feb 25, 2007 at 07:02:51PM +0100, Adrian Bunk wrote:
> 
> 
> Subject: request_module: runaway loop modprobe net-pf-1
> References : http://lkml.org/lkml/2007/2/21/206
> Submitter  : YOSHIFUJI Hideaki /  <[EMAIL PROTECTED]>
> Caused-By  : Kay Sievers <[EMAIL PROTECTED]>
>  commit c353c3fb0700a3c17ea2b0237710a184232ccd7f
> Handled-By : Greg KH <[EMAIL PROTECTED]>
> Status : problem is being discussed

Patch has been reverted and submitted to Linus to pull, but he's out of
town right now...

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] cxgb3 - FW version update

2007-02-25 Thread Steve Wise

> I would rather fix the code to use constants, and thus avoid this 
> problem ever happening again.
> 
>   Jeff
> 

How's this (not tested)?


---

 drivers/net/cxgb3/t3_hw.c   |6 --
 drivers/net/cxgb3/version.h |2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index 365a7f5..4b4cffb 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -884,11 +884,13 @@ int t3_check_fw_version(struct adapter *
major = G_FW_VERSION_MAJOR(vers);
minor = G_FW_VERSION_MINOR(vers);
 
-   if (type == FW_VERSION_T3 && major == 3 && minor == 1)
+   if (type == FW_VERSION_T3 && major == FW_VERSION_MAJOR && 
+   minor == FW_VERSION_MINOR)
return 0;
 
CH_ERR(adapter, "found wrong FW version(%u.%u), "
-  "driver needs version 3.1\n", major, minor);
+  "driver needs version %u.%u\n", major, minor, 
+  FW_VERSION_MAJOR, FW_VERSION_MINOR);
return -EINVAL;
 }
 
diff --git a/drivers/net/cxgb3/version.h b/drivers/net/cxgb3/version.h
index 2b67dd5..782a6cf 100644
--- a/drivers/net/cxgb3/version.h
+++ b/drivers/net/cxgb3/version.h
@@ -36,4 +36,6 @@ #define DRV_DESC "Chelsio T3 Network Dri
 #define DRV_NAME "cxgb3"
 /* Driver version */
 #define DRV_VERSION "1.0"
+#define FW_VERSION_MAJOR 3
+#define FW_VERSION_MINOR 2
 #endif /* __CHELSIO_VERSION_H */


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] cxgb3 - FW version update

2007-02-25 Thread Jeff Garzik

Steve Wise wrote:
I would rather fix the code to use constants, and thus avoid this 
problem ever happening again.


Jeff



How's this (not tested)?


seems OK to me


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/7] cxgb3 - FW version update

2007-02-25 Thread divy
From: Divy Le Ray <[EMAIL PROTECTED]>

Update FW version to 3.2

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/cxgb3/t3_hw.c   |6 --
 drivers/net/cxgb3/version.h |2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index 365a7f5..eaa7a2e 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -884,11 +884,13 @@ int t3_check_fw_version(struct adapter *
major = G_FW_VERSION_MAJOR(vers);
minor = G_FW_VERSION_MINOR(vers);
 
-   if (type == FW_VERSION_T3 && major == 3 && minor == 1)
+   if (type == FW_VERSION_T3 && major == FW_VERSION_MAJOR &&
+   minor == FW_VERSION_MINOR)
return 0;
 
CH_ERR(adapter, "found wrong FW version(%u.%u), "
-  "driver needs version 3.1\n", major, minor);
+  "driver needs version %u.%u\n", major, minor,
+  FW_VERSION_MAJOR, FW_VERSION_MINOR);
return -EINVAL;
 }
 
diff --git a/drivers/net/cxgb3/version.h b/drivers/net/cxgb3/version.h
index 2b67dd5..782a6cf 100644
--- a/drivers/net/cxgb3/version.h
+++ b/drivers/net/cxgb3/version.h
@@ -36,4 +36,6 @@
 #define DRV_NAME "cxgb3"
 /* Driver version */
 #define DRV_VERSION "1.0"
+#define FW_VERSION_MAJOR 3
+#define FW_VERSION_MINOR 2
 #endif /* __CHELSIO_VERSION_H */
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] cxgb3 - FW version update

2007-02-25 Thread Divy Le Ray

Jeff Garzik wrote:

Steve Wise wrote:
I would rather fix the code to use constants, and thus avoid this 
problem ever happening again.


Jeff



How's this (not tested)?


seems OK to me



I tested it and resubmitted. Thanks fro the fix suggestion and the patch!

Cheers,
Divy
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] cxgb3 - Add SW LRO support

2007-02-25 Thread Christoph Hellwig
On Sat, Feb 24, 2007 at 04:44:23PM -0800, [EMAIL PROTECTED] wrote:
> From: Divy Le Ray <[EMAIL PROTECTED]>
> 
> Add all-in-sw lro support.

Doing this in a LLDD doesn't sound like a good idea.  Have you
tried doing this in the core networking code instead?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/5] r8169: more alignment for the 0x8168

2007-02-25 Thread Philip Craig
Francois Romieu wrote:
> The experimental r8169 patch of the day against 2.6.21-rc1 is available at:
> http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.21-rc1/

Is 0006-r8169-confusion-between-hardware-and-IP-header-alignment.txt
the only relevant patch?

This only partially helps.  Many of the packets are greater than 200
bytes so copybreak doesn't apply to them.

Can we assume anything about the alignment of skb->data?  I think it
should be 4 byte aligned, otherwise the whole NET_IP_ALIGN thing
won't work.  All the drivers I looked at just reserve NET_IP_ALIGN
without checking the alignment first.

So can you do something like set align to 0 for RTL_CFG_0 and change
rtl8169_alloc_rx_skb() to:
skb_reserve(skb, align ? (align - 1) & (u32)skb->data : NET_IP_ALIGN);

BTW, should the alignment expression be:
(((u32)skb->data + (align - 1)) & ~(align - 1)) - (u32)skb->data
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


D-Link DGE-528T (r8159) autonegotation of 1000 Mbps link does not work

2007-02-25 Thread Petri T. Koistinen
Hi!

I just brought two D-Link DGE-528T (uses r8159 driver) network adapters
to have nice 1 Gbps home network between two computers.

I have gigabit crossover cable that is connected like this

Pin  Connector #1  Connector #2
1white/orange  white/green
2orangegreen
3white/green   white/orange
4blue  white/brown
5white/bluebrown
6green orange
7white/brown   blue
8brown white/blue
(from: http://logout.sh/computers/net/gigabit/)

and when I use this cable I only get 10 Mbps connection when connected
another D-Link DGE-528T.

However if I connect other end of cable to Intel EtherExpress 100 Mbps
card, I get 100 Mbps connection auto-negotiated OK.

I can however force gigabit link on both D-Link DGE-528T using

insmod ./r8169.ko media=0x10

command. And I get about 15 MiB/s transfer speed when testing
with netcat and pmr[1] (without NAPI). Which I count as some kind of
sucky gigabit connection.

Few questions for you:

Do anyone know if there some known bug in auto-negotiation with
Realtek 8159 chip? (Or is that just wrong cable?)

Can I force 1000 Mbps link on some other way than giving option to
insmod? Using /proc or something?

What kind of transfer speed you have reached with these cards? That
15 MiB/s is not what I expected.

Thanks!

Best regards,
Petri Koistinen

[1]
Test setup:
Server: nc -l -p 2000 /dev/null

pmr (ex. pipemeter) is a command line filter that measures
bandwidth going through the pipe:
http://zakalwe.fi/~shd/foss/pmr/pmr-0.12.tar.bz2

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html