Re: [1/1] connector: export cn_already_initialized.

2006-05-06 Thread Evgeniy Polyakov
On Fri, May 05, 2006 at 05:16:46PM -0700, David S. Miller ([EMAIL PROTECTED]) 
wrote:
> From: Evgeniy Polyakov <[EMAIL PROTECTED]>
> Date: Thu, 4 May 2006 16:24:22 +0400
> 
> > No in-kernel users require it to be exported, so if you do think it
> > should not be exported I will force external module changes.
> 
> What are the alternatives?

This flag shows that connector has finished it's initialization and now
it's calbacks can be safely added. In-kernel users (like various
accountings) which are initialized first (like fs) use that flag to
check if connector can be added or not.

Some external patches, which can be built both as static build and as
module just check that value, and thus will fail with unresolved symbol
when cn and module are built as modules.

The right set of operations should be following:
If external module is loaded and cn is not loaded or compiled into the
kernel, insmod will just fail with unresolved symbol (cn_add_callback and 
others),
if cn is already loaded or was built into the tree, then it has been 
initialized already and there is no need to check that value, external
module should be just loaded.

I think the right solution is to call external init functions after cn
init function, but it's ordering is not always known.

While writing this I've thought another solution, when
cn_add_callback() will just return -EINPROGRESS or other special error, 
which means that it is too early to call it and it must be run later.

Signed-off-by: Evgeniy Polyakov <[EMAIL PROTECTED]>

diff --git a/drivers/connector/connector.c b/drivers/connector/connector.c
index 3589707..f852e68 100644
--- a/drivers/connector/connector.c
+++ b/drivers/connector/connector.c
@@ -308,6 +308,9 @@ int cn_add_callback(struct cb_id *id, ch
int err;
struct cn_dev *dev = &cdev;
 
+   if (!cn_already_initialized)
+   return -EINPROGRESS;
+
err = cn_queue_add_callback(dev->cbdev, name, id, callback);
if (err)
return err;
@@ -456,6 +459,8 @@ static int __init cn_init(void)
sock_release(dev->nls->sk_socket);
return -EINVAL;
}
+   
+   cn_already_initialized = 1;
 
err = cn_add_callback(&dev->id, "connector", &cn_callback);
if (err) {
@@ -465,8 +470,6 @@ static int __init cn_init(void)
return -EINVAL;
}
 
-   cn_already_initialized = 1;
-
return 0;
 }
 

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: VJ Channel API - driver level (PATCH)

2006-05-06 Thread Evgeniy Polyakov
On Fri, May 05, 2006 at 05:35:33PM -0700, David S. Miller ([EMAIL PROTECTED]) 
wrote:
> From: Evgeniy Polyakov <[EMAIL PROTECTED]>
> Date: Fri, 5 May 2006 13:36:56 +0400
> 
> > Hardware folks could also create it's own implementation and show
> > community if theirs approach is good or not.
> 
> Designing hardware for non-existing software infrastructure is
> risky buisness :-)

There are companies that do TOE and crypto accelerators without
support from even windows :)

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: VJ Channel API - driver level (PATCH)

2006-05-06 Thread Evgeniy Polyakov
On Sat, May 06, 2006 at 12:42:38PM +0400, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
> On Fri, May 05, 2006 at 05:35:33PM -0700, David S. Miller ([EMAIL PROTECTED]) 
> wrote:
> > From: Evgeniy Polyakov <[EMAIL PROTECTED]>
> > Date: Fri, 5 May 2006 13:36:56 +0400
> > 
> > > Hardware folks could also create it's own implementation and show
> > > community if theirs approach is good or not.
> > 
> > Designing hardware for non-existing software infrastructure is
> > risky buisness :-)
> 
> There are companies that do TOE and crypto accelerators without
> support from even windows :)

And actually most of them have research departments which do new
interesting developments.

It is Open Source after all - just do everything you want and show
community that it is usefull. Neterion did that right with UFO and LRO.

IBM started to do it with netchannels.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH,RFT] bcm43xx: use softmac-suggested TX rate

2006-05-06 Thread Daniel Drake
Can a bcm43xx user please test this. It uses the new txrate stuff found in
the wireless-dev tree.

Signed-off-by: Daniel Drake <[EMAIL PROTECTED]>

Index: linux/drivers/net/wireless/bcm43xx/bcm43xx_xmit.c
===
--- linux.orig/drivers/net/wireless/bcm43xx/bcm43xx_xmit.c
+++ linux/drivers/net/wireless/bcm43xx/bcm43xx_xmit.c
@@ -296,11 +296,14 @@ void bcm43xx_generate_txhdr(struct bcm43
u16 control = 0;
u16 wsec_rate = 0;
u16 encrypt_frame;
+   u16 ftype = WLAN_FC_GET_TYPE(le16_to_cpu(wireless_header->frame_ctl));
+   int is_mgt = (ftype == IEEE80211_FTYPE_MGMT) != 0;
 
/* Now construct the TX header. */
memset(txhdr, 0, sizeof(*txhdr));
 
-   bitrate = bcm->softmac->txrates.default_rate;
+   bitrate = ieee80211softmac_suggest_txrate(bcm->softmac,
+   is_multicast_ether_addr(wireless_header->addr1), is_mgt);
ofdm_modulation = !(ieee80211_is_cck_rate(bitrate));
fallback_bitrate = bcm43xx_calc_fallback_rate(bitrate);
fallback_ofdm_modulation = !(ieee80211_is_cck_rate(fallback_bitrate));
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 0/3] [IPSEC]: Add xfrm_mode support

2006-05-06 Thread Herbert Xu
Hi:

These patches abstract out the protocol-specific encapsulation parts of
IPsec into what I've termed xfrm_mode objects.  This allows us to share
a little bit more code.  But more importantly, it allows us to add new
encapsulation modes such as BEET or v4/v6 and v6/v4 without polluting
the generic xfrm_input/xfrm_output paths.

These patches are not yet ready to be applied as I need to test them
a bit more.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/3] [IPSEC] xfrm: Undo afinfo lock proliferation

2006-05-06 Thread Herbert Xu
Hi:

The number of locks used to manage afinfo structures can easily be reduced
down to one each for policy and state respectively.  This is based on the
observation that the write locks are only held by module insertion/removal
which are very rare events so there is no need to further differentiate
between the insertion of modules like ipv6 versus esp6.

The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look
suspicious at first.  However, after you realise that nobody ever takes
the corresponding write lock you'll feel better :)

As far as I can gather it's an attempt to guard against the removal of
the corresponding modules.  Since neither module can be unloaded at all
we can leave it to whoever fixes up IPv6 unloading :)

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -204,8 +204,7 @@ struct xfrm_type;
 struct xfrm_dst;
 struct xfrm_policy_afinfo {
unsigned short  family;
-   rwlock_tlock;
-   struct xfrm_type_map*type_map;
+   struct xfrm_type*type_map[256];
struct dst_ops  *dst_ops;
void(*garbage_collect)(void);
int (*dst_lookup)(struct xfrm_dst **dst, struct 
flowi *fl);
@@ -232,7 +231,6 @@ extern int __xfrm_state_delete(struct xf
 
 struct xfrm_state_afinfo {
unsigned short  family;
-   rwlock_tlock;
struct list_head*state_bydst;
struct list_head*state_byspi;
int (*init_flags)(struct xfrm_state *x);
@@ -264,11 +262,6 @@ struct xfrm_type
u32 (*get_max_size)(struct xfrm_state *, int size);
 };
 
-struct xfrm_type_map {
-   rwlock_tlock;
-   struct xfrm_type*map[256];
-};
-
 extern int xfrm_register_type(struct xfrm_type *type, unsigned short family);
 extern int xfrm_unregister_type(struct xfrm_type *type, unsigned short family);
 extern struct xfrm_type *xfrm_get_type(u8 proto, unsigned short family);
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -17,8 +17,6 @@
 static struct dst_ops xfrm4_dst_ops;
 static struct xfrm_policy_afinfo xfrm4_policy_afinfo;
 
-static struct xfrm_type_map xfrm4_type_map = { .lock = RW_LOCK_UNLOCKED };
-
 static int xfrm4_dst_lookup(struct xfrm_dst **dst, struct flowi *fl)
 {
return __ip_route_output_key((struct rtable**)dst, fl);
@@ -237,9 +235,7 @@ _decode_session4(struct sk_buff *skb, st
 
 static inline int xfrm4_garbage_collect(void)
 {
-   read_lock(&xfrm4_policy_afinfo.lock);
xfrm4_policy_afinfo.garbage_collect();
-   read_unlock(&xfrm4_policy_afinfo.lock);
return (atomic_read(&xfrm4_dst_ops.entries) > 
xfrm4_dst_ops.gc_thresh*2);
 }
 
@@ -299,8 +295,6 @@ static struct dst_ops xfrm4_dst_ops = {
 
 static struct xfrm_policy_afinfo xfrm4_policy_afinfo = {
.family =   AF_INET,
-   .lock = RW_LOCK_UNLOCKED,
-   .type_map = &xfrm4_type_map,
.dst_ops =  &xfrm4_dst_ops,
.dst_lookup =   xfrm4_dst_lookup,
.find_bundle =  __xfrm4_find_bundle,
diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c
--- a/net/ipv4/xfrm4_state.c
+++ b/net/ipv4/xfrm4_state.c
@@ -131,7 +131,6 @@ __xfrm4_find_acq(u8 mode, u32 reqid, u8 
 
 static struct xfrm_state_afinfo xfrm4_state_afinfo = {
.family = AF_INET,
-   .lock   = RW_LOCK_UNLOCKED,
.init_flags = xfrm4_init_flags,
.init_tempsel   = __xfrm4_init_tempsel,
.state_lookup   = __xfrm4_state_lookup,
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -23,8 +23,6 @@
 static struct dst_ops xfrm6_dst_ops;
 static struct xfrm_policy_afinfo xfrm6_policy_afinfo;
 
-static struct xfrm_type_map xfrm6_type_map = { .lock = RW_LOCK_UNLOCKED };
-
 static int xfrm6_dst_lookup(struct xfrm_dst **dst, struct flowi *fl)
 {
int err = 0;
@@ -249,9 +247,7 @@ _decode_session6(struct sk_buff *skb, st
 
 static inline int xfrm6_garbage_collect(void)
 {
-   read_lock(&xfrm6_policy_afinfo.lock);
xfrm6_policy_afinfo.garbage_collect();
-   read_unlock(&xfrm6_policy_afinfo.lock);
return (atomic_read(&xfrm6_dst_ops.entries) > 
xfrm6_dst_ops.gc_thresh*2);
 }
 
@@ -311,8 +307,6 @@ static struct dst_ops xfrm6_dst_ops = {
 
 static struct xfrm_policy_afinfo xfrm6_policy_afinfo = {
.family =   AF_INET6,
-   .lock = 

[RFC 2/3] [IPSEC] xfrm: Abstract out encapsulation modes

2006-05-06 Thread Herbert Xu
Hi:

This patch adds the structure xfrm_mode.  It is meant to represent
the operations carried out by transport/tunnel modes.

By doing this we allow additional encapsulation modes to be added
without clogging up the xfrm_input/xfrm_output paths.

Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and
BEET modes.

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h
--- a/include/linux/xfrm.h
+++ b/include/linux/xfrm.h
@@ -118,6 +118,10 @@ enum
XFRM_SHARE_UNIQUE   /* Use once */
 };
 
+#define XFRM_MODE_TRANSPORT 0
+#define XFRM_MODE_TUNNEL 1
+#define XFRM_MODE_MAX 2
+
 /* Netlink configuration messages.  */
 enum {
XFRM_MSG_BASE = 0x10,
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -20,6 +20,8 @@
 #include 
 
 #define XFRM_ALIGN8(len)   (((len) + 7) & ~7)
+#define MODULE_ALIAS_XFRM_MODE(family, encap) \
+   MODULE_ALIAS("xfrm-mode-" __stringify(family) "-" __stringify(encap))
 
 extern struct sock *xfrm_nl;
 extern u32 sysctl_xfrm_aevent_etime;
@@ -164,6 +166,7 @@ struct xfrm_state
/* Reference to data common to all the instances of this
 * transformer. */
struct xfrm_type*type;
+   struct xfrm_mode*mode;
 
/* Security context */
struct xfrm_sec_ctx *security;
@@ -205,6 +208,7 @@ struct xfrm_dst;
 struct xfrm_policy_afinfo {
unsigned short  family;
struct xfrm_type*type_map[256];
+   struct xfrm_mode*mode_map[XFRM_MODE_MAX];
struct dst_ops  *dst_ops;
void(*garbage_collect)(void);
int (*dst_lookup)(struct xfrm_dst **dst, struct 
flowi *fl);
@@ -267,6 +271,19 @@ extern int xfrm_unregister_type(struct x
 extern struct xfrm_type *xfrm_get_type(u8 proto, unsigned short family);
 extern void xfrm_put_type(struct xfrm_type *type);
 
+struct xfrm_mode {
+   int (*input)(struct xfrm_state *x, struct sk_buff *skb);
+   int (*output)(struct sk_buff *skb);
+
+   struct module *owner;
+   int encap;
+};
+
+extern int xfrm_register_mode(struct xfrm_mode *mode, int family);
+extern int xfrm_unregister_mode(struct xfrm_mode *mode, int family);
+extern struct xfrm_mode *xfrm_get_mode(int encap, int family);
+extern void xfrm_put_mode(struct xfrm_mode *mode);
+
 struct xfrm_tmpl
 {
 /* id in template is interpreted as:
diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -414,6 +414,24 @@ config INET_TUNNEL
tristate
default n
 
+config INET_XFRM_MODE_TRANSPORT
+   tristate "IP: IPsec transport mode"
+   default y
+   select XFRM
+   ---help---
+ Support for IPsec transport mode.
+
+ If unsure, say Y.
+
+config INET_XFRM_MODE_TUNNEL
+   tristate "IP: IPsec tunnel mode"
+   default y
+   select XFRM
+   ---help---
+ Support for IPsec tunnel mode.
+
+ If unsure, say Y.
+
 config INET_DIAG
tristate "INET: socket monitoring interface"
default y
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -24,6 +24,8 @@ obj-$(CONFIG_INET_ESP) += esp4.o
 obj-$(CONFIG_INET_IPCOMP) += ipcomp.o
 obj-$(CONFIG_INET_XFRM_TUNNEL) += xfrm4_tunnel.o
 obj-$(CONFIG_INET_TUNNEL) += tunnel4.o
+obj-$(CONFIG_INET_XFRM_MODE_TRANSPORT) += xfrm4_mode_transport.o
+obj-$(CONFIG_INET_XFRM_MODE_TUNNEL) += xfrm4_mode_tunnel.o
 obj-$(CONFIG_IP_PNP) += ipconfig.o
 obj-$(CONFIG_IP_ROUTE_MULTIPATH_RR) += multipath_rr.o
 obj-$(CONFIG_IP_ROUTE_MULTIPATH_RANDOM) += multipath_random.o
diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c
--- a/net/ipv4/xfrm4_input.c
+++ b/net/ipv4/xfrm4_input.c
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -24,15 +23,6 @@ int xfrm4_rcv(struct sk_buff *skb)
 
 EXPORT_SYMBOL(xfrm4_rcv);
 
-static inline void ipip_ecn_decapsulate(struct sk_buff *skb)
-{
-   struct iphdr *outer_iph = skb->nh.iph;
-   struct iphdr *inner_iph = skb->h.ipiph;
-
-   if (INET_ECN_is_ce(outer_iph->tos))
-   IP_ECN_set_ce(inner_iph);
-}
-
 static int xfrm4_parse_spi(struct sk_buff *skb, u8 nexthdr, u32 *spi, u32 *seq)
 {
switch (nexthdr) {
@@ -113,24 +103,10 @@ int xfrm4_rcv_encap(struct sk_buff *skb,
 
xfrm_vec[xfrm_nr++] = x;
 
-   iph = skb->nh.iph;
+   if (x->mode->input(x, skb))
+   goto drop;
 
if (x->props.mode) {
-   if (iph->protocol != IPPROTO_IPIP)
-   goto drop;
-   if (!pskb_may_pull(skb, sizeof(

[RFC 3/3] [IPSEC]: Abstract out transport mode input code

2006-05-06 Thread Herbert Xu
Hi:

This patch is totally untested but shows how we can use xfrm_mode to
remove unnecessary code sharing between modes that in fact end up
slowing things down.  In particular, notice how we've eliminated a
double IP header copying for tunnel mode which is in fact the common
case.

I need to finish it for IPv6 and actually test this one :)

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c
index e2e4771..30524b9 100644
--- a/net/ipv4/ah4.c
+++ b/net/ipv4/ah4.c
@@ -172,11 +172,8 @@
}
}
((struct iphdr*)work_buf)->protocol = ah->nexthdr;
-   skb->nh.raw = skb_pull(skb, ah_hlen);
-   memcpy(skb->nh.raw, work_buf, iph->ihl*4);
-   skb->nh.iph->tot_len = htons(skb->len);
-   skb_pull(skb, skb->nh.iph->ihl*4);
-   skb->h.raw = skb->data;
+   skb->h.raw = memcpy(skb->nh.raw, work_buf, iph->ihl * 4) + ah_hlen;
+   __skb_pull(skb, ah_hlen + iph->ihl * 4);
 
return 0;
 
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 9d1881c..451803e 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -143,10 +143,8 @@
int alen = esp->auth.icv_trunc_len;
int elen = skb->len - sizeof(struct ip_esp_hdr) - esp->conf.ivlen - 
alen;
int nfrags;
-   int encap_len = 0;
u8 nexthdr[2];
struct scatterlist *sg;
-   u8 workbuf[60];
int padlen;
 
if (!pskb_may_pull(skb, sizeof(struct ip_esp_hdr)))
@@ -209,7 +207,6 @@
struct udphdr *uh;
 
uh = (struct udphdr *)(iph + 1);
-   encap_len = (void*)esph - (void*)uh;
 
/*
 * 1) if the NAT-T peer's IP or port changed then
@@ -246,11 +243,9 @@
 
iph->protocol = nexthdr[1];
pskb_trim(skb, skb->len - alen - padlen - 2);
-   memcpy(workbuf, skb->nh.raw, iph->ihl*4);
-   skb->h.raw = skb_pull(skb, sizeof(struct ip_esp_hdr) + esp->conf.ivlen);
-   skb->nh.raw += encap_len + sizeof(struct ip_esp_hdr) + esp->conf.ivlen;
-   memcpy(skb->nh.raw, workbuf, iph->ihl*4);
-   skb->nh.iph->tot_len = htons(skb->len);
+   skb->h.raw = __skb_pull(skb,
+   sizeof(struct ip_esp_hdr) + esp->conf.ivlen) -
+iph->ihl * 4;
 
return 0;
 
diff --git a/net/ipv4/ipcomp.c b/net/ipv4/ipcomp.c
index cd810f4..704937b 100644
--- a/net/ipv4/ipcomp.c
+++ b/net/ipv4/ipcomp.c
@@ -45,7 +45,6 @@
 static int ipcomp_decompress(struct xfrm_state *x, struct sk_buff *skb)
 {
int err, plen, dlen;
-   struct iphdr *iph;
struct ipcomp_data *ipcd = x->data;
u8 *start, *scratch;
struct crypto_tfm *tfm;
@@ -74,8 +73,6 @@

skb_put(skb, dlen - plen);
memcpy(skb->data, scratch, dlen);
-   iph = skb->nh.iph;
-   iph->tot_len = htons(dlen + iph->ihl * 4);
 out:   
put_cpu();
return err;
@@ -108,9 +105,8 @@
skb->nh.raw += sizeof(struct ip_comp_hdr);
memcpy(skb->nh.raw, &tmp_iph, tmp_iph.iph.ihl * 4);
iph = skb->nh.iph;
-   iph->tot_len = htons(ntohs(iph->tot_len) - sizeof(struct ip_comp_hdr));
iph->protocol = nexthdr;
-   skb->h.raw = skb->data;
+   skb->h.raw = skb->nh.raw;
err = ipcomp_decompress(x, skb);
 
 out:   
diff --git a/net/ipv4/xfrm4_mode_transport.c b/net/ipv4/xfrm4_mode_transport.c
index e46d9a4..c2e507e 100644
--- a/net/ipv4/xfrm4_mode_transport.c
+++ b/net/ipv4/xfrm4_mode_transport.c
@@ -40,6 +40,11 @@
 
 static int xfrm4_transport_input(struct xfrm_state *x, struct sk_buff *skb)
 {
+   if (skb->h.raw != skb->nh.raw)
+   skb->nh.raw = memmove(skb->h.raw, skb->nh.raw,
+ skb->nh.iph->ihl * 4);
+   skb->nh.iph->tot_len = htons(skb->len);
+   skb->h.raw = skb->data;
return 0;
 }
 
diff --git a/net/ipv4/xfrm4_mode_tunnel.c b/net/ipv4/xfrm4_mode_tunnel.c
index f8d880b..a304d1d 100644
--- a/net/ipv4/xfrm4_mode_tunnel.c
+++ b/net/ipv4/xfrm4_mode_tunnel.c
@@ -73,9 +73,11 @@
 
 static int xfrm4_tunnel_input(struct xfrm_state *x, struct sk_buff *skb)
 {
-   struct iphdr *iph = skb->nh.iph;
+   struct iphdr *iph;
int err = -EINVAL;
 
+   skb->nh.raw = skb->data;
+   iph = skb->nh.iph;
if (iph->protocol != IPPROTO_IPIP)
goto out;
if (!pskb_may_pull(skb, sizeof(struct iphdr)))
diff --git a/net/ipv4/xfrm4_tunnel.c b/net/ipv4/xfrm4_tunnel.c


Dscape ieee80211: enabling/disabling the radio

2006-05-06 Thread Ivo van Doorn
Hi,

While working on the rt2x00 driver, I keep hitting against some problems with 
scanning.
Basicly the dscape stack handles scanning in 2 ways, through the
passive_scan() handler in the ieee80211_hw structure, and by calling
the config() handler in the ieee80211_hw stucture.

The usage of the first handler, does not give any problems at this time.
The main source of problems during scanning in rt2x00 seems to come
when the config() handler is used.

In rt2x00 the config() handler schedules all configuration changes by using a 
workqueue,
this is required since several configuration changes in rt2x00 need sleeping 
and for
USB devices all register access requires sleeping. And the config() handler is 
often
called from interrupt context so it complains a lot when the workqueue is not 
used.

This seemed fine, untill the radio_enabled field was introduced to the 
configuration structure.
When the radio_enable field is set, the radio must be enabled, but enabling
the radio is something that can (at least in rt2x00) fail. So scheduling the 
enabling of the radio
to the workqueue is not something that is desired since the stack can not be 
notified that the
device is not able to enable the radio.

Moving the enabling of the radio outside the workqueue function and into the 
config()
handler results in scheduling while atomic issues since the enabling of the 
radio requires
sleeping for both PCI and USB devices.

Instead of using a config field radio_enabled, wouldn't it be better to add 2 
handlers
to the ieee80211_hw structure, something like enable_radio() and 
disable_radio()?
If these functions are called from normal context the dscape stack can still 
enable
and disable the radio whenever it is desired, and it is able to check the 
return value
to see if the request has actually succeeded.

What I am wondering about afterwards is what exactly should happen when the 
open()
and stop() handlers are being called? Because those are basicly intented to 
enable
and stop the radio as well. I checked bcm43xx to see what they do, and they 
don't seem
to check the radio_enabled field, so I don't know what they do besides enabling 
the radio.

Well this was just some stuff I have been trying to figure out while trying to 
solve several
rt2x00 bugs... ;)

Regards,

Ivo


pgp9RnjRgYeNa.pgp
Description: PGP signature


Re: Please pull upstream-fixes branch of wireless-2.6

2006-05-06 Thread John W. Linville
On Fri, May 05, 2006 at 09:12:09PM -0700, Stephen Hemminger wrote:
> Linus Torvalds wrote:
> >On Fri, 5 May 2006, Andrew Morton wrote:
> >  
> >>On Fri, 5 May 2006 21:06:18 -0400
> >>"John W. Linville" <[EMAIL PROTECTED]> wrote:
> >>
> >>>These are fixes intended for 2.6.17...thanks!
> >>>  
> >>Jeff is offline for a couple of weeks.   Please prepare a pull for Linus.
> >>
> >
> >Actually, while Jeff is off, Steve Hemminger is supposed to be the network 
> >driver overlord ("All bow down before the mighty Shemminger"), so please 
> >do synchronize with him.
> >
> >Of course, that might be just Steve taking a look and telling me "yeah, 
> >please pull directly from John".
> >
> > Linus
> >  
> I had a bunch ready for monday...

So, Stephen, you will pull from me and have Linus pull all from you?
Just trying to clarify the plan... :-)

It makes no difference to me.  I'll include the git url "just in case":

git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git 
upstream-fixes

Thanks!

John
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


2.4 kern: want to print TCP cwnd with every packet

2006-05-06 Thread George P Nychis
Hi,

I'd like to print the TCP cwnd for the sender, with every packet before it is 
sent out.  This way i could plot the sender window over time to show TCP's 
behavior in certain conditions.

I see in tcp_input.c several places where i could print the current window, but 
i'd have to add code in multiple places.  I was wondering if there is any 1 
place, right before a packet is sent out, that i could printk() tp->snd_cwnd

Thanks!
George

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fix IP-over-ATM and ARP interaction.

2006-05-06 Thread Simon Kelley
The classical IP over ATM code maintains its own IPv4 <-> 
ARP table, using the standard neighbour-table code. The
neigh_table_init function adds this neighbour table to a linked 
list of all neighbor tables which is used by the functions 
neigh_delete() neigh_add() and neightbl_set(), all called by 
the netlink code.

Once the ATM neighbour table is added to the list, there are two tables 
with family == AF_INET there, and  ARP entries sent via netlink go into 
the first table with matching family. This is indeterminate and often wrong.

To see the bug, on a kernel with CLIP enabled, create a standard IPv4
ARP entry by pinging an unused address on a local subnet. Then attempt
to complete that entry by doing

ip neigh replace  lladdr  nud reachable

Looking at the ARP tables by using 

ip neigh show

will reveal two ARP entries for the same address. One of these can be
found in /proc/net/arp, and the other in /proc/net/atm/arp.

This patch adds a new function, neigh_table_init_no_netlink() which
does everything the neigh_table_init() does, except add the table to
the netlink all-arp-tables chain. In addition neigh_table_init() has a
check that all tables on the chain have a distinct address family.
The init call in clip.c is changed to call neigh_table_init_no_netlink().

Since ATM ARP tables are rather more complicated than can currently be
handled by the available rtattrs in the netlink protocol, no
functionality is lost by this patch, and non-ATM ARP manipulation via
netlink is rescued. A more complete solution would involve a rtattr for 
ATM ARP entries and some way for the netlink code to give neigh_add 
and friends more information than just address family with which to find 
the correct ARP table.

Signed-off-by: Simon Kelley <[EMAIL PROTECTED]>
 

-- 

diff -Naur linux-2.6.16.11.orig/include/net/neighbour.h 
linux-2.6.16.11/include/net/neighbour.h
--- linux-2.6.16.11.orig/include/net/neighbour.h2006-04-24 
21:20:24.0 +0100
+++ linux-2.6.16.11/include/net/neighbour.h 2006-05-04 20:09:17.0 
+0100
@@ -211,6 +211,7 @@
 #define NEIGH_UPDATE_F_ADMIN   0x8000
 
 extern voidneigh_table_init(struct neigh_table *tbl);
+extern voidneigh_table_init_no_netlink(struct neigh_table 
*tbl);
 extern int neigh_table_clear(struct neigh_table *tbl);
 extern struct neighbour *  neigh_lookup(struct neigh_table *tbl,
 const void *pkey,
diff -Naur linux-2.6.16.11.orig/net/atm/clip.c linux-2.6.16.11/net/atm/clip.c
--- linux-2.6.16.11.orig/net/atm/clip.c 2006-04-24 21:20:24.0 +0100
+++ linux-2.6.16.11/net/atm/clip.c  2006-05-04 20:10:00.0 +0100
@@ -995,7 +995,7 @@
 
 static int __init atm_clip_init(void)
 {
-   neigh_table_init(&clip_tbl);
+   neigh_table_init_no_netlink(&clip_tbl);
 
clip_tbl_hook = &clip_tbl;
register_atm_ioctl(&clip_ioctl_ops);
diff -Naur linux-2.6.16.11.orig/net/core/neighbour.c 
linux-2.6.16.11/net/core/neighbour.c
--- linux-2.6.16.11.orig/net/core/neighbour.c   2006-04-24 21:20:24.0 
+0100
+++ linux-2.6.16.11/net/core/neighbour.c2006-05-04 20:07:55.0 
+0100
@@ -1322,8 +1322,7 @@
kfree(parms);
 }
 
-
-void neigh_table_init(struct neigh_table *tbl)
+void neigh_table_init_no_netlink(struct neigh_table *tbl)
 {
unsigned long now = jiffies;
unsigned long phsize;
@@ -1381,10 +1380,19 @@
 
tbl->last_flush = now;
tbl->last_rand  = now + tbl->parms.reachable_time * 20;
-   write_lock(&neigh_tbl_lock);
-   tbl->next   = neigh_tables;
-   neigh_tables= tbl;
-   write_unlock(&neigh_tbl_lock);
+}
+
+void neigh_table_init(struct neigh_table *tbl)
+{
+  struct neigh_table *tmp;
+
+  neigh_table_init_no_netlink(tbl);
+  write_lock(&neigh_tbl_lock);
+  for (tmp = neigh_tables; tmp; tmp = tmp->next)
+BUG_ON(tmp->family == tbl->family);
+  tbl->next= neigh_tables;
+  neigh_tables = tbl;
+  write_unlock(&neigh_tbl_lock);
 }
 
 int neigh_table_clear(struct neigh_table *tbl)
@@ -2655,6 +2663,7 @@
 EXPORT_SYMBOL(neigh_resolve_output);
 EXPORT_SYMBOL(neigh_table_clear);
 EXPORT_SYMBOL(neigh_table_init);
+EXPORT_SYMBOL(neigh_table_init_no_netlink);
 EXPORT_SYMBOL(neigh_update);
 EXPORT_SYMBOL(neigh_update_hhs);
 EXPORT_SYMBOL(pneigh_enqueue);

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: [Bugme-new] [Bug 6502] New: SIOCSIFHWBROADCAST needs compat layer

2006-05-06 Thread Andrew Morton


Begin forwarded message:

Date: Sat, 6 May 2006 08:24:25 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 6502] New: SIOCSIFHWBROADCAST needs compat layer


http://bugzilla.kernel.org/show_bug.cgi?id=6502

   Summary: SIOCSIFHWBROADCAST needs compat layer
Kernel Version: 2.6.15
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Most recent kernel where this bug did not occur: all kernels has this problem;
I've tested 2.6.8, 2.6.9, 2.6.15

Distribution:
Hardware Environment: ethernet interface
Software Environment: 32-bit libc with 64-bit kernel
Problem Description:
32-bit program running with 64-bit kernel fails to set hardware broadcast
address via ioctl(SIOCSIFHWBROADCAST).

Steps to reproduce:
Compile the following sample program:
#include 
#include 
#include 
#include 
#include 
#include 

main()
{
int s = socket(AF_INET, SOCK_DGRAM, 0);
struct ifreqreq;
int rc;

if (s < 0)
{
perror("failed to open socket");
return -1;
}
strcpy(req.ifr_name, "eth0");
req.ifr_hwaddr.sa_family = AF_LOCAL;
memset(req.ifr_hwaddr.sa_data, 0xff, 6);
rc = ioctl(s, SIOCSIFHWBROADCAST, &req);
if (rc != 0)
{
perror("ioctl failed");
return -1;
}
printf("ioctl(SIOCSIFHWBROADCAST) passed\n");
return 0;
}

When compiled to 64-bit binary, it works OK:
ioctl(SIOCSIFHWBROADCAST) passed
When compiled to 32-bit binary, it does not work:
ioctl failed: Invalid argument

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix IP-over-ATM and ARP interaction.

2006-05-06 Thread YOSHIFUJI Hideaki / 吉藤英明
In article <[EMAIL PROTECTED]> (at Sat, 06 May 2006 17:13:29 +0100), Simon 
Kelley <[EMAIL PROTECTED]> says:

> +void neigh_table_init(struct neigh_table *tbl)
> +{
> +  struct neigh_table *tmp;
> +
> +  neigh_table_init_no_netlink(tbl);
> +  write_lock(&neigh_tbl_lock);
> +  for (tmp = neigh_tables; tmp; tmp = tmp->next)
> +BUG_ON(tmp->family == tbl->family);
> +  tbl->next  = neigh_tables;
> +  neigh_tables   = tbl;
> +  write_unlock(&neigh_tbl_lock);
>  }
>  
>  int neigh_table_clear(struct neigh_table *tbl)

Please fix the coding style; use tab for indent, please.

--yoshfuji
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fix IP-over-ATM and ARP interaction - formatting fixed.

2006-05-06 Thread Simon Kelley
The classical IP over ATM code maintains its own IPv4 <-> 
ARP table, using the standard neighbour-table code. The
neigh_table_init function adds this neighbour table to a linked 
list of all neighbor tables which is used by the functions 
neigh_delete() neigh_add() and neightbl_set(), all called by 
the netlink code.

Once the ATM neighbour table is added to the list, there are two tables 
with family == AF_INET there, and  ARP entries sent via netlink go into 
the first table with matching family. This is indeterminate and often wrong.

To see the bug, on a kernel with CLIP enabled, create a standard IPv4
ARP entry by pinging an unused address on a local subnet. Then attempt
to complete that entry by doing

ip neigh replace  lladdr  nud reachable

Looking at the ARP tables by using 

ip neigh show

will reveal two ARP entries for the same address. One of these can be
found in /proc/net/arp, and the other in /proc/net/atm/arp.

This patch adds a new function, neigh_table_init_no_netlink() which
does everything the neigh_table_init() does, except add the table to
the netlink all-arp-tables chain. In addition neigh_table_init() has a
check that all tables on the chain have a distinct address family.
The init call in clip.c is changed to call neigh_table_init_no_netlink().

Since ATM ARP tables are rather more complicated than can currently be
handled by the available rtattrs in the netlink protocol, no
functionality is lost by this patch, and non-ATM ARP manipulation via
netlink is rescued. A more complete solution would involve a rtattr for 
ATM ARP entries and some way for the netlink code to give neigh_add 
and friends more information than just address family with which to find 
the correct ARP table.

Signed-off-by: Simon Kelley <[EMAIL PROTECTED]>
 

-- 

diff -Naur linux-2.6.16.11.orig/include/net/neighbour.h 
linux-2.6.16.11/include/net/neighbour.h
--- linux-2.6.16.11.orig/include/net/neighbour.h2006-04-24 
21:20:24.0 +0100
+++ linux-2.6.16.11/include/net/neighbour.h 2006-05-04 20:09:17.0 
+0100
@@ -211,6 +211,7 @@
 #define NEIGH_UPDATE_F_ADMIN   0x8000
 
 extern voidneigh_table_init(struct neigh_table *tbl);
+extern voidneigh_table_init_no_netlink(struct neigh_table 
*tbl);
 extern int neigh_table_clear(struct neigh_table *tbl);
 extern struct neighbour *  neigh_lookup(struct neigh_table *tbl,
 const void *pkey,
diff -Naur linux-2.6.16.11.orig/net/atm/clip.c linux-2.6.16.11/net/atm/clip.c
--- linux-2.6.16.11.orig/net/atm/clip.c 2006-04-24 21:20:24.0 +0100
+++ linux-2.6.16.11/net/atm/clip.c  2006-05-04 20:10:00.0 +0100
@@ -995,7 +995,7 @@
 
 static int __init atm_clip_init(void)
 {
-   neigh_table_init(&clip_tbl);
+   neigh_table_init_no_netlink(&clip_tbl);
 
clip_tbl_hook = &clip_tbl;
register_atm_ioctl(&clip_ioctl_ops);
diff -Naur linux-2.6.16.11.orig/net/core/neighbour.c 
linux-2.6.16.11/net/core/neighbour.c
--- linux-2.6.16.11.orig/net/core/neighbour.c   2006-04-24 21:20:24.0 
+0100
+++ linux-2.6.16.11/net/core/neighbour.c2006-05-06 17:44:30.0 
+0100
@@ -1322,8 +1322,7 @@
kfree(parms);
 }
 
-
-void neigh_table_init(struct neigh_table *tbl)
+void neigh_table_init_no_netlink(struct neigh_table *tbl)
 {
unsigned long now = jiffies;
unsigned long phsize;
@@ -1381,7 +1380,16 @@
 
tbl->last_flush = now;
tbl->last_rand  = now + tbl->parms.reachable_time * 20;
+}
+
+void neigh_table_init(struct neigh_table *tbl)
+{
+   struct neigh_table *tmp;
+   
+   neigh_table_init_no_netlink(tbl);
write_lock(&neigh_tbl_lock);
+   for (tmp = neigh_tables; tmp; tmp = tmp->next)
+   BUG_ON(tmp->family == tbl->family);
tbl->next   = neigh_tables;
neigh_tables= tbl;
write_unlock(&neigh_tbl_lock);
@@ -2655,6 +2663,7 @@
 EXPORT_SYMBOL(neigh_resolve_output);
 EXPORT_SYMBOL(neigh_table_clear);
 EXPORT_SYMBOL(neigh_table_init);
+EXPORT_SYMBOL(neigh_table_init_no_netlink);
 EXPORT_SYMBOL(neigh_update);
 EXPORT_SYMBOL(neigh_update_hhs);
 EXPORT_SYMBOL(pneigh_enqueue);


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fw: [Bugme-new] [Bug 6495] New: Vlan MTU Fragmentation

2006-05-06 Thread Ben Greear

Andrew Morton wrote:


I guess I can type simple commands and add printks.  Do you have time to
take a look at the driver and suggest what I should be looking for?


I can only offer vague hints:

TX usually works, but RX often has isues.  There is usually a bit
or two that needs setting to enable the larger than MTU pkt RX.

It seems some older chipsets had issues with doing checksum offload
for VLAN packets, so could try disabling the UDP/TCP checksum logic
and see if that helps.

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] bcm43xx-d80211: fix whitespace

2006-05-06 Thread Michael Buesch
On Friday 05 May 2006 21:59, Stefano Brivio wrote:
> Fix whitespace.
> 
> Signed-off-by: Stefano Brivio <[EMAIL PROTECTED]>
> 
> Index: wireless-dev/drivers/net/wireless/bcm43xx/bcm43xx_main.c
> ===
> --- wireless-dev.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c 
> 2006-05-05 00:50:00.370034536 +0200
> +++ wireless-dev/drivers/net/wireless/bcm43xx/bcm43xx_main.c  2006-05-05 
> 02:43:44.981535888 +0200

All your d80211 patches are _not_ against the dscape port
of the bcm43xx driver. The dscape port is located at:
wireless-dev/drivers/net/wireless/d80211/bcm43xx/

> @@ -128,13 +128,13 @@
>   static struct pci_device_id bcm43xx_pci_tbl[] = {
>   /* Broadcom 4303 802.11b */
>   { PCI_VENDOR_ID_BROADCOM, 0x4301, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
> - /* Broadcom 4307 802.11b */
> + /* Broadcom 4307 802.11b */
>   { PCI_VENDOR_ID_BROADCOM, 0x4307, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
> - /* Broadcom 4318 802.11b/g */
> + /* Broadcom 4318 802.11b/g */
>   { PCI_VENDOR_ID_BROADCOM, 0x4318, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
>   /* Broadcom 4306 802.11b/g */
>   { PCI_VENDOR_ID_BROADCOM, 0x4320, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
> - /* Broadcom 4306 802.11a */
> + /* Broadcom 4306 802.11a */
>  //   { PCI_VENDOR_ID_BROADCOM, 0x4321, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
>   /* Broadcom 4309 802.11a/b/g */
>   { PCI_VENDOR_ID_BROADCOM, 0x4324, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
> 
> 
> --
> Ciao
> Stefano
> ___
> Bcm43xx-dev mailing list
> Bcm43xx-dev@lists.berlios.de
> http://lists.berlios.de/mailman/listinfo/bcm43xx-dev
> 

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


PATCH] TCP congestion module: add TCP-LP supporting for 2.6.16.14

2006-05-06 Thread Wong Edison

TCP Low Priority is a distributed algorithm whose goal is to utilize only
 the excess network bandwidth as compared to the ``fair share`` of
 bandwidth as targeted by TCP. Available from:
   http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf

Original Author:
 Aleksandar Kuzmanovic <[EMAIL PROTECTED]>

See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
As of 2.6.13, Linux supports pluggable congestion control algorithms.
Due to the limitation of the API, we take the following changes from
the original TCP-LP implementation:
 o We use newReno in most core CA handling. Only add some checking
   within cong_avoid.
 o Error correcting in remote HZ, therefore remote HZ will be keeped
   on checking and updating.
 o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne
   OWD have a similar meaning as RTT. Also correct the buggy formular.
 o Handle reaction for Early Congestion Indication (ECI) within
   pkts_acked, as mentioned within pseudo code.
 o OWD is handled in relative format, where local time stamp will in
   tcp_time_stamp format.

Port from 2.4.19 to 2.6.16 as module by:
 Wong Hoi Sing Edison <[EMAIL PROTECTED]>
 Hung Hing Lun <[EMAIL PROTECTED]>

Signed-off-by: Wong Hoi Sing Edison <[EMAIL PROTECTED]>

---

diff -urpN linux-2.6.16.14/net/ipv4/Kconfig linux/net/ipv4/Kconfig
--- linux-2.6.16.14/net/ipv4/Kconfig2006-05-05 08:03:45.0 +0800
+++ linux/net/ipv4/Kconfig  2006-05-07 01:41:33.0 +0800
@@ -531,6 +531,27 @@ config TCP_CONG_SCALABLE
properties, though is known to have fairness issues.
See http://www-lce.eng.cam.ac.uk/~ctk21/scalable/

+config TCP_CONG_LP
+   tristate "TCP Low Priority"
+   depends on EXPERIMENTAL
+   default n
+   ---help---
+   TCP Low Priority (TCP-LP), a distributed algorithm whose goal is
+   to utiliza only the excess network bandwidth as compared to the
+   ``fair share`` of bandwidth as targeted by TCP.
+   See http://www-ece.rice.edu/networks/TCP-LP/
+
+config TCP_CONG_LP_DEBUG
+   bool "TCP-LP Debug"
+   depends on TCP_CONG_LP
+   default n
+   ---help---
+   Turn on/off the debug message for TCP-LP. The debug message will
+   print to default kernel debug log file, e.g. /var/log/debug as
+   default. You can use dmesg to obtain the log too.
+   
+   If unsure, say N.
+
endmenu

config TCP_CONG_BIC
diff -urpN linux-2.6.16.14/net/ipv4/Makefile linux/net/ipv4/Makefile
--- linux-2.6.16.14/net/ipv4/Makefile   2006-05-05 08:03:45.0 +0800
+++ linux/net/ipv4/Makefile 2006-05-07 01:41:33.0 +0800
@@ -41,6 +41,7 @@ obj-$(CONFIG_TCP_CONG_HYBLA) += tcp_hybl
obj-$(CONFIG_TCP_CONG_HTCP) += tcp_htcp.o
obj-$(CONFIG_TCP_CONG_VEGAS) += tcp_vegas.o
obj-$(CONFIG_TCP_CONG_SCALABLE) += tcp_scalable.o
+obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o

obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \
  xfrm4_output.o
diff -urpN linux-2.6.16.14/net/ipv4/tcp_lp.c linux/net/ipv4/tcp_lp.c
--- linux-2.6.16.14/net/ipv4/tcp_lp.c   1970-01-01 08:00:00.0 +0800
+++ linux/net/ipv4/tcp_lp.c 2006-05-07 01:41:33.0 +0800
@@ -0,0 +1,343 @@
+/*
+ * TCP Low Priority (TCP-LP)
+ *
+ * TCP Low Priority is a distributed algorithm whose goal is to utilize only
+ *   the excess network bandwidth as compared to the ``fair share`` of
+ *   bandwidth as targeted by TCP. Available from:
+ * http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf
+ *
+ * Original Author:
+ *   Aleksandar Kuzmanovic <[EMAIL PROTECTED]>
+ *
+ * See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
+ * As of 2.6.13, Linux supports pluggable congestion control algorithms.
+ * Due to the limitation of the API, we take the following changes from
+ * the original TCP-LP implementation:
+ *   o We use newReno in most core CA handling. Only add some checking
+ * within cong_avoid.
+ *   o Error correcting in remote HZ, therefore remote HZ will be keeped
+ * on checking and updating.
+ *   o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne
+ * OWD have a similar meaning as RTT. Also correct the buggy formular.
+ *   o Handle reaction for Early Congestion Indication (ECI) within
+ * pkts_acked, as mentioned within pseudo code.
+ *   o OWD is handled in relative format, where local time stamp will in
+ * tcp_time_stamp format.
+ *
+ * Port from 2.4.19 to 2.6.16 as module by:
+ *   Wong Hoi Sing Edison <[EMAIL PROTECTED]>
+ *   Hung Hing Lun <[EMAIL PROTECTED]>
+ *
+ * Version: $Id: tcp_lp.c,v 1.22 2006-05-02 18:18:19 hswong3i Exp $
+ */
+
+#include 
+#include 
+#include 
+
+#ifndef CONFIG_TCP_CONG_LP_DEBUG
+#define CONFIG_TCP_CONG_LP_DEBUG 0
+#endif
+
+/* resolution of owd */
+#define LP_RESOL   1000
+
+/**
+ * enum tcp_lp_state
+ * @LP_VALID_RHZ: is remote HZ valid?
+ * @LP_VALID_OWD: is OWD valid?
+ * @LP_WITHIN_THR: are we within threshold?
+ * @LP_WITHIN_INF: are we within in

Re: Associate on 'ifconfig up'

2006-05-06 Thread David Woodhouse
On Fri, 2006-05-05 at 17:38 +0100, David Woodhouse wrote:
> I still need this hack to work around the fact that softmac doesn't
> attempt to associate when we bring the device up...

It'd be quite good to get this fixed in 2.6.17 too. Otherwise, the
device doesn't manage to associate if you use the fairly common sequence
of iwconfig then dhclient.

It's a bit of an evil hack and it should really be fixed in softmac --
but it's only moving an _existing_ hack from one place in the driver to
another.

Signed-off-by: David Woodhouse <[EMAIL PROTECTED]>

--- linux-2.6.16.ppc/drivers/net/wireless/bcm43xx/bcm43xx_main.c.orig   
2006-05-05 17:14:26.0 +0100
+++ linux-2.6.16.ppc/drivers/net/wireless/bcm43xx/bcm43xx_main.c
2006-05-05 17:15:19.0 +0100
@@ -3263,6 +3263,9 @@ static int bcm43xx_init_board(struct bcm
bcm43xx_sysfs_register(bcm);
//FIXME: check for bcm43xx_sysfs_register failure. This function is a 
bit messy regarding unwinding, though...
 
+   /*FIXME: This should be handled by softmac instead. */
+   schedule_work(&bcm->softmac->associnfo.work);
+
assert(err == 0);
 out:
return err;
@@ -3937,9 +3940,6 @@ static int bcm43xx_resume(struct pci_dev
 
netif_device_attach(net_dev);

-   /*FIXME: This should be handled by softmac instead. */
-   schedule_work(&bcm->softmac->associnfo.work);
-
dprintk(KERN_INFO PFX "Device resumed.\n");
 
return 0;

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Proposed structure for Regulatory/Geographical Wireless database

2006-05-06 Thread Michael Buesch
On Friday 05 May 2006 22:14, you wrote:
> # Groups follow countries
> #
> Group 0 - Unspecified Country
> #
> # Band  Ch. Range   Ch. Spacing Power   Flags
  ^
Aren't there countries around, where there are gaps in the
allowed channel numbers? (Especially for 802.11a) So it would
not be an allowed "range", but an allowed list of channels.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Proposed structure for Regulatory/Geographical Wireless database

2006-05-06 Thread Larry Finger

Michael Buesch wrote:

On Friday 05 May 2006 22:14, you wrote:

# Groups follow countries
#
Group 0 - Unspecified Country
#
# Band  Ch. Range   Ch. Spacing Power   Flags

  ^
Aren't there countries around, where there are gaps in the
allowed channel numbers? (Especially for 802.11a) So it would
not be an allowed "range", but an allowed list of channels.



Yes, but the gaps are only in 802.11a that I know about. If there is a gap, then I use a second line 
as was shown in the example for the standard EU specs. In most cases, other factors change as well. 
I initially was going to specify the allowed channels, but the tables got very long. This way is 
more compact.


Larry

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] netdev: create attribute_groups with class_device_add

2006-05-06 Thread Greg KH
On Fri, May 05, 2006 at 11:00:50PM -0700, David S. Miller wrote:
> From: Greg KH <[EMAIL PROTECTED]>
> Date: Fri, 5 May 2006 21:08:39 -0700
> 
> > On Fri, May 05, 2006 at 06:41:58PM -0700, David S. Miller wrote:
> > > From: Stephen Hemminger <[EMAIL PROTECTED]>
> > > Date: Fri, 21 Apr 2006 12:54:38 -0700
> > > 
> > > > Atomically create attributes when class device is added. This avoids the
> > > > race between registering class_device (which generates hotplug event),
> > > > and the creation of attribute groups.
> > > > 
> > > > Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
> > > 
> > > Did the first patch that adds the attribute_group creation
> > > infrastructure go in so that we can get this networking fix in?
> > 
> > It and the netdev patch are setting in my tree which is showing up in
> > -mm.  I'm going to wait until 2.6.17 is out to send the first patch.  I
> > can send the second one then too if you want me to (probably make it
> > easier that way.)
> 
> The networking bit by Stephen is a bug fix.

Good point.  Ok, feel free to send both patches to Linus now if you
want.  You can add my:
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>
to the driver core change as I have no problems with it.
Or I can send them on Monday if you wish.  Whatever is easier for you.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] netdev: create attribute_groups with class_device_add

2006-05-06 Thread David S. Miller
From: Greg KH <[EMAIL PROTECTED]>
Date: Sat, 6 May 2006 15:59:04 -0700

> On Fri, May 05, 2006 at 11:00:50PM -0700, David S. Miller wrote:
> > The networking bit by Stephen is a bug fix.
> 
> Good point.  Ok, feel free to send both patches to Linus now if you
> want.  You can add my:
>   Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>
> to the driver core change as I have no problems with it.
> Or I can send them on Monday if you wish.  Whatever is easier for you.

I'll take care of pushing the changes, thanks Greg.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IPv6 connect() from site-local to global IPv6 address.

2006-05-06 Thread David Woodhouse
On Sat, 2006-05-06 at 11:39 +0900, YOSHIFUJI Hideaki wrote:
> In article <[EMAIL PROTECTED]> (at Sat, 06 May 2006 01:53:21 +0100), David 
> Woodhouse <[EMAIL PROTECTED]> says:
> 
> > There is a default route, because I believe that's the only thing that
> > radvd can do. I cannot advertise a route to _only_ fec0::/16, can I?
> 
> Yes, you can, via Route Information option.
> (>= 2.6.17-rc1 support this.)

Hm. Has radvd also been updated to send it, or did I miss it when I
looked for such an option?

Either way, this isn't particularly useful to me yet, since no deployed
systems support it.

> Anyway, it is valid to use (obsolete) site-local source address
> for global destination address.
> The problem seems that router does NOT send ICMPv6 destination
> unreachable to the sender. I don't know why, but it SHOULD. 

I'll pursue that question later. It wouldn't be _sufficient_ since there
are (buggy) programs, including Evolution, which will not fall back to
the second and subsequent addresses from getaddrinfo() -- they'll just
give up when the first attempt to connect fails. So we really do need
the IPv4 address to be listed _first_ in the results, as it used to be.

Glibc _used_ to do what we want -- I always attributed it to Rule 2 of
the destination address selection, without looking hard at the
implementation details.

Let's forget about any details of my current implementation and I'll ask
a _simple_ question...

I have machines on an internal company network, which is all RFC1918
IPv4 addresses and has connectivity by NAT to the outside world. These
machines are mostly Linux, of various versions. 

I wish to deploy IPv6 internally so that we can develop and test IPv6
support. There is _no_ chance of getting proper IPv6 connectivity to the
outside world through the corporate firewall. I'd like IPv6 to be usable
_internally_ though, without breaking connectivity to the outside world
over IPv4.

How should I do this?


In the past, I've done it with site-local addresses. A machine on each
Ethernet subnet runs radvd, and machines pick up a site-local address
(and default route) automatically. The machines running radvd also have
IPv6-over-IPv4 tunnels for routing between subnets. 

Glibc's getaddrinfo() has in the past given me optimal behaviour. If
connecting to a remote machine which has a site-local IPv6 address, it's
favoured that site-local address. If connecting to a remote machine
which has a _Global_ IPv6 address, it's given the IPv4 address first
instead.

On new kernels, however, glibc has started to return IPv6 addresses even
for external machines which can't be reached. Hence my mail about
$SUBJECT. If I'm doing this the wrong way, what _should_ I be doing?

Uli's response has been to switch glibc so that it _always_ favours IPv4
over IPv6, which AIUI basically means that IPv6 will never get _any_
usage on Linux machines unless in an IPv6-only environment rather than
dual-stack. I.e. Unless I publish _only_ an  record for my server
instead of both A and  records, I won't get any IPv6 traffic. I
think it would be best to avoid that situation.

-- 
dwmw2
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IPv6 connect() from site-local to global IPv6 address.

2006-05-06 Thread David Woodhouse
On Sat, 2006-05-06 at 09:19 +0900, YOSHIFUJI Hideaki wrote:
> You have compatible address.
> Do you really use the tunnel? How did you configure it? 

Sorry, I should have shown a strace from a different machine. Try this
one from an autoconfigured machine...

socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, 
"2001:8b0:10b:1:20d:93ff:fe7a:3f2c", &sin6_addr), sin6_flowinfo=0, 
sin6_scope_id=0}, 28) = 0
getsockname(3, {sa_family=AF_INET6, sin6_port=htons(32837), inet_pton(AF_INET6, 
"fec0::1:202:b3ff:fe03:45c1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 
[28]) = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(80), 
sin_addr=inet_addr("81.187.2.168")}, 16) = 0
getsockname(3, {sa_family=AF_INET, sin_port=htons(32837), 
sin_addr=inet_addr("172.16.18.64")}, [16]) = 0
Trying 2001:8b0:10b:1:20d:93ff:fe7a:3f2c...

There is a default route, because I believe that's the only thing that
radvd can do. I cannot advertise a route to _only_ fec0::/16, can I?


(The machine I showed you before was a router between networks --
the ::172.16.18.67 address was on its tunnel interface, and was added
automatically when I created the tunnel and brought it up.) 

-- 
dwmw2
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] netdev sysfs failure handling

2006-05-06 Thread David S. Miller
From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Fri, 21 Apr 2006 13:42:05 -0700

> In case of sysfs failure, don't let device be brought up.
> It can be cleared by unregister_netdevice so module can be unloaded
> normally.
> 
> Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>

I'm not so sure about this, a hot plug event could clear that bit too.

The problem I think is that here we've structured things such that we
can't handle the error properly and pass it back to the
register_netdevice() caller because we do the sysfs registry call in
the rtnl_unlock() todo list execution.

Next, even if you prevent the device from being brought up, people
can still assign IP addresses and do other stuff to the device so
it still sort of behaves as if it is there.

It would therefore be the best if we can do this stuff inside of
register_netdevice(), then handle and propagate any errors correctly.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] netdev: hotplug napi race cleanup

2006-05-06 Thread David S. Miller
From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Mon, 24 Apr 2006 15:23:41 -0700

> This follows after the earlier two patches.
> 
> Change the initialization of the class device portion of the net device
> to be done earlier, so that any races before registration completes are
> harmless.  Add a mutex to avoid changes to netdevice during the
> class device registration. 
> 
> Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>

I'm not going to apply this patch and instead request that we think
about why this problem exists in the first place.

This patch is even stronger evidence that doing the sysfs registry in
the todo list processing is wrong.  If you can legally do this while
holding the rtnl semaphore, you can just as equally do it inside of
register_netdevice() which is where it truly belongs.

Then you can handle errors properly, unwind the state, and return the
error to the caller instead of just losing the error and leaving the
device in a half-registered state.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] irda-usb: use NULL instead of 0

2006-05-06 Thread David S. Miller
From: "Randy.Dunlap" <[EMAIL PROTECTED]>
Date: Mon, 1 May 2006 15:31:34 -0700

> Use NULL instead of 0 for a null pointer value (sparse warning):
> drivers/net/irda/irda-usb.c:1781:30: warning: Using plain integer as NULL 
> pointer
> 
> Correct timeout argument to use milliseconds instead of jiffies.
> 
> Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>

Applied, thanks Randy.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Proposed structure for Regulatory/Geographical Wireless database

2006-05-06 Thread Larry Finger

Jouni Malinen wrote:

On Fri, May 05, 2006 at 03:14:35PM -0500, Larry Finger wrote:

The driver may not know the country code, so there should be mechanism
for user space to override this.


Do you think an environment variable would suffice, or do you propose another 
scheme?

* Checksum routines will be used to validate the data base. Such a simple 
scheme will not inhibit anyone with moderate skills from hacking the 
channel/power settings, but such hacking will require some effort.


I did not see this included in the example file. Did you have more
detailed plans on how this would be done?


I was anticipating storing the output of an md5sum command in a separate file and comparing the 
contents of that file with one computed for the database when the daemon initializes. Is there a 
better scheme?


* Each channel in the resulting kernel data structure will have appropriate 
flags set indicating if it is to be used indoors, outdoors, or both. In 
addition, if the channel should be used only for passive scanning, a 
suitable flag will be set. In the 2.4 GHz band, a flag will indicate if it 
should be used for 802.11b, otherwise both b and g mode will be assumed. In 
the 5.0 GHz bands, a flag will be set if the channel is to conform with 
802.11h or 802.11a standards.


802.11h, radar detection, and DFS may need to be more complex than just
a one-bit value of it being enabled. Countries may have different
requirements for different areas related to 802.11h..


I'm afraid that I'm not quite ready for the complexity of 802.11h. Obviously, I 
need to do more reading.

The database consists of two sections. The first relates the Country Codes 
to a wireless group. The second section describes the channel parameters 
for the groups. Shown below is a fragment showing the Country Code - Group 
info for a few countries and the definitions for a few of the groups.


One way to compress this and possible make maintaining quite a bit
easier would be to use two different set of groups: one for 2.4 GHz band
and another one for 5 GHz band. Many countries share the same
requirements for 2.4 GHz, but have different 5 GHz requirements.. This
is not really a requirement, but could end up making this easier to use.


I don't think it makes too much difference, but I will consider your suggestion as the database 
starts to be more complete.



Number of Countries: 100
Number of Groups: 15


These are not really needed and unless a tool is used to update this
file, they will most likely end up being out of sync at some point ;-).
The parser can just read through the file twice if it need to know these
numbers before parsing (though, that should not really be needed with
dynamic data structures)..


Your point is well taken. I will remove that data.


# group Country CodeDescription
#
1   AT  Austria (Standard EU)
1   DE  Germany (Standard EU)
2   FRI France Indoor (Not Guyana or La Reunion)
3   FRO France Outdoor (Not Guyana or La Reunion)
4   FR1 French Departments of Guyana and La Reunion Indoor
5   FR2 French Departments of Guyana and La Reunion Outdoor


Country code has to be two characters to fit into country IE..


This problem can be resolved for most of France as long as the driver supplies the country code and 
the indoor/outdoor flag. The table would then be:


Group 2 - France (Not Guyana or La Reunion)
#
bg1 -   8   1100B
bg9 -  13   1100I
bg9 -  13   1 10O
h36 -  48   4200I
h52 -  64   4200IP
h   100 - 140   4   1000IP
h   100 - 140   4   1000OP

The details for the two unique French departments may have to come from the still undetermined 
802.11h information.



AT and DE are a good example of possible use for different 2.4 GHz and 5
GHz groups.. If I remember correctly, they have the same rules for 2.4
GHz, but different for 5 GHz.. (unless--of course--they already changed
them since I looked last time.. ;-)


Yes they have. Following the decision contained in 
http://europa.eu.int/eur-lex/lex/LexUriServ/LexUriServ.do?uri=CELEX:32005D0513:EN:NOT, all EU 
members and candidates are to adopt the same standards. Most already have. The differences are 
outlined in Appendix 3 of the the document downloaded through 
http://www.ero.dk/documentation/docs/docfiles.asp?docid=1622&wd=N. Things are changing so fast that 
the information for Greece has already changed. The bottom line is that for most EU countries, the 
requirements are identical.



# Ch. Range - Minimum and Maximum Channels for this range
# Ch. Spacing - Number of channels between adjacent entries


Other option would be to use start channel and number of channels.
Channel spacing is also fixed in practice (1 for 2.4 GHz, 4 for 5 GHz),
so i

Re: dBm cutoff at -1dBm is too low

2006-05-06 Thread Pavel Roskin
On Fri, 2006-05-05 at 10:28 -0700, Jean Tourrilhes wrote:
>   Note that the main limitation is that before I introduced the
> explicit IW_QUAL_DBM in WE-19, the way to know if the value was
> relative or dBm was to use the 'sign' of it, i.e. value above 0 were
> non-dBm. The test is a few line before this snipset :
> 
> -
>   /* Check if the statistics are in dBm or relative */
>   if((qual->updated & IW_QUAL_DBM)
>|| (qual->level > range->max_qual.level))
>   {
> -
> 
>   There are still quite a few drivers which have not been
> converted to use IW_QUAL_DBM, so I don't want to drop the backward
> compatibility yet.

But shouldn't you trust the drivers using IW_QUAL_DBM, whether the value
is positive or negative?

>   Second, the measurement is useful mostly in marginal
> conditions. When signal is great, you don't really care, when signal
> is low, you want to tweak your system to improve reception.

The problem is the driver has to take care of it.  It cannot just take
the dBm value from the card and pass it to userspace.  It has to limit
the value at -1.  Otherwise iwconfig would show -256dBm or something
like that.  I can imagine that some GUI can decide that the connection
has become very bad, and that would confuse the user.

> > Wouldn't it be better to put the cutoff at a higher value?  The simplest
> > approach would be to treat qual->level and qual->noise as signed char,
> > which would put the cutoff and 127dBm.  500 gigawatts should be enough
> > for everyone :-)
> 
>   FCC says Tx 1W @ 2.4 GHz, ETSI says Tx 100mW @ 2.4 Gz. Yeah,
> you could use directional antennas. So, realistically, we only need to
> extend to +30dBm.
>   On the other hand, I expect that with MIMO and UWB we would
> start to receive signal weaker than what we currently do, and you
> don't want to cutoff the bottom of the range (is -128dBm enough ?).
>   I tried to use 'signed' in the struct a long while ago, and
> for some reason it broke left and right, I don't remember the
> details. So, whatever we do, it would not be straightforward.

I suggest -192dBm to 63dBm.  That's enough padding on both sides, so
that the drivers can just pass the firmware value without checking.

-- 
Regards,
Pavel Roskin

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [rfc][patch] ipvs: use proper timeout instead of fixed value

2006-05-06 Thread Horms
On Fri, May 05, 2006 at 02:57:26PM -0400, Andy Gospodarek wrote:
> On Fri, May 05, 2006 at 12:20:54PM +0900, Horms wrote:
> > 
> > Sorry, I missunderstood your patch completely the first time around.
> > Yes I think this is an excellent idea. Assuming its tested and works
> > I'm happy to sign off on it and prod DaveM.
> 
> Horms,
> 
> I'll get a setup together and post results when I have them.

I was thinking that it would be nice if the timeout could be sent over
the wire, though that might bring in some compatibility issues,
and thus your approach might be the best idea.

-- 
Horms   http://www.vergenet.net/~horms/

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] TCP congestion module: add TCP-LP supporting for 2.6.16.14

2006-05-06 Thread Wong Edison

TCP Low Priority is a distributed algorithm whose goal is to utilize only
the excess network bandwidth as compared to the ``fair share`` of
bandwidth as targeted by TCP. Available from:
  http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf

Original Author:
Aleksandar Kuzmanovic <[EMAIL PROTECTED]>

See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
As of 2.6.13, Linux supports pluggable congestion control algorithms.
Due to the limitation of the API, we take the following changes from
the original TCP-LP implementation:
o We use newReno in most core CA handling. Only add some checking
  within cong_avoid.
o Error correcting in remote HZ, therefore remote HZ will be keeped
  on checking and updating.
o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne
  OWD have a similar meaning as RTT. Also correct the buggy formular.
o Handle reaction for Early Congestion Indication (ECI) within
  pkts_acked, as mentioned within pseudo code.
o OWD is handled in relative format, where local time stamp will in
  tcp_time_stamp format.

Port from 2.4.19 to 2.6.16 as module by:
Wong Hoi Sing Edison <[EMAIL PROTECTED]>
Hung Hing Lun <[EMAIL PROTECTED]>

Signed-off-by: Wong Hoi Sing Edison <[EMAIL PROTECTED]>

---

diff -urpN linux-2.6.16.14/net/ipv4/Kconfig linux/net/ipv4/Kconfig
--- linux-2.6.16.14/net/ipv4/Kconfig2006-05-05 08:03:45.0 +0800
+++ linux/net/ipv4/Kconfig  2006-05-07 01:41:33.0 +0800
@@ -531,6 +531,27 @@ config TCP_CONG_SCALABLE
  properties, though is known to have fairness issues.
  See http://www-lce.eng.cam.ac.uk/~ctk21/scalable/

+config TCP_CONG_LP
+   tristate "TCP Low Priority"
+   depends on EXPERIMENTAL
+   default n
+   ---help---
+   TCP Low Priority (TCP-LP), a distributed algorithm whose goal is
+   to utiliza only the excess network bandwidth as compared to the
+   ``fair share`` of bandwidth as targeted by TCP.
+   See http://www-ece.rice.edu/networks/TCP-LP/
+
+config TCP_CONG_LP_DEBUG
+   bool "TCP-LP Debug"
+   depends on TCP_CONG_LP
+   default n
+   ---help---
+   Turn on/off the debug message for TCP-LP. The debug message will
+   print to default kernel debug log file, e.g. /var/log/debug as
+   default. You can use dmesg to obtain the log too.
+
+   If unsure, say N.
+
endmenu

config TCP_CONG_BIC
diff -urpN linux-2.6.16.14/net/ipv4/Makefile linux/net/ipv4/Makefile
--- linux-2.6.16.14/net/ipv4/Makefile   2006-05-05 08:03:45.0 +0800
+++ linux/net/ipv4/Makefile 2006-05-07 01:41:33.0 +0800
@@ -41,6 +41,7 @@ obj-$(CONFIG_TCP_CONG_HYBLA) += tcp_hybl
obj-$(CONFIG_TCP_CONG_HTCP) += tcp_htcp.o
obj-$(CONFIG_TCP_CONG_VEGAS) += tcp_vegas.o
obj-$(CONFIG_TCP_CONG_SCALABLE) += tcp_scalable.o
+obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o

obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \
xfrm4_output.o
diff -urpN linux-2.6.16.14/net/ipv4/tcp_lp.c linux/net/ipv4/tcp_lp.c
--- linux-2.6.16.14/net/ipv4/tcp_lp.c   1970-01-01 08:00:00.0 +0800
+++ linux/net/ipv4/tcp_lp.c 2006-05-07 01:41:33.0 +0800
@@ -0,0 +1,343 @@
+/*
+ * TCP Low Priority (TCP-LP)
+ *
+ * TCP Low Priority is a distributed algorithm whose goal is to utilize only
+ *   the excess network bandwidth as compared to the ``fair share`` of
+ *   bandwidth as targeted by TCP. Available from:
+ * http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf
+ *
+ * Original Author:
+ *   Aleksandar Kuzmanovic <[EMAIL PROTECTED]>
+ *
+ * See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
+ * As of 2.6.13, Linux supports pluggable congestion control algorithms.
+ * Due to the limitation of the API, we take the following changes from
+ * the original TCP-LP implementation:
+ *   o We use newReno in most core CA handling. Only add some checking
+ * within cong_avoid.
+ *   o Error correcting in remote HZ, therefore remote HZ will be keeped
+ * on checking and updating.
+ *   o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne
+ * OWD have a similar meaning as RTT. Also correct the buggy formular.
+ *   o Handle reaction for Early Congestion Indication (ECI) within
+ * pkts_acked, as mentioned within pseudo code.
+ *   o OWD is handled in relative format, where local time stamp will in
+ * tcp_time_stamp format.
+ *
+ * Port from 2.4.19 to 2.6.16 as module by:
+ *   Wong Hoi Sing Edison <[EMAIL PROTECTED]>
+ *   Hung Hing Lun <[EMAIL PROTECTED]>
+ *
+ * Version: $Id: tcp_lp.c,v 1.22 2006-05-02 18:18:19 hswong3i Exp $
+ */
+
+#include 
+#include 
+#include 
+
+#ifndef CONFIG_TCP_CONG_LP_DEBUG
+#define CONFIG_TCP_CONG_LP_DEBUG 0
+#endif
+
+/* resolution of owd */
+#define LP_RESOL   1000
+
+/**
+ * enum tcp_lp_state
+ * @LP_VALID_RHZ: is remote HZ valid?
+ * @LP_VALID_OWD: is OWD valid?
+ * @LP_WITHIN_THR: are we within threshold?
+ * @LP_WITHIN_INF: are we within inference?
+ *
+ * TCP-LP's sta

Re: [PATCH] TCP congestion module: add TCP-LP supporting for 2.6.16.14

2006-05-06 Thread David S. Miller

How many times are you going to post this same patch over and over
again?  Please don't do that, thank you.

We all saw it the first time.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html