Re: Linux v2.6.16-rc6

2006-03-12 Thread Willy Tarreau
On Sat, Mar 11, 2006 at 06:39:04PM -0800, David S. Miller wrote:
 From: Michal Piotrowski [EMAIL PROTECTED]
 Date: Sun, 12 Mar 2006 02:51:40 +0100
 
  I have noticed this warnings
  TCP: Treason uncloaked! Peer 82.113.55.2:11759/50967 shrinks window
  148470938:148470943. Repaired.
  TCP: Treason uncloaked! Peer 82.113.55.2:11759/50967 shrinks window
  148470938:148470943. Repaired.
  TCP: Treason uncloaked! Peer 82.113.55.2:11759/59768 shrinks window
  1124211698:1124211703. Repaired.
  TCP: Treason uncloaked! Peer 82.113.55.2:11759/59768 shrinks window
  1124211698:1124211703. Repaired.
  
  It maybe problem with ktorrent.
 
 It is a problem with the remote TCP implementation, it is
 illegally advertising a smaller window that it previously
 did.

on 2005/10/27, Herbert Xu provided a patch merged in 2.6.14 to fix some
erroneous occurences of this message (some of them appeared with Linux
on the other side). It would be interesting to know whether the peer
above is Linux or not, because it might be possible that Herbert's fix
needs to be applied to other places ?

Here comes his patch with his interesting analysis for reference, in
case it might give ideas to anybody.

Cheers,
Willy

---
From: Herbert Xu [EMAIL PROTECTED]
Date: Thu, 27 Oct 2005 08:47:46 + (+1000)
Subject: [TCP]: Clear stale pred_flags when snd_wnd changes
X-Git-Tag: v2.6.14
X-Git-Url: 
http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2ad41065d9fe518759b695fc2640cf9c07261dd2

[TCP]: Clear stale pred_flags when snd_wnd changes

This bug is responsible for causing the infamous Treason uncloaked
messages that's been popping up everywhere since the printk was added.
It has usually been blamed on foreign operating systems.  However,
some of those reports implicate Linux as both systems are running
Linux or the TCP connection is going across the loopback interface.

In fact, there really is a bug in the Linux TCP header prediction code
that's been there since at least 2.1.8.  This bug was tracked down with
help from Dale Blount.

The effect of this bug ranges from harmless Treason uncloaked
messages to hung/aborted TCP connections.  The details of the bug
and fix is as follows.

When snd_wnd is updated, we only update pred_flags if
tcp_fast_path_check succeeds.  When it fails (for example,
when our rcvbuf is used up), we will leave pred_flags with
an out-of-date snd_wnd value.

When the out-of-date pred_flags happens to match the next incoming
packet we will again hit the fast path and use the current snd_wnd
which will be wrong.

In the case of the treason messages, it just happens that the snd_wnd
cached in pred_flags is zero while tp-snd_wnd is non-zero.  Therefore
when a zero-window packet comes in we incorrectly conclude that the
window is non-zero.

In fact if the peer continues to send us zero-window pure ACKs we
will continue making the same mistake.  It's only when the peer
transmits a zero-window packet with data attached that we get a
chance to snap out of it.  This is what triggers the treason
message at the next retransmit timeout.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2239,6 +2239,7 @@ static int tcp_ack_update_window(struct 
/* Note, it is the only place, where
 * fast path is recovered for sending TCP.
 */
+   tp-pred_flags = 0;
tcp_fast_path_check(sk, tp);
 
if (nwin  tp-max_window) {




-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/1] git-net-arm-build-fix

2006-03-12 Thread akpm

From: Andrew Morton [EMAIL PROTECTED]

net/core/sock.c: In function `sock_setsockopt':
net/core/sock.c:460: error: duplicate case value
net/core/sock.c:278: error: previously used here

Guess:



Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 include/asm-arm/socket.h |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff -puN include/asm-arm/socket.h~git-net-arm-build-fix 
include/asm-arm/socket.h
--- 25-arm/include/asm-arm/socket.h~git-net-arm-build-fix   2006-03-12 
00:34:01.0 -0800
+++ 25-arm-akpm/include/asm-arm/socket.h2006-03-12 00:34:15.0 
-0800
@@ -48,6 +48,6 @@
 #define SO_ACCEPTCONN  30
 
 #define SO_PEERSEC 31
-#define SO_PASSSEC 32
+#define SO_PASSSEC 34
 
 #endif /* _ASM_SOCKET_H */
_
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/2] git-net: ebtables fix

2006-03-12 Thread Andrew Morton
[EMAIL PROTECTED] wrote:

  +.set_optmax = EBT_SO_SET_MAX + 1
  +.set= do_ebt_set_ctl,

It's unclear why that compiled.   Let me try again.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/1] git-net: ebtables fix

2006-03-12 Thread akpm

From: Andrew Morton [EMAIL PROTECTED]

net/bridge/netfilter/ebtables.c:1481: warning: initialization makes pointer 
from integer without a cast

Note that the compat functions aren't implemented?


Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 net/bridge/netfilter/ebtables.c |   10 --
 1 files changed, 8 insertions(+), 2 deletions(-)

diff -puN net/bridge/netfilter/ebtables.c~git-net-ebtables-fix 
net/bridge/netfilter/ebtables.c
--- 25/net/bridge/netfilter/ebtables.c~git-net-ebtables-fix 2006-03-12 
00:24:54.0 -0800
+++ 25-akpm/net/bridge/netfilter/ebtables.c 2006-03-12 00:53:32.0 
-0800
@@ -1477,8 +1477,14 @@ static int do_ebt_get_ctl(struct sock *s
 }
 
 static struct nf_sockopt_ops ebt_sockopts =
-{ { NULL, NULL }, PF_INET, EBT_BASE_CTL, EBT_SO_SET_MAX + 1, do_ebt_set_ctl,
-EBT_BASE_CTL, EBT_SO_GET_MAX + 1, do_ebt_get_ctl, 0, NULL
+{
+   .pf = PF_INET,
+   .set_optmin = EBT_BASE_CTL,
+   .set_optmax = EBT_SO_SET_MAX + 1,
+   .set= do_ebt_set_ctl,
+   .get_optmin = EBT_BASE_CTL,
+   .get_optmax = EBT_SO_GET_MAX + 1,
+   .get= do_ebt_get_ctl,
 };
 
 static int __init init(void)
_
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/2] git-net: br_netfilter warning fixes

2006-03-12 Thread David S. Miller
From: [EMAIL PROTECTED]
Date: Sat, 11 Mar 2006 19:20:22 -0800

 From: Andrew Morton [EMAIL PROTECTED]
 
 net/bridge/br_netfilter.c: In function `br_nf_pre_routing':
 net/bridge/br_netfilter.c:427: warning: unused variable `vhdr'
 net/bridge/br_netfilter.c:445: warning: unused variable `vhdr'
 
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]

Applied, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/2] git-net: ebtables fix

2006-03-12 Thread David S. Miller
From: [EMAIL PROTECTED]
Date: Sat, 11 Mar 2006 19:20:22 -0800

 From: Andrew Morton [EMAIL PROTECTED]
 
 net/bridge/netfilter/ebtables.c:1481: warning: initialization makes pointer 
 from integer without a cast
 
 Note that the compat functions aren't implemented?

Right, but now at least they can be implemented now that
we have the proper infrastructure for it :-)

 Signed-off-by: Andrew Morton [EMAIL PROTECTED]

Applied, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/1] git-net: ebtables fix

2006-03-12 Thread David S. Miller
From: [EMAIL PROTECTED]
Date: Sun, 12 Mar 2006 00:41:41 -0800

 From: Andrew Morton [EMAIL PROTECTED]
 
 net/bridge/netfilter/ebtables.c:1481: warning: initialization makes pointer 
 from integer without a cast
 
 Note that the compat functions aren't implemented?
 
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]

I just added the one-liner to put the comma there.

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/1] git-net: export security_sid_to_context()

2006-03-12 Thread akpm

From: Andrew Morton [EMAIL PROTECTED]

WARNING: security_sid_to_context [net/unix/unix.ko] undefined!

(Those functions in scm.h should be uninlined)

Cc: Stephen Smalley [EMAIL PROTECTED]
Cc: James Morris [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 security/selinux/ss/services.c |3 +++
 1 files changed, 3 insertions(+)

diff -puN security/selinux/ss/services.c~git-net-export-security_sid_to_context 
security/selinux/ss/services.c
--- 25-um/security/selinux/ss/services.c~git-net-export-security_sid_to_context 
2006-03-12 01:14:07.0 -0800
+++ 25-um-akpm/security/selinux/ss/services.c   2006-03-12 01:14:25.0 
-0800
@@ -27,6 +27,8 @@
 #include linux/in.h
 #include linux/sched.h
 #include linux/audit.h
+#include linux/module.h
+
 #include asm/semaphore.h
 #include flask.h
 #include avc.h
@@ -616,6 +618,7 @@ out:
return rc;
 
 }
+EXPORT_SYMBOL_GPL(security_sid_to_context);
 
 static int security_context_to_sid_core(char *scontext, u32 scontext_len, u32 
*sid, u32 def_sid)
 {
_
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/1] git-net: export security_sid_to_context()

2006-03-12 Thread David S. Miller
From: [EMAIL PROTECTED]
Date: Sun, 12 Mar 2006 01:19:18 -0800

 WARNING: security_sid_to_context [net/unix/unix.ko] undefined!

Applied, thanks Andrew.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/8] [I/OAT] Driver for the Intel(R) I/OAT DMA engine

2006-03-12 Thread Evgeniy Polyakov
On Fri, Mar 10, 2006 at 06:29:46PM -0800, Leech, Christopher ([EMAIL 
PROTECTED]) wrote:
 From: Chris Leech [mailto:[EMAIL PROTECTED] 
 Sent: Friday, March 10, 2006 6:29 PM
 To: 
 Subject: [PATCH 2/8] [I/OAT] Driver for the Intel(R) I/OAT DMA engine
 
 
 Adds a new ioatdma driver

enumerate_dma_channels() is still broken, if it can not fail add NOFAIL
gfp flag.
And you play tricky games with common_node/device_node of struct
dma_chan - one of that lists is never protected, while other is called 
under RCU and other locks (btw, why does insertion use RCU and deletion
in dma_async_device_unregister() does not?).
struct ioat_dma_chan - is it somewhere freed?

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/8] [I/OAT] Driver for the Intel(R) I/OAT DMA engine

2006-03-12 Thread Andrew Morton
Evgeniy Polyakov [EMAIL PROTECTED] wrote:

  On Fri, Mar 10, 2006 at 06:29:46PM -0800, Leech, Christopher ([EMAIL 
 PROTECTED]) wrote:
   From: Chris Leech [mailto:[EMAIL PROTECTED] 
   Sent: Friday, March 10, 2006 6:29 PM
   To: 
   Subject: [PATCH 2/8] [I/OAT] Driver for the Intel(R) I/OAT DMA engine
   
   
   Adds a new ioatdma driver
 
  enumerate_dma_channels() is still broken, if it can not fail add NOFAIL
  gfp flag.

The __GFP_NOFAIL flag is there to mark lame-and-buggy-code which doesn't
know how to handle ENOMEM.  I went through the kernel, found all the
retry-until-it-works loops and consolidated their behaviour in the page
allocator instead.

Really we should fix them all up.  Adding new users of __GFP_NOFAIL
would not be good.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux v2.6.16-rc6

2006-03-12 Thread Michal Piotrowski
Hi,

On 12/03/06, David S. Miller [EMAIL PROTECTED] wrote:
 From: Michal Piotrowski [EMAIL PROTECTED]
 Date: Sun, 12 Mar 2006 02:51:40 +0100

  I have noticed this warnings
  TCP: Treason uncloaked! Peer 82.113.55.2:11759/50967 shrinks window
  148470938:148470943. Repaired.
  TCP: Treason uncloaked! Peer 82.113.55.2:11759/50967 shrinks window
  148470938:148470943. Repaired.
  TCP: Treason uncloaked! Peer 82.113.55.2:11759/59768 shrinks window
  1124211698:1124211703. Repaired.
  TCP: Treason uncloaked! Peer 82.113.55.2:11759/59768 shrinks window
  1124211698:1124211703. Repaired.
 
  It maybe problem with ktorrent.

 It is a problem with the remote TCP implementation, it is
 illegally advertising a smaller window that it previously
 did.


Thanks for explanation.

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Nearly complete kzalloc cleanup for net/ipv6

2006-03-12 Thread Patrick McHardy
Ingo Oeser wrote:
 From: Ingo Oeser [EMAIL PROTECTED]
 
 Stupidly use kzalloc() instead of kmalloc()/memset() 
 everywhere where this is possible in net/ipv6/*.c . 
 
 The netfilter part is NOT included, because Harald should see these, too.

Feel free to send netfilter patches to me.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.16-rc6-mm1

2006-03-12 Thread Benoit Boissinot
On Sun, Mar 12, 2006 at 03:10:36AM -0800, Andrew Morton wrote:
 
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.16-rc6/2.6.16-rc6-mm1/
 
 

drivers/net/tg3.c:8065: warning: type qualifiers ignored on function return type

Signed-off-by: Benoit Boissinot [EMAIL PROTECTED]

Index: linux/drivers/net/tg3.c
===
--- linux.orig/drivers/net/tg3.c
+++ linux/drivers/net/tg3.c
@@ -8061,7 +8061,7 @@ static int tg3_test_link(struct tg3 *tp)
 }
 
 /* Only test the commonly used registers */
-static const int tg3_test_registers(struct tg3 *tp)
+static int tg3_test_registers(struct tg3 *tp)
 {
int i, is_5705;
u32 offset, read_mask, write_mask, val, save_val, read_val;

-- 
powered by bash/screen/(urxvt/fvwm|linux-console)/gentoo/gnu/linux OS
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


David W. Hankins: Linux BSD sockets.

2006-03-12 Thread Michael Richardson
---BeginMessage---
I recently got a bug in my ear to try and get multiple-dhclients to
work on separate interfaces under Linux (a one-daemon-per-interface
model).  Could use some advice from the linux masters (Jason, Andrew?)
if you can spare some time for me.

Another motivation is trying to kill our LPF support (switch to BSD
sockets or similar - so all we have to do is DHCP).


Our linux support with BSD sockets looks to see if the system has
SO_BINDTODEVICE, which purports to bind a socket to the named
interface (so we know packets received on it were received on that
interface, and packets transmitted to for example 255.255.255.255 will
exit that interface).  In addition to this, we set SO_BROADCAST to
truth for obvious reasons.

In theory then, our current bsd sockets support should always have
worked without the need for LPF, which is more than a little
astonishing.

Except now that I test this and try to look at where it's breaking
down, I find that dhclient can't seem to receive the server's responses
(Linux kernel v. 2.6.13), either when it unicasts its response to the
client's assigned address (and unicast mac) or when I set the broadcast
flag and make it reply in broadcast (broadcast mac and IP address to
255.255.255.255).

None of our recvfrom()'s are getting tickled, and the socket buffer
on these sockets isn't increasing (so it's not because we're not
emptying the socket).

One kinda funny side-effect of this shows up in my dmesg:

martian source 255.255.255.255 from 192.168.0.2, on dev eth0

192.168.0.2 is my home NAT dhcp server, this is it's perfectly normal
broadcast response.  I guess the kernel doesn't recognize that address
as a valid local destination.


In case you weren't aware, I'll explain real briefly, that up until
this point what we do is construct a LPF (packet filter) socket for
receiving and transmitting local subnet broadcast DHCP packets, and
a normal BSD socket (which we call the 'fallback interface') for any
unicast traffic that needs to be routed.


So, if I understand the current state of affairs for running DHCP on
Linux, to be able to run multiple daemons we will need to keep the
current LPF support and rework this 'fallback interface' support.

Chiefly, to only construct the fallback interface once at least one
interface assigned to the daemon's purview has acheived BOUND state,
and to bind() the socket to the address assigned to that interface
(which may mean having multiple 'fallback interfaces' if a daemon is
being asked to process multiple physical interfaces).

So, once the interface is addressed, construct the fallback interface
so that we can send our unicast renewals...if we fall out of BOUND
state, tear down this socket before clearing the interface config.

This avoids the problem of having multiple dhclient daemons all trying
to bind to 0.0.0.0:68, but is substantially more complex (and opens up
a corner case: duplicate addresses, but later in the plan we will solve
this while solving other problems).


Right, so the question is, is this the only way to move forward?  Is
there a better way to get RFC2131 broadcast address compliance on Linux
without writing our own UDP packets?

-- 
David W. HankinsIf you don't do it right the first time,
Software Engineer   you'll just have to do it again.
Internet Systems Consortium, Inc.   -- Jack T. Hankins
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
---End Message---


PKTGEN: Adds thread limit parameter.

2006-03-12 Thread Luiz Fernando Capitulino

Currently, pktgen will create one thread for each online CPU in the
system. That behaivor may be annoying if you're using pktgen in a
large SMP system.

This patch adds a new module parameter called 'pg_max_threads', which
can be set to the maximum number of threads pktgen should create.

For example, if you're playing with pktgen in SMP system with 8
CPUs, but want only two pktgen's threads, you can do:

   modprobe pktgen pg_max_threads=2

Signed-off-by: Luiz Capitulino [EMAIL PROTECTED]

---

 net/core/pktgen.c |   23 +--
 1 files changed, 17 insertions(+), 6 deletions(-)

e210ad47d0db52496fdaabdd50bfe6ee0bcc53cd
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 37b25a6..994aef1 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -154,7 +154,7 @@
 #include asm/div64.h /* do_div */
 #include asm/timex.h
 
-#define VERSION  pktgen v2.66: Packet Generator for packet performance 
testing.\n
+#define VERSION  pktgen v2.67: Packet Generator for packet performance 
testing.\n
 
 /* #define PG_DEBUG(a) a */
 #define PG_DEBUG(a)
@@ -488,6 +488,7 @@ static unsigned int fmt_ip6(char *s, con
 static int pg_count_d = 1000;  /* 1000 pkts by default */
 static int pg_delay_d;
 static int pg_clone_skb_d;
+static int pg_max_threads;
 static int debug;
 
 static DEFINE_MUTEX(pktgen_thread_lock);
@@ -3184,7 +3185,7 @@ static int pktgen_remove_device(struct p
 
 static int __init pg_init(void)
 {
-   int cpu;
+   int i, threads;
struct proc_dir_entry *pe;
 
printk(version);
@@ -3208,15 +3209,24 @@ static int __init pg_init(void)
/* Register us to receive netdevice events */
register_netdevice_notifier(pktgen_notifier_block);
 
-   for_each_online_cpu(cpu) {
+   threads = num_online_cpus();
+
+   /*
+* Check if we should have less threads than the number
+* of online CPUs
+*/
+   if ((pg_max_threads  0)  (pg_max_threads  threads))
+   threads = pg_max_threads;
+
+   for (i = 0; i  threads; i++) {
int err;
char buf[30];
 
-   sprintf(buf, kpktgend_%i, cpu);
-   err = pktgen_create_thread(buf, cpu);
+   sprintf(buf, kpktgend_%i, i);
+   err = pktgen_create_thread(buf, i);
if (err)
printk(pktgen: WARNING: Cannot create thread for cpu 
%d (%d)\n,
-   cpu, err);
+   i, err);
}
 
if (list_empty(pktgen_threads)) {
@@ -3263,4 +3273,5 @@ MODULE_LICENSE(GPL);
 module_param(pg_count_d, int, 0);
 module_param(pg_delay_d, int, 0);
 module_param(pg_clone_skb_d, int, 0);
+module_param(pg_max_threads, int, 0);
 module_param(debug, int, 0);
-- 
1.2.4.gbe76



-- 
Luiz Fernando N. Capitulino
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Simon Kelley: Re: Linux BSD sockets.

2006-03-12 Thread Michael Richardson

(I won't forward more stuff unless asked to)

---BeginMessage---
Michael Richardson wrote:
 
 
Simon == Simon Kelley [EMAIL PROTECTED] writes:
 
 Simon The alternative if LPF which is adopted by most clients,
 Simon (udhcpc, dhcpcd, diamond) is AF_PACKET sockets. A packet
 Simon socket gives you complete IP packets, with IP header. The
 Simon best solution seems to be to use a packet socket until bound,
 Simon and then switch to an AF_INET socket.
 
   Why use two mechanism if the first will do?
 
   I'll answer my question a bit --- because something else might come
 along and cause the LPF or AF_PACKET socket to become unuseable. For
 instance, anyone who decides to IPsec encrypt their wireless winds up
 having to make dhcp exceptions for the renewals.

Another reason: you either have to use a packet-filter on the AF_PACKET
socket (which gets you back to needing the packet filter in the kernel,
and is arguably ugly) or your DHCP client gets every IP packet the host
ever receives. Better, once established, to close the packet socket and
have the client sleep except when a packet arrives on port 67.

 
 Simon No-one else has every achieved this, as far as know.
 
 Simon If you find a way, I'd be very interested.
 
   I think it's a kernel bug. I think that a way ought to be created to
 make this work.
 

OK, let's try and enumerate what we're asking for here:

Client first.

To make the client work, we need to be able to receive broadcasts sent
to 255.255.255.255 even on an interface which doesn't have an IP
address. We need to be able to send UDP packets from an interface which
doesn't have an IP address, and the source address of those packets
needs to set to 0.0.0.0 (That's conventional, I'm not sure it's a MUST)
These packets have to be able to be broadcast and have destination
address 255.255.255.255

Actually, I'm not sure the send part doesn't work. What happens if one
uses sendmsg and IP_PKTINFO to force the source address to 0.0.0.0?
allowing that to work shouldn't be a big deal.

For both send and receive, we'd  like SO_BINDTODEVICE to work. This is
not essential, since the interface can be specified for send and receive
using the ipi_ifindex field of the pktinfo struct. Having it would make
one-daemon-to-one-interface easier to do.

Now the server.

This is easier, we need to be able to do is send to host which can't
reply to ARP requests, but for which we know the MAC address and the
interface by which it is reachable. The ARP-injection trick needs to be
investigated for this, it looks promising. We also need to be able to
send broadcasts with the destination address set as 255.255.255.255, not
the net broadcast. I have feeling that that used to be problematic, but
may not be now. If that works, and ARP-injection works, there's no
kernel fixes needed for the server.

Anybody have anything to add to the shopping list?


Cheers,

Simon.
---End Message---


Re: David W. Hankins: Linux BSD sockets.

2006-03-12 Thread Stefan Rompf
Am Sonntag 12 März 2006 17:50 schrieb Michael Richardson:

 I recently got a bug in my ear to try and get multiple-dhclients to
 work on separate interfaces under Linux (a one-daemon-per-interface
 model).  Could use some advice from the linux masters (Jason, Andrew?)
 if you can spare some time for me.

 Another motivation is trying to kill our LPF support (switch to BSD
 sockets or similar - so all we have to do is DHCP).

Forget it. With my dhcpclient, I tried avoiding the raw socket first too, but 
both sending and receiving packets through the IP stack was not entirely 
successful.

You've already stumbled over issues receiving. At least, the bootp broadcast 
flag needs to be set because a packet destined to the card's MAC address, but 
not IP address is a target for forwarding. Even with broadcast, you can hit 
reverse path validation, you've seen the martian source logs. This can be 
configured with /proc/sys/net/ipv4/conf/*/rp_filter, but that's something a 
DHCP client should not mess with.

Sending is as problematic. You cannot send IP packets from an interface 
without legal address - the kernel will either use the address of another 
interface or just fail (funny thing learned: ifconfig ... 0.0.0.0 is 
different to ip addr dev ... flush)

I did not report these as bugs as I don't expect an IP stack to be able to 
handle an interface without valid address ;-) Now I use AF_PACKET/SOCK_DGRAM 
socket that hides layer 2 for receiving and sending broadcasts, and a UDP 
socket for unicasts, with SO_BINDTODEVICE. The later seems to fail with 
multiple clients, it seems binding to a device is not sufficient.

 Chiefly, to only construct the fallback interface once at least one
 interface assigned to the daemon's purview has acheived BOUND state,
 and to bind() the socket to the address assigned to that interface
 (which may mean having multiple 'fallback interfaces' if a daemon is
 being asked to process multiple physical interfaces).

Yes, that should work. dhcpclient takes the easy way out and sends it's 
unicasts as broadcast over the raw socket if it cannot open the UDP socket 
and bind it to a device. Something to fix for one of the next snapshots ;-)

Stefan
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 2/4] natsemi: Support oversized EEPROMs

2006-03-12 Thread Mark Brown
The natsemi chip can have a larger EEPROM attached than it itself uses
for configuration.  This patch adds support for user space access
to such an EEPROM.

Signed-off-by: Mark Brown [EMAIL PROTECTED]

Index: natsemi-queue/drivers/net/natsemi.c
===
--- natsemi-queue.orig/drivers/net/natsemi.c2006-02-25 17:40:15.0 
+
+++ natsemi-queue/drivers/net/natsemi.c 2006-02-25 17:40:39.0 +
@@ -226,7 +226,7 @@ static int full_duplex[MAX_UNITS];
 NATSEMI_PG1_NREGS)
 #define NATSEMI_REGS_VER   1 /* v1 added RFDR registers */
 #define NATSEMI_REGS_SIZE  (NATSEMI_NREGS * sizeof(u32))
-#define NATSEMI_EEPROM_SIZE24 /* 12 16-bit values */
+#define NATSEMI_DEF_EEPROM_SIZE24 /* 12 16-bit values */
 
 /* Buffer sizes:
  * The nic writes 32-bit values, even if the upper bytes of
@@ -716,6 +716,8 @@ struct netdev_private {
unsigned int iosize;
spinlock_t lock;
u32 msg_enable;
+   /* EEPROM data */
+   int eeprom_size;
 };
 
 static void move_int_phy(struct net_device *dev, int addr);
@@ -892,6 +894,7 @@ static int __devinit natsemi_probe1 (str
np-msg_enable = (debug = 0) ? (1debug)-1 : NATSEMI_DEF_MSG;
np-hands_off = 0;
np-intr_status = 0;
+   np-eeprom_size = NATSEMI_DEF_EEPROM_SIZE;
 
option = find_cnt  MAX_UNITS ? options[find_cnt] : 0;
if (dev-mem_start)
@@ -2601,7 +2604,8 @@ static int get_regs_len(struct net_devic
 
 static int get_eeprom_len(struct net_device *dev)
 {
-   return NATSEMI_EEPROM_SIZE;
+   struct netdev_private *np = netdev_priv(dev);
+   return np-eeprom_size;
 }
 
 static int get_settings(struct net_device *dev, struct ethtool_cmd *ecmd)
@@ -2688,15 +2692,20 @@ static u32 get_link(struct net_device *d
 static int get_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom, 
u8 *data)
 {
struct netdev_private *np = netdev_priv(dev);
-   u8 eebuf[NATSEMI_EEPROM_SIZE];
+   u8 *eebuf;
int res;
 
+   eebuf = kmalloc(np-eeprom_size, GFP_KERNEL);
+   if (!eebuf)
+   return -ENOMEM;
+
eeprom-magic = PCI_VENDOR_ID_NS | (PCI_DEVICE_ID_NS_8381516);
spin_lock_irq(np-lock);
res = netdev_get_eeprom(dev, eebuf);
spin_unlock_irq(np-lock);
if (!res)
memcpy(data, eebuf+eeprom-offset, eeprom-len);
+   kfree(eebuf);
return res;
 }
 
@@ -3062,9 +3071,10 @@ static int netdev_get_eeprom(struct net_
int i;
u16 *ebuf = (u16 *)buf;
void __iomem * ioaddr = ns_ioaddr(dev);
+   struct netdev_private *np = netdev_priv(dev);
 
/* eeprom_read reads 16 bits, and indexes by 16 bits */
-   for (i = 0; i  NATSEMI_EEPROM_SIZE/2; i++) {
+   for (i = 0; i  np-eeprom_size/2; i++) {
ebuf[i] = eeprom_read(ioaddr, i);
/* The EEPROM itself stores data bit-swapped, but eeprom_read
 * reads it back sanely. So we swap it back here in order to

--
You grabbed my hand and we fell into it, like a daydream - or a fever.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 4/4] natsemi: Add quirks for Aculab E1/T1 PMXc cPCI carrier cards

2006-03-12 Thread Mark Brown
Aculab E1/T1 PMXc cPCI carrier card cards present a natsemi on the cPCI
bus wired up in a non-standard fashion.  This patch provides support in
the natsemi driver for these cards by implementing a quirk mechanism and
using that to configure appropriate settings for the card: forcing 100M
full duplex, having a large EEPROM and using the MII port while ignoring
PHYs.

Signed-off-by: Mark Brown [EMAIL PROTECTED]

Index: natsemi-queue/drivers/net/natsemi.c
===
--- natsemi-queue.orig/drivers/net/natsemi.c2006-02-25 17:41:59.0 
+
+++ natsemi-queue/drivers/net/natsemi.c 2006-03-08 21:44:12.0 +
@@ -226,7 +226,6 @@ static int full_duplex[MAX_UNITS];
 NATSEMI_PG1_NREGS)
 #define NATSEMI_REGS_VER   1 /* v1 added RFDR registers */
 #define NATSEMI_REGS_SIZE  (NATSEMI_NREGS * sizeof(u32))
-#define NATSEMI_DEF_EEPROM_SIZE24 /* 12 16-bit values */
 
 /* Buffer sizes:
  * The nic writes 32-bit values, even if the upper bytes of
@@ -344,12 +343,14 @@ None characterised.
 
 
 
-enum pcistuff {
+enum natsemi_quirks {
PCI_USES_IO = 0x01,
PCI_USES_MEM = 0x02,
PCI_USES_MASTER = 0x04,
PCI_ADDR0 = 0x08,
PCI_ADDR1 = 0x10,
+   MEDIA_FORCE_100FD = 0x20,
+   MEDIA_IGNORE_PHY = 0x40,
 };
 
 /* MMIO operations required */
@@ -367,17 +368,21 @@ enum pcistuff {
 #define MII_FX_SEL 0x0001  /* 100BASE-FX (fiber) */
 #define MII_EN_SCRM0x0004  /* enable scrambler (tp) */
 
- 
 /* array of board data directly indexed by pci_tbl[x].driver_data */
 static const struct {
const char *name;
unsigned long flags;
+   int quirks;
+   int eeprom_size;
 } natsemi_pci_info[] __devinitdata = {
-   { NatSemi DP8381[56], PCI_IOTYPE },
+   { NatSemi DP8381[56], PCI_IOTYPE, 0, 24 },
+   { Aculab E1/T1 PMXc cPCI carrier card, PCI_IOTYPE,
+ MEDIA_FORCE_100FD | MEDIA_IGNORE_PHY, 128 },
 };
 
 static struct pci_device_id natsemi_pci_tbl[] = {
-   { PCI_VENDOR_ID_NS, PCI_DEVICE_ID_NS_83815, PCI_ANY_ID, PCI_ANY_ID, },
+   { PCI_VENDOR_ID_NS, PCI_DEVICE_ID_NS_83815, PCI_VENDOR_ID_ACULAB, 
PCI_SUBDEVICE_ID_ACULAB_174, 0, 0, 1 },
+   { PCI_VENDOR_ID_NS, PCI_DEVICE_ID_NS_83815, PCI_ANY_ID, PCI_ANY_ID, 0, 
0, 0 },
{ 0, },
 };
 MODULE_DEVICE_TABLE(pci, natsemi_pci_tbl);
@@ -815,6 +820,39 @@ static void move_int_phy(struct net_devi
udelay(1);
 }
 
+static void __devinit natsemi_init_media (struct net_device *dev)
+{
+   struct netdev_private *np = netdev_priv(dev);
+   u32 tmp;
+
+   tmp= mdio_read(dev, MII_BMCR);
+   np-speed  = (tmp  BMCR_SPEED100)? SPEED_100 : SPEED_10;
+   np-duplex = (tmp  BMCR_FULLDPLX)? DUPLEX_FULL   : DUPLEX_HALF;
+   np-autoneg= (tmp  BMCR_ANENABLE)? AUTONEG_ENABLE: AUTONEG_DISABLE;
+   np-advertising= mdio_read(dev, MII_ADVERTISE);
+
+   if ((np-advertising  ADVERTISE_ALL) != ADVERTISE_ALL
+ netif_msg_probe(np)) {
+   printk(KERN_INFO natsemi %s: Transceiver default 
autonegotiation %s 
+   10%s %s duplex.\n,
+   pci_name(np-pci_dev),
+   (mdio_read(dev, MII_BMCR)  BMCR_ANENABLE)?
+ enabled, advertise : disabled, force,
+   (np-advertising 
+ (ADVERTISE_100FULL|ADVERTISE_100HALF))?
+   0 : ,
+   (np-advertising 
+ (ADVERTISE_100FULL|ADVERTISE_10FULL))?
+   full : half);
+   }
+   if (netif_msg_probe(np))
+   printk(KERN_INFO
+   natsemi %s: Transceiver status %#04x advertising 
%#04x.\n,
+   pci_name(np-pci_dev), mdio_read(dev, MII_BMSR),
+   np-advertising);
+
+}
+
 static int __devinit natsemi_probe1 (struct pci_dev *pdev,
const struct pci_device_id *ent)
 {
@@ -894,17 +932,21 @@ static int __devinit natsemi_probe1 (str
np-msg_enable = (debug = 0) ? (1debug)-1 : NATSEMI_DEF_MSG;
np-hands_off = 0;
np-intr_status = 0;
-   np-eeprom_size = NATSEMI_DEF_EEPROM_SIZE;
+   np-eeprom_size = natsemi_pci_info[chip_idx].eeprom_size;
 
option = find_cnt  MAX_UNITS ? options[find_cnt] : 0;
if (dev-mem_start)
option = dev-mem_start;
 
/* Ignore the PHY status? */
-   if (option  0x400) {
+   if (natsemi_pci_info[chip_idx].quirks  MEDIA_IGNORE_PHY) {
np-ignore_phy = 1;
} else {
-   np-ignore_phy = 0;
+   if (option  0x400) {
+   np-ignore_phy = 1;
+   } else {
+   np-ignore_phy = 0;
+   }
}
 
/* Initial port:
@@ -936,6 +978,12 @@ static int __devinit natsemi_probe1 (str

[patch 0/4] natsemi: Aculab E1/T1 PMXc Carrier Card support

2006-03-12 Thread Mark Brown
This patch series against the upstream branch of netdev-2.6 adds support
for these boards to the natsemi driver.  It implements some new
functionality required by the boards and enables the appropriate
settings when such a board is detected.

--
You grabbed my hand and we fell into it, like a daydream - or a fever.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/4] natsemi: Add support for using MII port with no PHY

2006-03-12 Thread Mark Brown
This patch provides a module option which configures the natsemi driver
to use the external MII port on the chip but ignore any PHYs that may be
attached to it.  The link state will be left as it was when the driver
started and can be configured via ethtool.  Any PHYs that are present
can be accessed via the MII ioctl()s.

This is useful for systems where the device is connected without a PHY
or where either information or actions outside the scope of the driver
are required in order to use the PHYs.

Signed-Off-By: Mark Brown [EMAIL PROTECTED]

Index: natsemi-queue/drivers/net/natsemi.c
===
--- natsemi-queue.orig/drivers/net/natsemi.c2006-02-25 13:38:34.0 
+
+++ natsemi-queue/drivers/net/natsemi.c 2006-02-25 13:50:51.0 +
@@ -259,7 +259,7 @@ MODULE_PARM_DESC(debug, DP8381x default
 MODULE_PARM_DESC(rx_copybreak, 
DP8381x copy breakpoint for copy-only-tiny-frames);
 MODULE_PARM_DESC(options, 
-   DP8381x: Bits 0-3: media type, bit 17: full duplex);
+   DP8381x: Bits 0-3: media type, bit 17: full duplex, bit 18: ignore 
PHY);
 MODULE_PARM_DESC(full_duplex, DP8381x full duplex setting(s) (1));
 
 /*
@@ -690,6 +690,8 @@ struct netdev_private {
u32 intr_status;
/* Do not touch the nic registers */
int hands_off;
+   /* Don't pay attention to the reported link state. */
+   int ignore_phy;
/* external phy that is used: only valid if dev-if_port != PORT_TP */
int mii;
int phy_addr_external;
@@ -891,7 +893,19 @@ static int __devinit natsemi_probe1 (str
np-hands_off = 0;
np-intr_status = 0;
 
+   option = find_cnt  MAX_UNITS ? options[find_cnt] : 0;
+   if (dev-mem_start)
+   option = dev-mem_start;
+
+   /* Ignore the PHY status? */
+   if (option  0x400) {
+   np-ignore_phy = 1;
+   } else {
+   np-ignore_phy = 0;
+   }
+
/* Initial port:
+* - If configured to ignore the PHY set up for external.
 * - If the nic was configured to use an external phy and if find_mii
 *   finds a phy: use external port, first phy that replies.
 * - Otherwise: internal port.
@@ -899,7 +913,7 @@ static int __devinit natsemi_probe1 (str
 * The address would be used to access a phy over the mii bus, but
 * the internal phy is accessed through mapped registers.
 */
-   if (readl(ioaddr + ChipConfig)  CfgExtPhy)
+   if (np-ignore_phy || readl(ioaddr + ChipConfig)  CfgExtPhy)
dev-if_port = PORT_MII;
else
dev-if_port = PORT_TP;
@@ -909,7 +923,9 @@ static int __devinit natsemi_probe1 (str
 
if (dev-if_port != PORT_TP) {
np-phy_addr_external = find_mii(dev);
-   if (np-phy_addr_external == PHY_ADDR_NONE) {
+   /* If we're ignoring the PHY it doesn't matter if we can't
+* find one. */
+   if (!np-ignore_phy  np-phy_addr_external == PHY_ADDR_NONE) {
dev-if_port = PORT_TP;
np-phy_addr_external = PHY_ADDR_INTERNAL;
}
@@ -917,10 +933,6 @@ static int __devinit natsemi_probe1 (str
np-phy_addr_external = PHY_ADDR_INTERNAL;
}
 
-   option = find_cnt  MAX_UNITS ? options[find_cnt] : 0;
-   if (dev-mem_start)
-   option = dev-mem_start;
-
/* The lower four bits are the media type. */
if (option) {
if (option  0x200)
@@ -954,7 +966,10 @@ static int __devinit natsemi_probe1 (str
if (mtu)
dev-mtu = mtu;
 
-   netif_carrier_off(dev);
+   if (np-ignore_phy)
+   netif_carrier_on(dev);
+   else
+   netif_carrier_off(dev);
 
/* get the initial settings from hardware */
tmp= mdio_read(dev, MII_BMCR);
@@ -1002,6 +1017,8 @@ static int __devinit natsemi_probe1 (str
printk(%02x, IRQ %d, dev-dev_addr[i], irq);
if (dev-if_port == PORT_TP)
printk(, port TP.\n);
+   else if (np-ignore_phy)
+   printk(, port MII, ignoring PHY\n);
else
printk(, port MII, phy ad %d.\n, 
np-phy_addr_external);
}
@@ -1682,42 +1699,44 @@ static void check_link(struct net_device
 {
struct netdev_private *np = netdev_priv(dev);
void __iomem * ioaddr = ns_ioaddr(dev);
-   int duplex;
+   int duplex = np-full_duplex;
u16 bmsr;
-   
-   /* The link status field is latched: it remains low after a temporary
-* link failure until it's read. We need the current link status,
-* thus read twice.
-*/
-   mdio_read(dev, MII_BMSR);
-   bmsr = mdio_read(dev, MII_BMSR);
 
-   if (!(bmsr  BMSR_LSTATUS)) {
-   if (netif_carrier_ok(dev)) {
+   

Possible bug in PFKEY implementation...

2006-03-12 Thread Stjepan Gros
Hi!

setkey command behaves strangely when SPD is large. Either because I'm
doing something wrong or because there is a bug. I believe it's a bug,
but who knows... Anyway, after 529 items it simply stops displaying
items from SPD with a message

recv: Resource temporarily unavailable

This occurs on FC4, Rawhide, Debian and Ubuntu, in other words, it seems
it's not distribution specific.

It seems it's not setkey's bug because the same behavior is noticed with
IKEv2 daemon during SPD scans. Also, ip command successfully displays
complete database. Now, I thought that bug is in PFKEY but pfkey_spddump
is quite simple and some primitive debugging with printk's showed that
that part seems OK, and that bug might be in xfrm_policy_walk.

To reproduce that behavior is quite simple. I attached simple perl
script that populates SPD with large number of items. After SPD is
populated just running 'setkey -DP' yields mentioned error message.

Stjepan Gros

P.S. And one another question, maybe off topic, but anyway. Is there any
reason why ip command doesn't display policy's ID? 


set_policy_responder.pl
Description: Perl program


signature.asc
Description: This is a digitally signed message part


FW: Router does not route when changing MAC Addresses

2006-03-12 Thread Greg Scott
It was suggested I post this here.  I sent the original netdev posting
to an incorrect email address.  I have also tried turning off rp_filter
on both interfaces.  arp_filter is already turned off.   Putting any of
the router interfaces into promiscuous mode makes no difference.  

- Greg Scott

__ 
From:   Greg Scott  
Sent:   Sunday, March 12, 2006 1:57 PM
To: '[EMAIL PROTECTED]'
Subject:Router does not route when changing MAC Addresses

Hello - I spent hours and hours and hours looking for documentation and
archives around this but did not find anything.  

I have a Linux router and I need the ability to swap hardware without
causing downtime.  The problem, of course, is ARPs.  The NICs in the
replacement system need the same MAC Addresses as the NICs in the
original system.  I'd like this all to be in the kernel and not depend
on a daemon process that can die.

How to change MAC addresses is documented well enough - and it works -
but when I change MAC addresses, my router stops routing.  Systems on
both sides can see the router but the router refuses to forward packets.


Here are my little test scripts to change MAC Addresses.

First - ip-fudge-mac.sh
[EMAIL PROTECTED] gregs]# more ip-fudge-mac.sh 
ip link set eth0 down 
ip link set eth0 address 01:02:03:04:05:06 
ip link set eth0 up

ip link set eth1 down
ip link set eth1 address 17:20:16:01:60:03 
ip link set eth1 up

echo 1  /proc/sys/net/ipv4/ip_forward



Now original-mac.sh

[EMAIL PROTECTED] gregs]# more original-mac.sh 
ifdown eth0 
ifconfig eth0 hw ether 00:c1:28:01:d8:07 
ifup eth0

ifdown eth1
ifconfig eth1 hw ether 00:50:da:90:e4:aa 
ifup eth1

echo 1  /proc/sys/net/ipv4/ip_forward

I have systems both on the left and right side of my test router - eth1
on the left, eth0 on the right.  Below is some output from the router
with tcpdump showing what happens.  I replaced the first 3 real public
IP Address octets with 1.2.3.  

The first set of tcpdump records shows it forwarding with the original
hardware MAC Addreses in place.  We see round trips from the left side
to the right side and back with echo request and reply packets.

The second set shows what happens after changing to fudged MAC
Addresses.  We only see echo request packets come in on the left side -
but no echo reply packets.  At the bottom, you can see that eth0 - the
right side NIC - ic completely quiet.   So the echo request packets are
dying somewhere inside my test router.  

And the second I go back to the hardware MAC Addresses, the router
forwards packets again.  

Packet forwarding must somehow depend on MAC Addresses but I cannot find
anything anywhere that tells me how this works.  

I reproduced this problem on at least two different Linux routers - one
running 2.4.27, the other running 2.6.11-1.  Am I asking the kernel to
do something bad?  What would it take to put together a patch to change
this behavior?


[EMAIL PROTECTED] gregs]# ./original-mac.sh
[EMAIL PROTECTED] gregs]# /usr/sbin/tcpdump -i eth1 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol
decode 
listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
17:14:51.010439 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 479
17:14:51.010537 IP 1.2.3.49  172.16.16.1: icmp 64: echo reply seq 479
17:14:52.010448 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 480
17:14:52.010621 IP 1.2.3.49  172.16.16.1: icmp 64: echo reply seq 480
17:14:53.010531 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 481
17:14:53.010696 IP 1.2.3.49  172.16.16.1: icmp 64: echo reply seq 481
17:14:54.010716 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 482
17:14:54.010882 IP 1.2.3.49  172.16.16.1: icmp 64: echo reply seq 482

8 packets captured
8 packets received by filter
0 packets dropped by kernel
[EMAIL PROTECTED] gregs]# ./ip-fudge-mac.sh
[EMAIL PROTECTED] gregs]# /usr/sbin/tcpdump -i eth1 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol
decode 
listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
17:15:10.031945 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 498

17:15:11.031980 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 499
17:15:11.806487 fe80::1520:16ff:fe01:6003  ff02::2: icmp6: router
solicitation
17:15:12.032062 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 500
17:15:13.032154 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 501
17:15:14.03 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 502
17:15:15.032305 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 503
17:15:15.805873 fe80::1520:16ff:fe01:6003  ff02::2: icmp6: router
solicitation
17:15:16.032394 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 504
17:15:17.032465 IP 172.16.16.1  1.2.3.49: icmp 64: echo request seq 505

10 packets captured
10 packets received by filter
0 packets dropped by kernel
[EMAIL PROTECTED] gregs]#
[EMAIL PROTECTED] gregs]# /usr/sbin/tcpdump -i eth0 -n
tcpdump: 

[NETLINK 04/07]: Fix use-after-free in netlink_recvmsg

2006-03-12 Thread Patrick McHardy
[NETLINK]: Fix use-after-free in netlink_recvmsg

The skb given to netlink_cmsg_recv_pktinfo is already freed, move it up
a few lines.

Coverity #948

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 6e745f19b2d9704b08cc5a3d9476b252bf86b46f
tree 20b4276a4a7ffdaf8b4ec4d52e1fa35aa34d850f
parent 901a2a6eb676baea9392e47f16f7e0a0219b7ba5
author Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 00:06:57 +0100
committer Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 00:06:57 +0100

 net/netlink/af_netlink.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 6b9772d..59dc7d1 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1194,6 +1194,9 @@ static int netlink_recvmsg(struct kiocb 
msg-msg_namelen = sizeof(*addr);
}
 
+   if (nlk-flags  NETLINK_RECV_PKTINFO)
+   netlink_cmsg_recv_pktinfo(msg, skb);
+
if (NULL == siocb-scm) {
memset(scm, 0, sizeof(scm));
siocb-scm = scm;
@@ -1205,8 +1208,6 @@ static int netlink_recvmsg(struct kiocb 
netlink_dump(sk);
 
scm_recv(sock, msg, siocb-scm, flags);
-   if (nlk-flags  NETLINK_RECV_PKTINFO)
-   netlink_cmsg_recv_pktinfo(msg, skb);
 
 out:
netlink_rcv_wake(sk);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[XFRM 03/07]: Fix leak in ah6_input

2006-03-12 Thread Patrick McHardy
[XFRM]: Fix leak in ah6_input

tmp_hdr is not freed when ipv6_clear_mutable_options fails.

Coverity #650

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 901a2a6eb676baea9392e47f16f7e0a0219b7ba5
tree 0424294ae2d79986e07e7751025e763c6b9ea5d7
parent 9275d31561bc3268ae997dbff36eed84afe80516
author Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 00:04:54 +0100
committer Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 00:04:54 +0100

 net/ipv6/ah6.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c
index c7932cb..8496374 100644
--- a/net/ipv6/ah6.c
+++ b/net/ipv6/ah6.c
@@ -279,7 +279,7 @@ static int ah6_input(struct xfrm_state *
goto out;
memcpy(tmp_hdr, skb-nh.raw, hdr_len);
if (ipv6_clear_mutable_options(skb-nh.ipv6h, hdr_len))
-   goto out;
+   goto free_out;
skb-nh.ipv6h-priority= 0;
skb-nh.ipv6h-flow_lbl[0] = 0;
skb-nh.ipv6h-flow_lbl[1] = 0;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[NETFILTER 01/07]: nfnetlink_queue: fix possible NULL-ptr dereference

2006-03-12 Thread Patrick McHardy
[NETFILTER]: nfnetlink_queue: fix possible NULL-ptr dereference

Fix NULL-ptr dereference when a config message for a non-existant
queue containing only an NFQA_CFG_PARAMS attribute is received.

Coverity #433

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit fa218d330c7f755e03fad775c1f186462caeba63
tree 5cfe10cf3b49ef8935f4716fef38cc7807a75148
parent 535744878e34d01a53f946f26dfbca37186f2cf8
author Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 00:03:44 +0100
committer Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 00:03:44 +0100

 net/netfilter/nfnetlink_queue.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index cac38b2..2cf5fb8 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -928,8 +928,12 @@ nfqnl_recv_config(struct sock *ctnl, str
 
if (nfqa[NFQA_CFG_PARAMS-1]) {
struct nfqnl_msg_config_params *params;
-   params = NFA_DATA(nfqa[NFQA_CFG_PARAMS-1]);
 
+   if (!queue) {
+   ret = -ENOENT;
+   goto out_put;
+   }
+   params = NFA_DATA(nfqa[NFQA_CFG_PARAMS-1]);
nfqnl_set_mode(queue, params-copy_mode,
ntohl(params-copy_range));
}
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[TCP 05/07]: tcp_highspeed: fix AIMD table out-of-bounds access

2006-03-12 Thread Patrick McHardy
[TCP]: tcp_highspeed: fix AIMD table out-of-bounds access

Covertiy #547

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 4f58fff161d4c5f79955fe6ad2f98752282c9d6b
tree ae0cb1579a20ef7a0d380e8bd0dc4c436c0b30ec
parent 6e745f19b2d9704b08cc5a3d9476b252bf86b46f
author Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 00:09:06 +0100
committer Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 00:09:06 +0100

 net/ipv4/tcp_highspeed.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/tcp_highspeed.c b/net/ipv4/tcp_highspeed.c
index 63cf7e5..e0e9d13 100644
--- a/net/ipv4/tcp_highspeed.c
+++ b/net/ipv4/tcp_highspeed.c
@@ -125,7 +125,7 @@ static void hstcp_cong_avoid(struct sock
/* Update AIMD parameters */
if (tp-snd_cwnd  hstcp_aimd_vals[ca-ai].cwnd) {
while (tp-snd_cwnd  hstcp_aimd_vals[ca-ai].cwnd 
-  ca-ai  HSTCP_AIMD_MAX)
+  ca-ai  HSTCP_AIMD_MAX - 1)
ca-ai++;
} else if (tp-snd_cwnd  hstcp_aimd_vals[ca-ai].cwnd) {
while (tp-snd_cwnd  hstcp_aimd_vals[ca-ai].cwnd 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[NET 00/07]: Coverity fixes

2006-03-12 Thread Patrick McHardy
Hi Dave,

following are a couple of coverity fixes for the more serious problems.
Double checked and should be OK for 2.6.16.


 net/ipv4/ip_output.c|5 +++--
 net/ipv4/netfilter/arp_tables.c |2 +-
 net/ipv4/tcp_highspeed.c|2 +-
 net/ipv6/ah6.c  |2 +-
 net/ipv6/ip6_output.c   |5 +++--
 net/netfilter/nfnetlink_queue.c |6 +-
 net/netlink/af_netlink.c|5 +++--
 net/sched/act_api.c |2 +-
 8 files changed, 18 insertions(+), 11 deletions(-)

Patrick McHardy:
  [NETFILTER]: nfnetlink_queue: fix possible NULL-ptr dereference
  [NET_SCHED]: act_api: fix skb leak in error path
  [XFRM]: Fix leak in ah6_input
  [NETLINK]: Fix use-after-free in netlink_recvmsg
  [TCP]: tcp_highspeed: fix AIMD table out-of-bounds access
  [NETFILTER]: arp_tables: fix NULL pointer dereference
  [IPV4/6]: Fix UFO error propagation
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[IPV4/6 07/07]: Fix UFO error propagation

2006-03-12 Thread Patrick McHardy
[IPV4/6]: Fix UFO error propagation

When ufo_append_data fails err is uninitialized, but returned back.
Strangely gcc doesn't notice it.

Coverity #901 and #902

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 3a83b01bbd9e0a2becfc427f7866dd2f20f75494
tree 421450cdd094d6490cfe4b561610c25ff5ca0e22
parent cbb18ea472603ef2e7fcc0dd21490c22a9c01335
author Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 00:09:33 +0100
committer Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 00:09:33 +0100

 net/ipv4/ip_output.c  |5 +++--
 net/ipv6/ip6_output.c |5 +++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 57d290d..46b0771 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -847,8 +847,9 @@ int ip_append_data(struct sock *sk,
if (((length  mtu)  (sk-sk_protocol == IPPROTO_UDP)) 
(rt-u.dst.dev-features  NETIF_F_UFO)) {
 
-   if(ip_ufo_append_data(sk, getfrag, from, length, hh_len,
-  fragheaderlen, transhdrlen, mtu, flags))
+   if ((err = ip_ufo_append_data(sk, getfrag, from, length, hh_len,
+ fragheaderlen, transhdrlen, mtu,
+ flags)))
goto error;
 
return 0;
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index f999edd..5fe62d0 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -944,8 +944,9 @@ int ip6_append_data(struct sock *sk, int
if (((length  mtu)  (sk-sk_protocol == IPPROTO_UDP)) 
(rt-u.dst.dev-features  NETIF_F_UFO)) {
 
-   if(ip6_ufo_append_data(sk, getfrag, from, length, hh_len,
-   fragheaderlen, transhdrlen, mtu, flags))
+   if ((err = ip6_ufo_append_data(sk, getfrag, from, length, 
hh_len,
+  fragheaderlen, transhdrlen, mtu,
+  flags)))
goto error;
 
return 0;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Announce] Intel PRO/Wireless 3945ABG Network Connection

2006-03-12 Thread David Rosky



I also suspect that they only have to make it
difficult for an end user and not a technologist.


The issues here can be complex and there there is often
a lot of mis-information.  The above statement is basically
true, but the issue of how compliance is determined is
not always simple.

I am a co-founder of a small company that manufactures remote
monitoring systems operating in the 433MHz and 915MHz bands,
and we have products that are being certified under FCC parts
15.231 and 15.247.  The FCC does require that equipment be
reasonably tamper-proof, but they understand that you can't
make a product absolutely bullet-proof against all efforts
to modify it and be able to build it in a cost effective
manner.  An example of this (from the hardware side) is
this:  Under part 15.231, the FCC places limits on radiated
field strength at a given distance from the transmitter, as
opposed to transmitter output power.  Well, the radiated field
strength will be dependent on antenna gain (among other things).
The FCC therefore requires that the transmit antenna either be
built into the product in a difficult-to-modify way (for example
as a trace on the circuit board), or, if it attaches via a
connector, the connector must be non-standard such that
an average user cannot simply go to Radio Shack and bolt on
a higher performance antenna.  As an example of what the FCC
considers sufficiently non-standard, a reverse-threaded
antenna connector will do.

So, does this keep a technologist from defeating the limits?
Of course not, reverse threaded connectors are easily
available from wholesale electronics distribution sources.
HOWEVER, the FCC would certainly frown on a technologist who
made a business going around and modifying part 15 certified
equipment to violate the limits.

It is neither the companies nor their lawyers who determine
whether a product sufficiently meets the requirments to become
certified.  It is normally determined by an FCC-certified TCB
(Telecommunications Certification Body).The FCC also maintains
a mechanism whereby certification-related questions can be asked
directly of them and whereby answers to previous questions can be
looked up. The TCB's themselves meet regularly with each other
and with FCC representatives to ensure that standards are being
applied uniformly.

The point if all this is that although you will not
find anywhere in the FCC regulations where it says that the
regulatory daemon controlling the radio *must* be binary-only,
it is not possible to say (except for someone at Intel) whether
it was Intel, or the certifying TCB who decided that making it
binary-only would be necessary to meet the FCC requirements
of being reasonably tamper-proof.

Dave


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


netconsole: no IP address for eth0, aborting

2006-03-12 Thread Randy.Dunlap

hardware: e1000 NIC in ThinkPad T42
kernel:  2.6.16-rc5-mm3

I always get $subject message.  I changed the delay in
net/core/netpoll.c from 4 seconds to 9 seconds, but it
doesn't matter, the e1000 always finds Link is Up immediately
after $subject message.

I suppose this is a consequence of e1000 using a workqueue for
some of its init code.  Maybe it would work better on an MP
machine instead of UP (giving e1000 a CPU to init on?).
I guess I'll have to switch from netconsole built-in to use a
loadable module instead.  That could work for some non-kernel-init
debugging, but this really needs to work for kernel init-time
debugging as well IMO.

---
~Randy
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/1] git-net: export security_sid_to_context()

2006-03-12 Thread Arnaldo Carvalho de Melo
On 3/12/06, James Morris [EMAIL PROTECTED] wrote:
 On Sun, 12 Mar 2006, [EMAIL PROTECTED] wrote:

  Cc: Stephen Smalley [EMAIL PROTECTED]
  Cc: James Morris [EMAIL PROTECTED]
  Signed-off-by: Andrew Morton [EMAIL PROTECTED]

 These SELinux symbols should not be directly exported (if you need it
 temporarily to fix the build, fine, but please don't send it to Linus'
 tree).

That was my understanding as well :-)

- Arnaldo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/1] git-net: export security_sid_to_context()

2006-03-12 Thread Andrew Morton
James Morris [EMAIL PROTECTED] wrote:

 On Sun, 12 Mar 2006, [EMAIL PROTECTED] wrote:
 
   Cc: Stephen Smalley [EMAIL PROTECTED]
   Cc: James Morris [EMAIL PROTECTED]
   Signed-off-by: Andrew Morton [EMAIL PROTECTED]
 
  These SELinux symbols should not be directly exported (if you need it 
  temporarily to fix the build, fine, but please don't send it to Linus' 
  tree).

We can remove the export again by uninlining scm_recv().

Both scm_recv() and scm_send() are very large anyway.

If they're uninlined then they'll need to be exported to modules.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [IPV4/6 07/07]: Fix UFO error propagation

2006-03-12 Thread Patrick McHardy
Arnaldo Carvalho de Melo wrote:
-   if(ip_ufo_append_data(sk, getfrag, from, length, hh_len,
-  fragheaderlen, transhdrlen, mtu, flags))
+   if ((err = ip_ufo_append_data(sk, getfrag, from, length, 
hh_len,
+ fragheaderlen, transhdrlen, mtu,
+ flags)))
 
 
 Ugh, can this be changed to look like:
 
  err = foo();
  if (err)
 
 ?

Sure, new patch attached.


[IPV4/6]: Fix UFO error propagation

When ufo_append_data fails err is uninitialized, but returned back.
Strangely gcc doesn't notice it.

Coverity #901 and #902

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 5c6e3731e884da550e7701ba8b68b4c3441e21f6
tree 76cdff77456646194b7f20023ad7744f161bf33f
parent 535744878e34d01a53f946f26dfbca37186f2cf8
author Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 02:20:47 +0100
committer Patrick McHardy [EMAIL PROTECTED] Mon, 13 Mar 2006 02:20:47 +0100

 net/ipv4/ip_output.c  |7 ---
 net/ipv6/ip6_output.c |7 ---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 57d290d..8ee4d01 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -847,10 +847,11 @@ int ip_append_data(struct sock *sk,
 	if (((length  mtu)  (sk-sk_protocol == IPPROTO_UDP)) 
 			(rt-u.dst.dev-features  NETIF_F_UFO)) {
 
-		if(ip_ufo_append_data(sk, getfrag, from, length, hh_len,
-			   fragheaderlen, transhdrlen, mtu, flags))
+		err = ip_ufo_append_data(sk, getfrag, from, length, hh_len,
+	 fragheaderlen, transhdrlen, mtu,
+	 flags);
+		if (err)
 			goto error;
-
 		return 0;
 	}
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index f999edd..5bf70b1 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -944,10 +944,11 @@ int ip6_append_data(struct sock *sk, int
 	if (((length  mtu)  (sk-sk_protocol == IPPROTO_UDP)) 
 	(rt-u.dst.dev-features  NETIF_F_UFO)) {
 
-		if(ip6_ufo_append_data(sk, getfrag, from, length, hh_len,
-fragheaderlen, transhdrlen, mtu, flags))
+		err = ip6_ufo_append_data(sk, getfrag, from, length, hh_len,
+	  fragheaderlen, transhdrlen, mtu,
+	  flags);
+		if (err)
 			goto error;
-
 		return 0;
 	}
 



Re: [patch 1/1] git-net: export security_sid_to_context()

2006-03-12 Thread David S. Miller
From: James Morris [EMAIL PROTECTED]
Date: Sun, 12 Mar 2006 20:07:33 -0500 (EST)

 On Sun, 12 Mar 2006, [EMAIL PROTECTED] wrote:
 
  Cc: Stephen Smalley [EMAIL PROTECTED]
  Cc: James Morris [EMAIL PROTECTED]
  Signed-off-by: Andrew Morton [EMAIL PROTECTED]
 
 These SELinux symbols should not be directly exported (if you need it 
 temporarily to fix the build, fine, but please don't send it to Linus' 
 tree).

We have to do this in order to accomodate AF_UNIX as a module since
they use the SCM inline functions which in turn call these selinux
functions.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/1] git-net: export security_sid_to_context()

2006-03-12 Thread David S. Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Sun, 12 Mar 2006 17:15:31 -0800

 We can remove the export again by uninlining scm_recv().
 
 Both scm_recv() and scm_send() are very large anyway.
 
 If they're uninlined then they'll need to be exported to modules.

This seems like a good idea.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [IPV4/6 07/07]: Fix UFO error propagation

2006-03-12 Thread David S. Miller
From: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
Date: Sun, 12 Mar 2006 22:25:37 -0300

  Sure, new patch attached.
 
 Thanks a lot.

All 7 patches applied, thanks a lot for doing these Coverity
fixes Patrick!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] TC: bug fixes to the sample clause

2006-03-12 Thread Russell Stuart
On Sat, 2006-03-11 at 08:11 -0500, jamal wrote: 
  On my machine tc does not parse filter sample for the u32
  filter.  Eg:
  
  tc filter add dev eth2 parent 1:0 protocol ip prio 1 u32 ht 801: \ 
  classid 1:3 \
  sample ip protocol 1 0xff match ip protocol 1 0xff
Illegal sample
  
 
 The syntax is correct but ht 801: must exist - that is why it is being
 rejected.. So there is absolutely categorically _no way in hell_ your
 memset would have fixed it ;- Apologies for being overdramatic ;-

You are wrong on both counts.

Firstly, the error message came from tc when it parsed
the line.  If tc gets an error talking to the kernel it 
says, among other things:
  We have an error talking to the kernel
Ergo, it hadn't given the command to the kernel yet.
This is significant, because the only place that
knows whether ht 801: has been created or not is
the kernel.  So obviously the error message can't
depend on whether the table had been created.

As it happens I did create the ht before issuing the
prior command when I first struck the bug - but I
didn't show it in my example because it was not
relevant.

As for _no way in hell_ - there are apparently more ways
in hell then you are aware of.  If you look at the 
function pack_key() in tc/f_u32.c, you will see it
assumes the sel parameter it is passed is initialised.
Without the added memset() it isn't - it just contains
random crap.  Of course on some machines (perhaps yours?)
that random crap might be 0, and then it would work.  
That is why I said at the start of my patch On my 
machine tc does not   On other machines the bug
may not appear.

 sample never worked 100% of the time with that hash. It works _most_ of
 the time with 256 buckets. Infact it will work some of the time as it is
 right now with 2.6.x. Can you post the output of tc -s filter ls on 2.6
 with and without your user space change?

Re: sample never worked 100% of the time with that 
hash.  Can you give an example?  As far as I know it
always worked.

Re: it will work some of the time as it is right now 
with 2.6.x.  Yes - it will work when you are sampling
one byte.   I am sampling port numbers, which are two
bytes.  It will not work in any case where there are
two non-zero bytes.

Re: Can you post the output of tc -s filter ls on 2.6 
with and without your user space change?.  Here it is:

  With my change:
tc qdisc add dev eth0 root handle 1: htb
tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 ht 801: divisor 256
tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 
1:0 sample tcp src 0x369 0x match tcp src 0x0369 0x
tc -s filter show dev eth0
filter parent 1: protocol ip pref 1 u32
filter parent 1: protocol ip pref 1 u32 fh 801: ht divisor 256
filter parent 1: protocol ip pref 1 u32 fh 801:3:800 order 2048 key ht 801 
bkt 3 flowid 1:
  match 0369/ at nexthdr+0
filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1

  On the orginal tc shipped with debian sarge:
tc qdisc add dev eth0 root handle 1: htb
tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 ht 801: divisor 256
tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 
1:0 sample tcp src 0x369 0x match tcp src 0x0369 0x
Illegal sample
  Ooops.  Looks like I can't get out of this without patching
  and compiling:
tc qdisc add dev eth0 root handle 1: htb
tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 ht 801: divisor 256
tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 
1:0 sample tcp src 0x369 0x match tcp src 0x0369 0x
tc -s filter show dev eth0filter parent 1: protocol ip pref 1 u32
filter parent 1: protocol ip pref 1 u32 fh 801: ht divisor 256
filter parent 1: protocol ip pref 1 u32 fh 801:6a:800 order 2048 key ht 801 
bkt 6a flowid 1:
  match 0369/ at nexthdr+0
filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1

Note: this also answers you request in your other email re:
can you give me an example that doesn't work.

 Heres how you should fix this:
 Patch1) fix kernel 2.4.x to be like 2.6.x
 patch 2) fix iproute2 to have the same syntax as for 2.6 and put a big
 bold note in the code that if anyone changes that part of the code to
 look at the kernel u32_hash_fold() routine.
 no need for the utsname check.

Why do it that way?  If you want to put the 2.6 hashing
algorithm in 2.4 then go ahead - but that is a separate 
decision which is not related to the issue of making tc 
backwards compatible.  Making tc work with all versions of
the kernel is what I am doing there. As an example of why
this is a good idea, Debian ships 2.4 and 2.6 kernels, 
and one version of tc.  That tc should work with both 
kernels.

Finally, regarding which hashing algorithm is better,
your results differ from mine.  First a bit of 
background: I am trying make VOIP work over some 256/64 
and 512/128 links