Re: [PATCH] tcp: cubic scaling error

2006-10-26 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Wed, 25 Oct 2006 10:52:29 -0700

 Doug Leith observed a discrepancy between the version of CUBIC described
 in the papers and the version in 2.6.18. A math error related to scaling
 causes Cubic to grow too slowly.
 
 Patch is from Sangtae Ha [EMAIL PROTECTED]. I validated that
 it does fix the problems.
 
 See the following to show behavior over 500ms 100 Mbit link.
 
 Sender (2.6.19-rc3) ---  Bridge (2.6.18-rt7) --- Receiver (2.6.19-rc3)
 1G  [netem]   100M
 
   http://developer.osdl.org/shemminger/tcp/2.6.19-rc3/cubic-orig.png
   http://developer.osdl.org/shemminger/tcp/2.6.19-rc3/cubic-fix.png
 
 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

Applied, thanks a lot Stephen.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix integer overflow in H-TCP congestion control

2006-10-26 Thread David Miller
From: Gavin McCullagh [EMAIL PROTECTED]
Date: Wed, 25 Oct 2006 09:47:26 +0100

 When using H-TCP with a single flow on a 500Mbit connection (or less
 actually), alpha can exceed 65000, so alpha needs to be a u32.
 
 Signed-off-by: Gavin McCullagh [EMAIL PROTECTED]
 Signed-off-by: Doug Leith [EMAIL PROTECTED]

Applied, thank you.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bridge: correct print message typo

2006-10-26 Thread David Miller
From: Randy Dunlap [EMAIL PROTECTED]
Date: Tue, 24 Oct 2006 21:24:58 -0700

 From: Randy Dunlap [EMAIL PROTECTED]
 
 Correct message typo/spello.
 
 Signed-off-by: Randy Dunlap [EMAIL PROTECTED]

Applied, thanks a lot Randy.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Network virtualization/isolation

2006-10-26 Thread Daniel Lezcano

Stephen Hemminger wrote:

On Wed, 25 Oct 2006 17:51:28 +0200
Daniel Lezcano [EMAIL PROTECTED] wrote:



Hi Stephen,

currently the work to make the container enablement into the kernel is 
doing good progress. The ipc, pid, utsname and filesystem system 
ressources are isolated/virtualized relying on the namespaces concept.


But, there is missing the network virtualization/isolation. Two 
approaches are proposed: doing the isolation at the layer 2 and at the 
layer 3.


The first one instanciate a network device by namespace and add a peer 
network device into the root namespace, all the routing ressources are 
  relative to the namespace. This work is done by Andrey Savochkin from 
the openvz project.


The second relies on the routes and associates the network namespace 
pointer with each route. When the traffic is incoming, the packet 
follows an input route and retrieve the associated network namespace. 
When the traffic is outgoing, the packet, identified from the network 
namespace is coming from, follows only the routes matching the same 
network namespace. This work is made by me.


IMHO, we need the two approach, the layer-2 to be able to bring *very* 
strong isolation for system container with a performance cost and a 
layer-3 to be able to have good isolation for lightweight container or 
application container when performances are more important.


Do you have some suggestions ? What is your point of view on that ?

Thanks in advance.

  -- Daniel



Any solution should allow both and it should build on the existing netfilter 
infrastructure.




The problem is netfilter can not give a good isolation, eg. how can be 
handled netstat command ? or avoid to see IP addresses assigned to 
another container when doing ifconfig ? Furthermore, one of the biggest 
interest of the network isolation is to bring mobility with a container 
and that can only be done if the network ressources inside the kernel 
can be identified by container in order to checkpoint/restart them.


The all-in-namespace solution, ie. at layer 2, is very good in terms of 
isolation but it adds an non-negligeable overhead. The layer 3 isolation 
 has an insignifiant overhead, a good isolation perfectly adapted for 
applications containers.


Unfortunatly, from the point of view of implementation, layer 3 can not 
be a subset of layer 2 isolation when using all-in-namespace and layer 
2 isolation can not be a extension of the layer 3 isolation.


I think the layer 2 and the layer 3 implementations can coexists. You 
can for example create a system container with a layer 2 isolation and 
inside it add a layer 3 isolation.


Does that make sense ?

-- Daniel




-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: watchdog timeout panic in e1000 driver

2006-10-26 Thread Kenzo Iwami
Hi,

Thank you for your comment.

 Anyway as I said in the same e-mail, we're working on reducing the lock 
 timeout to a 
 reasonable time. This will unfortunately take some time, as we need to 
 change some major 
 components in the driver to make sure this doesn't happen.

 How about the following approach?
 If acquiring semaphore fails inside the interrupt handler, acquiring 
 semaphore
 is abandoned immediately without waiting for timeout.
 However, I don't know whether this method affects other processes.
 
 with the current hardware being accessed simultaneously from several users in 
 the 
 kernel, that would lead to large problems - the watchdog task accesses it 
 every 2 
 seconds as it reads the PHY link status, so when one of those fails the 
 driver would 
 have no choice but to reset the entire device.

This problem occurs because interrupt handler is executed while the
interrupted code is still holding the semaphore. Acquiring the semaphore
fails regardless of the timeout period.

I think the watchdog task will fail trying to read the PHY link status,
even if the lock timeout period has been reduced.

-- 
  Kenzo Iwami ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Announce] Netchannels ported to the latest git tree. Gigabit benchmark. Complete rout.

2006-10-26 Thread Evgeniy Polyakov
On Fri, Oct 20, 2006 at 01:53:05PM +0400, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
 Netchannel [1] is pure bridge between low-level hardware and user, without any
 special protocol processing involved between them.
 Users are not limited to userspace only - I will use this netchannel
 infrastructure for fast NAT implementation, which is purely kernelspace user 
 (although it is possible to create NAT in userspace, but price of the 
 kernelspace board crossing is too high, which only needs to change some 
 fields 
 in the header and recalculate checksum).
 Userspace network stack [2] is another user of the new netchannel subsystem.
 
 Current netchannel version supports data transfer using copy*user().

Performance graph (speed and CPU usage) attached.
Benchmark uses 128 bytes sending/receiving per syscall (no latency
checks, only throughput.

MB and KB mean not 1000, but 1024.

Receiving is about 8 MB/sec faster.
Receiving CPU usage is 3 times less (90% socket code vs. 30%
netchannels+unetstack).

Sending is 10 MB/sec faster.
Sending CPU usage is 5 times less (upto 50% vs. upto 10%).

Number of syscalls is about 10 times less for netchannels.

Hardware.
System 1.
 Netchannel kernel (2.6.19-rc3-git) or 
   vanilla 2.6.19-rc3/2.6.18-1.2200.fc5.
 amd64 athlon 3500+ cpu
 1gb ram
 r8169 nic

System 2.
 2.6.17-2-686 debian etch
 intel core duo 3.40GHz
 2 gb ram
 Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller
 (sky2 driven)

All software used in tests (tcp_client.c/tcp_test.c and userspace
network stack) can be found on project's hompages (userspace network stack
requires increased window scaling factor than default).

Consider for inclusion netchannel subsystem.

1. Netchannels homepage.
http://tservice.net.ru/~s0mbre/old/?section=projectsitem=netchannel

2. Userspace network stack homapage.
http://tservice.net.ru/~s0mbre/old/?section=projectsitem=unetstack

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/arch/i386/kernel/syscall_table.S b/arch/i386/kernel/syscall_table.S
index 2697e92..3231b22 100644
--- a/arch/i386/kernel/syscall_table.S
+++ b/arch/i386/kernel/syscall_table.S
@@ -319,3 +319,4 @@ ENTRY(sys_call_table)
.long sys_move_pages
.long sys_getcpu
.long sys_epoll_pwait
+   .long sys_netchannel_control
diff --git a/arch/x86_64/ia32/ia32entry.S b/arch/x86_64/ia32/ia32entry.S
index b4aa875..d35d4d8 100644
--- a/arch/x86_64/ia32/ia32entry.S
+++ b/arch/x86_64/ia32/ia32entry.S
@@ -718,4 +718,5 @@ #endif
.quad compat_sys_vmsplice
.quad compat_sys_move_pages
.quad sys_getcpu
+   .quad sys_netchannel_control
 ia32_syscall_end:  
diff --git a/include/asm-i386/unistd.h b/include/asm-i386/unistd.h
index beeeaf6..33242f8 100644
--- a/include/asm-i386/unistd.h
+++ b/include/asm-i386/unistd.h
@@ -325,10 +325,11 @@ #define __NR_vmsplice 316
 #define __NR_move_pages317
 #define __NR_getcpu318
 #define __NR_epoll_pwait   319
+#define __NR_netchannel_control320
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 320
+#define NR_syscalls 321
 #include linux/err.h
 
 /*
diff --git a/include/asm-x86_64/unistd.h b/include/asm-x86_64/unistd.h
index 777288e..16f1aac 100644
--- a/include/asm-x86_64/unistd.h
+++ b/include/asm-x86_64/unistd.h
@@ -619,8 +619,10 @@ #define __NR_vmsplice  278
 __SYSCALL(__NR_vmsplice, sys_vmsplice)
 #define __NR_move_pages279
 __SYSCALL(__NR_move_pages, sys_move_pages)
+#define __NR_netchannel_control280
+__SYSCALL(__NR_netchannel_control, sys_netchannel_control)
 
-#define __NR_syscall_max __NR_move_pages
+#define __NR_syscall_max __NR_netchannel_control
 
 #ifdef __KERNEL__
 #include linux/err.h
diff --git a/include/linux/netchannel.h b/include/linux/netchannel.h
new file mode 100644
index 000..23e9f1e
--- /dev/null
+++ b/include/linux/netchannel.h
@@ -0,0 +1,88 @@
+/*
+ * netchannel.h
+ * 
+ * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED]
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef __NETCHANNEL_H
+#define __NETCHANNEL_H
+
+#include linux/types.h
+
+enum netchannel_commands {
+   NETCHANNEL_CREATE = 0,
+   NETCHANNEL_RECV,
+   NETCHANNEL_SEND,
+};
+
+enum 

RE: [PATCH] s2io: add PCI error recovery support

2006-10-26 Thread Ananda Raju
Hi, 
Can you try attached patch. The attached patch is simple. We set card
state as down in error_detecct() so that all entry points return error
and don't proceed further.

In slot_reset() we do s2io_card_down() will reset adapter. 
In io_resume() we bringup the driver. 

Ananda 

-Original Message-
From: Linas Vepstas [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, October 25, 2006 1:55 PM
To: Ananda Raju
Cc: Wen Xiong; linux-kernel@vger.kernel.org;
[EMAIL PROTECTED]; netdev@vger.kernel.org; Jeff Garzik;
Andrew Morton
Subject: Re: [PATCH] s2io: add PCI error recovery support

On Wed, Oct 25, 2006 at 10:11:24AM -0500, Linas Vepstas wrote:
 
  Also we have to add following if statement in beginning of
s2io_isr().

Done, below,

  If it is ok to do BAR0 read/write in error_detected() then patch is
OK. 

I re-wrote that section to avoid doing I/O. It seems to work well,
and generates a few less messages in the process.  New, improved patch
below, please ack and send upstream if you like it.

--linas

This patch adds PCI error recovery support to the 
s2io 10-Gigabit ethernet device driver.

Tested, seems to work well.

Signed-off-by: Linas Vepstas [EMAIL PROTECTED]
Cc: Raghavendra Koushik [EMAIL PROTECTED]
Cc: Ananda Raju [EMAIL PROTECTED]
Cc: Wen Xiong [EMAIL PROTECTED]


 drivers/net/s2io.c |  103
+
 drivers/net/s2io.h |5 ++
 2 files changed, 108 insertions(+)

Index: linux-2.6.19-rc1-git11/drivers/net/s2io.c
===
--- linux-2.6.19-rc1-git11.orig/drivers/net/s2io.c  2006-10-25
14:09:47.0 -0500
+++ linux-2.6.19-rc1-git11/drivers/net/s2io.c   2006-10-25
15:18:25.0 -0500
@@ -434,11 +434,18 @@ static struct pci_device_id s2io_tbl[] _
 
 MODULE_DEVICE_TABLE(pci, s2io_tbl);
 
+static struct pci_error_handlers s2io_err_handler = {
+   .error_detected = s2io_io_error_detected,
+   .slot_reset = s2io_io_slot_reset,
+   .resume = s2io_io_resume,
+};
+
 static struct pci_driver s2io_driver = {
   .name = S2IO,
   .id_table = s2io_tbl,
   .probe = s2io_init_nic,
   .remove = __devexit_p(s2io_rem_nic),
+  .err_handler = s2io_err_handler,
 };
 
 /* A simplifier macro used both by init and free shared_mem Fns(). */
@@ -4171,6 +4178,11 @@ static irqreturn_t s2io_isr(int irq, voi
mac_info_t *mac_control;
struct config_param *config;
 
+   if ((sp-pdev-error_state != pci_channel_io_normal) 
+(sp-pdev-error_state != 0)) {
+   return IRQ_HANDLED;
+   }
+
atomic_inc(sp-isr_cnt);
mac_control = sp-mac_control;
config = sp-config;
@@ -7564,3 +7576,94 @@ static void lro_append_pkt(nic_t *sp, lr
sp-mac_control.stats_info-sw_stat.clubbed_frms_cnt++;
return;
 }
+
+/**
+ * s2io_io_error_detected - called when PCI error is detected
+ * @pdev: Pointer to PCI device
+ * @state: The current pci conneection state
+ *
+ * This function is called after a PCI bus error affecting
+ * this device has been detected.
+ */
+static pci_ers_result_t s2io_io_error_detected(struct pci_dev *pdev,
pci_channel_state_t state)
+{
+   struct net_device *netdev = pci_get_drvdata(pdev);
+   nic_t *sp = netdev-priv;
+
+   netif_device_detach(netdev);
+
+   if (netif_running(netdev)) {
+   unsigned long flags;
+
+   /* The folowing is an abreviated subset of the
+* steps taken by s2io_card_down(), avoiding
+* steps that touch the card itself.
+*/
+   del_timer_sync(sp-alarm_timer);
+   atomic_set(sp-card_state, CARD_DOWN);
+
+   /* Kill tasklet. */
+   tasklet_kill(sp-task);
+
+   /* Free all Tx buffers */
+   spin_lock_irqsave(sp-tx_lock, flags);
+   free_tx_buffers(sp);
+   spin_unlock_irqrestore(sp-tx_lock, flags);
+
+   /* Free all Rx buffers */
+   spin_lock_irqsave(sp-rx_lock, flags);
+   free_rx_buffers(sp);
+   spin_unlock_irqrestore(sp-rx_lock, flags);
+   
+   clear_bit(0, (sp-link_state));
+   sp-device_close_flag = TRUE;   /* Device is shut down.
*/
+   }
+   pci_disable_device(pdev);
+
+   return PCI_ERS_RESULT_NEED_RESET;
+}
+
+/**
+ * s2io_io_slot_reset - called after the pci bus has been reset.
+ * @pdev: Pointer to PCI device
+ *
+ * Restart the card from scratch, as if from a cold-boot.
+ */
+static pci_ers_result_t s2io_io_slot_reset(struct pci_dev *pdev)
+{
+   struct net_device *netdev = pci_get_drvdata(pdev);
+   nic_t *sp = netdev-priv;
+
+   if (pci_enable_device(pdev)) {
+   printk(KERN_ERR s2io: Cannot re-enable PCI device after
reset.\n);
+   return PCI_ERS_RESULT_DISCONNECT;
+   }
+
+   pci_set_master(pdev);
+   s2io_reset(sp);
+
+   return 

[patch 3/5] net: fix uaccess handling

2006-10-26 Thread Heiko Carstens
Signed-off-by: Heiko Carstens [EMAIL PROTECTED]
---
 net/ipv4/raw.c   |   17 +++--
 net/ipv6/raw.c   |   17 +++--
 net/netlink/af_netlink.c |5 +++--
 3 files changed, 25 insertions(+), 14 deletions(-)

Index: linux-2.6/net/ipv4/raw.c
===
--- linux-2.6.orig/net/ipv4/raw.c   2006-10-26 14:40:56.0 +0200
+++ linux-2.6/net/ipv4/raw.c2006-10-26 14:42:12.0 +0200
@@ -329,7 +329,7 @@
return err; 
 }
 
-static void raw_probe_proto_opt(struct flowi *fl, struct msghdr *msg)
+static int raw_probe_proto_opt(struct flowi *fl, struct msghdr *msg)
 {
struct iovec *iov;
u8 __user *type = NULL;
@@ -338,7 +338,7 @@
unsigned int i;
 
if (!msg-msg_iov)
-   return;
+   return 0;
 
for (i = 0; i  msg-msg_iovlen; i++) {
iov = msg-msg_iov[i];
@@ -360,8 +360,9 @@
code = iov-iov_base;
 
if (type  code) {
-   get_user(fl-fl_icmp_type, type);
-   get_user(fl-fl_icmp_code, code);
+   if (get_user(fl-fl_icmp_type, type) ||
+   get_user(fl-fl_icmp_code, code))
+   return -EFAULT;
probed = 1;
}
break;
@@ -372,6 +373,7 @@
if (probed)
break;
}
+   return 0;
 }
 
 static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
@@ -480,8 +482,11 @@
.proto = inet-hdrincl ? IPPROTO_RAW :
 sk-sk_protocol,
  };
-   if (!inet-hdrincl)
-   raw_probe_proto_opt(fl, msg);
+   if (!inet-hdrincl) {
+   err = raw_probe_proto_opt(fl, msg);
+   if (err)
+   goto done;
+   }
 
security_sk_classify_flow(sk, fl);
err = ip_route_output_flow(rt, fl, sk, 
!(msg-msg_flagsMSG_DONTWAIT));
Index: linux-2.6/net/ipv6/raw.c
===
--- linux-2.6.orig/net/ipv6/raw.c   2006-10-26 14:40:56.0 +0200
+++ linux-2.6/net/ipv6/raw.c2006-10-26 14:42:12.0 +0200
@@ -604,7 +604,7 @@
return err; 
 }
 
-static void rawv6_probe_proto_opt(struct flowi *fl, struct msghdr *msg)
+static int rawv6_probe_proto_opt(struct flowi *fl, struct msghdr *msg)
 {
struct iovec *iov;
u8 __user *type = NULL;
@@ -616,7 +616,7 @@
int i;
 
if (!msg-msg_iov)
-   return;
+   return 0;
 
for (i = 0; i  msg-msg_iovlen; i++) {
iov = msg-msg_iov[i];
@@ -638,8 +638,9 @@
code = iov-iov_base;
 
if (type  code) {
-   get_user(fl-fl_icmp_type, type);
-   get_user(fl-fl_icmp_code, code);
+   if (get_user(fl-fl_icmp_type, type) ||
+   get_user(fl-fl_icmp_code, code))
+   return -EFAULT;
probed = 1;
}
break;
@@ -650,7 +651,8 @@
/* check if type field is readable or not. */
if (iov-iov_len  2 - len) {
u8 __user *p = iov-iov_base;
-   get_user(fl-fl_mh_type, p[2 - len]);
+   if (get_user(fl-fl_mh_type, p[2 - len]))
+   return -EFAULT;
probed = 1;
} else
len += iov-iov_len;
@@ -664,6 +666,7 @@
if (probed)
break;
}
+   return 0;
 }
 
 static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
@@ -787,7 +790,9 @@
opt = ipv6_fixup_options(opt_space, opt);
 
fl.proto = proto;
-   rawv6_probe_proto_opt(fl, msg);
+   err = rawv6_probe_proto_opt(fl, msg);
+   if (err)
+   goto out;
  
ipv6_addr_copy(fl.fl6_dst, daddr);
if (ipv6_addr_any(fl.fl6_src)  !ipv6_addr_any(np-saddr))
Index: linux-2.6/net/netlink/af_netlink.c
===
--- linux-2.6.orig/net/netlink/af_netlink.c 2006-10-26 14:40:56.0 
+0200
+++ linux-2.6/net/netlink/af_netlink.c  2006-10-26 14:42:12.0 +0200
@@ -1075,8 +1075,9 @@
return -EINVAL;
len = sizeof(int);
val = nlk-flags  

Re: [Announce] Netchannels ported to the latest git tree. Gigabit benchmark. Complete rout.

2006-10-26 Thread bert hubert
On Thu, Oct 26, 2006 at 02:51:51PM +0400, Evgeniy Polyakov wrote:

 Benchmark uses 128 bytes sending/receiving per syscall (no latency
 checks, only throughput.
 Receiving CPU usage is 3 times less (90% socket code vs. 30%
 Sending CPU usage is 5 times less (upto 50% vs. upto 10%).

Wow. I currently lack the hardware to reproduce your measurements, do you
have any idea of how these numbers would be with 1024 byte system calls?

Thanks.

-- 
http://www.PowerDNS.com  Open source, database driven DNS Software 
http://netherlabs.nl  Open and Closed source services
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: watchdog timeout panic in e1000 driver

2006-10-26 Thread Auke Kok

Kenzo Iwami wrote:

Hi,

Thank you for your comment.

Anyway as I said in the same e-mail, we're working on reducing the lock timeout to a 
reasonable time. This will unfortunately take some time, as we need to change some major 
components in the driver to make sure this doesn't happen.

How about the following approach?
If acquiring semaphore fails inside the interrupt handler, acquiring semaphore
is abandoned immediately without waiting for timeout.
However, I don't know whether this method affects other processes.
with the current hardware being accessed simultaneously from several users in the 
kernel, that would lead to large problems - the watchdog task accesses it every 2 
seconds as it reads the PHY link status, so when one of those fails the driver would 
have no choice but to reset the entire device.


This problem occurs because interrupt handler is executed while the
interrupted code is still holding the semaphore. Acquiring the semaphore
fails regardless of the timeout period.

I think the watchdog task will fail trying to read the PHY link status,
even if the lock timeout period has been reduced.


correct, we're not looking into reducing the lock timeout but towards reducing the total 
lock time. Once we have reduced that to something acceptable, we can reduce the timout 
accordingly.


Cheers,

Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] [PATCH 0/3] Add Regulatory Domain support to d80211

2006-10-26 Thread Dan Williams
On Thu, 2006-10-26 at 00:00 +0200, Johannes Berg wrote:
 On Wed, 2006-10-25 at 13:43 -0400, Luis R. Rodriguez wrote:
 
  I guess my hope was that d80211 would just be more than a softmac
  implementation. When I hear wireless stack I don't think softmac
  implementation, I think a robust set of headers and device
  definitions which all wireless devices can share.
 
 Not just that, a bunch of library functions for example for crypto would
 be nice too. That's part of why I've been proposing that the tkip stuff
 be library functions that the drivers can call if required, instead of
 the bitfields.
 
 Currently, there's lot of top-down stuff in d80211, it does things which
 depend on flags and then instructs the driver to do something. This is
 good for a bunch of things, but in some cases where devices vary wildly
 it may be better to go for library functions instead. IMHO the TKIP key
 computation is such a case, it's trivial for a driver to call phase1 and
 phase2 when required.
 
  Also I thought we'd ditch WE as it seems we keep fixing it with gum
  (as seen by Linville's latest ABI compatibility fix). 
 
 Well, that was sort of necessary.
 
  If that wasn't
  the case then I'm suggesting it -- can we consider ditching WE?
 
 Well, no. We can make it a second-class citizen like I did with the
 cfg80211 work where I made it just one userspace interface for cfg80211
 which admittedly sometimes strange behaviour, but it's still there and
 current operations should still work with it (and I'd consider not
 working a bug except if userspace never calls 'commit' and expects
 things to work)
 
  I'd say lets just go for a
  userspace MLME as its already written but I seriously think we need to
  ditch replace WE first.
 
 It seems no one has a plan on what to do though.
 
  - Jiri's trying to fix the SMP issues. That's great.
  - Jiri also would like to expand ieee80211_conf.c, the stuff I
started for cfg80211
  - I'd like to see a header cleanup, it's necessary. Part of the problem
here is all the sub-ioctl WE foo. Clean that up by moving them into
cfg80211 as required, there's basically one user, wpa_supplicant (and
maybe hostapd), screw the others if there are any

While wpa_supplicant is certainly the main client for stuff directly
related to setting up a connection, there are quite a few other users of
general WE calls to pull information out of the card, or to receive scan
events.  So if you want maximum compatibility for a limited amount of
work, you can probably consider wpa_supplicant the only client of

(s = set, g = get)

1) [s|g] ENCODEEXT
2) [s|g] AUTH
3) [s|g] MLME
4) [s] RATE
5) [s] FREQ
6) [s] SENS
7) [s] AP
8) [s|g] RTS
9) [s|g] FRAG
10)[s|g] GENIE
11)[s|g] PMKSA

Notable exceptions:
1) [s|g] ENCODE
2) [s|g] MODE (other stuff turns on promiscuous mode)
3) [s|g] SCAN (other stuff needs to do this too)
4) [s|g] POWER (power management does this, not wpa_supplicant)

Of course lots of stuff needs to get RATE, ESSID, AP, FREQ, etc.

Dan

  - fix people's minds to not expect a perfect solution immediately but
accept something that can be expanded on later. I think we need to
accept some breakage in our development trees to get anywhere at all.
 
 Actually, the last point should be first.
 
 Enough rant from me for today,
 johannes

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] tcp: setsockopt congestion control autoload

2006-10-26 Thread Stephen Hemminger

Evgeniy Polyakov wrote:

On Wed, Oct 25, 2006 at 11:08:43AM -0700, Stephen Hemminger ([EMAIL PROTECTED]) 
wrote:
  

If user asks for a congestion control type with setsockopt() then it
may be available as a module not included in the kernel already. 
It should be autoloaded if needed.  This is done already when

the default selection is change with sysctl, but not when application
requests via sysctl.

Only reservation is are there any bad security implications from this?



What if system is badly configured, so it is possible to load malicious
module by kernel?

  
The kernel module loader has a fixed path. So one would have to be able 
to create a module
in /lib/modules/kernel release in order to get the malicious code 
loaded.  If the intruder could
put a module there, it would be just as easy to patch an existing module 
and have the

hack available on reboot.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] [PATCH 0/3] Add Regulatory Domain support to d80211

2006-10-26 Thread Johannes Berg
On Thu, 2006-10-26 at 10:35 -0400, Dan Williams wrote:

   - I'd like to see a header cleanup, it's necessary. Part of the problem
 here is all the sub-ioctl WE foo. Clean that up by moving them into
 cfg80211 as required, there's basically one user, wpa_supplicant (and
 maybe hostapd), screw the others if there are any

Oh, right, by sub-ioctl I was referring to the mess of the private
ioctls d80211 has for WE, including sub-items again.

 While wpa_supplicant is certainly the main client for stuff directly
 related to setting up a connection, there are quite a few other users of
 general WE calls to pull information out of the card, or to receive scan
 events. 

Of course.

 So if you want maximum compatibility for a limited amount of
 work, you can probably consider wpa_supplicant the only client of
 
 (s = set, g = get)
 
 1) [s|g] ENCODEEXT
 2) [s|g] AUTH
 3) [s|g] MLME
 4) [s] RATE
 5) [s] FREQ
 6) [s] SENS
 7) [s] AP
 8) [s|g] RTS
 9) [s|g] FRAG
 10)[s|g] GENIE
 11)[s|g] PMKSA

Sounds about right to me. I did actually intend to keep these intact but
drop the private ones with the 10xx sub-numbers.

 4) [s|g] POWER (power management does this, not wpa_supplicant)

Does it work for any card?

johannes
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] tcp: setsockopt congestion control autoload

2006-10-26 Thread Evgeniy Polyakov
On Thu, Oct 26, 2006 at 07:34:57AM -0700, Stephen Hemminger ([EMAIL PROTECTED]) 
wrote:
 Evgeniy Polyakov wrote:
 On Wed, Oct 25, 2006 at 11:08:43AM -0700, Stephen Hemminger 
 ([EMAIL PROTECTED]) wrote:
   
 If user asks for a congestion control type with setsockopt() then it
 may be available as a module not included in the kernel already. 
 It should be autoloaded if needed.  This is done already when
 the default selection is change with sysctl, but not when application
 requests via sysctl.
 
 Only reservation is are there any bad security implications from this?
 
 
 What if system is badly configured, so it is possible to load malicious
 module by kernel?
 
 The kernel module loader has a fixed path. So one would have to be able 
 to create a module
 in /lib/modules/kernel release in order to get the malicious code 
 loaded.  If the intruder could
 put a module there, it would be just as easy to patch an existing module 
 and have the
 hack available on reboot.

It just calls /sbin/modprobe, which in turn runs tons of scripts in
/etc/hotplug, modprobe and other places...
In the paranoid case we should not allow any user to load kernel
modules, even known ones. Should this option be guarded by some
capability check?

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] [PATCH 0/3] Add Regulatory Domain support to d80211

2006-10-26 Thread Luis R. Rodriguez

On 10/26/06, Dan Williams [EMAIL PROTECTED] wrote:

While wpa_supplicant is certainly the main client for stuff directly
related to setting up a connection, there are quite a few other users of
general WE calls to pull information out of the card, or to receive scan
events.


How about we just ditch iwconfig completely and move on to
wpa_supplicant/wpa_cli as our next userspace application with
nl80211/cg80211 as our new API for usersapce--kernel communication?
As you point out, wpa_supplicant already does a lot for us -- and
several distributions already rely on it. Some work is required but I
think its worth it. If we do a complete move from WE to nl80211 it
would be transparent to the users too.

 Luis
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Announce] Netchannels ported to the latest git tree. Gigabit benchmark. Complete rout.

2006-10-26 Thread Evgeniy Polyakov
On Thu, Oct 26, 2006 at 03:44:37PM +0200, bert hubert ([EMAIL PROTECTED]) wrote:
 On Thu, Oct 26, 2006 at 02:51:51PM +0400, Evgeniy Polyakov wrote:
 
  Benchmark uses 128 bytes sending/receiving per syscall (no latency
  checks, only throughput.
  Receiving CPU usage is 3 times less (90% socket code vs. 30%
  Sending CPU usage is 5 times less (upto 50% vs. upto 10%).
 
 Wow. I currently lack the hardware to reproduce your measurements, do you
 have any idea of how these numbers would be with 1024 byte system calls?

Results are not that exciting in this case.
Receiving CPU usage is about the same: it steady grows upto about 30-35%
(netchannel stops growing after some time with about 28%, socket slowly
continues), but netchannel's speed is smaller.
It can be described by that fact, that unetstack uses C-coded
checksumming (the dumbies algo I think) and additional memory copy, 
which becomes visible in case of big buffers (it can be eliminated though, 
I will think about better interface).
The same applies to sending - CPU usage is smaller, but speed is smaller
too. (10% vs. 8% compared to 30 MB/sec vs. 24 MB/sec).

So netchannels with userspace stack behave exactly the same in both 128
and 1024 byte write/read cases.

But all it is just a drawnbacks of userspace stack, not netchannels,
which do not have any protocol processing at all - it is just a queue
between low-level driver and users - some kind of high performance scalable 
packet socket with only selected addresses or tun/tap device.

 Thanks.
 
 -- 
 http://www.PowerDNS.com  Open source, database driven DNS Software 
 http://netherlabs.nl  Open and Closed source services

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] tcp: setsockopt congestion control autoload

2006-10-26 Thread Stephen Hemminger
On Thu, 26 Oct 2006 18:57:13 +0400
Evgeniy Polyakov [EMAIL PROTECTED] wrote:

 On Thu, Oct 26, 2006 at 07:34:57AM -0700, Stephen Hemminger ([EMAIL 
 PROTECTED]) wrote:
  Evgeniy Polyakov wrote:
  On Wed, Oct 25, 2006 at 11:08:43AM -0700, Stephen Hemminger 
  ([EMAIL PROTECTED]) wrote:

  If user asks for a congestion control type with setsockopt() then it
  may be available as a module not included in the kernel already. 
  It should be autoloaded if needed.  This is done already when
  the default selection is change with sysctl, but not when application
  requests via sysctl.
  
  Only reservation is are there any bad security implications from this?
  
  
  What if system is badly configured, so it is possible to load malicious
  module by kernel?
  
  The kernel module loader has a fixed path. So one would have to be able 
  to create a module
  in /lib/modules/kernel release in order to get the malicious code 
  loaded.  If the intruder could
  put a module there, it would be just as easy to patch an existing module 
  and have the
  hack available on reboot.
 
 It just calls /sbin/modprobe, which in turn runs tons of scripts in
 /etc/hotplug, modprobe and other places...
 In the paranoid case we should not allow any user to load kernel
 modules, even known ones. Should this option be guarded by some
 capability check?
 

No capability check needed. Any additional paranoia belongs in /sbin/modprobe.

There seems to be lots of existing usage where a user can cause a module
to be loaded (see bin_fmt, xtables, etc).

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] [PATCH 0/3] Add Regulatory Domain support to d80211

2006-10-26 Thread Dan Williams
On Thu, 2006-10-26 at 11:04 -0400, Luis R. Rodriguez wrote:
 On 10/26/06, Dan Williams [EMAIL PROTECTED] wrote:
  While wpa_supplicant is certainly the main client for stuff directly
  related to setting up a connection, there are quite a few other users of
  general WE calls to pull information out of the card, or to receive scan
  events.
 
 How about we just ditch iwconfig completely and move on to
 wpa_supplicant/wpa_cli as our next userspace application with
 nl80211/cg80211 as our new API for usersapce--kernel communication?
 As you point out, wpa_supplicant already does a lot for us -- and
 several distributions already rely on it. Some work is required but I
 think its worth it. If we do a complete move from WE to nl80211 it
 would be transparent to the users too.

The one blocker I can think of here is startup scripts on various
distributions.  Most of those are shell, and they usually rely on
iwconfig quite heavily.  Getting those converted to wpa_supplicant
wouldn't be a trivial amount of work, but it wouldn't be a ton either.

Dan

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Network virtualization/isolation

2006-10-26 Thread Stephen Hemminger
On Thu, 26 Oct 2006 11:44:55 +0200
Daniel Lezcano [EMAIL PROTECTED] wrote:

 Stephen Hemminger wrote:
  On Wed, 25 Oct 2006 17:51:28 +0200
  Daniel Lezcano [EMAIL PROTECTED] wrote:
  
  
 Hi Stephen,
 
 currently the work to make the container enablement into the kernel is 
 doing good progress. The ipc, pid, utsname and filesystem system 
 ressources are isolated/virtualized relying on the namespaces concept.
 
 But, there is missing the network virtualization/isolation. Two 
 approaches are proposed: doing the isolation at the layer 2 and at the 
 layer 3.
 
 The first one instanciate a network device by namespace and add a peer 
 network device into the root namespace, all the routing ressources are 
relative to the namespace. This work is done by Andrey Savochkin from 
 the openvz project.
 
 The second relies on the routes and associates the network namespace 
 pointer with each route. When the traffic is incoming, the packet 
 follows an input route and retrieve the associated network namespace. 
 When the traffic is outgoing, the packet, identified from the network 
 namespace is coming from, follows only the routes matching the same 
 network namespace. This work is made by me.
 
 IMHO, we need the two approach, the layer-2 to be able to bring *very* 
 strong isolation for system container with a performance cost and a 
 layer-3 to be able to have good isolation for lightweight container or 
 application container when performances are more important.
 
 Do you have some suggestions ? What is your point of view on that ?
 
 Thanks in advance.
 
-- Daniel
  
  
  Any solution should allow both and it should build on the existing 
  netfilter infrastructure.
  
  
 
 The problem is netfilter can not give a good isolation, eg. how can be 
 handled netstat command ? or avoid to see IP addresses assigned to 
 another container when doing ifconfig ? Furthermore, one of the biggest 
 interest of the network isolation is to bring mobility with a container 
 and that can only be done if the network ressources inside the kernel 
 can be identified by container in order to checkpoint/restart them.
 
 The all-in-namespace solution, ie. at layer 2, is very good in terms of 
 isolation but it adds an non-negligeable overhead. The layer 3 isolation 
   has an insignifiant overhead, a good isolation perfectly adapted for 
 applications containers.
 
 Unfortunatly, from the point of view of implementation, layer 3 can not 
 be a subset of layer 2 isolation when using all-in-namespace and layer 
 2 isolation can not be a extension of the layer 3 isolation.
 
 I think the layer 2 and the layer 3 implementations can coexists. You 
 can for example create a system container with a layer 2 isolation and 
 inside it add a layer 3 isolation.
 
 Does that make sense ?
 
   -- Daniel

Assuming you are talking about pseudo-virtualized environments,
there are several different discussions.

1. How should the namespace be isolated for the virtualized containered
   applications?

2. How should traffic be restricted into/out of those containers. This
   is where existing netfilter, classification, etc, should be used.
   The network code is overly rich as it is, we don't need another
   abstraction.

3. Can the virtualized containers be secure? No. we really can't keep
   hostile root in a container from killing system without going to
   a hypervisor.





-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.19-rc3 2/2] ehea: 64K page support fix

2006-10-26 Thread Jan-Bernd Themann
Hi,

that is right, I'll send a new patch

Thanks,
Jan-Bernd

On Wednesday 25 October 2006 18:21, Anton Blanchard wrote:
 
 Hi,
 
  +#ifdef CONFIG_PPC_64K_PAGES
  +   /* To support 64k pages we must round to 64k page boundary */
  +   epas-kernel.addr =
  +   ioremap((paddr_kernel  0x), PAGE_SIZE) +
  +   (paddr_kernel  0x);
  +#else
  epas-kernel.addr = ioremap(paddr_kernel, PAGE_SIZE);
  +#endif
 
 Cant you just use PAGE_MASK, ~PAGE_MASK and remove the ifdefs
 completely?
 
 Anton
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usbnet: use MII hooks only if CONFIG_MII is enabled

2006-10-26 Thread Adrian Bunk
On Wed, Oct 25, 2006 at 04:58:58PM -0700, Randy Dunlap wrote:
...
 Build tested with CONFIG_MII=y, m, n.
...
 --- linux-2619-rc3-pv.orig/drivers/usb/net/usbnet.c
 +++ linux-2619-rc3-pv/drivers/usb/net/usbnet.c
 @@ -47,6 +47,12 @@
  
  #define DRIVER_VERSION   22-Aug-2005
  
 +#if defined(CONFIG_MII) || defined(CONFIG_MII_MODULE)
 +#define HAVE_MII 1
 +#else
 +#define HAVE_MII 0
 +#endif
...

I'm too lame to test it, but I bet this will break with
CONFIG_USB_USBNET=y, CONFIG_MII=m, and you'll actually need

  #if defined(CONFIG_MII) || (defined(CONFIG_MII_MODULE)  defined(MODULE))

And then there's the question whether this amount of #ifdef's is 
actually worth avoiding the select MII...

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] usbnet: use MII hooks only if CONFIG_MII is enabled

2006-10-26 Thread Randy.Dunlap

Adrian Bunk wrote:

On Wed, Oct 25, 2006 at 04:58:58PM -0700, Randy Dunlap wrote:

...
Build tested with CONFIG_MII=y, m, n.
...
--- linux-2619-rc3-pv.orig/drivers/usb/net/usbnet.c
+++ linux-2619-rc3-pv/drivers/usb/net/usbnet.c
@@ -47,6 +47,12 @@
 
 #define DRIVER_VERSION		22-Aug-2005
 
+#if defined(CONFIG_MII) || defined(CONFIG_MII_MODULE)

+#define HAVE_MII   1
+#else
+#define HAVE_MII   0
+#endif
...


I'm too lame to test it, but I bet this will break with
CONFIG_USB_USBNET=y, CONFIG_MII=m, and you'll actually need

  #if defined(CONFIG_MII) || (defined(CONFIG_MII_MODULE)  defined(MODULE))

And then there's the question whether this amount of #ifdef's is 
actually worth avoiding the select MII...


Thanks, but that's OK, David posted a different patch for it.

--
~Randy
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7421] New: Oops, EIP is at atalk_sendmsg

2006-10-26 Thread Andrew Morton
On Thu, 26 Oct 2006 04:08:36 -0700
[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=7421
 
Summary: Oops, EIP is at atalk_sendmsg
 Kernel Version: 2.6.18.1
 Status: NEW
   Severity: normal
  Owner: [EMAIL PROTECTED]
  Submitter: [EMAIL PROTECTED]
 
 
 Distribution: Debian sarge
 Hardware Environment: i386
 
 Problem Description:
 
 ct 26 10:01:03 localhost papd[3120]: restart (2.0.3)
 Oct 26 10:01:07 localhost kernel: BUG: unable to handle kernel NULL pointer \
 dereference at virtual address 
 Oct 26 10:01:07 localhost kernel:  printing eip:
 Oct 26 10:01:07 localhost kernel: d0c16a8a
 Oct 26 10:01:07 localhost kernel: *pde = 
 Oct 26 10:01:07 localhost kernel: Oops:  [#1]
 Oct 26 10:01:07 localhost kernel: Modules linked in: appletalk psnap llc ipv6 
 \
 pcmcia_core af_packet parport_pc parport floppy pcspkr sn d_maestro3
 snd_ac97_codec \
 snd_ac97_bus snd_pcm snd_timer snd_page_alloc snd soundcore intel_agp 
 uhci_hcd \
 usbcore 3c59x mii agpgart mous edev tsdev joydev psmouse ide_cd cdrom rtc 
 reiserfs \
 ext3 jbd ide_disk ide_generic siimage aec62xx trm290 alim15x3 hpt34x hpt366
 cmd64x  \
 piix rz1000 slc90e66 generic cs5530 cs5520 sc1200 triflex atiixp pdc202xx_old 
 \
 pdc202xx_new opti621 ns87415 cy82c693 amd74xx sis5513 via 82cxxx serverworks
 ide_core \
 unix
 Oct 26 10:01:07 localhost kernel: CPU:0
 Oct 26 10:01:07 localhost kernel: EIP:0060:[pg0+277633674/1070257152]
 Not \
 tainted VLI
 Oct 26 10:01:07 localhost kernel: EFLAGS: 00010286   (2.6.17.14.2006-10-25 
 #1) 
 Oct 26 10:01:07 localhost kernel: EIP is at atalk_sendmsg+0x15b/0x4e4 
 [appletalk]
 Oct 26 10:01:07 localhost kernel: eax:    ebx: 002f   ecx: 
    \
 edx: 
 Oct 26 10:01:07 localhost kernel: esi: cadcb600   edi:    ebp: 
 cc9d7eec   \
 esp: cc9d7d6c
 Oct 26 10:01:07 localhost kernel: ds: 007b   es: 007b   ss: 0068
 Oct 26 10:01:07 localhost kernel: Process afpd (pid: 3118, 
 threadinfo=cc9d6000 \
 task=cfe205d0)
 Oct 26 10:01:07 localhost kernel: Stack:  c02b32c0  cc9d7ee8
 cffbc500 \
  d0c16f05 cffbc500 
 Oct 26 10:01:07 localhost kernel:cffbc500 cc9d7ec8 cadcb600 
  \
 0400 cc9d7f48 001b 
 Oct 26 10:01:07 localhost kernel:cc9d7ec8 cc9d7e1c cc9d7ee8 c01fe97a
 cc9d7e1c \
 ca252600 cc9d7ec8 001b 
 Oct 26 10:01:07 localhost kernel: Call Trace:
 Oct 26 10:01:07 localhost kernel:  d0c16f05 atalk_recvmsg+0xf2/0x105
 [appletalk]  \
 c01fe97a sock_sendmsg+0xd0/0xeb
 Oct 26 10:01:07 localhost kernel:  c0157bfd touch_atime+0xb4/0xbb  
 c0198b22 \
 copy_from_user+0x34/0x5a
 Oct 26 10:01:07 localhost kernel:  c012383e 
 autoremove_wake_function+0x0/0x3a  \
 c0198b22 copy_from_user+0x34/0x5a
 Oct 26 10:01:07 localhost kernel:  c01fe490 move_addr_to_kernel+0x24/0x39  \
 c01ffaaa sys_sendto+0xe9/0x10d
 Oct 26 10:01:07 localhost kernel:  c01fe67e sock_attach_fd+0x72/0xd2  
 c0143d52 \
 get_empty_filp+0x3b/0xe4
 Oct 26 10:01:07 localhost kernel:  c0143d7b get_empty_filp+0x64/0xe4  
 c0198ae4 \
 copy_to_user+0x32/0x3c
 Oct 26 10:01:07 localhost kernel:  c02001de sys_socketcall+0xf2/0x180 
 c0102a03 \
 syscall_call+0x7/0xb
 Oct 26 10:01:07 localhost kernel: Code: 0c 83 c0 04 eb 15 c6 44 24 1a 00 0f b7
 86 26 \
 01 00 00 66 89 44 24 18 8d 44 24 18 50 e8 e0 eb ff  ff 89 44 24 04 85 f6 5d 8b
 14 24 \
 8b 12 89 54 24 04 74 1b 8b 86 84 00 00 00 f6 c4 04 74 10 52 
 53 
 Oct 26 10:01:07 localhost kernel: EIP: [pg0+277633674/1070257152] \
 atalk_sendmsg+0x15b/0x4e4 [appletalk] SS:ESP 0068:cc9d7d6c
 Oct 26 10:01:21 localhost atalkd[3106]: as_timer gateway 8000.100 down
 
 
 
 Steps to reproduce:
 restart the machine, start papd after network initializing has finished
 a second start of papd works fine
 
 appletalk is loades as module
 
 same behaviour with 2.6.17.14
 
 --- You are receiving this mail because: ---
 You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] tcp: setsockopt congestion control autoload

2006-10-26 Thread Patrick McHardy
Stephen Hemminger wrote:
 No capability check needed. Any additional paranoia belongs in /sbin/modprobe.
 
 There seems to be lots of existing usage where a user can cause a module
 to be loaded (see bin_fmt, xtables, etc).


x_tables is restricted to CAP_NET_ADMIN, but in net/ alone we have
__sock_create (loads protocol families), sock_ioctl (loads bridge,
vlan or dlci), the already mentioned netlink case, inet_create
(loads IP protocols), inet6_create (similar to inet_create), and
a few more.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[SOFTMAC] - level of verbosity

2006-10-26 Thread neologix
Hi.
I'd just like to know whether it is be possible to reduce the verbosity level of
softmac.
My wireless is working fine, but my logs are polluted by:

/* LOG */

printk: 20 messages suppressed.
SoftMAC: Received deauthentication packet from 00:0a:79:52:84:6c, but that
network is unknown.
SoftMAC: Received deauthentication packet from 00:0a:79:52:84:6c, but that
network is unknown.
SoftMAC: Received deauthentication packet from 00:0a:79:52:84:6c, but that
network is unknown.
SoftMAC: Received deauthentication packet from 00:0a:79:52:84:6c, but that
network is unknown.
SoftMAC: Received deauthentication packet from 00:0a:79:52:84:6c, but that
network is unknown.
SoftMAC: Received deauthentication packet from 00:0a:79:52:84:6c, but that
network is unknown.
SoftMAC: Received deauthentication packet from 00:14:bf:03:40:68, but that
network is unknown.
SoftMAC: Received deauthentication packet from 00:14:bf:03:40:68, but that
network is unknown.
SoftMAC: Received deauthentication packet from 00:14:bf:03:40:68, but that
network is unknown.
SoftMAC: Received deauthentication packet from 00:14:bf:03:40:68, but that
network is unknown.
SoftMAC: Received deauthentication packet from 00:14:bf:03:40:68, but that
network is unknown.
SoftMAC: Received deauthentication packet from 00:14:bf:03:40:68, but that
network is unknown.
SoftMAC: Authentication response received from 00:14:bf:03:40:68 but no queue
item exists.
SoftMAC: Authentication response received from 00:14:bf:03:40:68 but no queue
item exists.
SoftMAC: Authentication response received from 00:14:bf:03:40:68 but no queue
item exists.
SoftMAC: Authentication response received from 00:14:bf:03:40:68 but no queue
item exists.
SoftMAC: Authentication response received from 00:0f:66:b9:3d:4c but no queue
item exists.
SoftMAC: Authentication response received from 00:0f:66:b9:3d:4c but no queue
item exists.
SoftMAC: Authentication response received from 00:0f:66:b9:3d:4c but no queue
item exists.
SoftMAC: Authentication response received from 00:0f:66:b9:3d:4c but no queue
item exists.
SoftMAC: Authentication response received from 00:0f:66:b9:3d:4c but no queue
item exists.
SoftMAC: Received deauthentication packet from 00:0a:79:52:84:6c, but that
network is unknown.

/* LOG */

It's kinda polluting the interesting parts of logs, and furthermore, when it's
actually written to the file, it provokes regular disk activity, which is
really annoying.

It built my kernel without the softmac debugging option:

$ cat .config | grep SOFTMAC
CONFIG_IEEE80211_SOFTMAC=y
# CONFIG_IEEE80211_SOFTMAC_DEBUG is not set

but it's still really talkative :-)

So, is there anyway to get rid of this?

Anyway, thanks for your work,

cheers,

cf



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] tcp: setsockopt congestion control autoload

2006-10-26 Thread John Heffner
My reservation in doing this would be that as an administrator, I may 
want to choose exactly what congestion control is available any any 
given time.  The different congestion control algorithms are not 
necessarily fair to each other.


If the modules are autoloaded, I could still enforce this by moving the 
modules out of /lib/modules, but I think it's cleaner to do it by 
loading/unloading modules as appropriate.


  -John


Stephen Hemminger wrote:

If user asks for a congestion control type with setsockopt() then it
may be available as a module not included in the kernel already. 
It should be autoloaded if needed.  This is done already when

the default selection is change with sysctl, but not when application
requests via sysctl.

Only reservation is are there any bad security implications from this?

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- orig/net/ipv4/tcp_cong.c2006-10-25 13:55:34.0 -0700
+++ new/net/ipv4/tcp_cong.c 2006-10-25 13:58:39.0 -0700
@@ -153,9 +153,19 @@
 
 	rcu_read_lock();

ca = tcp_ca_find(name);
+   /* no change asking for existing value */
if (ca == icsk-icsk_ca_ops)
goto out;
 
+#ifdef CONFIG_KMOD

+   /* not found attempt to autoload module */
+   if (!ca) {
+   rcu_read_unlock();
+   request_module(tcp_%s, name);
+   rcu_read_lock();
+   ca = tcp_ca_find(name);
+   }
+#endif
if (!ca)
err = -ENOENT;
 
-

To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP congestion graphs

2006-10-26 Thread Hagen Paul Pfeifer
Hi Stephen,

is your rt-patch to netem public available?

Best regards

HGN


-- 
Signed and/or encrypted mails preferd. Key-Id = 0x98350C22
Fingerprint = 490F 557B 6C48 6D7E 5706  2EA2 4A22 8D45 9835 0C22 
Key available under: www.jauu.net/download/gnupg_key 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch] kmemdup() cleanup in net/

2006-10-26 Thread Eric Sesterhenn
hi,

replace open coded kmemdup() to save some screen space,
and allow inlining/not inlining to be triggered by gcc.

Signed-off-by: Eric Sesterhenn [EMAIL PROTECTED]

--- linux-2.6.19-rc3-git1/net/atm/lec.c.orig2006-10-26 20:21:48.0 
+0200
+++ linux-2.6.19-rc3-git1/net/atm/lec.c 2006-10-26 20:23:28.0 +0200
@@ -1321,11 +1321,10 @@ static int lane2_resolve(struct net_devi
if (table == NULL)
return -1;
 
-   *tlvs = kmalloc(table-sizeoftlvs, GFP_ATOMIC);
+   *tlvs = kmemdup(table-tlvs, table-sizeoftlvs, GFP_ATOMIC);
if (*tlvs == NULL)
return -1;
 
-   memcpy(*tlvs, table-tlvs, table-sizeoftlvs);
*sizeoftlvs = table-sizeoftlvs;
 
return 0;
@@ -1364,11 +1363,10 @@ static int lane2_associate_req(struct ne
 
kfree(priv-tlvs);  /* NULL if there was no previous association */
 
-   priv-tlvs = kmalloc(sizeoftlvs, GFP_KERNEL);
+   priv-tlvs = kmemdup(tlvs, sizeoftlvs, GFP_KERNEL);
if (priv-tlvs == NULL)
return (0);
priv-sizeoftlvs = sizeoftlvs;
-   memcpy(priv-tlvs, tlvs, sizeoftlvs);
 
skb = alloc_skb(sizeoftlvs, GFP_ATOMIC);
if (skb == NULL)
--- linux-2.6.19-rc3-git1/net/ax25/ax25_out.c.orig  2006-10-26 
20:23:59.0 +0200
+++ linux-2.6.19-rc3-git1/net/ax25/ax25_out.c   2006-10-26 20:24:15.0 
+0200
@@ -70,11 +70,10 @@ ax25_cb *ax25_send_frame(struct sk_buff 
ax25-dest_addr   = *dest;
 
if (digi != NULL) {
-   if ((ax25-digipeat = kmalloc(sizeof(ax25_digi), GFP_ATOMIC)) 
== NULL) {
+   if ((ax25-digipeat = kmemdup(digi, sizeof(ax25_digi), 
GFP_ATOMIC)) == NULL) {
ax25_cb_put(ax25);
return NULL;
}
-   memcpy(ax25-digipeat, digi, sizeof(ax25_digi));
}
 
switch (ax25-ax25_dev-values[AX25_VALUES_PROTOCOL]) {
--- linux-2.6.19-rc3-git1/net/ax25/ax25_route.c.orig2006-10-26 
20:24:23.0 +0200
+++ linux-2.6.19-rc3-git1/net/ax25/ax25_route.c 2006-10-26 20:24:50.0 
+0200
@@ -432,11 +432,11 @@ int ax25_rt_autobind(ax25_cb *ax25, ax25
}
 
if (ax25_rt-digipeat != NULL) {
-   if ((ax25-digipeat = kmalloc(sizeof(ax25_digi), GFP_ATOMIC)) 
== NULL) {
+   if ((ax25-digipeat = kmemdup(ax25_rt-digipeat, 
+   sizeof(ax25_digi), GFP_ATOMIC)) == NULL) {
err = -ENOMEM;
goto put;
}
-   memcpy(ax25-digipeat, ax25_rt-digipeat, sizeof(ax25_digi));
ax25_adjust_path(addr, ax25-digipeat);
}
 
--- linux-2.6.19-rc3-git1/net/core/neighbour.c.orig 2006-10-26 
20:25:20.0 +0200
+++ linux-2.6.19-rc3-git1/net/core/neighbour.c  2006-10-26 20:25:52.0 
+0200
@@ -1266,10 +1266,9 @@ void pneigh_enqueue(struct neigh_table *
 struct neigh_parms *neigh_parms_alloc(struct net_device *dev,
  struct neigh_table *tbl)
 {
-   struct neigh_parms *p = kmalloc(sizeof(*p), GFP_KERNEL);
+   struct neigh_parms *p = kmemdup(tbl-parms, sizeof(*p), GFP_KERNEL);
 
if (p) {
-   memcpy(p, tbl-parms, sizeof(*p));
p-tbl= tbl;
atomic_set(p-refcnt, 1);
INIT_RCU_HEAD(p-rcu_head);
--- linux-2.6.19-rc3-git1/net/dccp/feat.c.orig  2006-10-26 20:26:12.0 
+0200
+++ linux-2.6.19-rc3-git1/net/dccp/feat.c   2006-10-26 20:27:26.0 
+0200
@@ -279,12 +279,11 @@ static int dccp_feat_nn(struct sock *sk,
if (opt == NULL)
return -ENOMEM;
 
-   copy = kmalloc(len, GFP_ATOMIC);
+   copy = kmemdup(val, len, GFP_ATOMIC);
if (copy == NULL) {
kfree(opt);
return -ENOMEM;
}
-   memcpy(copy, val, len);
 
opt-dccpop_type = DCCPO_CONFIRM_R; /* NN can only confirm R */
opt-dccpop_feat = feature;
@@ -501,20 +500,18 @@ int dccp_feat_clone(struct sock *oldsk, 
list_for_each_entry(opt, olddmsk-dccpms_pending, dccpop_node) {
struct dccp_opt_pend *newopt;
/* copy the value of the option */
-   u8 *val = kmalloc(opt-dccpop_len, GFP_ATOMIC);
+   u8 *val = kmemdup(opt-dccpop_val, opt-dccpop_len, GFP_ATOMIC);
 
if (val == NULL)
goto out_clean;
-   memcpy(val, opt-dccpop_val, opt-dccpop_len);
 
-   newopt = kmalloc(sizeof(*newopt), GFP_ATOMIC);
+   newopt = kmemdup(opt, sizeof(*newopt), GFP_ATOMIC);
if (newopt == NULL) {
kfree(val);
goto out_clean;
}
 
/* insert the option */
-   memcpy(newopt, opt, sizeof(*newopt));
newopt-dccpop_val = val;

[PATCH] Rewrite e100_phys_id

2006-10-26 Thread Matthew Wilcox

The motivator for this was to fix the sparse warning:

drivers/net/e100.c:2418:48: warning: cast truncates bits from constant
value (83126e978d4fdf becomes 978d4fdf)
drivers/net/e100.c:2419:37: warning: cast truncates bits from constant
value (83126e978d4fdf becomes 978d4fdf)

Initially, I tried a quick fix, but when it ran into difficulties, I
looked at tg3.c to see how it does it.  I liked their way better, so I
rewrote e100.c to be similar.  It shaves ~700 bytes off the size of the
driver, and a few bytes off the size of struct nic, so I think it's a
win all round.  Tested on the internal interface of an HP Integrity rx2600.

Signed-off-by: Matthew Wilcox [EMAIL PROTECTED]

diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index a3a08a5..aade1e9 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -556,7 +556,6 @@ struct nic {
struct params params;
struct net_device_stats net_stats;
struct timer_list watchdog;
-   struct timer_list blink_timer;
struct mii_if_info mii;
struct work_struct tx_timeout_task;
enum loopback loopback;
@@ -581,7 +580,6 @@ struct nic {
u32 rx_over_length_errors;
 
u8 rev_id;
-   u16 leds;
u16 eeprom_wc;
u16 eeprom[256];
spinlock_t mdio_lock;
@@ -2168,23 +2166,6 @@ err_clean_rx:
return err;
 }
 
-#define MII_LED_CONTROL0x1B
-static void e100_blink_led(unsigned long data)
-{
-   struct nic *nic = (struct nic *)data;
-   enum led_state {
-   led_on = 0x01,
-   led_off= 0x04,
-   led_on_559 = 0x05,
-   led_on_557 = 0x07,
-   };
-
-   nic-leds = (nic-leds  led_on) ? led_off :
-   (nic-mac  mac_82559_D101M) ? led_on_557 : led_on_559;
-   mdio_write(nic-netdev, nic-mii.phy_id, MII_LED_CONTROL, nic-leds);
-   mod_timer(nic-blink_timer, jiffies + HZ / 4);
-}
-
 static int e100_get_settings(struct net_device *netdev, struct ethtool_cmd 
*cmd)
 {
struct nic *nic = netdev_priv(netdev);
@@ -2411,16 +2392,32 @@ static void e100_diag_test(struct net_de
msleep_interruptible(4 * 1000);
 }
 
+#define MII_LED_CONTROL0x1B
 static int e100_phys_id(struct net_device *netdev, u32 data)
 {
struct nic *nic = netdev_priv(netdev);
+   int i;
+
+   enum led_state {
+   led_off= 0x04,
+   led_on_559 = 0x05,
+   led_on_557 = 0x07,
+   };
+   u16 leds = led_off;
+
+   if (data == 0)
+   data = 2;
+
+   for (i = 0; i  (data * 2); i++) {
+   leds = (leds == led_off) ?
+   (nic-mac  mac_82559_D101M) ? led_on_557 : led_on_559 :
+   led_off;
+   mdio_write(nic-netdev, nic-mii.phy_id, MII_LED_CONTROL, leds);
+   if (msleep_interruptible(500))
+   break;
+   }
 
-   if(!data || data  (u32)(MAX_SCHEDULE_TIMEOUT / HZ))
-   data = (u32)(MAX_SCHEDULE_TIMEOUT / HZ);
-   mod_timer(nic-blink_timer, jiffies);
-   msleep_interruptible(data * 1000);
-   del_timer_sync(nic-blink_timer);
-   mdio_write(netdev, nic-mii.phy_id, MII_LED_CONTROL, 0);
+   mdio_write(netdev, nic-mii.phy_id, MII_LED_CONTROL, led_off);
 
return 0;
 }
@@ -2633,9 +2630,6 @@ #endif
init_timer(nic-watchdog);
nic-watchdog.function = e100_watchdog;
nic-watchdog.data = (unsigned long)nic;
-   init_timer(nic-blink_timer);
-   nic-blink_timer.function = e100_blink_led;
-   nic-blink_timer.data = (unsigned long)nic;
 
INIT_WORK(nic-tx_timeout_task,
(void (*)(void *))e100_tx_timeout_task, netdev);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Rewrite e100_phys_id

2006-10-26 Thread Jeff Garzik
On Thu, Oct 26, 2006 at 01:11:55PM -0600, Matthew Wilcox wrote:
 
 The motivator for this was to fix the sparse warning:
 
 drivers/net/e100.c:2418:48: warning: cast truncates bits from constant
 value (83126e978d4fdf becomes 978d4fdf)
 drivers/net/e100.c:2419:37: warning: cast truncates bits from constant
 value (83126e978d4fdf becomes 978d4fdf)
 
 Initially, I tried a quick fix, but when it ran into difficulties, I
 looked at tg3.c to see how it does it.  I liked their way better, so I
 rewrote e100.c to be similar.  It shaves ~700 bytes off the size of the
 driver, and a few bytes off the size of struct nic, so I think it's a
 win all round.  Tested on the internal interface of an HP Integrity rx2600.
 
 Signed-off-by: Matthew Wilcox [EMAIL PROTECTED]

Seems sane to me...  I'll pick it up, if Auke doesn't...

Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP congestion graphs

2006-10-26 Thread Stephen Hemminger
On Thu, 26 Oct 2006 20:50:19 +0200
Hagen Paul Pfeifer [EMAIL PROTECTED] wrote:

 Hi Stephen,
 
 is your rt-patch to netem public available?
 
 Best regards
 
 HGN
 

The tools are in the tcp directory

http://developer.osdl.org/shemminger/tcp/netem-2.6.18-rt.patch


-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Please pull bcm43xx-d80211 features and bugfixes

2006-10-26 Thread Michael Buesch
Hi John,

Please pull latest bcm43xx-d80211 features and bugfixes.

git pull http://bu3sch.de/git/wireless-dev.git for-linville

This will introduce hardware encryption for everything
but TKIP on v4 firmware. For v3 firmware or TKIP it will
transparently fall back to software encryption.
It also fixes a bug that caused high CPU usage due to an IRQ
not stopping to trigger.


  bcm43xx-d80211: Fix runaway IRQ which caused high CPU usage.
  bcm43xx-d80211: Rename IRQs
  bcm43xx-d80211: Fix hardware based encryption for v4 firmware.
  bcm43xx-d80211: Use software encryption for TKIP for now.
  bcm43xx-d80211: No support for hw encryption with v3 firmware. Various 
hwenc fixes.
  bcm43xx-d80211: Only set USEDEFKEYS hostflag for WEP.

 drivers/net/wireless/d80211/bcm43xx/bcm43xx.h  |  103 +++-
 drivers/net/wireless/d80211/bcm43xx/bcm43xx_main.c |  488 +---
 drivers/net/wireless/d80211/bcm43xx/bcm43xx_main.h |3 
 drivers/net/wireless/d80211/bcm43xx/bcm43xx_xmit.c |   93 +++-
 drivers/net/wireless/d80211/bcm43xx/bcm43xx_xmit.h |2 
 5 files changed, 461 insertions(+), 228 deletions(-)

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Rewrite e100_phys_id

2006-10-26 Thread Auke Kok

Jeff Garzik wrote:

On Thu, Oct 26, 2006 at 01:11:55PM -0600, Matthew Wilcox wrote:

The motivator for this was to fix the sparse warning:

drivers/net/e100.c:2418:48: warning: cast truncates bits from constant
value (83126e978d4fdf becomes 978d4fdf)
drivers/net/e100.c:2419:37: warning: cast truncates bits from constant
value (83126e978d4fdf becomes 978d4fdf)

Initially, I tried a quick fix, but when it ran into difficulties, I
looked at tg3.c to see how it does it.  I liked their way better, so I
rewrote e100.c to be similar.  It shaves ~700 bytes off the size of the
driver, and a few bytes off the size of struct nic, so I think it's a
win all round.  Tested on the internal interface of an HP Integrity rx2600.

Signed-off-by: Matthew Wilcox [EMAIL PROTECTED]


Seems sane to me...  I'll pick it up, if Auke doesn't...


no objections, so I'll ACK it with the notion that I'm going to let our labs do some 
more testing on it with all the latest changes to it.


Jeff, I will stack it on the patches I have for 2.6.20 and push those out before the 
weekend.


Cheers,

Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7421] New: Oops, EIP is at atalk_sendmsg

2006-10-26 Thread Krzysztof Oledzki



On Thu, 26 Oct 2006, Andrew Morton wrote:


On Thu, 26 Oct 2006 04:08:36 -0700
[EMAIL PROTECTED] wrote:


http://bugzilla.kernel.org/show_bug.cgi?id=7421

   Summary: Oops, EIP is at atalk_sendmsg
Kernel Version: 2.6.18.1
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Distribution: Debian sarge
Hardware Environment: i386

Problem Description:

ct 26 10:01:03 localhost papd[3120]: restart (2.0.3)
Oct 26 10:01:07 localhost kernel: BUG: unable to handle kernel NULL pointer \
dereference at virtual address 
Oct 26 10:01:07 localhost kernel:  printing eip:
Oct 26 10:01:07 localhost kernel: d0c16a8a
Oct 26 10:01:07 localhost kernel: *pde = 
Oct 26 10:01:07 localhost kernel: Oops:  [#1]
Oct 26 10:01:07 localhost kernel: Modules linked in: appletalk psnap llc ipv6 \
pcmcia_core af_packet parport_pc parport floppy pcspkr sn d_maestro3
snd_ac97_codec \
snd_ac97_bus snd_pcm snd_timer snd_page_alloc snd soundcore intel_agp uhci_hcd \
usbcore 3c59x mii agpgart mous edev tsdev joydev psmouse ide_cd cdrom rtc 
reiserfs \
ext3 jbd ide_disk ide_generic siimage aec62xx trm290 alim15x3 hpt34x hpt366
cmd64x  \
piix rz1000 slc90e66 generic cs5530 cs5520 sc1200 triflex atiixp pdc202xx_old \
pdc202xx_new opti621 ns87415 cy82c693 amd74xx sis5513 via 82cxxx serverworks
ide_core \
unix
Oct 26 10:01:07 localhost kernel: CPU:0
Oct 26 10:01:07 localhost kernel: EIP:0060:[pg0+277633674/1070257152]
Not \
tainted VLI
Oct 26 10:01:07 localhost kernel: EFLAGS: 00010286   (2.6.17.14.2006-10-25 #1)
Oct 26 10:01:07 localhost kernel: EIP is at atalk_sendmsg+0x15b/0x4e4 
[appletalk]
Oct 26 10:01:07 localhost kernel: eax:    ebx: 002f   ecx:  
  \
edx: 
Oct 26 10:01:07 localhost kernel: esi: cadcb600   edi:    ebp: cc9d7eec 
  \
esp: cc9d7d6c
Oct 26 10:01:07 localhost kernel: ds: 007b   es: 007b   ss: 0068
Oct 26 10:01:07 localhost kernel: Process afpd (pid: 3118, threadinfo=cc9d6000 \
task=cfe205d0)
Oct 26 10:01:07 localhost kernel: Stack:  c02b32c0  cc9d7ee8
cffbc500 \
 d0c16f05 cffbc500
Oct 26 10:01:07 localhost kernel:cffbc500 cc9d7ec8 cadcb600 
 \
0400 cc9d7f48 001b
Oct 26 10:01:07 localhost kernel:cc9d7ec8 cc9d7e1c cc9d7ee8 c01fe97a
cc9d7e1c \
ca252600 cc9d7ec8 001b
Oct 26 10:01:07 localhost kernel: Call Trace:
Oct 26 10:01:07 localhost kernel:  d0c16f05 atalk_recvmsg+0xf2/0x105
[appletalk]  \
c01fe97a sock_sendmsg+0xd0/0xeb
Oct 26 10:01:07 localhost kernel:  c0157bfd touch_atime+0xb4/0xbb  c0198b22 
\
copy_from_user+0x34/0x5a
Oct 26 10:01:07 localhost kernel:  c012383e autoremove_wake_function+0x0/0x3a 
 \
c0198b22 copy_from_user+0x34/0x5a
Oct 26 10:01:07 localhost kernel:  c01fe490 move_addr_to_kernel+0x24/0x39  \
c01ffaaa sys_sendto+0xe9/0x10d
Oct 26 10:01:07 localhost kernel:  c01fe67e sock_attach_fd+0x72/0xd2  
c0143d52 \
get_empty_filp+0x3b/0xe4
Oct 26 10:01:07 localhost kernel:  c0143d7b get_empty_filp+0x64/0xe4  
c0198ae4 \
copy_to_user+0x32/0x3c
Oct 26 10:01:07 localhost kernel:  c02001de sys_socketcall+0xf2/0x180
c0102a03 \
syscall_call+0x7/0xb
Oct 26 10:01:07 localhost kernel: Code: 0c 83 c0 04 eb 15 c6 44 24 1a 00 0f b7
86 26 \
01 00 00 66 89 44 24 18 8d 44 24 18 50 e8 e0 eb ff  ff 89 44 24 04 85 f6 5d 8b
14 24 \
8b 12 89 54 24 04 74 1b 8b 86 84 00 00 00 f6 c4 04 74 10 52 53
Oct 26 10:01:07 localhost kernel: EIP: [pg0+277633674/1070257152] \
atalk_sendmsg+0x15b/0x4e4 [appletalk] SS:ESP 0068:cc9d7d6c
Oct 26 10:01:21 localhost atalkd[3106]: as_timer gateway 8000.100 down



Steps to reproduce:
restart the machine, start papd after network initializing has finished
a second start of papd works fine

appletalk is loades as module

same behaviour with 2.6.17.14


Something like me too:

Unable to handle kernel NULL pointer dereference at virtual address 
 printing eip:
c036b1ef
*pde = 
Oops:  [#1]
PREEMPT
Modules linked in: bonding
CPU:0
EIP:0060:[c036b1ef]Not tainted VLI
EFLAGS: 00010286   (2.6.15.1)
EIP is at atalk_sendmsg+0x158/0x557
eax: d468fee4   ebx: 0017   ecx: d468fd20   edx: 
esi:    edi: d7e88200   ebp: bfa7c480   esp: d468fd68
ds: 007b   es: 007b   ss: 0068
Process atalkd (pid: 551, threadinfo=d468e000 task=d6f55090)
Stack:  d468ff40  d468fee0 d70d20a0 0003 c036b6e0 d70d20a0
   d70d20a0 d468fec0 d7e88200   0400 d468ff40 0003
   d468fec0 d468fe18 bfa7c480 c02e2d5e d468fe18 d7194540 d468fec0 0003
Call Trace:
 [c036b6e0] atalk_recvmsg+0xf2/0x105
 [c02e2d5e] sock_sendmsg+0xce/0xe9
 [c01212c2] 

[IPROUTE] manpage for lnstat

2006-10-26 Thread Michael Prokop
Hello,

I wrote a manpage for lnstat, would be great if it could be applied
to the next release.

@Harald, I'm following your 'If somebody wants to do a manpage, feel
free to send me a patch :)' of lnstat's README. :)

regards,
-mika-
.TH LNSTAT 1
.SH NAME
lnstat \- unified linux network statistics
.SH SYNOPSIS
.B lnstat
.RI [ options ]
.SH DESCRIPTION
This manual page documents briefly the
.B lnstat
command.
.PP
\fBlnstat\fP is a generalized and more feature-complete replacement for the old 
rtstat program.
In addition to routing cache statistics, it supports any kind of statistics the 
linux kernel
exports via a file in /proc/net/stat/.
.SH OPTIONS
These programs follow the usual GNU command line syntax, with long
options starting with two dashes (`-').
lnstat supports the following options.
.TP
.B \-h, \-\-help
Show summary of options.
.TP
.B \-V, \-\-version
Show version of program.
.TP
.B \-c, \-\-count count
Print count number of intervals.
.TP
.B \-d, \-\-dump
Dump list of available files/keys.
.TP
.B \-f, \-\-file file
Statistics file to use.
.TP
.B \-i, \-\-interval intv
Set interval to 'intv' seconds.
.TP
.B \-k, \-\-keys k,k,k,...
Display only keys specified.
.TP
.B \-s, \-\-subject [0-2]
Specify display of subject/header. '0' means no header at all, '1' prints a 
header only at start of the program and '2' prints a header every 20 lines.
.TP
.B \-w, \-\-width n,n,n,...
Width for each field.
.SH USAGE EXAMPLES
.TP
.B # lnstat -d
Get a list of supported statistics files.
.TP
.B # lnstat -k arp_cache:entries,rt_cache:in_hit,arp_cache:destroys
Select the specified files and keys.
.TP
.B # lnstat -i 10
Use an interval of 10 seconds.
.TP
.B # lnstat -f ip_conntrack
Use only the specified file for statistics.
.TP
.B # lnstat -s 0
Do not print a header at all.
.TP
.B # lnstat -s 20
Print a header at start and every 20 lines.
.TP
.B # lnstat -c -1 -i 1 -f rt_cache -k entries,in_hit,in_slow_tot
Display statistics for keys entries, in_hit and in_slow_tot of field rt_cache 
every second.
.SH SEE ALSO
.BR ip (8),
and /usr/share/doc/iproute-doc/README.lnstat (package iproute-doc on Debian)
.br
.SH AUTHOR
lnstat was written by Harald Welte [EMAIL PROTECTED].
.PP
This manual page was written by Michael Prokop [EMAIL PROTECTED] for the 
Debian project (but may be used by others).


pgplbiCZvTrRZ.pgp
Description: PGP signature


[PATCH] myri10ge: ServerWorks HT2000 PCI id is already defined in pci_ids.h

2006-10-26 Thread Brice Goglin
[PATCH] myri10ge: ServerWorks HT2000 PCI id is already defined in pci_ids.h

No need to keep defining PCI_DEVICE_ID_SERVERWORKS_HT2000_PCIE
in the driver code since it is now defined in pci_ids.h.

Signed-off-by: Brice Goglin [EMAIL PROTECTED]
---
 drivers/net/myri10ge/myri10ge.c |1 -
 1 file changed, 1 deletion(-)



Please apply for 2.6.19 since PCI_DEVICE_ID_SERVERWORKS_HT2000_PCIE has been
added to pci_ids.h in -rc1 (commit 6397c75cbc4d7dbc3d07278b57c82a47dafb21b5).

Thanks,
Brice



Index: linux-rc/drivers/net/myri10ge/myri10ge.c
===
--- linux-rc.orig/drivers/net/myri10ge/myri10ge.c   2006-10-26 
22:18:53.0 +0200
+++ linux-rc/drivers/net/myri10ge/myri10ge.c2006-10-26 22:19:05.0 
+0200
@@ -2410,7 +2410,6 @@
  * firmware image, and set tx.boundary to 4KB.
  */
 
-#define PCI_DEVICE_ID_SERVERWORKS_HT2000_PCIE  0x0132
 #define PCI_DEVICE_ID_INTEL_E5000_PCIE23 0x25f7
 #define PCI_DEVICE_ID_INTEL_E5000_PCIE47 0x25fa
 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] tcp: setsockopt congestion control autoload

2006-10-26 Thread David Miller
From: Evgeniy Polyakov [EMAIL PROTECTED]
Date: Thu, 26 Oct 2006 18:57:13 +0400

 It just calls /sbin/modprobe, which in turn runs tons of scripts in
 /etc/hotplug, modprobe and other places...
 In the paranoid case we should not allow any user to load kernel
 modules, even known ones. Should this option be guarded by some
 capability check?

Do you realize that sys_socket() already makes this kind of
thing happen already?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [IPROUTE] manpage for lnstat

2006-10-26 Thread Harald Welte
On Thu, Oct 26, 2006 at 10:41:26PM +0200, Michael Prokop wrote:
 Hello,
 
 I wrote a manpage for lnstat, would be great if it could be applied
 to the next release.

Stephen: Please include it into your next release, looks fine to me.

 @Harald, I'm following your 'If somebody wants to do a manpage, feel
 free to send me a patch :)' of lnstat's README. :)

thanks a lot!
-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

We all know Linux is great...it does infinite loops in 5 seconds. -- Linus


pgpudwX1x8eA0.pgp
Description: PGP signature


Re: [RFC] tcp: setsockopt congestion control autoload

2006-10-26 Thread David Miller
From: John Heffner [EMAIL PROTECTED]
Date: Thu, 26 Oct 2006 13:29:26 -0400

 My reservation in doing this would be that as an administrator, I may 
 want to choose exactly what congestion control is available any any 
 given time.  The different congestion control algorithms are not 
 necessarily fair to each other.
 
 If the modules are autoloaded, I could still enforce this by moving the 
 modules out of /lib/modules, but I think it's cleaner to do it by 
 loading/unloading modules as appropriate.

Fair enough, and for the folks doing tests of congestion control
algorithms they can run as root or whatever.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] sealevel: uses arp_broken_ops

2006-10-26 Thread Randy Dunlap
On Wed, 25 Oct 2006 18:03:13 +0200 Toralf Förster wrote:

 WARNING: arp_broken_ops [drivers/net/wan/sealevel.ko] undefined!
 make[1]: *** [__modpost] Error 1
 make: *** [modules] Error 2
 
 Here's the config:
...
 # CONFIG_INET is not set
 CONFIG_SEALEVEL_4021=m

---
From: Randy Dunlap [EMAIL PROTECTED]

Sealevel uses arp_broken_ops so it needs to depend on INET.

Signed-off-by: Randy Dunlap [EMAIL PROTECTED]
---
 drivers/net/wan/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2619-rc3-pv.orig/drivers/net/wan/Kconfig
+++ linux-2619-rc3-pv/drivers/net/wan/Kconfig
@@ -127,7 +127,7 @@ config LANMEDIA
 # There is no way to detect a Sealevel board. Force it modular
 config SEALEVEL_4021
tristate Sealevel Systems 4021 support
-   depends on WAN  ISA  m  ISA_DMA_API
+   depends on WAN  ISA  m  ISA_DMA_API  INET
help
  This is a driver for the Sealevel Systems ACB 56 serial I/O adapter.
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC] [PATCH 0/3] Add Regulatory Domain support to d80211

2006-10-26 Thread Simon Barber
Getting people to use wpa_supplicant almost exclusivly as the interface
for wireless will improve a lot of things. One thing that might help
here is to do some work on the wpa_cli - to make it easier for the
startup scripts to do what they need, and also to make the command
syntax easier for command line users to do what they need.

Simon 

-Original Message-
From: Dan Williams [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 26, 2006 8:33 AM
To: Luis R. Rodriguez
Cc: Johannes Berg; Michael Wu; Simon Barber; David Kimdon;
netdev@vger.kernel.org; Jiri Benc; John W. Linville; Jean Tourrilhes;
Hong Liu; Jouni Malinen
Subject: Re: [RFC] [PATCH 0/3] Add Regulatory Domain support to d80211

On Thu, 2006-10-26 at 11:04 -0400, Luis R. Rodriguez wrote:
 On 10/26/06, Dan Williams [EMAIL PROTECTED] wrote:
  While wpa_supplicant is certainly the main client for stuff directly

  related to setting up a connection, there are quite a few other 
  users of general WE calls to pull information out of the card, or to

  receive scan events.
 
 How about we just ditch iwconfig completely and move on to 
 wpa_supplicant/wpa_cli as our next userspace application with
 nl80211/cg80211 as our new API for usersapce--kernel communication?
 As you point out, wpa_supplicant already does a lot for us -- and 
 several distributions already rely on it. Some work is required but I 
 think its worth it. If we do a complete move from WE to nl80211 it 
 would be transparent to the users too.

The one blocker I can think of here is startup scripts on various
distributions.  Most of those are shell, and they usually rely on
iwconfig quite heavily.  Getting those converted to wpa_supplicant
wouldn't be a trivial amount of work, but it wouldn't be a ton either.

Dan

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [IPROUTE] manpage for lnstat

2006-10-26 Thread Stephen Hemminger
On Thu, 26 Oct 2006 22:41:26 +0200
Michael Prokop [EMAIL PROTECTED] wrote:

 -mika-
 
 [lnstat.1  text/plain (2087 bytes)] 

Added, but I took the liberty of moving it to section 8
(lnstat.8) because that is where the other iproute2
commands are.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] [PATCH 0/3] Add Regulatory Domain support to d80211

2006-10-26 Thread Johannes Berg
On Thu, 2006-10-26 at 11:33 -0400, Dan Williams wrote:

 The one blocker I can think of here is startup scripts on various
 distributions.  Most of those are shell, and they usually rely on
 iwconfig quite heavily.  Getting those converted to wpa_supplicant
 wouldn't be a trivial amount of work, but it wouldn't be a ton either.

But as I've said forever... I intend to leave WE working when we move to
cfg80211. We can't rip out the old userspace API just like that.

johannes


signature.asc
Description: This is a digitally signed message part


Re: Network virtualization/isolation

2006-10-26 Thread Daniel Lezcano

Stephen Hemminger wrote:

On Thu, 26 Oct 2006 11:44:55 +0200
Daniel Lezcano [EMAIL PROTECTED] wrote:


[ ... ]


Assuming you are talking about pseudo-virtualized environments,
there are several different discussions.


Yes, exact, I forgot to mention that.



1. How should the namespace be isolated for the virtualized containered
   applications?


The network ressources should be related to the namespaces and 
especially the struct sock. So when a checkpoint is initiated for the 
container, you can identify the established connection, the timewait 
socket, the req queues, ... related to the container in order to freeze 
the traffic and checkpoint them.
The IP addresses are not a valid discrimator for identifiying,  for 
example if you have several containers interconnected into the same host.




2. How should traffic be restricted into/out of those containers. This
   is where existing netfilter, classification, etc, should be used.
   The network code is overly rich as it is, we don't need another
   abstraction.


Using only the netfilters you will be not able to bind to the same 
INADDR_ANY,port in different containers. You will need to handle several 
IP addresses coming from IP aliasing and check source address to be sure 
the source address is related to the right container and not from a 
primary interface probably assigned to a different container.


3. Can the virtualized containers be secure? No. we really can't keep
   hostile root in a container from killing system without going to
   a hypervisor.


That is totally true, the containers don't aim to replace 
full-virtualized environment.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] tcp: setsockopt congestion control autoload

2006-10-26 Thread Hagen Paul Pfeifer
* John Heffner | 2006-10-26 13:29:26 [-0400]:

My reservation in doing this would be that as an administrator, I may 
want to choose exactly what congestion control is available any any 
given time.  The different congestion control algorithms are not 
necessarily fair to each other.

ACK, completely right. A user without CAP_NET_ADMIN MUST NOT changed the
algorithm.  We know that there are some unfairness out there. And maybe some
time ago someone introduce a satellite-algorithm which is per definition
completely unfair to vanilla tcp.
We should guard this with a CAP_NET_ADMIN capability so that built-in modules
also shouldn't be enabled.

HGN

--
Signed and/or encrypted mails preferd. Key-Id = 0x98350C22
Fingerprint = 490F 557B 6C48 6D7E 5706  2EA2 4A22 8D45 9835 0C22 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] tcp: setsockopt congestion control autoload

2006-10-26 Thread John Heffner

Hagen Paul Pfeifer wrote:

* John Heffner | 2006-10-26 13:29:26 [-0400]:

My reservation in doing this would be that as an administrator, I may 
want to choose exactly what congestion control is available any any 
given time.  The different congestion control algorithms are not 
necessarily fair to each other.


ACK, completely right. A user without CAP_NET_ADMIN MUST NOT changed the
algorithm.  We know that there are some unfairness out there. And maybe some
time ago someone introduce a satellite-algorithm which is per definition
completely unfair to vanilla tcp.
We should guard this with a CAP_NET_ADMIN capability so that built-in modules
also shouldn't be enabled.


I don't know if I'd want to go that far.  For example, there's a nice 
protocol TCP-LP which is by design unfair in the other direction -- it 
yields to other traffic so that you can basically run a scavenger service.


If you really care about this, you could try to rank protocols based on 
aggressiveness (note this is not trivial) and do something like 'nice' 
where mortals can only nice up not down.  Practically speaking, I'm not 
sure this is necessary (worth the effort).


  -John
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/8] netpoll per device txq

2006-10-26 Thread Stephen Hemminger
When the netpoll beast got really busy, it tended to clog
things, so it stored them for later. But the beast was putting
all it's skb's in one basket. This was bad because maybe some
pipes were clogged and others were not.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 include/linux/netpoll.h |2 +
 net/core/netpoll.c  |   50 ++--
 2 files changed, 17 insertions(+), 35 deletions(-)

--- linux-2.6.orig/include/linux/netpoll.h
+++ linux-2.6/include/linux/netpoll.h
@@ -33,6 +33,8 @@ struct netpoll_info {
spinlock_t rx_lock;
struct netpoll *rx_np; /* netpoll that registered an rx_hook */
struct sk_buff_head arp_tx; /* list of arp requests to reply to */
+   struct sk_buff_head txq;
+   struct work_struct tx_work;
 };
 
 void netpoll_poll(struct netpoll *np);
--- linux-2.6.orig/net/core/netpoll.c
+++ linux-2.6/net/core/netpoll.c
@@ -38,10 +38,6 @@
 
 static struct sk_buff_head skb_pool;
 
-static DEFINE_SPINLOCK(queue_lock);
-static int queue_depth;
-static struct sk_buff *queue_head, *queue_tail;
-
 static atomic_t trapped;
 
 #define NETPOLL_RX_ENABLED  1
@@ -56,46 +52,25 @@ static void arp_reply(struct sk_buff *sk
 
 static void queue_process(void *p)
 {
-   unsigned long flags;
+   struct netpoll_info *npinfo = p;
struct sk_buff *skb;
 
-   while (queue_head) {
-   spin_lock_irqsave(queue_lock, flags);
-
-   skb = queue_head;
-   queue_head = skb-next;
-   if (skb == queue_tail)
-   queue_head = NULL;
-
-   queue_depth--;
-
-   spin_unlock_irqrestore(queue_lock, flags);
-
+   while ((skb = skb_dequeue(npinfo-txq)))
dev_queue_xmit(skb);
-   }
-}
 
-static DECLARE_WORK(send_queue, queue_process, NULL);
+}
 
 void netpoll_queue(struct sk_buff *skb)
 {
-   unsigned long flags;
+   struct net_device *dev = skb-dev;
+   struct netpoll_info *npinfo = dev-npinfo;
 
-   if (queue_depth == MAX_QUEUE_DEPTH) {
-   __kfree_skb(skb);
-   return;
+   if (!npinfo)
+   kfree_skb(skb);
+   else {
+   skb_queue_tail(npinfo-txq, skb);
+   schedule_work(npinfo-tx_work);
}
-
-   spin_lock_irqsave(queue_lock, flags);
-   if (!queue_head)
-   queue_head = skb;
-   else
-   queue_tail-next = skb;
-   queue_tail = skb;
-   queue_depth++;
-   spin_unlock_irqrestore(queue_lock, flags);
-
-   schedule_work(send_queue);
 }
 
 static int checksum_udp(struct sk_buff *skb, struct udphdr *uh,
@@ -649,6 +624,9 @@ int netpoll_setup(struct netpoll *np)
npinfo-tries = MAX_RETRIES;
spin_lock_init(npinfo-rx_lock);
skb_queue_head_init(npinfo-arp_tx);
+   skb_queue_head_init(npinfo-txq);
+   INIT_WORK(npinfo-tx_work, queue_process, npinfo);
+
atomic_set(npinfo-refcnt, 1);
} else {
npinfo = ndev-npinfo;
@@ -771,6 +749,8 @@ void netpoll_cleanup(struct netpoll *np)
np-dev-npinfo = NULL;
if (atomic_dec_and_test(npinfo-refcnt)) {
skb_queue_purge(npinfo-arp_tx);
+   skb_queue_purge(npinfo-txq);
+   flush_scheduled_work();
 
kfree(npinfo);
}

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/8] netpoll retry cleanup

2006-10-26 Thread Stephen Hemminger
The netpoll beast was still not happy. If the beast got
clogged pipes, it tended to stare blankly off in space
for a long time.

The problem couldn't be completely fixed because the
beast talked with irq's disabled. But it could be made
less painful and shorter.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 include/linux/netpoll.h |1 
 net/core/netpoll.c  |   71 ++--
 2 files changed, 33 insertions(+), 39 deletions(-)

--- linux-2.6.orig/net/core/netpoll.c
+++ linux-2.6/net/core/netpoll.c
@@ -34,12 +34,12 @@
 #define MAX_UDP_CHUNK 1460
 #define MAX_SKBS 32
 #define MAX_QUEUE_DEPTH (MAX_SKBS / 2)
-#define MAX_RETRIES 2
 
 static struct sk_buff_head skb_pool;
 
 static atomic_t trapped;
 
+#define USEC_PER_POLL  50
 #define NETPOLL_RX_ENABLED  1
 #define NETPOLL_RX_DROP 2
 
@@ -72,6 +72,7 @@ static void queue_process(void *p)
schedule_delayed_work(npinfo-tx_work, HZ/10);
return;
}
+
netif_tx_unlock_bh(dev);
}
 }
@@ -241,50 +242,44 @@ repeat:
 
 static void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
 {
-   int status;
-   struct netpoll_info *npinfo;
+   int status = NETDEV_TX_BUSY;
+   unsigned long tries;
+   struct net_device *dev = np-dev;
+   struct netpoll_info *npinfo = np-dev-npinfo;
+
+   if (!npinfo || !netif_running(dev) || !netif_device_present(dev)) {
+   __kfree_skb(skb);
+   return;
+   }
+
+   /* don't get messages out of order, and no recursion */
+   if ( !(np-drop == netpoll_queue  skb_queue_len(npinfo-txq))
+ npinfo-poll_owner != smp_processor_id()
+ netif_tx_trylock(dev)) {
+
+   /* try until next clock tick */
+   for(tries = jiffies_to_usecs(1)/USEC_PER_POLL; tries  0; 
--tries) {
+   if (!netif_queue_stopped(dev))
+   status = dev-hard_start_xmit(skb, dev);
 
-   if (!np || !np-dev || !netif_running(np-dev)) {
-   __kfree_skb(skb);
-   return;
-   }
+   if (status == NETDEV_TX_OK)
+   break;
+
+   /* tickle device maybe there is some cleanup */
+   netpoll_poll(np);
 
-   npinfo = np-dev-npinfo;
+   udelay(USEC_PER_POLL);
+   }
+   netif_tx_unlock(dev);
+   }
 
-   /* avoid recursion */
-   if (npinfo-poll_owner == smp_processor_id() ||
-   np-dev-xmit_lock_owner == smp_processor_id()) {
+   if (status != NETDEV_TX_OK) {
+   /* requeue for later */
if (np-drop)
np-drop(skb);
else
__kfree_skb(skb);
-   return;
}
-
-   do {
-   npinfo-tries--;
-   netif_tx_lock(np-dev);
-
-   /*
-* network drivers do not expect to be called if the queue is
-* stopped.
-*/
-   status = NETDEV_TX_BUSY;
-   if (!netif_queue_stopped(np-dev))
-   status = np-dev-hard_start_xmit(skb, np-dev);
-
-   netif_tx_unlock(np-dev);
-
-   /* success */
-   if(!status) {
-   npinfo-tries = MAX_RETRIES; /* reset */
-   return;
-   }
-
-   /* transmit busy */
-   netpoll_poll(np);
-   udelay(50);
-   } while (npinfo-tries  0);
 }
 
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
@@ -640,7 +635,7 @@ int netpoll_setup(struct netpoll *np)
npinfo-rx_np = NULL;
spin_lock_init(npinfo-poll_lock);
npinfo-poll_owner = -1;
-   npinfo-tries = MAX_RETRIES;
+
spin_lock_init(npinfo-rx_lock);
skb_queue_head_init(npinfo-arp_tx);
skb_queue_head_init(npinfo-txq);
--- linux-2.6.orig/include/linux/netpoll.h
+++ linux-2.6/include/linux/netpoll.h
@@ -28,7 +28,6 @@ struct netpoll_info {
atomic_t refcnt;
spinlock_t poll_lock;
int poll_owner;
-   int tries;
int rx_flags;
spinlock_t rx_lock;
struct netpoll *rx_np; /* netpoll that registered an rx_hook */

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/8] netpoll header cleanup

2006-10-26 Thread Stephen Hemminger
As Steve left netpoll beast, hopefully not to return soon.
He noticed that the header was messy. He straightened it
up and polished it a little, then waved goodbye.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 include/linux/netpoll.h |7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

--- linux-2.6.orig/include/linux/netpoll.h
+++ linux-2.6/include/linux/netpoll.h
@@ -12,16 +12,15 @@
 #include linux/rcupdate.h
 #include linux/list.h
 
-struct netpoll;
-
 struct netpoll {
struct net_device *dev;
-   char dev_name[16], *name;
+   char dev_name[IFNAMSIZ];
+   const char *name;
void (*rx_hook)(struct netpoll *, int, char *, int);
 
u32 local_ip, remote_ip;
u16 local_port, remote_port;
-   unsigned char local_mac[6], remote_mac[6];
+   u8 local_mac[ETH_ALEN], remote_mac[ETH_ALEN];
 };
 
 struct netpoll_info {

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/8] netpoll setup error handling

2006-10-26 Thread Stephen Hemminger
The beast was not always healthy. When it was sick,
it tended to be laconic and not tell anyone the real problem.
A few small changes had it telling the world about its
problems, if they really wanted to hear.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 drivers/net/netconsole.c |7 +--
 net/core/netpoll.c   |   20 +---
 2 files changed, 18 insertions(+), 9 deletions(-)

--- linux-2.6.orig/drivers/net/netconsole.c
+++ linux-2.6/drivers/net/netconsole.c
@@ -102,6 +102,8 @@ __setup(netconsole=, option_setup);
 
 static int init_netconsole(void)
 {
+   int err;
+
if(strlen(config))
option_setup(config);
 
@@ -110,8 +112,9 @@ static int init_netconsole(void)
return 0;
}
 
-   if(netpoll_setup(np))
-   return -EINVAL;
+   err = netpoll_setup(np);
+   if (err)
+   return err;
 
register_console(netconsole);
printk(KERN_INFO netconsole: network logging started\n);
--- linux-2.6.orig/net/core/netpoll.c
+++ linux-2.6/net/core/netpoll.c
@@ -602,20 +602,23 @@ int netpoll_setup(struct netpoll *np)
struct in_device *in_dev;
struct netpoll_info *npinfo;
unsigned long flags;
+   int err;
 
if (np-dev_name)
ndev = dev_get_by_name(np-dev_name);
if (!ndev) {
printk(KERN_ERR %s: %s doesn't exist, aborting.\n,
   np-name, np-dev_name);
-   return -1;
+   return -ENODEV;
}
 
np-dev = ndev;
if (!ndev-npinfo) {
npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
-   if (!npinfo)
+   if (!npinfo) {
+   err = -ENOMEM;
goto release;
+   }
 
npinfo-rx_flags = 0;
npinfo-rx_np = NULL;
@@ -636,6 +639,7 @@ int netpoll_setup(struct netpoll *np)
if (!ndev-poll_controller) {
printk(KERN_ERR %s: %s doesn't support polling, aborting.\n,
   np-name, np-dev_name);
+   err = -ENOTSUPP;
goto release;
}
 
@@ -646,13 +650,14 @@ int netpoll_setup(struct netpoll *np)
   np-name, np-dev_name);
 
rtnl_lock();
-   if (dev_change_flags(ndev, ndev-flags | IFF_UP)  0) {
+   err = dev_open(ndev);
+   rtnl_unlock();
+
+   if (err) {
printk(KERN_ERR %s: failed to open %s\n,
-  np-name, np-dev_name);
-   rtnl_unlock();
+  np-name, ndev-name);
goto release;
}
-   rtnl_unlock();
 
atleast = jiffies + HZ/10;
atmost = jiffies + 4*HZ;
@@ -690,6 +695,7 @@ int netpoll_setup(struct netpoll *np)
rcu_read_unlock();
printk(KERN_ERR %s: no IP address for %s, aborting\n,
   np-name, np-dev_name);
+   err = -EDESTADDRREQ;
goto release;
}
 
@@ -722,7 +728,7 @@ int netpoll_setup(struct netpoll *np)
kfree(npinfo);
np-dev = NULL;
dev_put(ndev);
-   return -1;
+   return err;
 }
 
 static int __init netpoll_init(void) {

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/8] netpoll queue cleanup

2006-10-26 Thread Stephen Hemminger
The beast had a long and not very happy history. At one
point, a friend (netdump) had asked that he open up a little.
Well, the friend was long gone now, and the beast had
this dangling piece hanging (netpoll_queue).

It wasn't hard to stitch the netpoll_queue back in
where it belonged and make everything tidy.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 drivers/net/netconsole.c |1 -
 include/linux/netpoll.h  |4 ++--
 net/core/netpoll.c   |   23 +++
 3 files changed, 5 insertions(+), 23 deletions(-)

--- linux-2.6.orig/drivers/net/netconsole.c
+++ linux-2.6/drivers/net/netconsole.c
@@ -60,7 +60,6 @@ static struct netpoll np = {
.local_port = 6665,
.remote_port = ,
.remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
-   .drop = netpoll_queue,
 };
 static int configured = 0;
 
--- linux-2.6.orig/include/linux/netpoll.h
+++ linux-2.6/include/linux/netpoll.h
@@ -18,7 +18,7 @@ struct netpoll {
struct net_device *dev;
char dev_name[16], *name;
void (*rx_hook)(struct netpoll *, int, char *, int);
-   void (*drop)(struct sk_buff *skb);
+
u32 local_ip, remote_ip;
u16 local_port, remote_port;
unsigned char local_mac[6], remote_mac[6];
@@ -44,7 +44,7 @@ int netpoll_trap(void);
 void netpoll_set_trap(int trap);
 void netpoll_cleanup(struct netpoll *np);
 int __netpoll_rx(struct sk_buff *skb);
-void netpoll_queue(struct sk_buff *skb);
+
 
 #ifdef CONFIG_NETPOLL
 static inline int netpoll_rx(struct sk_buff *skb)
--- linux-2.6.orig/net/core/netpoll.c
+++ linux-2.6/net/core/netpoll.c
@@ -77,19 +77,6 @@ static void queue_process(void *p)
}
 }
 
-void netpoll_queue(struct sk_buff *skb)
-{
-   struct net_device *dev = skb-dev;
-   struct netpoll_info *npinfo = dev-npinfo;
-
-   if (!npinfo)
-   kfree_skb(skb);
-   else {
-   skb_queue_tail(npinfo-txq, skb);
-   schedule_work(npinfo-tx_work);
-   }
-}
-
 static int checksum_udp(struct sk_buff *skb, struct udphdr *uh,
 unsigned short ulen, u32 saddr, u32 daddr)
 {
@@ -253,7 +240,7 @@ static void netpoll_send_skb(struct netp
}
 
/* don't get messages out of order, and no recursion */
-   if ( !(np-drop == netpoll_queue  skb_queue_len(npinfo-txq))
+   if ( skb_queue_len(npinfo-txq) == 0
  npinfo-poll_owner != smp_processor_id()
  netif_tx_trylock(dev)) {
 
@@ -274,11 +261,8 @@ static void netpoll_send_skb(struct netp
}
 
if (status != NETDEV_TX_OK) {
-   /* requeue for later */
-   if (np-drop)
-   np-drop(skb);
-   else
-   __kfree_skb(skb);
+   skb_queue_tail(npinfo-txq, skb);
+   schedule_work(npinfo-tx_work);
}
 }
 
@@ -800,4 +784,3 @@ EXPORT_SYMBOL(netpoll_setup);
 EXPORT_SYMBOL(netpoll_cleanup);
 EXPORT_SYMBOL(netpoll_send_udp);
 EXPORT_SYMBOL(netpoll_poll);
-EXPORT_SYMBOL(netpoll_queue);

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/8] netpoll deferred transmit path

2006-10-26 Thread Stephen Hemminger
When the netpoll beast got busy, he tended to babble.
Instead of talking out of his large mouth as normal,
he tended to try to snort out other orifices. This lead
to words (skbs) ending up in odd places (like NIT) that
he did not intend.

The normal way of talking wouldn't work, but he could
at least change to using the same tone all the time.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 net/core/netpoll.c |   21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

--- linux-2.6.orig/net/core/netpoll.c
+++ linux-2.6/net/core/netpoll.c
@@ -55,9 +55,25 @@ static void queue_process(void *p)
struct netpoll_info *npinfo = p;
struct sk_buff *skb;
 
-   while ((skb = skb_dequeue(npinfo-txq)))
-   dev_queue_xmit(skb);
+   while ((skb = skb_dequeue(npinfo-txq))) {
+   struct net_device *dev = skb-dev;
 
+   if (!netif_device_present(dev) || !netif_running(dev)) {
+   __kfree_skb(skb);
+   continue;
+   }
+
+   netif_tx_lock_bh(dev);
+   if (netif_queue_stopped(dev) ||
+   dev-hard_start_xmit(skb, dev) != NETDEV_TX_OK) {
+   skb_queue_head(npinfo-txq, skb);
+   netif_tx_unlock_bh(dev);
+
+   schedule_delayed_work(npinfo-tx_work, HZ/10);
+   return;
+   }
+   netif_tx_unlock_bh(dev);
+   }
 }
 
 void netpoll_queue(struct sk_buff *skb)
@@ -756,6 +772,7 @@ void netpoll_cleanup(struct netpoll *np)
if (atomic_dec_and_test(npinfo-refcnt)) {
skb_queue_purge(npinfo-arp_tx);
skb_queue_purge(npinfo-txq);
+   cancel_rearming_delayed_work(npinfo-tx_work);
flush_scheduled_work();
 
kfree(npinfo);

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/8] netpoll info leak

2006-10-26 Thread Stephen Hemminger
After looking harder, Steve noticed that the netpoll
beast leaked a little every time it shutdown for a nap.
Not a big leak, but a nuisance kind of thing.

He took out his refcount duct tape and patched the
leak. It was overkill since there was already other
locking in that area, but it looked clean and wouldn't
attract fleas.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 include/linux/netpoll.h |1 +
 net/core/netpoll.c  |   25 +++--
 2 files changed, 20 insertions(+), 6 deletions(-)

--- linux-2.6.orig/include/linux/netpoll.h
+++ linux-2.6/include/linux/netpoll.h
@@ -25,6 +25,7 @@ struct netpoll {
 };
 
 struct netpoll_info {
+   atomic_t refcnt;
spinlock_t poll_lock;
int poll_owner;
int tries;
--- linux-2.6.orig/net/core/netpoll.c
+++ linux-2.6/net/core/netpoll.c
@@ -649,8 +649,11 @@ int netpoll_setup(struct netpoll *np)
npinfo-tries = MAX_RETRIES;
spin_lock_init(npinfo-rx_lock);
skb_queue_head_init(npinfo-arp_tx);
-   } else
+   atomic_set(npinfo-refcnt, 1);
+   } else {
npinfo = ndev-npinfo;
+   atomic_inc(npinfo-refcnt);
+   }
 
if (!ndev-poll_controller) {
printk(KERN_ERR %s: %s doesn't support polling, aborting.\n,
@@ -757,12 +760,22 @@ void netpoll_cleanup(struct netpoll *np)
 
if (np-dev) {
npinfo = np-dev-npinfo;
-   if (npinfo  npinfo-rx_np == np) {
-   spin_lock_irqsave(npinfo-rx_lock, flags);
-   npinfo-rx_np = NULL;
-   npinfo-rx_flags = ~NETPOLL_RX_ENABLED;
-   spin_unlock_irqrestore(npinfo-rx_lock, flags);
+   if (npinfo) {
+   if (npinfo-rx_np == np) {
+   spin_lock_irqsave(npinfo-rx_lock, flags);
+   npinfo-rx_np = NULL;
+   npinfo-rx_flags = ~NETPOLL_RX_ENABLED;
+   spin_unlock_irqrestore(npinfo-rx_lock, flags);
+   }
+
+   np-dev-npinfo = NULL;
+   if (atomic_dec_and_test(npinfo-refcnt)) {
+   skb_queue_purge(npinfo-arp_tx);
+
+   kfree(npinfo);
+   }
}
+
dev_put(np-dev);
}
 

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/8] netpoll: A Halloween horror mystery

2006-10-26 Thread Stephen Hemminger
It was dull and cloudy day in Portland when Steve first
went hunting for suspend bugs in the sky2 driver. The hunt
was motivated by a certain mini owned by penguin, but he
could never get the license number. Anyway, he stumbled
down some blind alleys and met the:

NETPOLL BEAST

It wasn't that beast was ugly, like some of the other
things he had seen. It was just an untidy mess, the kind
of thing you didn't want to bring home to mother, instead
he would have rather booted over the fence to the Viro
shredder.  But since he was in the neighborhood, he
got out his keyboard and went to work.

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io: add PCI error recovery support

2006-10-26 Thread Linas Vepstas
Hi.

On Thu, Oct 26, 2006 at 05:56:34AM -0400, Ananda Raju wrote:
 Hi, 
 Can you try attached patch. The attached patch is simple. We set card
 state as down in error_detecct() so that all entry points return error
 and don't proceed further.
 
 In slot_reset() we do s2io_card_down() will reset adapter. 
 In io_resume() we bringup the driver. 

Simplicity is always better. However, some questions/comments:

 @@ -4175,6 +4186,10 @@ static irqreturn_t s2io_isr(int irq, voi
   mac_info_t *mac_control;
   struct config_param *config;
  
 + if (atomic_read(sp-card_state) == CARD_DOWN) {
 + return IRQ_NONE;
 + }

I used 

if ((sp-pdev-error_state != pci_channel_io_normal)

here for a reason: the pdev-error_state is set even in an interrupt
context, that is, it gets set even if interrups are disabled, and
so it represents the actual state immediately. By contrast, the
error callbacks do not get called until possibly much later, 
and so sp-card_state = CARD_DOWN might not get set for a while.

If, for any reason, e.g. some obscure corner case, the s2io 
generates zillions of interupts, this could result in a soft-lockup.
I actually saw this in the symbios device driver, which will
regenerate an interrupt until its acknowledged -- and so it 
sat there, spinning. :-(

I was returning IRQ_HANDLED instead of IRQ_NONE, so as to avoid
falling into handle_bad_irq() or report_bad_irq(). I haven't 
seen this happen on s2io, but thought it would still be wise.

If this can't happen, then there's no problem here.

 +/**
 + * s2io_io_slot_reset - called after the pci bus has been reset.
 + * @pdev: Pointer to PCI device
 + *
 + * Restart the card from scratch, as if from a cold-boot.
 + */
 +static pci_ers_result_t s2io_io_slot_reset(struct pci_dev *pdev)
 +{

At this point, the card has just experienced a hardware reset,
(the #RST wire was held low for 250 millisecs, followed by
a settle time of 2 seconds, followed by whatever BIOS thinks
it needed to do, followed by a restore of the pci config space
to what it was after a cold boot. So the card is in a fresh
state; in theory its identitcal to a cold boot. So ... 
are you sure you want to down at this point? 

 + s2io_card_down(sp);
 + sp-device_close_flag = TRUE;   /* Device is shut down. */


One problem I'm having is that the watchdog timer sometimes
pops and tries to reset the card before s2io_card_down()
has a chance to run. I fixed this ... 

==
So -- just for grins, I thought to myself, Maybe I can make 
s2io be the first adapter ever to fully recover without 
a hard reset of the card.

The idea is simple: 

1) enable MMIO,
2) call s2io_card_down()
3) enable DMA
4) cal s2io_card_up()

I have a patch that does this, but then hit a few more snags.
I haven't yet nailed down all the trouble spots, maybe tommorrow.

--linas

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


2.6.18 forcedeth GSO panic on send

2006-10-26 Thread Denis Vlasenko
Hello,

I am using an AMD64 box with 32bit userspace / 64bit kernel.

Kernels 2.6.18 and 2.6.18.1 semi-randomly hang when I upload stuff
over the net - for example, svn commit, scp are affected.
2.6.17.11 does not seem to be affected.

Unfortunately even 60-line screen is not big enough
to catch whole trace. There are at least two traces,
and first scrolls off. I have a photo at
http://busybox.net/~vda/gso_panic/forcedeth_gso_panic.jpg

Something bad is happening here, when kernel tries
to send some data:

...
error_exit
skb_over_panic
skb_over_panic
skb_segment
tcp_tso_segment
inet_gso_segment
skb_gso_segment
dev_hard_start_xmit
dev_queue_xmit
...

Looks like it is related to hardware accel in forcedeth.
I will try disabling all hw accel.

Please find in attached tarball:

.config
dmesg
ethtool-k
lspci
lspci-v
--
vda


gso_panic.tar.bz2
Description: application/tbz


[PATCH] Check if user has CAP_NET_ADMIN to change congestion control algorithm

2006-10-26 Thread Hagen Paul Pfeifer

Check if user has CAP_NET_ADMIN capability to change congestion control
algorithm.

Under normal circumstances a application programmer doesn't have enough
information to choose the right algorithm (expect he is the pchar/pathchar
maintainer). At 99.9% only the local host administrator has the knowledge to
select a proper standard, system-wide algorithm (the remaining 0.1% are
for testing purpose). If we let the user select an alternative algorithm we
introduce one potential weak spot - so we ban this eventuality.

HGN


Signed-off-by: Hagen Paul Pfeifer [EMAIL PROTECTED]

diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index af0aca1..c1ae2e9 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -10,6 +10,7 @@ #include linux/module.h
 #include linux/mm.h
 #include linux/types.h
 #include linux/list.h
+#include linux/capability.h
 #include net/tcp.h

 static DEFINE_SPINLOCK(tcp_cong_list_lock);
@@ -151,6 +152,9 @@ int tcp_set_congestion_control(struct so
struct tcp_congestion_ops *ca;
int err = 0;

+   if (!capable(CAP_NET_ADMIN))
+   return -EPERM;
+
rcu_read_lock();
ca = tcp_ca_find(name);
if (ca == icsk-icsk_ca_ops)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Check if user has CAP_NET_ADMIN to change congestion control algorithm

2006-10-26 Thread Ian McDonald

On 10/27/06, Hagen Paul Pfeifer [EMAIL PROTECTED] wrote:


Check if user has CAP_NET_ADMIN capability to change congestion control
algorithm.

Under normal circumstances a application programmer doesn't have enough
information to choose the right algorithm (expect he is the pchar/pathchar
maintainer). At 99.9% only the local host administrator has the knowledge to
select a proper standard, system-wide algorithm (the remaining 0.1% are
for testing purpose). If we let the user select an alternative algorithm we
introduce one potential weak spot - so we ban this eventuality.


I don't agree with this at all. I would love Firefox, BitTorrent etc
to implement usage of TCP-LP for example so they use unused
bandwidth only.

With this change applications can't do this.

If we are going to restrict by capabilities then I think we should
only restrict module loading - this way the admin of the box can
decide what algorithms can be used.

Ian
--
Ian McDonald
Web: http://wand.net.nz/~iam4
Blog: http://imcdnzl.blogspot.com
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Check if user has CAP_NET_ADMIN to change congestion control algorithm

2006-10-26 Thread David Miller

This is driving me crazy...

Your email client turned the tabs into spaces in the patch making it
useless.

I want to ask why it is so hard for people to submit patches that are
not corrupted?  :-/

I type in this kind of email response at least 2 or 3 times every
single day that I review patches.  Spending the time to review a patch
only to find out that it is corrupted and doesn't apply consumes a
significant chunk of my time.

Again, send the patch in an email to yourself and try to apply the
patch from that email if you are in doubt.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Check if user has CAP_NET_ADMIN to change congestion control algorithm

2006-10-26 Thread David Miller
From: Ian McDonald [EMAIL PROTECTED]
Date: Fri, 27 Oct 2006 12:59:30 +1300

 I don't agree with this at all. I would love Firefox, BitTorrent etc
 to implement usage of TCP-LP for example so they use unused
 bandwidth only.
 
 With this change applications can't do this.
 
 If we are going to restrict by capabilities then I think we should
 only restrict module loading - this way the admin of the box can
 decide what algorithms can be used.

You are using an example of a (supposedly) safe case of this
as a justification for allowing all cases.

It is bad, very bad, to allow arbitrary users to select arbitrary
congestion control algorithms.  It is just as bad as allowing them to
disable congestion control completely if that were an option.

If someone, for example, builds all the algorithms statically into
their kernel, for testing as root, this lets all users on the machine
do the same which is not right.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] netpoll: skb private pool management

2006-10-26 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Thu, 26 Oct 2006 15:46:49 -0700

 @@ -188,19 +186,14 @@ void netpoll_poll(struct netpoll *np)
  static void refill_skbs(void)
  {
   struct sk_buff *skb;
 - unsigned long flags;
  
 - spin_lock_irqsave(skb_list_lock, flags);
 - while (nr_skbs  MAX_SKBS) {
 + while (skb_queue_len(skb_pool)  MAX_SKBS) {

Previously, the lock actually protected nr_skbs from going over
MAX_SKBS properly, but the new code does not.  skb_queue_len()
is lockless.

Stephen, I really appreciate your efforts to clean up netpoll,
but on every iteration I am finding simple errors on the first
patch every time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sealevel: uses arp_broken_ops

2006-10-26 Thread David Miller
From: Randy Dunlap [EMAIL PROTECTED]
Date: Thu, 26 Oct 2006 14:08:08 -0700

 Sealevel uses arp_broken_ops so it needs to depend on INET.
 
 Signed-off-by: Randy Dunlap [EMAIL PROTECTED]

Applied, thanks Randy.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KJ] [Patch] kmemdup() cleanup in net/

2006-10-26 Thread David Miller
From: Eric Sesterhenn [EMAIL PROTECTED]
Date: Thu, 26 Oct 2006 22:49:31 +0200

 arg, i thought i compile tested everything, please use this
 version.
 
 Signed-off-by: Eric Sesterhenn [EMAIL PROTECTED]

Definitely post-2.6.19 material, please resubmit when
2.6.20 merging opens up, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Check if user has CAP_NET_ADMIN to change congestion control algorithm

2006-10-26 Thread Ian McDonald

On 10/27/06, David Miller [EMAIL PROTECTED] wrote:

From: Ian McDonald [EMAIL PROTECTED]
Date: Fri, 27 Oct 2006 12:59:30 +1300

 I don't agree with this at all. I would love Firefox, BitTorrent etc
 to implement usage of TCP-LP for example so they use unused
 bandwidth only.

 With this change applications can't do this.

 If we are going to restrict by capabilities then I think we should
 only restrict module loading - this way the admin of the box can
 decide what algorithms can be used.

You are using an example of a (supposedly) safe case of this
as a justification for allowing all cases.

It is bad, very bad, to allow arbitrary users to select arbitrary
congestion control algorithms.  It is just as bad as allowing them to
disable congestion control completely if that were an option.


OK understand your point here but I think low priority TCP has its
use. Don't agree it is just as bad, but it is bad under the wrong
circumstances - it's still better than UDP which has no congestion
control...

Don't want to make it over complicated though.

I think the most sense would be to restrict it as shown as tcp-lp is
the exception and allow tcp-lp via another mechanism. That is a
situation where the user could specify how low priority they want the
traffic to be... If I ever get enough time I'll have a go at it but
can't see it this year :-(

It actually makes more sense to tie the congestion control algorithm
to the route/destination IP if we are going to change it but that is a
whole another exercise in itself.


If someone, for example, builds all the algorithms statically into
their kernel, for testing as root, this lets all users on the machine
do the same which is not right.


This is the state at present as I understand it. However that doesn't
make it right.
--
Ian McDonald
Web: http://wand.net.nz/~iam4
Blog: http://imcdnzl.blogspot.com
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7421] New: Oops, EIP is at atalk_sendmsg

2006-10-26 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Thu, 26 Oct 2006 09:44:38 -0700

  Oct 26 10:01:07 localhost kernel: EIP is at atalk_sendmsg+0x15b/0x4e4 
  [appletalk]
  Oct 26 10:01:07 localhost kernel: eax:    ebx: 002f   ecx: 
     \
  edx: 
  Oct 26 10:01:07 localhost kernel: esi: cadcb600   edi:    ebp: 
  cc9d7eec   \
  esp: cc9d7d6c

Does this make the bug go away?

This code has been like this for a long time, I'm surprised
it never triggered before.  We properly set dev = rt-dev
right after the if (!rt) check, so the two settings removed
by this patch were not only OOPS-prone, they were also
superfluous.

diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index 708e2e0..485e35c 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1584,7 +1584,6 @@ #endif
 
if (usat-sat_addr.s_net || usat-sat_addr.s_node == ATADDR_ANYNODE) {
rt = atrtr_find(usat-sat_addr);
-   dev = rt-dev;
} else {
struct atalk_addr at_hint;
 
@@ -1592,7 +1591,6 @@ #endif
at_hint.s_net  = at-src_net;
 
rt = atrtr_find(at_hint);
-   dev = rt-dev;
}
if (!rt)
return -ENETUNREACH;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Check if user has CAP_NET_ADMIN to change congestion control algorithm

2006-10-26 Thread Stephen Hemminger
On Fri, 27 Oct 2006 01:52:56 +0200
Hagen Paul Pfeifer [EMAIL PROTECTED] wrote:

 
 Check if user has CAP_NET_ADMIN capability to change congestion control
 algorithm.
 
 Under normal circumstances a application programmer doesn't have enough
 information to choose the right algorithm (expect he is the pchar/pathchar
 maintainer). At 99.9% only the local host administrator has the knowledge to
 select a proper standard, system-wide algorithm (the remaining 0.1% are
 for testing purpose). If we let the user select an alternative algorithm we
 introduce one potential weak spot - so we ban this eventuality.
 
 HGN

If you aren't doing experiments don't compile it in your kernel.
If distro's are including unfair congestion control file a bug report.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] netpoll: skb private pool management

2006-10-26 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Thu, 26 Oct 2006 18:04:02 -0700

 On Thu, 26 Oct 2006 17:12:47 -0700 (PDT)
 David Miller [EMAIL PROTECTED] wrote:
 
  From: Stephen Hemminger [EMAIL PROTECTED]
  Date: Thu, 26 Oct 2006 15:46:49 -0700
  
   @@ -188,19 +186,14 @@ void netpoll_poll(struct netpoll *np)
static void refill_skbs(void)
{
 struct sk_buff *skb;
   - unsigned long flags;

   - spin_lock_irqsave(skb_list_lock, flags);
   - while (nr_skbs  MAX_SKBS) {
   + while (skb_queue_len(skb_pool)  MAX_SKBS) {
  
  Previously, the lock actually protected nr_skbs from going over
  MAX_SKBS properly, but the new code does not.  skb_queue_len()
  is lockless.
  
  Stephen, I really appreciate your efforts to clean up netpoll,
  but on every iteration I am finding simple errors on the first
  patch every time.
 
 racing over by one is not a big issue.

It's potentially racing by more than that, depending upon whether any
cpus take interrupts and are stalled for significiant time after
making the decision to add.  The upper bound is something like
(2 * NCPUS) - 1.

It's a bug Stephen.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] netpoll: skb private pool management

2006-10-26 Thread Stephen Hemminger
It was a dark and stormy night when Steve first saw the
netpoll beast. The beast was odd, and misshapen but not
extremely ugly.

Let me take off one of your warts he said. This wart
is where you tried to make an skb list yourself. If the
beast had ever run out of memory, he would have stupefied
himself unnecessarily.

The first try was painful, so he tried again till the bleeding
stopped.


Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 net/core/netpoll.c |   53 +
 1 file changed, 21 insertions(+), 32 deletions(-)

--- netpoll.orig/net/core/netpoll.c 2006-10-26 19:12:36.0 -0700
+++ netpoll/net/core/netpoll.c  2006-10-26 19:16:05.0 -0700
@@ -36,9 +36,7 @@
 #define MAX_QUEUE_DEPTH (MAX_SKBS / 2)
 #define MAX_RETRIES 2
 
-static DEFINE_SPINLOCK(skb_list_lock);
-static int nr_skbs;
-static struct sk_buff *skbs;
+static struct sk_buff_head skb_pool;
 
 static DEFINE_SPINLOCK(queue_lock);
 static int queue_depth;
@@ -190,17 +188,15 @@
struct sk_buff *skb;
unsigned long flags;
 
-   spin_lock_irqsave(skb_list_lock, flags);
-   while (nr_skbs  MAX_SKBS) {
+   spin_lock_irqsave(skb_pool-lock, flags);
+   while (skb_pool.qlen  MAX_SKBS) {
skb = alloc_skb(MAX_SKB_SIZE, GFP_ATOMIC);
if (!skb)
break;
 
-   skb-next = skbs;
-   skbs = skb;
-   nr_skbs++;
+   __skb_queue_tail(skb_pool, skb);
}
-   spin_unlock_irqrestore(skb_list_lock, flags);
+   spin_unlock_irqrestore(skb_pool-lock, flags);
 }
 
 static void zap_completion_queue(void)
@@ -229,38 +225,25 @@
put_cpu_var(softnet_data);
 }
 
-static struct sk_buff * find_skb(struct netpoll *np, int len, int reserve)
+static struct sk_buff *find_skb(struct netpoll *np, int len, int reserve)
 {
-   int once = 1, count = 0;
-   unsigned long flags;
-   struct sk_buff *skb = NULL;
+   int count = 0;
+   struct sk_buff *skb;
 
zap_completion_queue();
+   refill_skbs();
 repeat:
-   if (nr_skbs  MAX_SKBS)
-   refill_skbs();
 
skb = alloc_skb(len, GFP_ATOMIC);
-
-   if (!skb) {
-   spin_lock_irqsave(skb_list_lock, flags);
-   skb = skbs;
-   if (skb) {
-   skbs = skb-next;
-   skb-next = NULL;
-   nr_skbs--;
-   }
-   spin_unlock_irqrestore(skb_list_lock, flags);
-   }
+   if (!skb)
+   skb = skb_dequeue(skb_pool);
 
if(!skb) {
-   count++;
-   if (once  (count == 100)) {
-   printk(out of netpoll skbs!\n);
-   once = 0;
+   if (++count  10) {
+   netpoll_poll(np);
+   goto repeat;
}
-   netpoll_poll(np);
-   goto repeat;
+   return NULL;
}
 
atomic_set(skb-users, 1);
@@ -764,6 +747,12 @@
return -1;
 }
 
+static int __init netpoll_init(void) {
+   skb_queue_head_init(skb_pool);
+   return 0;
+}
+core_initcall(netpoll_init);
+
 void netpoll_cleanup(struct netpoll *np)
 {
struct netpoll_info *npinfo;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/9] NetEffect 10Gb RNIC Driver: kernel Kconfig and makefiles

2006-10-26 Thread Glenn Grundstrom
The following set of patches contain the source code for the NetEffect
NE010 iWarp adapter running under the OpenFabrics Alliance software
stack.  This is a repost.

Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED]

==

diff -ruNp old/drivers/infiniband/Kconfig new/drivers/infiniband/Kconfig
--- old/drivers/infiniband/Kconfig  2006-10-25 09:57:43.0
-0500
+++ new/drivers/infiniband/Kconfig  2006-10-25 10:48:40.0
-0500
@@ -41,6 +41,8 @@ source drivers/infiniband/hw/ehca/Kconf
 
 source drivers/infiniband/hw/amso1100/Kconfig
 
+source drivers/infiniband/hw/nes/Kconfig
+
 source drivers/infiniband/hw/cxgb3/Kconfig
 
 source drivers/infiniband/ulp/ipoib/Kconfig
diff -ruNp old/drivers/infiniband/hw/nes/Kconfig
new/drivers/infiniband/hw/nes/Kconfig
--- old/drivers/infiniband/hw/nes/Kconfig   1969-12-31
18:00:00.0 -0600
+++ new/drivers/infiniband/hw/nes/Kconfig   2006-10-25
10:50:18.0 -0500
@@ -0,0 +1,15 @@
+config INFINIBAND_NES
+   tristate NetEffect RNIC support
+   depends on PCI  INET  INFINIBAND
+   ---help---
+ This is a low-level driver for NetEffect RDMA enabled
+ Network Interface Cards (RNIC).
+
+config INFINIBAND_NES_DEBUG
+   bool Verbose debugging output
+   depends on INFINIBAND_NES
+   default n
+   ---help---
+ This option causes the NetEffect RNIC driver to produce debug
+ messages.  Select this if you are developing the driver
+ or trying to diagnose a problem.
diff -ruNp old/drivers/infiniband/hw/nes/Makefile
new/drivers/infiniband/hw/nes/Makefile
--- old/drivers/infiniband/hw/nes/Makefile  1969-12-31
18:00:00.0 -0600
+++ new/drivers/infiniband/hw/nes/Makefile  2006-10-25
11:10:26.0 -0500
@@ -0,0 +1,27 @@
+EXTRA_CFLAGS += -Idrivers/infiniband/include
-Idrivers/infiniband/hw/nes/nes_tcpip/include
+
+ifdef CONFIG_INFINIBAND_NES_DEBUG
+EXTRA_CFLAGS += -DNES_DEBUG
+endif
+
+ifneq ($(KERNELRELEASE),)
+   obj-$(CONFIG_INFINIBAND_NES) += iw_nes.o
+
+   iw_nes-objs := \
+   nes.o \
+   nes_hw.o \
+   nes_nic.o \
+   nes_cm.o \
+   nes_utils.o \
+   nes_verbs.o 
+else
+   KERNELDIR ?= /usr/src/linux
+   PWD := $(shell pwd)
+
+default:
+   $(MAKE) -C $(KERNELDIR) M=$(PWD) modules
+
+clean:
+   $(MAKE) -C $(KERNELDIR) M=$(PWD) clean
+
+endif

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/9] NetEffect 10Gb RNIC Driver: main kernel driver c file

2006-10-26 Thread Glenn Grundstrom
Kernel driver patch 2 of 9.

Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED]

==

diff -ruNp old/drivers/infiniband/hw/nes/nes.c
new/drivers/infiniband/hw/nes/nes.c
--- old/drivers/infiniband/hw/nes/nes.c 1969-12-31 18:00:00.0
-0600
+++ new/drivers/infiniband/hw/nes/nes.c 2006-10-25 10:15:49.0
-0500
@@ -0,0 +1,653 @@
+/*
+ * Copyright (c) 2006 NetEffect, Inc. All rights reserved.
+ * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+#include linux/module.h
+#include linux/moduleparam.h
+#include linux/etherdevice.h
+#include linux/ethtool.h
+#include linux/mii.h
+#include linux/if_vlan.h
+#include linux/crc32.h
+#include linux/in.h
+#include linux/init.h
+#include linux/if_arp.h
+#include asm/io.h
+#include asm/irq.h
+#include asm/byteorder.h
+
+#include rdma/ib_smi.h
+#include rdma/ib_verbs.h
+#include rdma/ib_pack.h
+#include rdma/iw_cm.h
+
+#include nes.h
+
+MODULE_AUTHOR(NetEffect);
+MODULE_DESCRIPTION(NetEffect RNIC Low-level iWARP Driver);
+MODULE_LICENSE(Dual BSD/GPL);
+MODULE_VERSION(DRV_VERSION);
+
+int max_mtu = ETH_DATA_LEN;
+
+
+/* Interoperability */
+int mpa_version = 1;
+module_param(mpa_version, int, 0);
+MODULE_PARM_DESC(mpa_version, MPA version to be used int MPA Req/Resp
(0 or 1));
+
+/* Interoperability */
+int disable_mpa_crc = 0;
+module_param(disable_mpa_crc, int, 0);
+MODULE_PARM_DESC(disable_mpa_crc, Disable checking of MPA CRC);
+
+
+unsigned int send_first = 0;
+module_param(send_first, int, 0);
+MODULE_PARM_DESC(send_first, Send RDMA Message First on Active
Connection);
+
+
+LIST_HEAD(nes_adapter_list);
+LIST_HEAD(nes_dev_list);
+
+static int nes_device_event(struct notifier_block *notifier, unsigned
long event, void *ptr);
+static int nes_inetaddr_event(struct notifier_block *notifier, unsigned
long event, void *ptr);
+static void nes_print_macaddr(struct net_device *netdev);
+static irqreturn_t nes_interrupt(int, void *, struct pt_regs *);
+static int __devinit nes_probe(struct pci_dev *, const struct
pci_device_id *);
+static int nes_suspend(struct pci_dev *, pm_message_t);
+static int nes_resume(struct pci_dev *);
+static void __devexit nes_remove(struct pci_dev *);
+static int __init nes_init_module(void);
+static void __exit nes_exit_module(void);
+
+extern  struct nes_dev  *nes_ifs[];
+
+// _the_ function interface handle to nes_tcpip module
+struct nes_stack_ops *stack_ops_p;
+
+static struct pci_device_id nes_pci_table[] = {
+   {PCI_VENDOR_ID_NETEFFECT, PCI_DEVICE_ID_NETEFFECT_NE010,
PCI_ANY_ID, PCI_ANY_ID},
+   {0}
+};
+
+MODULE_DEVICE_TABLE(pci, nes_pci_table);
+
+
+static struct notifier_block nes_dev_notifier = {
+   notifier_call:  nes_device_event
+};
+
+static struct notifier_block nes_inetaddr_notifier = {
+   notifier_call:  nes_inetaddr_event
+};
+
+
+/**
+ * nes_device_event
+ * 
+ * @param notifier
+ * @param event
+ * @param ptr
+ * 
+ * @return int
+ */
+static int nes_device_event(struct notifier_block *notifier,
+   unsigned long
event, void *ptr)
+{
+   struct net_device  *netdev = (struct  net_device *)ptr;
+   struct nes_dev *nesdev;
+
+   dprintk(nes_device_event: notifier %p event=%ld netdev=%p,
interface name = %s.\n,
+   notifier, event, netdev, netdev-name);
+
+   list_for_each_entry(nesdev, nes_dev_list, list) {
+   dprintk(Nesdev list entry = 0x%p.\n, nesdev);
+   if (nesdev-netdev == netdev) {
+   switch (event) {
+ 

Re: [PATCH 1/9] NetEffect 10Gb RNIC Driver: kernel Kconfig and makefiles

2006-10-26 Thread Roland Dreier
  +source drivers/infiniband/hw/nes/Kconfig
  +
   source drivers/infiniband/hw/cxgb3/Kconfig

This patch seems to be against some non-standard tree, since cxgb3
isn't upstream yet.  And if cxgb3 were already upstream, it might be
polite to add yourself after it rather than before ;)

  +config INFINIBAND_NES_DEBUG
  +bool Verbose debugging output
  +depends on INFINIBAND_NES
  +default n
  +---help---
  +  This option causes the NetEffect RNIC driver to produce debug
  +  messages.  Select this if you are developing the driver
  +  or trying to diagnose a problem.

I recommend making this option invisible unless EMBEDDED is set, and
having the default be 'y', and making your debugging level changeable
at run-time.  That way everyone (in particular distros) will have this
turned on and you'll be able to figure out problems without making
end-users rebuild a kernel.

  +EXTRA_CFLAGS += -Idrivers/infiniband/include

Not needed in the kernel tree.

  -Idrivers/infiniband/hw/nes/nes_tcpip/include

I guess this is the mysterious TCP stack module.  Anyway if you need
this in the end, I would suggest removing the C flag and using
#include nes_tcpip/blah.h in your source.

  +ifdef CONFIG_INFINIBAND_NES_DEBUG
  +EXTRA_CFLAGS += -DNES_DEBUG
  +endif

There's no point to this -- just test CONFIG_INFINIBAND_NES_DEBUG directly.

  +ifneq ($(KERNELRELEASE),)
  +obj-$(CONFIG_INFINIBAND_NES) += iw_nes.o
  +
  +iw_nes-objs := \
  +nes.o \
  +nes_hw.o \
  +nes_nic.o \
  +nes_cm.o \
  +nes_utils.o \
  +nes_verbs.o 
  +else

This should be your whole Makefile -- we're not going to merge stuff
into the kernel tree to build your module out of the kernel tree.
Also it's more idiomatic to put all your component objects onto one
(or a few) lines.

 - R.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/9] NetEffect 10Gb RNIC Driver: openfabrics connection manager c file

2006-10-26 Thread Glenn Grundstrom
Kernel driver patch 3 of 9.

Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED]

==

diff -ruNp old/drivers/infiniband/hw/nes/nes_cm.c
new/drivers/infiniband/hw/nes/nes_cm.c
--- old/drivers/infiniband/hw/nes/nes_cm.c  1969-12-31
18:00:00.0 -0600
+++ new/drivers/infiniband/hw/nes/nes_cm.c  2006-10-25
10:36:29.0 -0500
@@ -0,0 +1,1204 @@
+/*
+ * Copyright (c) 2006 NetEffect, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#define TCPOPT_TIMESTAMP 8
+
+#include linux/module.h
+#include linux/moduleparam.h
+#include linux/etherdevice.h
+#include linux/ethtool.h
+#include linux/mii.h
+#include linux/if_vlan.h
+#include linux/crc32.h
+#include linux/in.h
+#include linux/ip.h
+#include linux/tcp.h
+#include linux/init.h
+#include linux/if_arp.h
+#include linux/notifier.h
+#include linux/net.h
+#include linux/types.h
+#include asm/irq.h
+#include asm/byteorder.h
+
+#include net/neighbour.h
+#include net/route.h
+#include net/ip_fib.h
+
+#include rdma/ib_smi.h
+#include rdma/ib_verbs.h
+#include rdma/ib_pack.h
+#include rdma/iw_cm.h
+
+#include nes.h
+
+#define OS_LINUX
+#define OS_LINUX_26
+#include nes.h
+#include nes_sockets.h
+
+extern unsigned int send_first;
+
+struct nes_v4_quad
+{
+   UINT32   rsvd0;
+   UINT32   DstIpAdrIndex;  /* Only most significant 5 bits are valid
*/
+   UINT32   SrcIpadr;
+   UINT32   TcpPorts; /* src is low, dest is high */
+};
+
+enum ietf_mpa_flags {
+   IETF_MPA_FLAGS_MARKERS = 0x80,   /* receive Markers */
+IETF_MPA_FLAGS_CRC = 0x40,  /* receive Markers */
+IETF_MPA_FLAGS_REJECT = 0x20,   /* Reject */
+};
+
+#define IEFT_MPA_KEY_REQ MPA ID Req Frame
+#define IEFT_MPA_KEY_REP MPA ID Rep Frame
+
+struct ietf_mpa_req_resp_frame {
+   u8 key[16];
+   u8 flags;
+   u8 rev;
+   u16 private_data_size;
+   u8 private_data[0];
+};
+
+static void connect_worker(void *);
+static void listen_worker(void *);
+
+extern int NesAdapterAdd(struct net_device *netdev);
+extern int NesInitSockets(void);
+extern void set_interface(
+  UINT32ip_addr,
+  UINT32mask,
+  UINT32bcastaddr,
+  UINT32type
+ );
+#define ADD_ADDR 1
+#define SET_ADDR 2
+#define DELETE_ADDR  3
+
+extern void bdc_cleanup(void);
+extern int mpa_version;
+
+unsigned char   DriverNamePrefix[] = iw_nes;
+
+int nes_if_count = 0;
+
+#define MAX_NES_IFS 4
+struct nes_dev *nes_ifs[MAX_NES_IFS]= { 0 };
+
+
+/**
+ * nes_start_cm
+ * 
+ * @param nesdev
+ * @param new_ifa
+ * 
+ * @return int
+ */
+int nes_start_cm(struct nes_dev *nesdev, struct in_ifaddr *new_ifa)
+{
+   int result = 0;
+   dprintk(%s:%s:%u\n, __FILE__, __FUNCTION__, __LINE__);
+
+   nes_ifs[0] = nesdev;
+
+   stack_ops_p-dhcp_control(0x00);
+
+   // set ip and subnet mask
+   stack_ops_p-set_ip_info(ntohl(new_ifa-ifa_address),
+   ntohl(new_ifa-ifa_mask));
+   stack_ops_p-set_dev_name(nesdev-netdev-name);
+
+   if (nesdev-nes_stack_start == 0) {
+   stack_ops_p-stack_init(nesdev-netdev);
+   /* TODO: Deal with multiple IP addresses */
+   nesdev-local_ipaddr = new_ifa-ifa_address;
+
+   nesdev-nes_stack_start = 1;
+   }
+
+   return result;
+}
+
+
+/**
+ * nes_stop_cm
+ * 
+ * @param nesdev
+ * 
+ * @return int
+ */
+int nes_stop_cm(struct nes_dev *nesdev)
+{
+   if (nesdev-nes_stack_start) 
+   {
+   

Re: [PATCH 1/9] NetEffect 10Gb RNIC Driver: kernel Kconfig and makefiles

2006-10-26 Thread David Miller
From: Roland Dreier [EMAIL PROTECTED]
Date: Thu, 26 Oct 2006 16:58:41 -0700

   -Idrivers/infiniband/hw/nes/nes_tcpip/include
 
 I guess this is the mysterious TCP stack module.

What is this thing?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/9] NetEffect 10Gb RNIC Driver: hardware interface c file

2006-10-26 Thread Glenn Grundstrom
Kernel driver patch 5 of 9.

Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED]

==

diff -ruNp old/drivers/infiniband/hw/nes/nes_hw.c
new/drivers/infiniband/hw/nes/nes_hw.c
--- old/drivers/infiniband/hw/nes/nes_hw.c  1969-12-31
18:00:00.0 -0600
+++ new/drivers/infiniband/hw/nes/nes_hw.c  2006-10-25
10:15:50.0 -0500
@@ -0,0 +1,1470 @@
+/*
+ * Copyright (c) 2006 NetEffect, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#include linux/module.h
+#include linux/moduleparam.h
+#include linux/etherdevice.h
+#include linux/ip.h
+#include linux/tcp.h
+
+#include nes.h
+
+
+#if defined(SA1)
+struct nes_init_values init_values[] = 
+{
+   {0x0600,0x}, 
+   {0x0604,0x}, 
+   {0x2000,0x0001}, 
+   {0x2004,0x0001}, 
+   {0x2008,0x}, 
+   {0x200C,0x0001}, 
+   {0x2010,0x0241}, 
+   {0x201C,0x75345678}, 
+   {0x5100,0x0008}, 
+   {0x6000,0x00e0}, 
+   {0x6008,0x00e0}, 
+// {0x6018,0x0001}, 
+// {0x6028,0x0001}, 
+   {0x6038,0x0003}, 
+   {0x60B8,0x0002}, 
+   {0x6090,0x}, 
+   {0x0900,0x2001}, 
+// {0x01E8,0x000208c2}, 
+   {0x01E8,0x000208c4}, 
+   {0x01EC,0x5f1e8480}, 
+   {0x01FC,0x00050005}, 
+   {0x0B00,0x1000}, 
+   {0x10C8,0x0003}, 
+   {0x5008,0x1F1F1F1F}, 
+   {0x5010,0x1F1F1F1F}, 
+   {0x5018,0x1F1F1F1F}, 
+   {0x5020,0x1F1F1F1F}, 
+// {0x60B8,0x0001}, 
+   {0x60C0,0x0194}, 
+   {0x60C8,0x0020}, 
+   {0x,0x}  
+};
+#endif
+
+
+/**
+ * nes_adapter_init - initialize adapter
+ *
+ * @param nesdev
+ * @param num_pds
+ * 
+ * @return struct nes_adapter*
+ */
+struct nes_adapter *nes_adapter_init(struct nes_dev *nesdev, unsigned
long num_pds) {
+   struct nes_adapter *nesadapter = NULL;
+   int i=0;
+   int found = 0;
+   u32 u32temp;
+   u16 max_rq_wrs;
+   u16 max_sq_wrs;
+   u32 max_mr;
+   u32 max_256pbl;
+   u32 max_4kpbl;
+   u32 max_qp;
+   u32 max_irrq;
+   u32 max_cq;
+   u32 hte_index_mask;
+   u32 adapter_size;
+   u32 arp_table_size;
+
+   /* search the list of existing adapters */
+   list_for_each_entry(nesadapter, nes_adapter_list, list) {
+   dprintk(Searching Adapter list for PCI devfn =
0x%X.\n, nesdev-pcidev-devfn);
+   if ((PCI_SLOT(nesadapter-devfn) ==
PCI_SLOT(nesdev-pcidev-devfn))  
+   (nesadapter-bus_number ==
nesdev-pcidev-bus-number)) {
+   found = 1;   
+   break;
+   }
+   }
+
+   if (!found) {
+   if (nes_read_indexed(nesdev-index_reg, 
+
NES_IDX_QP_CONTROL+PCI_FUNC(nesdev-pcidev-devfn)*8)) {
+   nes_write32(nesdev-regs+NES_SOFTWARE_RESET,
0xd);
+   }
+   /* enable the ports */
+   nes_write32(nesdev-regs+NES_SOFTWARE_RESET, 0);
+
+   u32temp = 0;
+   while ( nes_read_indexed(nesdev-index_reg, 
+
NES_IDX_INT_CPU_STATUS) != 0x80 ) {
+   if (u32temp++  1) break;
+   mdelay(1);
+   }
+
+   if (nes_read_indexed(nesdev-index_reg,
NES_IDX_INT_CPU_STATUS) != 0x80) {
+   printk(KERN_ERR PFX Internal CPU not ready,
status = %02X\n,
+

Re: [PATCH 4 of 9] NetEffect 10Gb RNIC Driver: kernel driver header files

2006-10-26 Thread David Miller
From: Glenn Grundstrom [EMAIL PROTECTED]
Date: Thu, 26 Oct 2006 19:06:23 -0500

 +#include nes_tcpip/include/nes_sockets.h

I want to know what in the world this nes_tcpip thing is?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4 of 9] NetEffect 10Gb RNIC Driver: kernel driver header files

2006-10-26 Thread Glenn Grundstrom
It is part of our connection manager and assists with connection setup
and teardown only.

Glenn.

-Original Message-
From: David Miller [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 26, 2006 7:10 PM
To: Glenn Grundstrom; Glenn Grundstrom
Cc: openib-general@openib.org; netdev@vger.kernel.org
Subject: Re: [PATCH 4 of 9] NetEffect 10Gb RNIC Driver: kernel driver
header files

From: Glenn Grundstrom [EMAIL PROTECTED]
Date: Thu, 26 Oct 2006 19:06:23 -0500

 +#include nes_tcpip/include/nes_sockets.h

I want to know what in the world this nes_tcpip thing is?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/9] NetEffect 10Gb RNIC Driver: kernel network interface c file

2006-10-26 Thread Glenn Grundstrom
Kernel driver patch 6 of 9.

Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED]

==

diff -ruNp old/drivers/infiniband/hw/nes/nes_nic.c
new/drivers/infiniband/hw/nes/nes_nic.c
--- old/drivers/infiniband/hw/nes/nes_nic.c 1969-12-31
18:00:00.0 -0600
+++ new/drivers/infiniband/hw/nes/nes_nic.c 2006-10-25
10:15:50.0 -0500
@@ -0,0 +1,567 @@
+/*
+ * Copyright (c) 2006 NetEffect, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#include linux/module.h
+#include linux/moduleparam.h
+#include linux/etherdevice.h
+#include linux/ip.h
+#include linux/tcp.h
+#include linux/if_arp.h
+
+#include nes.h
+
+static const u32 default_msg = NETIF_MSG_DRV | NETIF_MSG_PROBE |
NETIF_MSG_LINK
+  |
NETIF_MSG_IFUP | NETIF_MSG_IFDOWN;
+static int debug = -1;
+
+static int nes_netdev_open(struct net_device *);
+static int nes_netdev_stop(struct net_device *);
+static int nes_netdev_start_xmit(struct sk_buff *, struct net_device
*);
+static struct net_device_stats *nes_netdev_get_stats(struct net_device
*);
+static void nes_netdev_tx_timeout(struct net_device *);
+static int nes_netdev_set_mac_address(struct net_device *, void *);
+static int nes_netdev_change_mtu(struct net_device *, int);
+
+
+/**
+ * nes_netdev_open
+ * 
+ * @param netdev
+ * 
+ * @return int
+ */
+static int nes_netdev_open(struct net_device *netdev)
+{
+   struct nes_port *nes_port = netdev_priv(netdev);
+   struct nes_dev *nesdev = nes_port-nesdev;
+   u32 u32temp;
+   u32 nic_active_bit;
+   u32 nic_active;
+   u16 link_up = 0;
+
+   dprintk(%s:%s:%u\n, __FILE__, __FUNCTION__, __LINE__);
+
+   assert(nesdev != NULL);
+
+   if (netif_msg_ifup(nes_port))
+   dprintk(KERN_INFO PFX %s: enabling interface\n,
netdev-name);
+
+   /* clear the MAC interrupt status */
+   u32temp = nes_read_indexed(nesdev-index_reg,
NES_IDX_MAC_INT_STATUS );
+   dprintk(Phy interrupt status = 0x%X.\n, u32temp);
+   nes_write_indexed(nesdev-index_reg, NES_IDX_MAC_INT_STATUS,
u32temp);
+
+   nes_phy_init(nesdev);
+
+   nes_nic_qp_init(nesdev, netdev);
+
+   // Set packet filters
+   nic_active_bit = 1PCI_FUNC(nesdev-pcidev-devfn);
+   nic_active = nes_read_indexed(nesdev-index_reg,
NES_IDX_NIC_ACTIVE);
+   nic_active |= nic_active_bit;
+   nic_active |= 2;
+   nes_write_indexed(nesdev-index_reg, NES_IDX_NIC_ACTIVE,
nic_active);
+   nic_active = nes_read_indexed(nesdev-index_reg,
NES_IDX_NIC_MULTICAST_ALL);
+   nic_active |= nic_active_bit;
+   nes_write_indexed(nesdev-index_reg, NES_IDX_NIC_MULTICAST_ALL,
nic_active);
+   nic_active = nes_read_indexed(nesdev-index_reg,
NES_IDX_NIC_BROADCAST_ON);
+   nic_active |= nic_active_bit;
+   nes_write_indexed(nesdev-index_reg, NES_IDX_NIC_BROADCAST_ON,
nic_active);
+
+
+   nes_write32(nesdev-regs+NES_CQE_ALLOC,
NES_CQE_ALLOC_NOTIFY_NEXT | 
+   nesdev-hnic_cq.cq_number );
+
+   // TODO: add proper way to setup packet filters
+   // TODO: move some of the code from init_netdev?
+
+   if ( link_up ) {
+   /* Enable network packets */
+   nes_port-linkup = 1;
+   netif_start_queue(netdev);
+   } else {
+   nes_port-linkup = 0;
+   netif_carrier_off(netdev);
+   }
+
+   nes_write_indexed(nesdev-index_reg, NES_IDX_MAC_INT_MASK, 
+ ~(NES_MAC_INT_LINK_STAT_CHG |
NES_MAC_INT_XGMII_EXT | 
+

Re: [PATCH 4 of 9] NetEffect 10Gb RNIC Driver: kernel driver header files

2006-10-26 Thread David Miller
From: Glenn Grundstrom [EMAIL PROTECTED]
Date: Thu, 26 Oct 2006 19:14:19 -0500

 It is part of our connection manager and assists with connection setup
 and teardown only.

I fear this is exactly the kind of stuff that we didn't want
to see start going into the kernel, and we've resisted the
TCP/IP stack offload stuff in the infiniband layer exactly
for this reason.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] NetEffect 10Gb RNIC Driver: main kernel driver c file

2006-10-26 Thread Roland Dreier
  +static int nes_device_event(struct notifier_block *notifier, unsigned
  long event, void *ptr);
  +static int nes_inetaddr_event(struct notifier_block *notifier, unsigned
  long event, void *ptr);
  +static void nes_print_macaddr(struct net_device *netdev);
  +static irqreturn_t nes_interrupt(int, void *, struct pt_regs *);
  +static int __devinit nes_probe(struct pci_dev *, const struct
  pci_device_id *);
  +static int nes_suspend(struct pci_dev *, pm_message_t);
  +static int nes_resume(struct pci_dev *);
  +static void __devexit nes_remove(struct pci_dev *);
  +static int __init nes_init_module(void);
  +static void __exit nes_exit_module(void);

Some of these declarations are already unneeded (eg at least
nes_init_module and nes_exit_module), and it would be good to
rearrange your code so that the rest can be removed too.

  +// _the_ function interface handle to nes_tcpip module

We prefer /* */ style comments

  +static struct notifier_block nes_dev_notifier = {
  +notifier_call:  nes_device_event
  +};

Standard C syntax (rather than gcc extension is preferred), like:

static struct notifier_block nes_dev_notifier = {
.notifier_call = nes_device_event
};

  +/**
  + * nes_device_event
  + * 
  + * @param notifier
  + * @param event
  + * @param ptr
  + * 
  + * @return int
  + */

There's no point to comments like this.  I can read the function
declaration just fine, so save the screen real estate unless you have
something more to say.

  +unsigned long reg0_start, reg0_flags, reg0_len;
  +unsigned long reg1_start, reg1_flags, reg1_len;

PCI bars are type resource_size_t, which can be bigger than long...

  +assert(pcidev != NULL);
  +assert(ent != NULL);

BUG_ON() is more idiomatic.  But this looks kind of useless anyway --
you'll get a nice enough oops if they are NULL.

  +/* Enable PCI device */
  +ret = pci_enable_device(pcidev);

This isn't major, but comments like this just waste screen space.  I
mean, someone who can't guess what pci_enable_device() does is
probably not going to be helped by the comment either.

  +/* pci tweaks */
  +pci_write_config_word(pcidev, 0x000c, 0xfc10);
  +pci_write_config_dword(pcidev, 0x0048, 0x00480007);

Looks rather magic and fragile.  Register 0xc is the cacheline size
and latency, right?  Why are you tweaking that?

And I assume 0x48 is somewhere in a capability structure.  It's much
better to use pci_find_capability() in that case.  That way when the
hardware guys tell you they have to rearrange the PCI header in the
next rev of the chip, you don't have to touch the chip.  However this
tweaking probably needs to be justified too.

  +/**
  + * nes_suspend - power management
  + */
  +static int nes_suspend(struct pci_dev *pcidev, pm_message_t state)
  +{
  +dprintk(pcidev=%p\n, pcidev);
  +
  +return (0);
  +}
  

Umm, just don't have suspend/resume methods if you don't support it.

  +nes_adapter_free(nesdev-nesadapter);
  +
  +dprintk(nes_remove: calling iounmap.\n);
  +/* Unmap adapter PA space */
  +iounmap(nesdev-regs);
  +
  +/* Unregister with OpenFabrics */
  +if (nesdev-of_device_registered) {
  +dprintk(nes_remove: calling nes_unregister_device.\n);
  +nes_unregister_device(nesdev);
  +}

You can still have upper layers calling into you until
ib_unregister_device() returns, so it looks bogus to do things like
iounmap before then.  I think your cleanup needs to be reordered.

And I don't think you're unregistering with OpenFabrics -- you're just
unregistering with the RDMA midlayer.

  +return (pci_module_init(nes_pci_driver));

Just use pci_register_driver().
 - R.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] NetEffect 10Gb RNIC Driver: kernel Kconfig and makefiles

2006-10-26 Thread Roland Dreier
David What is this thing?

Good question.  I haven't gotten a straight answer yet, which is why I
called it mysterious.

 - R.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/9] NetEffect 10Gb RNIC Driver: utility routines c file

2006-10-26 Thread Glenn Grundstrom
Kernel driver patch 7 of 9.

Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED]

==

diff -ruNp old/drivers/infiniband/hw/nes/nes_utils.c
new/drivers/infiniband/hw/nes/nes_utils.c
--- old/drivers/infiniband/hw/nes/nes_utils.c   1969-12-31
18:00:00.0 -0600
+++ new/drivers/infiniband/hw/nes/nes_utils.c   2006-10-25
10:15:51.0 -0500
@@ -0,0 +1,488 @@
+/*
+ * Copyright (c) 2006 NetEffect, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+#include linux/module.h
+#include linux/moduleparam.h
+#include linux/etherdevice.h
+#include linux/ethtool.h
+#include linux/mii.h
+#include linux/if_vlan.h
+#include linux/crc32.h
+#include linux/in.h
+#include linux/ip.h
+#include linux/tcp.h
+#include linux/init.h
+
+#include asm/io.h
+#include asm/irq.h
+#include asm/byteorder.h
+
+#include rdma/ib_smi.h
+#include rdma/ib_verbs.h
+#include rdma/ib_pack.h
+#include rdma/iw_cm.h
+#include nes.h
+#include nes_verbs.h
+
+#define BITMASK(X) (1L  (X))
+#define NES_CRC_WID 32
+
+static u32 nesCRCTable[256];
+static u32 nesCRCInitialized = 0;
+
+static u32 nesCRCWidMask(u32);
+static u32 nes_crc_table_gen(u32 *, u32, u32, u32);
+static u32 reflect(u32, u32);
+static u32 byte_swap(u32, u32);
+
+
+/**
+ * nes_read_eeprom_values -
+ * 
+ * @param nesdev
+ * 
+ * @return int
+ */
+int nes_read_eeprom_values(struct nes_dev *nesdev)
+{
+   struct nes_adapter *nesadapter = nesdev-nesadapter;
+   u32 mac_addr_low;
+   u16 mac_addr_high;
+   u16 eeprom_data;
+   u16 eeprom_offset;
+
+   if (0 == nesadapter-firmware_eeprom_offset) {
+   /* Read the EEPROM Parameters */
+   eeprom_data = nes_read16_eeprom(nesdev-regs, 0);
+   dprintk(EEPROM Offset 0  = 0x%04X\n, eeprom_data);
+   eeprom_offset = 2 + (((eeprom_data 
0x007f)3)((eeprom_data  0x0080)7));
+   dprintk(Firmware Offset = 0x%04X\n, eeprom_offset);
+   nesadapter-firmware_eeprom_offset = eeprom_offset;
+   eeprom_data = nes_read16_eeprom(nesdev-regs,
eeprom_offset+4);
+   if (eeprom_data != 0x5746) {
+   dprintk(Not a valid Firmware Image = 0x%04X\n,
eeprom_data);
+   return -1;
+   }
+
+   eeprom_data = nes_read16_eeprom(nesdev-regs,
eeprom_offset+2);
+   dprintk(EEPROM Offset %u  = 0x%04X\n, eeprom_offset+2,
eeprom_data);
+   eeprom_offset += ((eeprom_data 
0x00ff)3)((eeprom_data  0x0100)8);
+   dprintk(Software Offset = 0x%04X\n, eeprom_offset);
+   nesadapter-software_eeprom_offset = eeprom_offset;
+   eeprom_data = nes_read16_eeprom(nesdev-regs,
eeprom_offset);
+   dprintk(EEPROM Offset %u  = 0x%04X\n, eeprom_offset,
eeprom_data);
+   eeprom_data = nes_read16_eeprom(nesdev-regs,
eeprom_offset+4);
+   if (eeprom_data != 0x5753) {
+   dprintk(Not a valid Software Image = 0x%04X\n,
eeprom_data);
+   return -1;
+   }
+
+   eeprom_offset = nesadapter-software_eeprom_offset;
+   eeprom_offset += 10;
+   mac_addr_high = nes_read16_eeprom(nesdev-regs,
eeprom_offset);
+   eeprom_offset += 2;
+   mac_addr_low = (u32)nes_read16_eeprom(nesdev-regs,
eeprom_offset);
+   eeprom_offset += 2;
+   mac_addr_low = 16;
+   mac_addr_low += (u32)nes_read16_eeprom(nesdev-regs,
eeprom_offset);
+   dprintk(MAC Address = 0x%04X%08X\n, mac_addr_high,

[PATCH 9/9] NetEffect 10Gb RNIC Driver: openfabrics verbs header file

2006-10-26 Thread Glenn Grundstrom
Kernel driver patch 9 of 9.

Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED]

==

diff -ruNp old/drivers/infiniband/hw/nes/nes_verbs.h
new/drivers/infiniband/hw/nes/nes_verbs.h
--- old/drivers/infiniband/hw/nes/nes_verbs.h   1969-12-31
18:00:00.0 -0600
+++ new/drivers/infiniband/hw/nes/nes_verbs.h   2006-10-25
10:15:52.0 -0500
@@ -0,0 +1,144 @@
+/*
+ * Copyright (c) 2006 NetEffect, Inc. All rights reserved.
+ * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#ifndef NES_VERBS_H
+#define NES_VERBS_H
+
+struct nes_dev;
+
+#define NES_MAX_USER_DB_REGIONS  4096
+#define NES_MAX_USER_WQ_REGIONS  4096
+
+struct nes_ucontext {
+   struct ib_ucontext ibucontext;
+   struct nes_dev *nesdev;
+   /* need to track mmapped areas, start with bit vector? */
+   unsigned long mmap_wq_offset;
+   unsigned long mmap_cq_offset; /* to be removed */
+   int index;  /* rnic index (minor) */
+   unsigned long
allocated_doorbells[BITS_TO_LONGS(NES_MAX_USER_DB_REGIONS)];
+   u16 mmap_db_index[NES_MAX_USER_DB_REGIONS];  
+   u16 first_free_db;
+   unsigned long
allocated_wqs[BITS_TO_LONGS(NES_MAX_USER_WQ_REGIONS)];  
+   struct nes_qp * mmap_nesqp[NES_MAX_USER_WQ_REGIONS];  
+   u16 first_free_wq;
+   struct list_head cq_reg_mem_list;
+};
+
+struct nes_pd {
+   struct ib_pd ibpd;
+   u16 pd_id;
+   atomic_t sqp_count;
+   u16 mmap_db_index;
+};
+
+struct nes_mr {
+   struct ib_mr ibmr;
+   u16 pbls_used;
+   u8 mode;
+   u8 pbl_4k; 
+};
+
+struct nes_hw_pb {
+   u32 pa_low;
+   u32 pa_high;
+};
+
+struct nes_vpbl {
+   dma_addr_t pbl_pbase;
+   struct nes_hw_pb *pbl_vbase;
+};
+
+struct nes_root_vpbl {
+   dma_addr_t pbl_pbase;
+   struct nes_hw_pb *pbl_vbase;
+   struct nes_vpbl *leaf_vpbl;
+};
+
+struct nes_av;
+
+struct nes_cq {
+   struct ib_cq ibcq;
+   struct nes_hw_cq hw_cq;
+   u32 polled_completions;
+   u32 cq_mem_size;
+   spinlock_t lock;
+   u8 virtual_cq;
+   u8 pad[3];
+};
+
+struct nes_wq {
+   spinlock_t lock;
+};
+
+struct iw_cm_id;
+
+struct nes_qp {
+   struct ib_qp ibqp;
+   enum ib_qp_stateibqp_state;
+   u32 iwarp_state;
+   void * allocated_buffer;
+   struct iw_cm_id *cm_id;
+   struct workqueue_struct *wq;
+   struct workqueue_struct *aewq;
+   struct socket *ksock;
+   struct nes_cq *nesscq;
+   struct nes_cq *nesrcq;
+   struct nes_pd *nespd;
+struct ietf_mpa_req_resp_frame *ietf_frame;
+dma_addr_t ietf_frame_pbase;
+   wait_queue_head_t state_waitq;
+   unsigned long socket;
+   struct nes_hw_qp hwqp;
+   struct work_struct work;
+   struct work_struct ae_work;
+   u32 hte_index;
+   u32 last_aeq;
+   u32 qp_mem_size;
+   atomic_t refcount;
+   u32 mmap_sq_db_index;
+   u32 mmap_rq_db_index;
+spinlock_t lock;
+   /* TODO: should move these two to the hw qp? */
+   struct nes_qp_context *nesqp_context;
+   dma_addr_t nesqp_context_pbase;
+   u32 bytes_sent;
+u16 private_data_len;
+u8 active_conn;
+u8 skip_lsmm;
+u8 user_mode;
+   u8 hte_added;
+};
+
+#endif /* NES_VERBS_H */



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4 of 9] NetEffect 10Gb RNIC Driver: kernel driver header files

2006-10-26 Thread Roland Dreier
  I fear this is exactly the kind of stuff that we didn't want
  to see start going into the kernel, and we've resisted the
  TCP/IP stack offload stuff in the infiniband layer exactly
  for this reason.

We're definitely not going to merge a second TCP stack in any form.

 - R.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5] NetEffect 10Gb RNIC Userspace Library: makefile generation

2006-10-26 Thread Glenn Grundstrom
Kernel driver patch 2 of 5.

Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED]

==

diff -ruNp old/src/userspace/libnes/libnes.spec.in
new/src/userspace/libnes/libnes.spec.in
--- old/src/userspace/libnes/libnes.spec.in 1969-12-31
18:00:00.0 -0600
+++ new/src/userspace/libnes/libnes.spec.in 2006-10-25
11:11:23.0 -0500
@@ -0,0 +1,57 @@
+
+%define ver  @VERSION@
+
+Name: libnes
+Version: 0.1
+Release: 0.%{?dist}
+Summary: NetEffect RNIC Userspace Driver
+
+Group: System Environment/Libraries
+License: GPL/BSD
+Url: http://openib.org/
+Source: http://openib.org/downloads/%{name}-%{ver}.tar.gz
+BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u}
-n)
+
+BuildRequires: libibverbs-devel
+
+%description
+libnes provides a device-specific userspace driver for NetEffect RNICs
+for use with the libibverbs library.
+
+%package devel
+Summary: Development files for the libnes driver
+Group: System Environment/Libraries
+Requires: %{name} = %{version}-%{release}
+
+%description devel
+Static version of libnes that may be linked directly to an
+application, which may be useful for debugging.
+
+%prep
+%setup -q -n %{name}-%{ver}
+
+%build
+%configure
+make %{?_smp_mflags}
+
+%install
+rm -rf $RPM_BUILD_ROOT
+%makeinstall
+# remove unpackaged files from the buildroot
+rm -f $RPM_BUILD_ROOT%{_libdir}/infiniband/*.la
+
+%clean
+rm -rf $RPM_BUILD_ROOT
+
+%files
+%defattr(-,root,root,-)
+%{_libdir}/infiniband/nes.so
+%doc AUTHORS COPYING ChangeLog README
+
+%files devel
+%defattr(-,root,root,-)
+%{_libdir}/infiniband/nes.a
+
+%changelog
+* Wed May 10 2006 nesdev [EMAIL PROTECTED] - 1.0
+- First development Effort
diff -ruNp old/src/userspace/libnes/Makefile.am
new/src/userspace/libnes/Makefile.am
--- old/src/userspace/libnes/Makefile.am1969-12-31
18:00:00.0 -0600
+++ new/src/userspace/libnes/Makefile.am2006-10-25
11:11:30.0 -0500
@@ -0,0 +1,25 @@
+
+neslibdir = $(libdir)/infiniband
+
+neslib_LTLIBRARIES = src/nes.la
+
+src_nes_la_CFLAGS = -g -Wall -D_GNU_SOURCE
+
+if HAVE_LD_VERSION_SCRIPT
+nes_version_script = -Wl,--version-script=$(srcdir)/src/nes.map
+else
+nes_version_script =
+endif
+
+src_nes_la_SOURCES = src/nes_umain.c src/nes_uverbs.c
+src_nes_la_LDFLAGS = -avoid-version -module \
+$(nes_version_script)
+
+DEBIAN = debian/changelog debian/compat debian/control debian/copyright
\
+debian/libnes1.install debian/libnes-dev.install debian/rules
+
+EXTRA_DIST = src/nes.h src/nes-abi.h \
+src/nes.map libnes.spec.in $(DEBIAN)
+
+dist-hook: libnes.spec
+   cp libnes.spec $(distdir)



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] NetEffect 10Gb RNIC Userspace Library: userspace header files

2006-10-26 Thread Glenn Grundstrom
Userspace driver patch 3 of 5.

Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED]

==

diff -ruNp old/src/userspace/libnes/src/nes-abi.h
new/src/userspace/libnes/src/nes-abi.h
--- old/src/userspace/libnes/src/nes-abi.h  1969-12-31
18:00:00.0 -0600
+++ new/src/userspace/libnes/src/nes-abi.h  2006-10-25
10:27:58.0 -0500
@@ -0,0 +1,99 @@
+/*
+ * Copyright (c) 2006 NetEffect, Inc. All rights reserved.
+ * Copyright (c) 2006 Open Grid Computing, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+#ifndef nes_ABI_H
+#define nes_ABI_H
+
+#include infiniband/kern-abi.h
+
+struct nes_ualloc_ucontext_resp {
+   struct ibv_get_context_resp ibv_resp;
+   __u32 max_pds; /* maximum pds allowed for this user process */
+   __u32 max_qps; /* maximum qps allowed for this user process */
+   __u32 wq_size; /* defines the size of the WQs (sq+rq) allocated
to the mmaped area */
+   __u32 reserved;
+};
+
+struct nes_ualloc_pd_resp {
+   struct ibv_alloc_pd_resp ibv_resp;
+   __u32 pd_id;
+   __u32 db_index;
+};
+
+struct nes_ucreate_cq {
+   struct ibv_create_cq ibv_cmd;
+   __u64 user_cq_buffer;
+};
+
+enum nes_umemreg_type {
+   NES_UMEMREG_TYPE_MEM = 0x,
+   NES_UMEMREG_TYPE_QP = 0x0001,
+   NES_UMEMREG_TYPE_CQ = 0x0002,
+};
+
+struct nes_ureg_mr {
+   struct ibv_reg_mr ibv_cmd;
+   __u32 reg_type; /* indicates if id is memory, QP or CQ */
+   __u32 reserved; /* QP or CQ ID */
+};
+
+struct nes_ucreate_cq_resp {
+   struct ibv_create_cq_resp ibv_resp;
+   __u32 cq_id;
+   __u32 cq_size;
+   __u32 mmap_db_index;
+   __u32 reserved;
+};
+
+struct nes_ucreate_qp {
+   struct ibv_create_qp ibv_cmd;
+};
+
+struct nes_ucreate_qp_resp {
+   struct ibv_create_qp_resp ibv_resp;
+   __u32 qp_id;
+   __u32 actual_sq_size;
+   __u32 actual_rq_size;
+   __u32 mmap_sq_db_index;
+   __u32 mmap_rq_db_index;
+   __u32 reserved;
+};
+
+
+struct nes_cqe {
+   __u32 header;
+   __u32 len;
+   __u32 wrid_hi_stag;
+   __u32 wrid_low_msn;
+};
+
+#endif /* nes_ABI_H */
diff -ruNp old/src/userspace/libnes/src/nes.map
new/src/userspace/libnes/src/nes.map
--- old/src/userspace/libnes/src/nes.map1969-12-31
18:00:00.0 -0600
+++ new/src/userspace/libnes/src/nes.map2006-10-25
11:11:45.0 -0500
@@ -0,0 +1,6 @@
+{
+   global:
+   ibv_driver_init;
+   openib_driver_init;
+   local: *;
+};
diff -ruNp old/src/userspace/libnes/src/nes_umain.h
new/src/userspace/libnes/src/nes_umain.h
--- old/src/userspace/libnes/src/nes_umain.h1969-12-31
18:00:00.0 -0600
+++ new/src/userspace/libnes/src/nes_umain.h2006-10-25
10:27:59.0 -0500
@@ -0,0 +1,271 @@
+/*
+ * Copyright (c) 2006 NetEffect, Inc. All rights reserved.
+ * Copyright (c) 2006 Open Grid Computing, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ 

[PATCH 4/5] NetEffect 10Gb RNIC Userspace Library: userspace main c file

2006-10-26 Thread Glenn Grundstrom
Userspace driver patch 4 of 5.

Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED]

==

diff -ruNp old/src/userspace/libnes/src/nes_umain.c
new/src/userspace/libnes/src/nes_umain.c
--- old/src/userspace/libnes/src/nes_umain.c1969-12-31
18:00:00.0 -0600
+++ new/src/userspace/libnes/src/nes_umain.c2006-10-25
10:27:58.0 -0500
@@ -0,0 +1,251 @@
+/*
+ * Copyright (c) 2006 NetEffect, Inc. All rights reserved.
+ * Copyright (c) 2006 Open Grid Computing, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+#if HAVE_CONFIG_H
+#  include config.h
+#endif /* HAVE_CONFIG_H */
+
+#include stdio.h
+#include stdlib.h
+#include unistd.h
+#include errno.h
+#include sys/mman.h
+#include pthread.h
+
+#include nes_umain.h
+#include nes-abi.h
+
+long int page_size;
+
+#ifdef HAVE_SYSFS_LIBSYSFS_H
+#include sysfs/libsysfs.h
+#endif
+
+#include sys/types.h
+#include sys/stat.h
+#include fcntl.h
+
+#ifndef PCI_VENDOR_ID_NETEFFECT
+#define PCI_VENDOR_ID_NETEFFECT0x1678
+#endif
+
+#ifndef PCI_DEVICE_ID_NETEFFECT_nes
+#define PCI_DEVICE_ID_NETEFFECT_nes0x0100
+#endif
+
+
+#define HCA(v, d, t) \
+   { .vendor = PCI_VENDOR_ID_##v,  \
+ .device = PCI_DEVICE_ID_NETEFFECT_##d,\
+ .type = NETEFFECT_##t }
+
+struct {
+   unsigned vendor;
+   unsigned device;
+   enum nes_uhca_type type;
+} hca_table[] = {
+   HCA(NETEFFECT, nes, nes),};
+
+static struct ibv_context *nes_ualloc_context(struct ibv_device *,
int);
+static void nes_ufree_context(struct ibv_context *);
+
+static struct ibv_context_ops nes_uctx_ops = {
+   .query_device = nes_uquery_device,
+   .query_port = nes_uquery_port,
+   .alloc_pd = nes_ualloc_pd,
+   .dealloc_pd = nes_ufree_pd,
+   .reg_mr = nes_ureg_mr,
+   .dereg_mr = nes_udereg_mr,
+   .create_cq = nes_ucreate_cq,
+   .poll_cq = nes_upoll_cq,
+   .req_notify_cq = nes_uarm_cq,
+   .cq_event = NULL,
+   .resize_cq = nes_uresize_cq,
+   .destroy_cq = nes_udestroy_cq,
+   .create_srq = NULL,
+   .modify_srq = NULL,
+   .query_srq = NULL,
+   .destroy_srq = NULL,
+   .post_srq_recv = NULL,
+   .create_qp = nes_ucreate_qp,
+   .modify_qp = nes_umodify_qp,
+   .destroy_qp = nes_udestroy_qp,
+   .post_send = nes_upost_send,
+   .post_recv = nes_upost_recv,
+   .create_ah = nes_ucreate_ah,
+   .destroy_ah = nes_udestroy_ah,
+   .attach_mcast = nes_uattach_mcast,
+   .detach_mcast = nes_udetach_mcast
+};
+
+
+/**
+ * nes_ualloc_context
+ * 
+ * @param ibdev
+ * @param cmd_fd
+ * 
+ * @return struct ibv_context*
+ */
+static struct ibv_context *nes_ualloc_context(struct ibv_device *ibdev,
int cmd_fd)
+{
+   // void *mymmapp = NULL;
+   struct ibv_pd *ibv_pd;
+   struct nes_uvcontext *nesvctx;
+   struct ibv_get_context cmd;
+   struct nes_ualloc_ucontext_resp resp;
+
+   page_size = sysconf(_SC_PAGESIZE);
+
+   nesvctx = malloc(sizeof *nesvctx);
+   if (!nesvctx)
+   return NULL;
+
+   nesvctx-ibv_ctx.cmd_fd = cmd_fd;
+
+   if (ibv_cmd_get_context(nesvctx-ibv_ctx, cmd, sizeof cmd,
+   resp.ibv_resp, sizeof resp))
+   goto err_free;
+
+   nesvctx-ibv_ctx.device = ibdev;
+   nesvctx-ibv_ctx.ops = nes_uctx_ops;
+   nesvctx-max_pds = resp.max_pds;
+   nesvctx-max_qps = resp.max_qps;
+   nesvctx-wq_size = resp.wq_size;
+
+   /* Get a doorbell region for the CQs */
+   ibv_pd = 

[PATCH 5/5] NetEffect 10Gb RNIC Userspace Library: openfabrics verbs interface c file

2006-10-26 Thread Glenn Grundstrom
Userspace driver patch 5 of 5.

Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED]

==

diff -ruNp old/src/userspace/libnes/src/nes_uverbs.c
new/src/userspace/libnes/src/nes_uverbs.c
--- old/src/userspace/libnes/src/nes_uverbs.c   1969-12-31
18:00:00.0 -0600
+++ new/src/userspace/libnes/src/nes_uverbs.c   2006-10-25
10:27:59.0 -0500
@@ -0,0 +1,933 @@
+/*
+ * Copyright (c) 2006 NetEffect, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+#if HAVE_CONFIG_H
+#  include config.h
+#endif /* HAVE_CONFIG_H */
+
+#include stdlib.h
+#include stdio.h
+#include string.h
+#include unistd.h
+#include signal.h
+#include errno.h
+#include pthread.h
+#include malloc.h
+#include sys/mman.h
+#include netinet/in.h
+#include linux/compiler.h
+
+#include nes_umain.h
+#include nes-abi.h
+
+extern long int page_size;
+
+
+/**
+ * nes_uquery_device
+ * 
+ * @param context
+ * @param attr
+ * 
+ * @return int
+ */
+int nes_uquery_device(struct ibv_context *context, struct
ibv_device_attr *attr)
+{
+   struct ibv_query_device cmd;
+   uint64_t reserved;
+   int ret;
+
+   ret = ibv_cmd_query_device(context, attr, reserved, cmd,
sizeof cmd);
+   if (ret)
+   return ret;
+
+   return 0;
+}
+
+
+/**
+ * nes_uquery_port
+ * 
+ * @param context
+ * @param port
+ * @param attr
+ * 
+ * @return int
+ */
+int nes_uquery_port(struct ibv_context *context, uint8_t port,
+   struct ibv_port_attr *attr)
+{
+   struct ibv_query_port cmd;
+
+   return ibv_cmd_query_port(context, port, attr, cmd, sizeof
cmd);
+}
+
+
+/**
+ * nes_ualloc_pd
+ * 
+ * @param context
+ * 
+ * @return struct ibv_pd*
+ */
+struct ibv_pd *nes_ualloc_pd(struct ibv_context *context)
+{
+   struct ibv_alloc_pd cmd;
+   struct nes_ualloc_pd_resp resp;
+   struct nes_upd *nesupd;
+
+   nesupd = malloc(sizeof *nesupd);
+   if (!nesupd)
+   return NULL;
+
+   if (ibv_cmd_alloc_pd(context, nesupd-ibv_pd, cmd, sizeof cmd,
+resp.ibv_resp, sizeof resp)) {
+   free(nesupd);
+   return NULL;
+   }
+   nesupd-pd_id = resp.pd_id;
+   nesupd-db_index = resp.db_index;
+
+   nesupd-udoorbell = mmap(NULL, 4096, PROT_WRITE | PROT_READ,
MAP_SHARED,
+ context-cmd_fd, nesupd-db_index * 4096);
+
+   if (((void *)-1) == nesupd-udoorbell) {
+   free(nesupd);
+   return NULL;
+   }
+
+   return (nesupd-ibv_pd);
+}
+
+
+/**
+ * nes_ufree_pd
+ * 
+ * @param pd
+ * 
+ * @return int
+ */
+int nes_ufree_pd(struct ibv_pd *pd)
+{
+   int ret;
+   struct nes_upd *nesupd;
+// fprintf(stderr, PFX %s\n, __FUNCTION__);
+
+   nesupd = to_nes_upd(pd);
+
+   ret = ibv_cmd_dealloc_pd(pd);
+   if (ret)
+   return ret;
+
+   munmap((void *)nesupd-udoorbell, 4096);
+   free(nesupd);
+   return 0;
+}
+
+
+/**
+ * nes_ureg_mr
+ * 
+ * @param pd
+ * @param addr
+ * @param length
+ * @param access
+ * 
+ * @return struct ibv_mr*
+ */
+struct ibv_mr *nes_ureg_mr(struct ibv_pd *pd, void *addr,
+  size_t length, enum ibv_access_flags access)
+{
+   struct ibv_mr *mr;
+   struct nes_ureg_mr cmd;
+   
+// fprintf(stderr, PFX %s: address = %p, length = %u.\n,
__FUNCTION__, addr, length);
+
+   mr = malloc(sizeof *mr);
+   if (!mr)
+   return NULL;
+
+   cmd.reg_type = NES_UMEMREG_TYPE_MEM;
+   if (ibv_cmd_reg_mr(pd, addr, length, (uintptr_t) addr,
+  

Re: [PATCH] Rewrite e100_phys_id

2006-10-26 Thread Matthew Wilcox
On Thu, Oct 26, 2006 at 01:04:32PM -0700, Auke Kok wrote:
 no objections, so I'll ACK it with the notion that I'm going to let our 
 labs do some more testing on it with all the latest changes to it.

Thanks, Auke.  Here's the equivalent patch for e1000.  I don't have a
convenient machine to test it on, but it reduces the size of the driver
by 1.5k.

diff --git a/drivers/net/e1000/e1000.h b/drivers/net/e1000/e1000.h
index 7ecce43..1e22da6 100644
--- a/drivers/net/e1000/e1000.h
+++ b/drivers/net/e1000/e1000.h
@@ -257,9 +257,6 @@ #endif
struct work_struct reset_task;
uint8_t fc_autoneg;
 
-   struct timer_list blink_timer;
-   unsigned long led_status;
-
/* TX */
struct e1000_tx_ring *tx_ring;  /* One per active queue */
unsigned long tx_queue_len;
diff --git a/drivers/net/e1000/e1000_ethtool.c 
b/drivers/net/e1000/e1000_ethtool.c
index 773821e..620afa5 100644
--- a/drivers/net/e1000/e1000_ethtool.c
+++ b/drivers/net/e1000/e1000_ethtool.c
@@ -1819,61 +1819,15 @@ e1000_set_wol(struct net_device *netdev,
return 0;
 }
 
-/* toggle LED 4 times per second = 2 blinks per second */
-#define E1000_ID_INTERVAL  (HZ/4)
-
-/* bit defines for adapter-led_status */
-#define E1000_LED_ON   0
-
-static void
-e1000_led_blink_callback(unsigned long data)
-{
-   struct e1000_adapter *adapter = (struct e1000_adapter *) data;
-
-   if (test_and_change_bit(E1000_LED_ON, adapter-led_status))
-   e1000_led_off(adapter-hw);
-   else
-   e1000_led_on(adapter-hw);
-
-   mod_timer(adapter-blink_timer, jiffies + E1000_ID_INTERVAL);
-}
-
 static int
 e1000_phys_id(struct net_device *netdev, uint32_t data)
 {
struct e1000_adapter *adapter = netdev_priv(netdev);
 
-   if (!data || data  (uint32_t)(MAX_SCHEDULE_TIMEOUT / HZ))
-   data = (uint32_t)(MAX_SCHEDULE_TIMEOUT / HZ);
-
-   if (adapter-hw.mac_type  e1000_82571) {
-   if (!adapter-blink_timer.function) {
-   init_timer(adapter-blink_timer);
-   adapter-blink_timer.function = 
e1000_led_blink_callback;
-   adapter-blink_timer.data = (unsigned long) adapter;
-   }
-   e1000_setup_led(adapter-hw);
-   mod_timer(adapter-blink_timer, jiffies);
-   msleep_interruptible(data * 1000);
-   del_timer_sync(adapter-blink_timer);
-   } else if (adapter-hw.phy_type == e1000_phy_ife) {
-   if (!adapter-blink_timer.function) {
-   init_timer(adapter-blink_timer);
-   adapter-blink_timer.function = 
e1000_led_blink_callback;
-   adapter-blink_timer.data = (unsigned long) adapter;
-   }
-   mod_timer(adapter-blink_timer, jiffies);
-   msleep_interruptible(data * 1000);
-   del_timer_sync(adapter-blink_timer);
-   e1000_write_phy_reg((adapter-hw), 
IFE_PHY_SPECIAL_CONTROL_LED, 0);
-   } else {
-   e1000_blink_led_start(adapter-hw);
-   msleep_interruptible(data * 1000);
-   }
+   if (data == 0)
+   data = 2;
 
-   e1000_led_off(adapter-hw);
-   clear_bit(E1000_LED_ON, adapter-led_status);
-   e1000_cleanup_led(adapter-hw);
+   e1000_blink_led(adapter-hw, data);
 
return 0;
 }
diff --git a/drivers/net/e1000/e1000_hw.c b/drivers/net/e1000/e1000_hw.c
index 65077f3..db5e999 100644
--- a/drivers/net/e1000/e1000_hw.c
+++ b/drivers/net/e1000/e1000_hw.c
@@ -6071,7 +6071,7 @@ e1000_id_led_init(struct e1000_hw * hw)
  *
  * hw - Struct containing variables accessed by shared code
  */
-int32_t
+static int32_t
 e1000_setup_led(struct e1000_hw *hw)
 {
 uint32_t ledctl;
@@ -6123,50 +6123,11 @@ e1000_setup_led(struct e1000_hw *hw)
 
 
 /**
- * Used on 82571 and later Si that has LED blink bits.
- * Callers must use their own timer and should have already called
- * e1000_id_led_init()
- * Call e1000_cleanup led() to stop blinking
- *
- * hw - Struct containing variables accessed by shared code
- */
-int32_t
-e1000_blink_led_start(struct e1000_hw *hw)
-{
-int16_t  i;
-uint32_t ledctl_blink = 0;
-
-DEBUGFUNC(e1000_id_led_blink_on);
-
-if (hw-mac_type  e1000_82571) {
-/* Nothing to do */
-return E1000_SUCCESS;
-}
-if (hw-media_type == e1000_media_type_fiber) {
-/* always blink LED0 for PCI-E fiber */
-ledctl_blink = E1000_LEDCTL_LED0_BLINK |
- (E1000_LEDCTL_MODE_LED_ON  
E1000_LEDCTL_LED0_MODE_SHIFT);
-} else {
-/* set the blink bit for each LED that's on (0x0E) in ledctl_mode2 */
-ledctl_blink = hw-ledctl_mode2;
-for 

Re: [PATCH 2.6.19-rc3 1/2] ehea: kzalloc GFP_ATOMIC fix

2006-10-26 Thread Andrew Morton
On Wed, 25 Oct 2006 13:11:42 +0200
Jan-Bernd Themann [EMAIL PROTECTED] wrote:

 This patch fixes kzalloc parameters (GFP_ATOMIC instead of GFP_KERNEL)

why?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18 forcedeth GSO panic on send

2006-10-26 Thread Herbert Xu
On Thu, Oct 26, 2006 at 11:17:57PM +, Denis Vlasenko wrote:
 
 I am using an AMD64 box with 32bit userspace / 64bit kernel.
 
 Kernels 2.6.18 and 2.6.18.1 semi-randomly hang when I upload stuff
 over the net - for example, svn commit, scp are affected.
 2.6.17.11 does not seem to be affected.
 
 Unfortunately even 60-line screen is not big enough
 to catch whole trace. There are at least two traces,
 and first scrolls off. I have a photo at
 http://busybox.net/~vda/gso_panic/forcedeth_gso_panic.jpg

Looks like a network stack bug rather than a driver problem.
However, I'd really like to see the first oops including the
print out from skb_over_panic.

Could you try booting with pause_on_oops=1 or perhaps use a
serial console?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html