date:20060614

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Brian F. G. Bidulock

Chase,

On Tue, 13 Jun 2006, Chase Venters wrote:
 
  I don't think that it is fair to say that an unstable API/ABI, in of
  itself, provides an incentive to open an existing proprietary driver.
 
 Sure it does, depending on your perspective and what you're willing to 
 consider. The lack of a stable API/ABI means that if you don't want to have 
 to do work tracking the kernel, you should push to have your drivers merged.
 

More work must be done to track the kernel before they are merged, thus
purposeless API changes, or unnecessary use of EXPORT_SYMBOL_GPL impedes
merging

Not all useful kernel modules will nor should be merged GPL or not.

I think that a policy that intentionally makes it hard for proprietary
modules to be developed defeats the purpose of ultimate opening and merging.
It might end up causing something like iBCS, LinuxABI, SVR D3DK, or ODI to
flourish obviating the principal goal.

The interface currently under discussion is ultimately derived from the BSD
socket-protocol interface, and IMHO should be EXPORT_SYMBOL instead of
EXPORT_SYMBOL_GPL, if only because using _GPL serves no purpose here and can
be defeated with 3 or 4 obvious (and probably existing) lines of code.  I
wrote similar wrappers for STREAMS TPI to Linux NET4 interface instead of
using pointers directly quite a few years ago.  I doubt I was the first.
There is nothing really so novel here that it deserves _GPL.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] 8139cp: add ethtool eeprom support

2006-06-14 Thread Philip Craig

Implement the ethtool eeprom operations for the 8139cp driver.
Tested on x86 and big-endian ARM.

Signed-off-by: Philip Craig [EMAIL PROTECTED]

Index: linux-2.6.17-rc6/drivers/net/8139cp.c
===
--- linux-2.6.17-rc6.orig/drivers/net/8139cp.c  2006-06-14 15:59:26.0 
+1000
+++ linux-2.6.17-rc6/drivers/net/8139cp.c   2006-06-14 15:59:53.0 
+1000
@@ -401,6 +401,11 @@ static void cp_clean_rings (struct cp_pr
 #ifdef CONFIG_NET_POLL_CONTROLLER
 static void cp_poll_controller(struct net_device *dev);
 #endif
+static int cp_get_eeprom_len(struct net_device *dev);
+static int cp_get_eeprom(struct net_device *dev,
+struct ethtool_eeprom *eeprom, u8 *data);
+static int cp_set_eeprom(struct net_device *dev,
+struct ethtool_eeprom *eeprom, u8 *data);

 static struct pci_device_id cp_pci_tbl[] = {
{ PCI_VENDOR_ID_REALTEK, PCI_DEVICE_ID_REALTEK_8139,
@@ -1577,6 +1582,9 @@ static struct ethtool_ops cp_ethtool_ops
.get_strings= cp_get_strings,
.get_ethtool_stats  = cp_get_ethtool_stats,
.get_perm_addr  = ethtool_op_get_perm_addr,
+   .get_eeprom_len = cp_get_eeprom_len,
+   .get_eeprom = cp_get_eeprom,
+   .set_eeprom = cp_set_eeprom,
 };

 static int cp_ioctl (struct net_device *dev, struct ifreq *rq, int cmd)
@@ -1612,24 +1620,32 @@ static int cp_ioctl (struct net_device *
 #define eeprom_delay() readl(ee_addr)

 /* The EEPROM commands include the alway-set leading bit. */
+#define EE_EXTEND_CMD  (4)
 #define EE_WRITE_CMD   (5)
 #define EE_READ_CMD(6)
 #define EE_ERASE_CMD   (7)

-static int read_eeprom (void __iomem *ioaddr, int location, int addr_len)
-{
-   int i;
-   unsigned retval = 0;
-   void __iomem *ee_addr = ioaddr + Cfg9346;
-   int read_cmd = location | (EE_READ_CMD  addr_len);
+#define EE_EWDS_ADDR   (0)
+#define EE_WRAL_ADDR   (1)
+#define EE_ERAL_ADDR   (2)
+#define EE_EWEN_ADDR   (3)
+
+#define CP_EEPROM_MAGIC PCI_DEVICE_ID_REALTEK_8139

+static void eeprom_cmd_start(void __iomem *ee_addr)
+{
writeb (EE_ENB  ~EE_CS, ee_addr);
writeb (EE_ENB, ee_addr);
eeprom_delay ();
+}
+
+static void eeprom_cmd(void __iomem *ee_addr, int cmd, int cmd_len)
+{
+   int i;

-   /* Shift the read command bits out. */
-   for (i = 3 + addr_len - 1; i = 0; i--) {
-   int dataval = (read_cmd  (1  i)) ? EE_DATA_WRITE : 0;
+   /* Shift the command bits out. */
+   for (i = cmd_len - 1; i = 0; i--) {
+   int dataval = (cmd  (1  i)) ? EE_DATA_WRITE : 0;
writeb (EE_ENB | dataval, ee_addr);
eeprom_delay ();
writeb (EE_ENB | dataval | EE_SHIFT_CLK, ee_addr);
@@ -1637,6 +1653,33 @@ static int read_eeprom (void __iomem *io
}
writeb (EE_ENB, ee_addr);
eeprom_delay ();
+}
+
+static void eeprom_cmd_end(void __iomem *ee_addr)
+{
+   writeb (~EE_CS, ee_addr);
+   eeprom_delay ();
+}
+
+static void eeprom_extend_cmd(void __iomem *ee_addr, int extend_cmd,
+ int addr_len)
+{
+   int cmd = (EE_EXTEND_CMD  addr_len) | (extend_cmd  (addr_len - 2));
+
+   eeprom_cmd_start(ee_addr);
+   eeprom_cmd(ee_addr, cmd, 3 + addr_len);
+   eeprom_cmd_end(ee_addr);
+}
+
+static u16 read_eeprom (void __iomem *ioaddr, int location, int addr_len)
+{
+   int i;
+   u16 retval = 0;
+   void __iomem *ee_addr = ioaddr + Cfg9346;
+   int read_cmd = location | (EE_READ_CMD  addr_len);
+
+   eeprom_cmd_start(ee_addr);
+   eeprom_cmd(ee_addr, read_cmd, 3 + addr_len);

for (i = 16; i  0; i--) {
writeb (EE_ENB | EE_SHIFT_CLK, ee_addr);
@@ -1648,13 +1691,125 @@ static int read_eeprom (void __iomem *io
eeprom_delay ();
}

-   /* Terminate the EEPROM access. */
-   writeb (~EE_CS, ee_addr);
-   eeprom_delay ();
+   eeprom_cmd_end(ee_addr);

return retval;
 }

+static void write_eeprom(void __iomem *ioaddr, int location, u16 val,
+int addr_len)
+{
+   int i;
+   void __iomem *ee_addr = ioaddr + Cfg9346;
+   int write_cmd = location | (EE_WRITE_CMD  addr_len);
+
+   eeprom_extend_cmd(ee_addr, EE_EWEN_ADDR, addr_len);
+
+   eeprom_cmd_start(ee_addr);
+   eeprom_cmd(ee_addr, write_cmd, 3 + addr_len);
+   eeprom_cmd(ee_addr, val, 16);
+   eeprom_cmd_end(ee_addr);
+
+   eeprom_cmd_start(ee_addr);
+   for (i = 0; i  2; i++)
+   if (readb(ee_addr)  EE_DATA_READ)
+   break;
+   eeprom_cmd_end(ee_addr);
+
+   eeprom_extend_cmd(ee_addr, EE_EWDS_ADDR, addr_len);
+}
+
+static int cp_get_eeprom_len(struct net_device *dev)
+{
+   struct cp_private *cp = netdev_priv(dev);
+   int size;
+
+   spin_lock_irq(cp-lock);
+   size =

[PATCH 1/2] 8139cp: fix eeprom read command length

2006-06-14 Thread Philip Craig

The read command for the 93C46/93C56 EEPROMS should be 3 bits plus
the address.  This doesn't appear to affect the operation of the
read command, but similar errors for write commands do cause failures.

Signed-off-by: Philip Craig [EMAIL PROTECTED]

Index: linux-2.6.17-rc6/drivers/net/8139cp.c
===
--- linux-2.6.17-rc6.orig/drivers/net/8139cp.c  2006-06-14 16:02:00.0 
+1000
+++ linux-2.6.17-rc6/drivers/net/8139cp.c   2006-06-14 16:03:29.0 
+1000
@@ -1628,7 +1628,7 @@ static int read_eeprom (void __iomem *io
eeprom_delay ();

/* Shift the read command bits out. */
-   for (i = 4 + addr_len; i = 0; i--) {
+   for (i = 3 + addr_len - 1; i = 0; i--) {
int dataval = (read_cmd  (1  i)) ? EE_DATA_WRITE : 0;
writeb (EE_ENB | dataval, ee_addr);
eeprom_delay ();
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Chase Venters

On Wednesday 14 June 2006 01:06, Brian F. G. Bidulock wrote:

 The interface currently under discussion is ultimately derived from the BSD
 socket-protocol interface, and IMHO should be EXPORT_SYMBOL instead of
 EXPORT_SYMBOL_GPL, if only because using _GPL serves no purpose here and
 can be defeated with 3 or 4 obvious (and probably existing) lines of code. 
 I wrote similar wrappers for STREAMS TPI to Linux NET4 interface instead of
 using pointers directly quite a few years ago.  I doubt I was the first.
 There is nothing really so novel here that it deserves _GPL.

I mentioned that I don't have any particular opinion on the BSD socket API in 
this discussion. All that I'm speaking of here is a property of licensing. 

I've watched a lot of what has happened with binary drivers. You'll find in 
the LKML archives plenty of lengthy discussions about whether or not binary 
drivers are allowed under the GPL. If I were to guess, there is still 
disagreement. Although some hardware support could improve, we thankfully 
seem to have some kind of an equilibrium capable of supporting lots of users.

One point I remember coming up in the discussion was that the 
EXPORT_SYMBOL()/EXPORT_SYMBOL_GPL() split was a compromise of sorts. 
Interfaces that were needed to support users would reasonably be placed under 
EXPORT_SYMBOL(). By contrast, EXPORT_SYMBOL_GPL() would indicate 
functionality that would only seem to be used by derived works. It implies 
that any code using it should probably be GPL as well.

I don't raise this in an attempt to belittle anything people are working on. 
It's an observation about the ecosystem - Linux in the 2.6 series has seen a 
great amount of corporate contribution in terms of enhancing what the kernel 
is capable of doing. GPL, I believe, encourages this.

Thanks,
Chase
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.17: networking bug??

2006-06-14 Thread Daniel Drake


Mark Lord wrote:

Further to this, the current behaviour is badly unpredictable.

A machine could be working perfectly, not (noticeably) affected
by this bug.  And then the user adds another stick of RAM to it.


This bug already existed in 2.6.16 to a certain extent: you were 
losing out on a lot of TCP performance. Go back to 2.6.7, measure TCP 
performance, and you'll probably find it was significantly better.


Also, there aren't that many broken end-points out there. 
www.everymac.com loads fine for me and does not ignore the window scale 
factor.


The problem in your case is a broken router in the middle. I had the 
same problem: certain sites would not load, but there is absolutely 
nothing wrong with the servers that run these sites:


http://marc.theaimsgroup.com/?l=linux-netdevm=114478312100641w=2

I contacted my ISP and informed them of the issue. They fixed it 
nationwide within a few weeks. You might try confirming that your 
problem only applies to HTTP like mine did (ISP runs some lame 
transparent webcaches), and it was a bug in the software there (NetApp).


We already had the some routers are broken, should we do anything 
discussion back at the time of 2.6.8:


http://lwn.net/Articles/92727/

Daniel

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Brian F. G. Bidulock

Chase,

On Wed, 14 Jun 2006, Chase Venters wrote:
 
 One point I remember coming up in the discussion was that the 
 EXPORT_SYMBOL()/EXPORT_SYMBOL_GPL() split was a compromise of sorts. 
 Interfaces that were needed to support users would reasonably be placed under 
 EXPORT_SYMBOL(). By contrast, EXPORT_SYMBOL_GPL() would indicate 
 functionality that would only seem to be used by derived works. It implies 
 that any code using it should probably be GPL as well.

The difficulty with EXPORT_SYMBOL_GPL() as I see it that it reached farther
than the GPL.  GPL does not impact non-derived works, which can be licensed
under any terms their authors see fit.  Whereas, EXPORT_SYMBOL_GPL() requires
a non-derived work to declare a GPL license to even use it.  If you subscribe
to the FSF view of derived work (just linking is a derivation) then I suppose
you would support the EXPORT_SYMBOL_GPL().  IANAL, but I don't believe that
TRIPS nor Berne Convention case law supports the FSF view.  Linus' statements
in the COPYING file take a different view: that simple use of a technical
interface is not necessarily (in itself) derivation.

Now, I understand the use of EXPORT_SYMBOL() vs. EXPORT_SYMBOL_GPL() to allow
authors to differ on this idea.  But, in the case in point, the function
pointers can be accessed by merely including the appropriate header files.
Changing a the wrapper access to them to EXPORT_SYMBOL_GPL() strikes me as
similar to changing kmalloc() from EXPORT_SYMBOL() to EXPORT_SYMBOL_GPL().

Understand that all exported symbols, regardless of licensing or modversions
or whatever, are available in the kernel boot image and can be linked to by
any module at any time.  That is, those that would abuse the concept of
derivation will not be impeded by EXPORT_SYMBOL_GPL().  (Rip the symbol from
the kernel image, write a thin GPL'ed module that aliases the symbol and the
exports it again as EXPORT_SYMBOL() without module versioning, copy the lines
of code into the proprietary module, reversing the order of arbitrary lines,
etc.)

In any case, all it serves to do is to punish honest non-derivative works not
published compatible with the GPL.

What I resist is the apparent attempt to change these symbols to _GPL as some
matter of general policy in this case contrary to the author's original
intentions as expressed in the original patch submission, and without the
author of the interface being wrappered jumping up an screaming that his code
was under strict FSF linking-is-derivation GPL (in which case we could have
had a good discussion on whether Linux NET4 is actually a derivative work of
BSD 4.4 Lite which was licensed under the old BSD license, incompatible with
the GPL ;)

As a general policy I would say make it EXPORT_SYMBOL() unless the author of
the patch (derivation) or author of the original (derived) code dictates that
it be EXPORT_SYMBOL_GPL().

Ok, I'll shut up now... ...really.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] NET: Accurate packet scheduling for ATM/ADSL (kernel)

2006-06-14 Thread Jesper Dangaard Brouer


The Linux traffic's control engine inaccurately calculates
transmission times for packets sent over ADSL links.  For
some packet sizes the error rises to over 50%.  This occurs
because ADSL uses ATM as its link layer transport, and ATM
transmits packets in fixed sized 53 byte cells.

This changes the kernel rate table lookup, to be able to lookup
packet transmission times over all ATM links, including ADSL,
with perfect accuracy. The accuracy is dependent on the rate
table that is calculated in userspace by iproute2 command tc.

A longer presentation of the patch, its rational, what it
does and how to use it can be found here:
   http://www.stuart.id.au/russell/files/tc/tc-atm/

A earlier version of the patch, and a _detailed_ empirical
investigation of its effects can be found here:
   http://www.adsl-optimizer.dk/

Signed-off-by: Jesper Dangaard Brouer [EMAIL PROTECTED]
Signed-off-by: Russell Stuart [EMAIL PROTECTED]
---

diff -Nurp kernel-source-2.6.16.orig/include/linux/pkt_sched.h 
kernel-source-2.6.16/include/linux/pkt_sched.h
--- kernel-source-2.6.16.orig/include/linux/pkt_sched.h 2006-03-20 
15:53:29.0 +1000
+++ kernel-source-2.6.16/include/linux/pkt_sched.h  2006-06-13 
11:42:12.0 +1000
@@ -77,8 +77,9 @@ struct tc_ratespec
 {
unsigned char   cell_log;
unsigned char   __reserved;
-   unsigned short  feature;
-   short   addend;
+   unsigned short  feature;/* Always 0 in pre-atm patch kernels */
+   charcell_align; /* Always 0 in pre-atm patch kernels */
+   unsigned char   __unused;
unsigned short  mpu;
__u32   rate;
 };
diff -Nurp kernel-source-2.6.16.orig/include/net/sch_generic.h 
kernel-source-2.6.16/include/net/sch_generic.h
--- kernel-source-2.6.16.orig/include/net/sch_generic.h 2006-03-20 
15:53:29.0 +1000
+++ kernel-source-2.6.16/include/net/sch_generic.h  2006-06-13 
11:42:12.0 +1000
@@ -307,4 +307,18 @@ drop:
return NET_XMIT_DROP;
 }
 
+/* Lookup a qdisc_rate_table to determine how long it will take to send a
+   packet given its size.
+ */
+static inline u32 qdisc_l2t(struct qdisc_rate_table* rtab, int pktlen)
+{
+   int slot = pktlen + rtab-rate.cell_align;
+   if (slot  0)
+   slot = 0;
+   slot = rtab-rate.cell_log;
+   if (slot  255)
+   return rtab-data[255] + 1;
+   return rtab-data[slot];
+}
+
 #endif
diff -Nurp kernel-source-2.6.16.orig/net/sched/act_police.c 
kernel-source-2.6.16/net/sched/act_police.c
--- kernel-source-2.6.16.orig/net/sched/act_police.c2006-03-20 
15:53:29.0 +1000
+++ kernel-source-2.6.16/net/sched/act_police.c 2006-06-13 11:42:12.0 
+1000
@@ -33,8 +33,8 @@
 #include net/sock.h
 #include net/act_api.h
 
-#define L2T(p,L)   ((p)-R_tab-data[(L)(p)-R_tab-rate.cell_log])
-#define L2T_P(p,L) ((p)-P_tab-data[(L)(p)-P_tab-rate.cell_log])
+#define L2T(p,L)   qdisc_l2t((p)-R_tab,L)
+#define L2T_P(p,L) qdisc_l2t((p)-P_tab,L)
 #define PRIV(a) ((struct tcf_police *) (a)-priv)
 
 /* use generic hash table */
diff -Nurp kernel-source-2.6.16.orig/net/sched/sch_cbq.c 
kernel-source-2.6.16/net/sched/sch_cbq.c
--- kernel-source-2.6.16.orig/net/sched/sch_cbq.c   2006-03-20 
15:53:29.0 +1000
+++ kernel-source-2.6.16/net/sched/sch_cbq.c2006-06-13 11:42:12.0 
+1000
@@ -193,7 +193,7 @@ struct cbq_sched_data
 };
 
 
-#define L2T(cl,len)((cl)-R_tab-data[(len)(cl)-R_tab-rate.cell_log])
+#define L2T(cl,len)qdisc_l2t((cl)-R_tab,len)
 
 
 static __inline__ unsigned cbq_hash(u32 h)
diff -Nurp kernel-source-2.6.16.orig/net/sched/sch_htb.c 
kernel-source-2.6.16/net/sched/sch_htb.c
--- kernel-source-2.6.16.orig/net/sched/sch_htb.c   2006-03-20 
15:53:29.0 +1000
+++ kernel-source-2.6.16/net/sched/sch_htb.c2006-06-13 11:42:12.0 
+1000
@@ -206,12 +206,10 @@ struct htb_class
 static __inline__ long L2T(struct htb_class *cl,struct qdisc_rate_table *rate,
int size)
 { 
-int slot = size  rate-rate.cell_log;
-if (slot  255) {
+long result = qdisc_l2t(rate, size);
+if (result  rate-data[255])
cl-xstats.giants++;
-   slot = 255;
-}
-return rate-data[slot];
+return result;
 }
 
 struct htb_sched
diff -Nurp kernel-source-2.6.16.orig/net/sched/sch_tbf.c 
kernel-source-2.6.16/net/sched/sch_tbf.c
--- kernel-source-2.6.16.orig/net/sched/sch_tbf.c   2006-03-20 
15:53:29.0 +1000
+++ kernel-source-2.6.16/net/sched/sch_tbf.c2006-06-13 11:42:12.0 
+1000
@@ -132,8 +132,8 @@ struct tbf_sched_data
struct Qdisc*qdisc; /* Inner qdisc, default - bfifo queue */
 };
 
-#define L2T(q,L)   ((q)-R_tab-data[(L)(q)-R_tab-rate.cell_log])
-#define L2T_P(q,L) ((q)-P_tab-data[(L)(q)-P_tab-rate.cell_log])
+#define L2T(q,L)   qdisc_l2t((q)-R_tab,L)
+#define L2T_P(q,L) qdisc_l2t((q)-P_tab,L)
 
 static int tbf_enqueue(struct sk_buff *skb, struct Qdisc* sch)
 {




signature.asc

[PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-14 Thread Jesper Dangaard Brouer


The Linux traffic's control engine inaccurately calculates
transmission times for packets sent over ADSL links.  For
some packet sizes the error rises to over 50%.  This occurs
because ADSL uses ATM as its link layer transport, and ATM
transmits packets in fixed sized 53 byte cells.

This changes the userspace tool iproute2/tc by adding an
option to calculate traffic transmission times (rate table)
over all ATM links, including ADSL, with perfect accuracy.

A longer presentation of the patch, its rational, what it
does and how to use it can be found here:
   http://www.stuart.id.au/russell/files/tc/tc-atm/

A earlier version of the patch, and a _detailed_ empirical
investigation of its effects can be found here:
   http://www.adsl-optimizer.dk/

Signed-off-by: Jesper Dangaard Brouer [EMAIL PROTECTED]
Signed-off-by: Russell Stuart [EMAIL PROTECTED]
---

diff -Nurp iproute2.orig/include/linux/pkt_sched.h 
iproute2/include/linux/pkt_sched.h
--- iproute2.orig/include/linux/pkt_sched.h 2005-12-10 09:27:44.0 
+1000
+++ iproute2/include/linux/pkt_sched.h  2006-06-13 11:53:27.0 +1000
@@ -77,8 +77,9 @@ struct tc_ratespec
 {
unsigned char   cell_log;
unsigned char   __reserved;
-   unsigned short  feature;
-   short   addend;
+   unsigned short  feature;/* Always 0 in pre-atm patch kernels */
+   charcell_align; /* Always 0 in pre-atm patch kernels */
+   unsigned char   __unused;
unsigned short  mpu;
__u32   rate;
 };
diff -Nurp iproute2.orig/tc/m_police.c iproute2/tc/m_police.c
--- iproute2.orig/tc/m_police.c 2005-01-19 08:11:58.0 +1000
+++ iproute2/tc/m_police.c  2006-06-13 11:53:27.0 +1000
@@ -35,7 +35,7 @@ struct action_util police_action_util = 
 static void explain(void)
 {
fprintf(stderr, Usage: ... police rate BPS burst BYTES[/BYTES] [ mtu 
BYTES[/BYTES] ]\n);
-   fprintf(stderr, [ peakrate BPS ] [ avrate BPS ]\n);
+   fprintf(stderr, [ peakrate BPS ] [ avrate BPS ] [ 
overhead OVERHEAD ] [ atm ]\n);
fprintf(stderr, [ ACTIONTERM ]\n);
fprintf(stderr, Old Syntax ACTIONTERM := action 
EXCEEDACT[/NOTEXCEEDACT] \n); 
fprintf(stderr, New Syntax ACTIONTERM := conform-exceed 
EXCEEDACT[/NOTEXCEEDACT] \n); 
@@ -134,7 +134,10 @@ int act_parse_police(struct action_util 
__u32 ptab[256];
__u32 avrate = 0;
int presult = 0;
-   unsigned buffer=0, mtu=0, mpu=0;
+   unsigned buffer=0, mtu=0;
+   __u8 mpu=0;
+   __s8 overhead=0;
+   int atm=0;
int Rcell_log=-1, Pcell_log = -1; 
struct rtattr *tail;
 
@@ -184,7 +187,7 @@ int act_parse_police(struct action_util 
fprintf(stderr, Double \mpu\ spec\n);
return -1;
}
-   if (get_size(mpu, *argv)) {
+   if (get_u8(mpu, *argv, 10)) {
explain1(mpu);
return -1;
}
@@ -198,6 +201,18 @@ int act_parse_police(struct action_util 
explain1(rate);
return -1;
}
+   } else if (strcmp(*argv, overhead) == 0) {
+   NEXT_ARG();
+   if (p.rate.rate) {
+   fprintf(stderr, Double \overhead\ spec\n);
+   return -1;
+   }
+   if (get_s8(overhead, *argv, 10)) {
+   explain1(overhead);
+   return -1;
+   }
+   } else if (strcmp(*argv, atm) == 0) {
+   atm = 1;
} else if (strcmp(*argv, avrate) == 0) {
NEXT_ARG();
if (avrate) {
@@ -264,22 +279,12 @@ int act_parse_police(struct action_util 
}
 
if (p.rate.rate) {
-   if ((Rcell_log = tc_calc_rtable(p.rate.rate, rtab, Rcell_log, 
mtu, mpu))  0) {
-   fprintf(stderr, TBF: failed to calculate rate 
table.\n);
-   return -1;
-   }
+   tc_calc_ratespec(p.rate, rtab, p.rate.rate, Rcell_log, mtu, 
mpu, atm, overhead);
p.burst = tc_calc_xmittime(p.rate.rate, buffer);
-   p.rate.cell_log = Rcell_log;
-   p.rate.mpu = mpu;
}
p.mtu = mtu;
if (p.peakrate.rate) {
-   if ((Pcell_log = tc_calc_rtable(p.peakrate.rate, ptab, 
Pcell_log, mtu, mpu))  0) {
-   fprintf(stderr, POLICE: failed to calculate peak rate 
table.\n);
-   return -1;
-   }
-   p.peakrate.cell_log = Pcell_log;
-   p.peakrate.mpu = mpu;
+

[PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL

2006-06-14 Thread Jesper Dangaard Brouer


The Linux traffic's control engine inaccurately calculates
transmission times for packets sent over ADSL links.  For
some packet sizes the error rises to over 50%.  This occurs
because ADSL uses ATM as its link layer transport, and ATM
transmits packets in fixed sized 53 byte cells.

The following patches to iproute2 and the kernel add an
option to calculate traffic transmission times over all
ATM links, including ADSL, with perfect accuracy.

A longer presentation of the patch, its rational, what it
does and how to use it can be found here:
   http://www.stuart.id.au/russell/files/tc/tc-atm/

A earlier version of the patch, and a _detailed_ empirical
investigation of its effects can be found here:
   http://www.adsl-optimizer.dk/

The patches are both backwards and forwards compatible.
This means unpatched kernels will work with a patched
version of iproute2, and an unpatched iproute2 will work
on patches kernels.


This is a combined effort of Jesper Brouer and Russell Stuart,
to get these patches into the upstream kernel.

Let the discussion start about what we need to change to get this
upstream?

We see this as a feature enhancement, as thus hope that it can be
queued in davem's net-2.6.18.git tree.

---
Regards,
 Jesper Brouer  Russell Stuart.



signature.asc
Description: This is a digitally signed message part

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Alan Cox

Ar Mer, 2006-06-14 am 00:07 -0600, ysgrifennodd Brian F. G. Bidulock:
 I think that a policy that intentionally makes it hard for proprietary
 modules to be developed defeats the purpose of ultimate opening and merging.

It isn't policy its called copyright law.

 The interface currently under discussion is ultimately derived from the BSD
 socket-protocol interface, and IMHO should be EXPORT_SYMBOL instead of
 EXPORT_SYMBOL_GPL, if only because using _GPL serves no purpose here and can
 be defeated with 3 or 4 obvious (and probably existing) lines of code

You don't seem to understand copyright law either. The GPL like all
copyright licenses deals with the right to make copies and to create and
control derivative works. It's not defeated by four lines of code.
  I
 wrote similar wrappers for STREAMS TPI to Linux NET4 interface instead of
 using pointers directly quite a few years ago.  I doubt I was the first.

Is that a confession ;)

 There is nothing really so novel here that it deserves _GPL.

Copyright is not about novelty, you have it confused with the
theoretical (not actual) role of patents. Wrong kind of intellectual
monopoly right.

Alan

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Theodore Tso

On Tue, Jun 13, 2006 at 07:53:19PM -0500, Chase Venters wrote:
  It is the lack of an ABI that is most frustrating to these users.
 
 And the presence of an ABI would be _very_ frustrating to core
 developers. Not only would these people suffer, everyone would --
 developer time would be wasted dealing with cruft, and forward
 progress would be slowed.

Note that just because an interface is EXPORT_SYMBOL doesn't mean that
the interface is guaranteed to be stable.  So folks who are aruging
that an interface shouldn't be usable by non-GPL applications because
we are therefore guaranteeing a stable API are making an unwarranted
assumption.

- Ted
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-14 Thread Alan Cox

Ar Mer, 2006-06-14 am 11:40 +0200, ysgrifennodd Jesper Dangaard Brouer:
 option to calculate traffic transmission times (rate table)
 over all ATM links, including ADSL, with perfect accuracy.

Pedant
Only if the lowest level is encoded in a time linear manner. If you are
using NRZ, NRZI etc at the bottom level then you may still be out...

/Pedant

The other problem I see with this code is it is very tightly tied to ATM
cell sizes, not to solving the generic question of packetisation. I'm
not sure if that matters but for modern processors I'm also sceptical
that the clever computation is actually any faster than just doing the
maths, especially if something cache intensive is also running.

Alan

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Brian F. G. Bidulock

Alan,

On Wed, 14 Jun 2006, Alan Cox wrote:

 It isn't policy its called copyright law.

I know that I said I'd shut up, but I missed in TRIPS where it said
that symbols must be EXPORT_SYMBOL_GPL...  Could you point that out?
(Just kidding.)

 You don't seem to understand copyright law either. The GPL like all
 copyright licenses deals with the right to make copies and to create and
 control derivative works. It's not defeated by four lines of code.

The 3 or 4 lines of code that I wrote as an original expression before
the patch was submitted.

 Is that a confession ;)

No, just a declaration: the code in question was released under GPL
Version 2.

 Copyright is not about novelty, you have it confused with the
 theoretical (not actual) role of patents. Wrong kind of intellectual
 monopoly right.

Yes, perhaps I should have said original instead of novel.  The patch
is not original as it was predated by equivalent (machine translatable)
original expressions.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Remove Prism II support from Orinoco

2006-06-14 Thread Jiri Benc

On Tue, 13 Jun 2006 07:30:56 +0300, Faidon Liambotis wrote:
 Unfortunately, that workaround doesn't work so well when you want to
 have the ability to plug real orinoco (hermes) cards to your computer...
 In other words and unless I'm missing something, there isn't currently a
 way to have a Hermes card and a Prism II card both plugged in and working.

Do you know about /sys/bus/pci/drivers/*/bind and unbind?

http://lwn.net/Articles/143397/

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Refactor Netlink connector?

2006-06-14 Thread jamal


So whats the resolution on this? I actually have some cycles this coming
weekend that i was hopping to spend updating the doc instead.

cheers,
jamal

On Thu, 2006-01-06 at 10:24 -0400, James Morris wrote:
 On Thu, 1 Jun 2006, Thomas Graf wrote:
 
  It shouldn't be hard to split what is implemented in nlmsg_route_perms[]
  for NETLINK_ROUTE into the definitions of the generic netlink
  operations, could look like this:
  
  struct genl_ops some_op = {
  [...]
  .perm= NETLINK_GENERIC_SOCKET__NLMSG_READ,
  };
 
 We wouldn't need the socket class outside of SELinux, just the perm, so 
 something like:
 
 NL_PERM_READ
 
  int genl_peek_cmd(struct nlmsghdr *nlh)
  {
  struct genlmsghdr *hdr = nlmsg_data(nlh);
  
  if (nlh-nlmsglen  nlmsg_msg_sizeo(GENL_HDRLEN))
  return -EINVAL;
  
  return hdr-cmd;
  }
 
 Unless I'm mistaken, people are already multiplexing commands inside genl 
 commands (and if so, why even bother with registerable ops?).
 
 
 I'll look at it in more detail soon.
 
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL

2006-06-14 Thread jamal


I have taken linux-kernel off the list.

Russell's site is inaccessible to me (I actually think this is related
to some DNS issues i may be having) and your masters is too long to
spend 2 minutes and glean it; so heres a question or two for you:

- Have you tried to do a long-lived session such as a large FTP and 
seen how far off the deviation was? That would provide some interesting
data point.
- To be a devil's advocate (and not claim there is no issue), where do
you draw the line with overhead? 
Example the smallest ethernet packet is 64 bytes of which 14 bytes are
ethernet headers (overhead for IP) - and this is not counting CRC etc.
If you were to set an MTU of say 64 bytes and tried to do a http or ftp,
how accurate do you think the calculation would be? I would think not
very different.
Does it matter if it is accurate on the majority of the cases?
- For further reflection: Have you considered the case where the rate
table has already been considered on some link speed in user space and
then somewhere post-config the physical link speed changes? This would
happen in the case where ethernet AN is involved and the partner makes
some changes (use ethtool). 

I would say the last bullet is a more interesting problem than a corner
case of some link layer technology that has high overhead.
Your work would be more interesting if it was generic for many link
layers instead of just ATM.


cheers,
jamal

On Wed, 2006-14-06 at 11:40 +0200, Jesper Dangaard Brouer wrote:
 The Linux traffic's control engine inaccurately calculates
 transmission times for packets sent over ADSL links.  For
 some packet sizes the error rises to over 50%.  This occurs
 because ADSL uses ATM as its link layer transport, and ATM
 transmits packets in fixed sized 53 byte cells.
 
 The following patches to iproute2 and the kernel add an
 option to calculate traffic transmission times over all
 ATM links, including ADSL, with perfect accuracy.
 
 A longer presentation of the patch, its rational, what it
 does and how to use it can be found here:
http://www.stuart.id.au/russell/files/tc/tc-atm/
 
 A earlier version of the patch, and a _detailed_ empirical
 investigation of its effects can be found here:
http://www.adsl-optimizer.dk/
 
 The patches are both backwards and forwards compatible.
 This means unpatched kernels will work with a patched
 version of iproute2, and an unpatched iproute2 will work
 on patches kernels.
 
 
 This is a combined effort of Jesper Brouer and Russell Stuart,
 to get these patches into the upstream kernel.
 
 Let the discussion start about what we need to change to get this
 upstream?
 
 We see this as a feature enhancement, as thus hope that it can be
 queued in davem's net-2.6.18.git tree.
 
 ---
 Regards,
  Jesper Brouer  Russell Stuart.
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL

2006-06-14 Thread Jesper Dangaard Brouer


On Wed, 2006-06-14 at 08:06 -0400, jamal wrote:

 Russell's site is inaccessible to me (I actually think this is related
 to some DNS issues i may be having) 

Strange, I have access to Russell's site.  Maybe its his redirect
feature that confuses your browser, try:
 http://ace-host.stuart.id.au/russell/files/tc/tc-atm/

 and your masters is too long to
 spend 2 minutes and glean it; so heres a question or two for you:

Yes, I is quite long and very detailed.  But it worth reading (... says
the author him self ;-))


 - Have you tried to do a long-lived session such as a large FTP and 
 seen how far off the deviation was? That would provide some interesting
 data point.

The deviation can be calculated.  The impact is of cause small for large
packets.  But the argument that bulk TCP transfers is not as badly
affected, is wrong because all the TCP ACK packets gets maximum penalty.

On an ADSL link with more than 8 bytes overhead, a 40 bytes TCP ACK will
use more that one ATM frame, causing 2 ATM frames to be send that
consumes 106 bytes, eg. 62% overhead.  On a small upstream ADSL line
that hurts! (See thesis page 53, table 5.3 Overhead summary).


 - To be a devil's advocate (and not claim there is no issue), where do
 you draw the line with overhead? 
 Example the smallest ethernet packet is 64 bytes of which 14 bytes are
 ethernet headers (overhead for IP) - and this is not counting CRC etc.
 If you were to set an MTU of say 64 bytes and tried to do a http or ftp,
 how accurate do you think the calculation would be? I would think not
 very different.

I do think we handle this situation, but I'm not quite sure that I fully
understand the question (sorry).


 Does it matter if it is accurate on the majority of the cases?
 - For further reflection: Have you considered the case where the rate
 table has already been considered on some link speed in user space and
 then somewhere post-config the physical link speed changes? This would
 happen in the case where ethernet AN is involved and the partner makes
 some changes (use ethtool). 

 I would say the last bullet is a more interesting problem than a corner
 case of some link layer technology that has high overhead.

We only claim to do magic on ATM/ADSL links... nothing else ;-)


 Your work would be more interesting if it was generic for many link
 layers instead of just ATM.

Well, we did consider to do so, but we though that it would be harder to
get it into the kernel.

Actually thats the reason for the defines:
 #defineATM_CELL_SIZE   53
 #defineATM_CELL_PAYLOAD48

Changing these should should make it possible to adapt to any other SAR
(Segment And Reasembly) link layer.  


 On Wed, 2006-14-06 at 11:40 +0200, Jesper Dangaard Brouer wrote:
  The Linux traffic's control engine inaccurately calculates
  transmission times for packets sent over ADSL links.  For
  some packet sizes the error rises to over 50%.  This occurs
  because ADSL uses ATM as its link layer transport, and ATM
  transmits packets in fixed sized 53 byte cells.
  
  The following patches to iproute2 and the kernel add an
  option to calculate traffic transmission times over all
  ATM links, including ADSL, with perfect accuracy.
  
  A longer presentation of the patch, its rational, what it
  does and how to use it can be found here:
 http://www.stuart.id.au/russell/files/tc/tc-atm/
  
  A earlier version of the patch, and a _detailed_ empirical
  investigation of its effects can be found here:
 http://www.adsl-optimizer.dk/
  
  The patches are both backwards and forwards compatible.
  This means unpatched kernels will work with a patched
  version of iproute2, and an unpatched iproute2 will work
  on patches kernels.
  
  
  This is a combined effort of Jesper Brouer and Russell Stuart,
  to get these patches into the upstream kernel.
  
  Let the discussion start about what we need to change to get this
  upstream?
  
  We see this as a feature enhancement, as thus hope that it can be
  queued in davem's net-2.6.18.git tree.
  
  ---
  Regards,
   Jesper Brouer  Russell Stuart.
  
 

Thanks for your comments :-)

-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network developer
  Cand. Scient Datalog / MSc.
  Author of http://adsl-optimizer.dk



signature.asc
Description: This is a digitally signed message part

Re: netif_tx_disable and lockless TX

2006-06-14 Thread jamal

On Wed, 2006-31-05 at 19:52 +0200, Robert Olsson wrote:
 jamal writes:
 
   Latency-wise: TX completion interrupt provides the best latency.
   Processing in the poll() -aka softirq- was almost close to the hardirq
   variant. So if you can make things run in a softirq such as transmit
   one, then the numbers will likely stay the same.
  
  I don't remember we tried tasklet for TX a la Herbert's suggestion but we 
  used use tasklets for controlling RX processing to avoid hardirq livelock
  in pre-NAPI versions.
 

Hrm - it may have been a private thing i did then. I could swear we did
that experiment together ...
Perhaps Herbert's motivation was not really to optimize but rather to
get something unstuck in the transmit path state machine maybe in a
context of netconsole? The conditions for which that tasklet would even
run require a CPU collision to the transmit. Sorry, I didnt quiet follow
the motivation/discussion that ended in that patch.

  Had variants of tulip driver with both TX cleaning at -poll and TX
  cleaning at hardirq and didn't see any performance difference. The 
  -poll was much cleaner but we kept Alexey's original work for tulip.
 

It certainly is cleaner - but i do recall the hardirq variant had better
latency much observable under high packet rates aka small packets. 

   Sorry, I havent been following discussions on netchannels[1] so i am not
   qualified to comment on the replacement part Dave mentioned earlier.
   What I can say is the tx processing doesnt have to be part of the NAPI
   poll() and still use hardirq.
 
  Yes true but I see TX numbers with newer boards (wire rate small pakets)
  with cleaing in -poll. Also now linux is very safe in network overload 
  situations. Moving work to hardirq may change that.
 

Oh, I am not suggesting a change - i am a lot more conservative than
that ;- these areas are delicate (not code-delicate Acme ;-) but
rather what seems obvious requires a lot of experimental results first.

Robert, your transmit results Intel or AMD based?

cheers,
jamal



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Problems with xfrm (IPSec) and multicast

2006-06-14 Thread Roar Bjørgum Rotvik


Hi,

I have configured two Linux PC's to use IPSec to encrypt some mcast traffic, using ip 
xfrm. Each PC has two network cards, one connected to a LAN (unencrypted side, also 
called red side) and one connected to the other node (encrypted side, also called black side).

Currently the setup uses static keys in the SA entries, so IKE is not a problem.

|-- RED side --|- BLACK network -|-- RED side --|
+---+   +-+   +-+   +---+
| LAN A +---+ IPSEC-A +---+ IPSEC-B +---+ LAN B |
+---+   +-+   +-+   +---+
10.0.10.0/24  192.168.0.0/24  10.0.20.0/24

Configuration:
Kernel tested: Linux-2.6.16.13 + 2.6.17-rc4
LAN A: 10.0.10.0/24
LAN B: 10.0.20.0/24
IPSEC-A: RED IP: 10.0.10.1, BLACK IP: 192.168.0.1
IPSEC-B: RED IP: 10.0.20.1, BLACK IP: 192.168.0.2
RED mcast group used: 239.192.20.1
BLACK mcast group used: 239.192.10.1

IPSEC-A SA and SP entries:
[EMAIL PROTECTED] ~]# ip xfrm state
src 192.168.0.1 dst 239.192.10.1
proto esp spi 0x0001 reqid 0 mode tunnel
replay-window 4
auth sha1 0x01020301
enc aes 0x0001
encap type espinudp sport 4500 dport 4500 addr 0.0.0.0

[EMAIL PROTECTED] ~]# ip xfrm policy
src 0.0.0.0/0 dst 239.192.20.1/32
dir in priority 2147483648
tmpl src 192.168.0.1 dst 239.192.10.1
proto esp reqid 0 mode tunnel
src 10.0.10.0/24 dst 239.192.20.1/32
dir out priority 2147483648
tmpl src 192.168.0.1 dst 239.192.10.1
proto esp reqid 0 mode tunnel
src 0.0.0.0/0 dst 239.192.20.1/32
dir fwd priority 2147483648
tmpl src 192.168.0.1 dst 239.192.10.1
proto esp reqid 0 mode tunnel

(The entries for IPSEC-B is similar, but the address 192.168.0.1 is changed to the IPSEC-B 
BLACK IP 192.168.0.2).
And I have a small userspace app that opens a socket and bind to port 4500 and issue 
setsockopt (fd, SOL_UDP, UDP_ENCAP) so that the kernel will accept UDP encap ESP packets.


When I send multicast traffic from IPSEC-A (bound to the RED interface 10.0.10.1) to mcast 
group 239.192.20.1, the traffic matches the out SP entry and is encrypted according to the 
SA entry and sent as UDP encap ESP to mcast group 239.192.10.1 on the BLACK network.


On IPSEC-B the UDP encap ESP packet is decrypted and is visible for userspace processes. 
So far so good.


But then I start sending similar mcast traffic the other way, but from IPSEC-B (bound to 
IPSEC-B RED IP 10.0.20.1). This traffic is also encrypted and sent to IPSEC-A.


But this packet is not decrypted at IPSEC-A, it seems to disappear. The IP and UDP SNMP 
counters increase for the received UDP encap ESP packet, but I cannot see what happens to 
the packet after the UDP layer. Seems like it is dropped somewhere in XFRM?


By sending some more packets from IPSEC-B (roughly 5-8 more packets), these packets 
suddenly starts to be decrypted at IPSEC-A and all is well. Until I start traffic the 
other way around again, when the same problem occurs at IPSEC-B.


So I cannot make encrypted multicast traffic to flow both ways at the same time, and has 
no clue as to why the first packets after changing direction is dropped somewhere.


Anyone have a clue to this observed problem with linux xfrm and multicast or a better 
solution for encrypted multicast on linux 2.6.x?


Any help with this is appreciated and more info (tcpdump, snmp stats and so on) can be 
obtained if needed.


--
Roar Bjørgum Rotvik
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: driver for pptp

2006-06-14 Thread Harald Welte

On Sat, Jun 03, 2006 at 11:06:19AM +0400, [EMAIL PROTECTED] wrote:
 I have developed the driver for Point-to-Point Tunneling Protocol (PPTP).

great news.  something that I always thought of a nice-to-have. 

 I have published the project on http://accel-pptp.sourceforge.net/

Please don't expect Linux Kernel networking developers to actually go to
sourceforge download and extract code that you want to have
reviewed/submitted.

Please read Documentation/SubmittingPatches (and CodingStyle) and submit
your kernel patch to netdev.

 Hope this driver will go to a kernel tree and will make linux more productive.

not without you pushing it actively and getting through review cycles
(which I hope you will!).

Some initial comments:

1) why wasn't it possible to use the PPPoX infrastructure of the kernel
   which is already being used by PPPoE ?  Or at least model it somehow
   similar to the existing PPPoE/PPPoX infrastructure?

2) why are you using a timer for asynchronous processing of GRE frames?
   First of all, why does it have to happen asynchronously at all?
   Secondly, why using a timer when there's nothing time related (or do
   I miss something)?  If deferred, out-of-context execution is
   required, there are other primitives such as tasklets.

3) you conflict with the ip_gre.c genric GRE encapsulation driver.  this
   is because both want to reigster a proto handler for GRE.  Ideally,
   there needs to be another demultiplex between the GRE protocl and its
   users.  The code registered for GRE would look at the packet and
   determine whether e.g. it is a PPTP GRE packet and then pass it on to
   the pptp module.

4) your code doesn't look nonlinear skb clean

5) why did you chose to implement  /dev/pptp rather than a socket family
   like the existing pppox/pppoe code?

6) lots of codingstyle issues

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

We all know Linux is great...it does infinite loops in 5 seconds. -- Linus


pgpuEiqiED7WD.pgp
Description: PGP signature

Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late

2006-06-14 Thread Kyle McMartin

On Tue, Jun 13, 2006 at 10:44:12PM -0600, Grant Grundler wrote:
 On Tue, Jun 13, 2006 at 08:33:22PM -0400, Jeff Garzik wrote:
  Grant Grundler wrote:
  o tulip_stop_rxtx() has to be called _after_ free_irq().
ie. v2 patch didn't fix the original race condition
and when under test, dies about as fast as the original code.
  
  You made the race window smaller, but it's still there.  The chip's DMA 
  engines should be stopped before you unregister the interrupt handler.
 
 Switching the order to be:
 tulip_stop_rxtx(tp);/* Stop DMA */
 free_irq (dev-irq, dev);   /* no more races after this */
 

I think the correct sequence would be:

reset tulip interrupt mask
flush posted write

synchronize irq /* make sure we got 'em all */
tulip_stop_rxtx /* turn off dma */
free irq/* bye bye */

The synchronize irq guarantees we shouldn't see another irq
generated by the card because it was held up somewhere.

Cheers,
Kyle M.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Offering presents for more routing tables!

2006-06-14 Thread Ben Greear


I've asked for this feature several years ago and evidently
it is not trivial to increase the number of routing tables.

But, perhaps someone now has time  inclination?

I would like to have more (a few thousand) routing tables
available in the kernel so that I can use a routing table for
each of my many VLANs.  Currently, the netlink protocol only
specifies an 8-bit id for the routing table:
http://www.faqs.org/rfcs/rfc3549.html  Section 3.1.1
so a new netlink message would need to be created and the 'ip' tool
updated.  I think at least a 16-bit identifier should be used,
possibly a full 32 bits so we don't have to revisit this again for
a while!

The kernel itself would also need to be modified so that it can
have more routing tables.  I realize most people don't need a
large number of tables, so the maximum number should be configured
at either compile time or run time.

If I remember right, there are certain tables (253 - 255) that
are currently special in the kernel.  For complete backwards compatibility,
this hole would probably have to remain as it is, with the new
tables starting at 256.

I would be willing to help test any resulting patches, and can
also offer bribes of money, hardware, beer, etc.

Thanks,
Ben

--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-14 Thread Jesper Dangaard Brouer

On Wed, 2006-06-14 at 11:57 +0100, Alan Cox wrote:
 Ar Mer, 2006-06-14 am 11:40 +0200, ysgrifennodd Jesper Dangaard Brouer:
  option to calculate traffic transmission times (rate table)
  over all ATM links, including ADSL, with perfect accuracy.

 The other problem I see with this code is it is very tightly tied to ATM
 cell sizes, not to solving the generic question of packetisation. 

Well, we did consider to do so, but we though that it would be harder to
get it into the kernel.

Actually thats the reason for the defines:
 #defineATM_CELL_SIZE   53
 #defineATM_CELL_PAYLOAD48

Changing these should should make it possible to adapt to any other SAR
(Segment And Reasembly) link layer.

 I'm
 not sure if that matters but for modern processors I'm also sceptical
 that the clever computation is actually any faster than just doing the
 maths, especially if something cache intensive is also running.

I guess you are refering to the rate table lookup system, that is based
upon array lookups.  I do think that the rate table array lookup system
has been outdated, as memory access is the bottleneck on modern CPUs.
But its design by Alexey for a long time ago where the hardware
restrictions were different.  It also avoids floting point operations in
the kernel.

Thanks for your comments.

-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network developer
  Cand. Scient Datalog / MSc.
  Author of http://adsl-optimizer.dk



signature.asc
Description: This is a digitally signed message part

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Harald Welte

On Tue, Jun 13, 2006 at 02:12:41PM -0700, Daniel Phillips wrote:
 
 This has the makings of a nice stable internal kernel api.  Why do we want
 to provide this nice stable internal api to proprietary modules?

because there is IMHO legally nothing we can do about it anyway.  Use of
an industry-standard API that is provided in multiple operating system
is one of the clearest idnication of some program _not_ being a
derivative work.

Whether we like it or not, it doesn't really matter if we export them
GPL-only or not.  Anybody using those scoket API calls will be having an
easy time arguing in favor of non-derivative work.

The GPL doesn't extend beyon what copyright law allows you to do...

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

We all know Linux is great...it does infinite loops in 5 seconds. -- Linus


pgpFWlVVgch8f.pgp
Description: PGP signature

Re: http://bugzilla.kernel.org/show_bug.cgi?id=6197

2006-06-14 Thread Patrick McHardy

Michael Tokarev wrote:
 Patrick McHardy wrote:
 []
 
He patched his kernel with the IMQ device, which is known to cause all
kinds of weird problems.
 
 
 Wich problems?  Known to whom?

Known to me (who wrote the original implementation of the current IMQ
device) and numerous people who were hit by them. IIRC it does some
invalid skb refcounting hacks which result in crashes in certain
scenarios - but I don't remeber the exact details.

 I was considering using imq for our needs (not done yet), and from the
 FAQ at http://www.linuximq.net/faq.html (item #3, Is it stable?) it
 seems there's no problems except of gre tunnels and locally generated
 traffic...
 
 Googling for imq linux problem shows usual pile of various user
 support questions (how to configure.. what did I do wrong.. etc),
 but nothing relevant.

The lartc list had lots of reports of crashes. I guess imq crash or
imq oops will give better results.

 So... I'm curious whenever the claim on linuximq.net site about the
 stability is true, or there in fact are some real issue...

From what I know these problems haven't been fixed. Current kernels
include Jamal's tc actions and the ifb-Device, which obsolete IMQ
anyway.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late

2006-06-14 Thread Grant Grundler

On Wed, Jun 14, 2006 at 09:05:06AM -0400, Kyle McMartin wrote:
 I think the correct sequence would be:
 
   reset tulip interrupt mask
   flush posted write
 
   synchronize irq /* make sure we got 'em all */

   tulip_stop_rxtx /* turn off dma */
   free irq/* bye bye */
 
 The synchronize irq guarantees we shouldn't see another irq
 generated by the card because it was held up somewhere.

Kyle,
syncronize_irq() only guarantees currently executing interrupt handler
completes before handing control back to the caller.
It does not guarantee IRQ signals still inflight are flushed.
Remember that IRQ lines are a sideband signal and not subject
to PCI data ordering rules.

thanks,
grant
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late

2006-06-14 Thread Jeff Garzik


Grant Grundler wrote:

On Tue, Jun 13, 2006 at 08:33:22PM -0400, Jeff Garzik wrote:

Grant Grundler wrote:

o tulip_stop_rxtx() has to be called _after_ free_irq().
 ie. v2 patch didn't fix the original race condition
 and when under test, dies about as fast as the original code.
You made the race window smaller, but it's still there.  The chip's DMA 
engines should be stopped before you unregister the interrupt handler.


Switching the order to be:
tulip_stop_rxtx(tp);/* Stop DMA */
free_irq (dev-irq, dev);   /* no more races after this */

still leaves us open to IRQs being delivered _after_ we've stopped DMA.


Correct.  And that is the preferred, natural, logical, obvious order:

1) Turn things off.
2) Wait for activity to cease.



That in turn allows the interrupt handler to re-enable DMA again.


Then that would be a problem to solve...  Some interrupt handlers will 
test netif_running() or a driver-specific shutting-down flag, 
specifically to avoid such behaviors.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL

2006-06-14 Thread Jesper Dangaard Brouer

On Wed, 2006-06-14 at 10:27 -0400, Phillip Susi wrote:
 Jesper Dangaard Brouer wrote:
  The Linux traffic's control engine inaccurately calculates
  transmission times for packets sent over ADSL links.  For
  some packet sizes the error rises to over 50%.  This occurs
  because ADSL uses ATM as its link layer transport, and ATM
  transmits packets in fixed sized 53 byte cells.
  
 
 I could have sworn that DSL uses its own framing protocol that is 
 similar to the frame/superframe structure of HDSL ( T1 ) lines and over 
 that you can run ATM or ethernet.  Or is it typically ethernet - ATM - 
 HDSL?

Nope, not according to the ADSL standards G.992.1 and G.992.2.

 In any case, why does the kernel care about the exact time that the IP 
 packet has been received and reassembled on the headend?

I think you have misunderstood what the rate table does...
(There is an explaination in the thesis page 57 section 6.1.2)
http://www.adsl-optimizer.dk/thesis/

-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network developer
  Cand. Scient Datalog / MSc.
  Author of http://adsl-optimizer.dk



signature.asc
Description: This is a digitally signed message part

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL

2006-06-14 Thread Phillip Susi


Jesper Dangaard Brouer wrote:

The Linux traffic's control engine inaccurately calculates
transmission times for packets sent over ADSL links.  For
some packet sizes the error rises to over 50%.  This occurs
because ADSL uses ATM as its link layer transport, and ATM
transmits packets in fixed sized 53 byte cells.



I could have sworn that DSL uses its own framing protocol that is 
similar to the frame/superframe structure of HDSL ( T1 ) lines and over 
that you can run ATM or ethernet.  Or is it typically ethernet - ATM - 
HDSL?


In any case, why does the kernel care about the exact time that the IP 
packet has been received and reassembled on the headend?



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Erik Mouw

On Wed, Jun 14, 2006 at 03:30:22PM +0200, Harald Welte wrote:
 On Tue, Jun 13, 2006 at 02:12:41PM -0700, Daniel Phillips wrote:
  
  This has the makings of a nice stable internal kernel api.  Why do we want
  to provide this nice stable internal api to proprietary modules?
 
 because there is IMHO legally nothing we can do about it anyway.  Use of
 an industry-standard API that is provided in multiple operating system
 is one of the clearest idnication of some program _not_ being a
 derivative work.

IMHO there is no industry-standard API for in-kernel use of sockets.
There is however one for user space.


Erik
(IANAL, etc)

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Refactor Netlink connector?

2006-06-14 Thread James Morris

On Wed, 14 Jun 2006, jamal wrote:

 
 So whats the resolution on this? I actually have some cycles this coming
 weekend that i was hopping to spend updating the doc instead.

Haven't had a chance to look at it since.

-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Harald Welte

On Wed, Jun 14, 2006 at 04:29:04PM +0200, Erik Mouw wrote:
 On Wed, Jun 14, 2006 at 03:30:22PM +0200, Harald Welte wrote:
  On Tue, Jun 13, 2006 at 02:12:41PM -0700, Daniel Phillips wrote:
   
   This has the makings of a nice stable internal kernel api.  Why do we want
   to provide this nice stable internal api to proprietary modules?
  
  because there is IMHO legally nothing we can do about it anyway.  Use of
  an industry-standard API that is provided in multiple operating system
  is one of the clearest idnication of some program _not_ being a
  derivative work.
 
 IMHO there is no industry-standard API for in-kernel use of sockets.
 There is however one for user space.

it doesn't matter in what space you are.  If the API really is similar
enough, then any piece of code (no matter where it was originally
intended to run) will be able to work with any such socket API.

The whole point of this is: Where is the derivation of an existing work?
I can write a program against some BSD socket api somewhere, and I can
easily make it use the proposed in-kernel sockets API.  No derivation of
anything that is inside the kernel and GPL licensed.

 (IANAL, etc)

Neither am I, but I'm constantly dealing with legal questions related to
the GPL while running gpl-violations.org.

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

We all know Linux is great...it does infinite loops in 5 seconds. -- Linus


pgpIzJfnLL2Vq.pgp
Description: PGP signature

Re: [PATCH 0/2] NET: Accurate packet scheduling for ATM/ADSL

2006-06-14 Thread Andy Furniss


jamal wrote:

I have taken linux-kernel off the list.

Russell's site is inaccessible to me (I actually think this is related
to some DNS issues i may be having) and your masters is too long to
spend 2 minutes and glean it; so heres a question or two for you:

- Have you tried to do a long-lived session such as a large FTP and 
seen how far off the deviation was? That would provide some interesting

data point.
- To be a devil's advocate (and not claim there is no issue), where do
you draw the line with overhead? 


Me and many others have run a smilar hack for years, there is also a 
userspace project still alive which does the same.


The difference is that without it I would need to sacrifice almost half 
my 288kbit atm/dsl showtime bandwidth to be sure of control.


With the modification I can run at 286kbit / 288 and know I will never 
have jitter worse than the bitrate latency of a mtu packet. The 286 
figure was choses to allow a full buffer to drain/ allow for timer 
innaccuracy etc. On a p200 with tsc, 2.6.12 it's never gone over for me 
- though talking of timers I notice on my desktop 2.6.16 I gain 2 
minutes a day now.


Andy.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] bcm43xx: use softmac-suggested TX rate

2006-06-14 Thread Michael Buesch

Hi John,

Sorry, took a little bit longer than expected, but here it is. :)
Please queue for 2.6.18.

--

From: Daniel Drake [EMAIL PROTECTED]

Use Softmac-suggested TX ratecode:
ieee80211softmac_suggest_txrate()

Signed-off-by: Daniel Drake [EMAIL PROTECTED]
Signed-off-by: Michael Buesch [EMAIL PROTECTED]

Index: wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_xmit.c
===
--- wireless-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_xmit.c   
2006-06-14 16:53:50.0 +0200
+++ wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_xmit.c2006-06-14 
17:44:23.0 +0200
@@ -296,11 +296,14 @@
u16 control = 0;
u16 wsec_rate = 0;
u16 encrypt_frame;
+   const u16 ftype = 
WLAN_FC_GET_TYPE(le16_to_cpu(wireless_header-frame_ctl));
+   const int is_mgt = (ftype == IEEE80211_FTYPE_MGMT);
 
/* Now construct the TX header. */
memset(txhdr, 0, sizeof(*txhdr));
 
-   bitrate = bcm-softmac-txrates.default_rate;
+   bitrate = ieee80211softmac_suggest_txrate(bcm-softmac,
+   is_multicast_ether_addr(wireless_header-addr1), is_mgt);
ofdm_modulation = !(ieee80211_is_cck_rate(bitrate));
fallback_bitrate = bcm43xx_calc_fallback_rate(bitrate);
fallback_ofdm_modulation = !(ieee80211_is_cck_rate(fallback_bitrate));

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 6/5] rt2x00: per-queue TX flow control

2006-06-14 Thread Jiri Benc

This is a patch for rt2x00 driver to do TX flow control.

It is compile-tested only.

Signed-off-by: Jiri Benc [EMAIL PROTECTED]

---
 drivers/net/wireless/d80211/rt2x00/rt2400pci.c |   26 ++---
 drivers/net/wireless/d80211/rt2x00/rt2500pci.c |   26 ++---
 drivers/net/wireless/d80211/rt2x00/rt2500usb.c |   18 +
 drivers/net/wireless/d80211/rt2x00/rt61pci.c   |   26 ++---
 drivers/net/wireless/d80211/rt2x00/rt73usb.c   |   18 +
 5 files changed, 85 insertions(+), 29 deletions(-)

--- dscape.orig/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
+++ dscape/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
@@ -1001,7 +1001,7 @@ rt2400pci_txdone(void *data)
struct data_entry   *entry;
struct txd  *txd;
int tx_status;
-   int ack;
+   int ack, wake, queue;
 
while (!rt2x00_ring_empty(ring)) {
entry = rt2x00_get_data_entry_done(ring);
@@ -1048,7 +1048,11 @@ rt2400pci_txdone(void *data)
rt2x00_set_field32(txd-word0, TXD_W0_VALID, 0);
entry-skb = NULL;
 
+   wake = rt2x00_ring_full(ring);
+   queue = entry-tx_status.control.queue;
rt2x00_ring_index_done_inc(ring);
+   if (wake)
+   ieee80211_wake_queue(ring-net_dev, queue);
}
 
/*
@@ -1541,24 +1545,31 @@ rt2400pci_tx(struct net_device *net_dev,
ERROR(Attempt to send packet over invalid queue %d.\n
Please file bug report to %s.\n,
control-queue, DRV_PROJECT);
-   return NET_XMIT_DROP;
+   dev_kfree_skb_any(skb);
+   return NETDEV_TX_OK;
}
 
-   if (rt2x00_ring_full(ring))
-   return NET_XMIT_DROP;
+   if (rt2x00_ring_full(ring)) {
+   ieee80211_stop_queue(net_dev, control-queue);
+   return NETDEV_TX_BUSY;
+   }
 
entry = rt2x00_get_data_entry(ring);
txd = entry-desc_addr;
 
-   if (rt2x00_get_field32(txd-word0, TXD_W0_OWNER_NIC)
-   || rt2x00_get_field32(txd-word0, TXD_W0_VALID))
-   return NET_XMIT_DROP;
+   if (rt2x00_get_field32(txd-word0, TXD_W0_OWNER_NIC) ||
+   rt2x00_get_field32(txd-word0, TXD_W0_VALID)) {
+   ieee80211_stop_queue(net_dev, control-queue);
+   return NETDEV_TX_BUSY;
+   }
 
memcpy(entry-data_addr, skb-data, skb-len);
rt2400pci_write_tx_desc(rt2x00pci, txd, skb, control);
entry-skb = skb;
 
rt2x00_ring_index_inc(ring);
+   if (rt2x00_ring_full(ring))
+   ieee80211_stop_queue(net_dev, control-queue);
 
rt2x00_register_read(rt2x00pci, TXCSR0, reg);
if (control-queue == IEEE80211_TX_QUEUE_DATA0)
@@ -1668,6 +1679,7 @@ rt2400pci_open(struct net_device *net_de
rt2x00_register_write(rt2x00pci, CSR8, reg);
 
SET_FLAG(rt2x00pci, RADIO_ENABLED);
+   ieee80211_start_queues(net_dev);
 
return 0;
 
--- dscape.orig/drivers/net/wireless/d80211/rt2x00/rt2500pci.c
+++ dscape/drivers/net/wireless/d80211/rt2x00/rt2500pci.c
@@ -1089,7 +1089,7 @@ rt2500pci_txdone(void *data)
struct data_entry   *entry;
struct txd  *txd;
int tx_status;
-   int ack;
+   int ack, wake, queue;
 
while (!rt2x00_ring_empty(ring)) {
entry = rt2x00_get_data_entry_done(ring);
@@ -1136,7 +1136,11 @@ rt2500pci_txdone(void *data)
rt2x00_set_field32(txd-word0, TXD_W0_VALID, 0);
entry-skb = NULL;
 
+   wake = rt2x00_ring_full(ring);
+   queue = entry-tx_status.control.queue;
rt2x00_ring_index_done_inc(ring);
+   if (wake)
+   ieee80211_wake_queue(ring-net_dev, queue);
}
 
/*
@@ -1664,24 +1668,31 @@ rt2500pci_tx(struct net_device *net_dev,
ERROR(Attempt to send packet over invalid queue %d.\n
Please file bug report to %s.\n,
control-queue, DRV_PROJECT);
-   return NET_XMIT_DROP;
+   dev_kfree_skb_any(skb);
+   return NETDEV_TX_OK;
}
 
-   if (rt2x00_ring_full(ring))
-   return NET_XMIT_DROP;
+   if (rt2x00_ring_full(ring)) {
+   ieee80211_stop_queue(net_dev, control-queue);
+   return NETDEV_TX_BUSY;
+   }
 
entry = rt2x00_get_data_entry(ring);
txd = entry-desc_addr;
 
-   if (rt2x00_get_field32(txd-word0, TXD_W0_OWNER_NIC)
-   || rt2x00_get_field32(txd-word0, TXD_W0_VALID))
-   return NET_XMIT_DROP;
+   if (rt2x00_get_field32(txd-word0, TXD_W0_OWNER_NIC) ||
+   rt2x00_get_field32(txd-word0,

Re: tcp_slow_start_after_idle

2006-06-14 Thread Rick Jones


+tcp_slow_start_after_idle - BOOLEAN
+   If set, provide RFC2861 behavior and time out the congestion
+   window after an idle period.  An idle period is defined at
+   the current RTO.  If unset, the congestion window will not
+   be timed out after an idle period.
+   Default: 1


Did you mean defined as rather than defined at?

Also, does the congestion window time out or does it decay?

Perhaps:

tcp_slow_start_after_idle - BOOLEAN
 If set, provide RFC2861 behavior and decay the congestion
 window after the connection has been idle for the connection's
 current RTO.  If unset, the congestion window will not decay
 when the connection has been idle.
 Default: 1


diff --git a/include/net/tcp.h b/include/net/tcp.h
index de88c54..bfc71f9 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -227,6 +227,7 @@ extern int sysctl_tcp_abc;
 extern int sysctl_tcp_mtu_probing;
 extern int sysctl_tcp_base_mss;
 extern int sysctl_tcp_workaround_signed_windows;
+extern int sysctl_tcp_slow_start_after_idle;



diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 743016b..be6d929 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -59,6 +59,9 @@ int sysctl_tcp_tso_win_divisor = 3;
 int sysctl_tcp_mtu_probing = 0;
 int sysctl_tcp_base_mss = 512;
 
+/* By default, RFC2861 behavior.  */

+int sysctl_tcp_slow_start_after_idle = 1;
+


Is this a candidate for readmostly?

rick jones
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: tcp_slow_start_after_idle

2006-06-14 Thread Zach Brown

David Miller wrote:
 Bringing back up this old topic:
 
   http://marc.theaimsgroup.com/?l=linux-netdevm=114564962420171w=2
 
 I've decided to add this tunable to the net-2.6.18 tree, patch below.

Nice, thanks for the heads-up.  I'll pass the notice on to the guys who
were asking about this in that thread.

- z
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Interrupt handling on SMP system

2006-06-14 Thread Majid Khan


Hi all,
 I had a few question regarding how interrupt handling work on linux
within a SMP systems.

1. Which processor gets the interrupt when a new packet arrives? Is
there any policy mechanism which can guide the interrupt to the idle
processor etc? Do the processors share an interrupt line and an
interrupt controller assign it to a specific processor?

2. When a soft IRQ is scheduled by the driver and picked up by a
ksoftirqd thread, does it get processed entirely on the same
processor? Does ksoftirqd/i thread processes the packets in the ith
processor queue only?

Regards,
Majid
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Daniel Phillips


Hi Harald,

You wrote:

On Tue, Jun 13, 2006 at 02:12:41PM -0700, I wrote:


This has the makings of a nice stable internal kernel api.  Why do we want
to provide this nice stable internal api to proprietary modules?


because there is IMHO legally nothing we can do about it anyway.


Speaking as a former member of a grey market binary module vendor that
came in from the cold I can assure you that the distinction between EXPORT
and EXPORT_GPL _is_ meaningful.  That tainted flag makes it extremely
difficult to do deals with mainstream Linux companies and there is always
the fear that it will turn into a legal problem.  The latter bit tends to
make venture capitalists nervous.

That said, the EXPORT_GPL issue is not about black and white legal issues,
it is about gentle encouragement.  In this case we are offering a clumsy,
on-the-metal, guaranteed-to-change-and-make-you-edit-code interface to
non-GPL-compatible modules and a decent, stable (in the deserves to live
sense) interface for the pure of heart.  Gentle encouragement at exactly
the right level.

Did we settle the question of whether these particular exports should be
EXPORT_SYMBOL_GPL?

Regards,

Daniel
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [openib-general] [PATCH v2 1/2] iWARP Connection Manager.

2006-06-14 Thread Steve Wise

On Tue, 2006-06-13 at 16:46 -0500, Steve Wise wrote:
 On Tue, 2006-06-13 at 14:36 -0700, Sean Hefty wrote:
   Er...no. It will lose this event. Depending on the event...the carnage
   varies. We'll take a look at this.
  
  
  This behavior is consistent with the Infiniband CM (see
  drivers/infiniband/core/cm.c function cm_recv_handler()).  But I think
  we should at least log an error because a lost event will usually stall
  the rdma connection.
  
  I believe that there's a difference here.  For the Infiniband CM, an 
  allocation
  error behaves the same as if the received MAD were lost or dropped.  Since 
  MADs
  are unreliable anyway, it's not so much that an IB CM event gets lost, as it
  doesn't ever occur.  A remote CM should retry the send, which hopefully 
  allows
  the connection to make forward progress.
  
 
 hmm.  Ok.  I see.  I misunderstood the code in cm_recv_handler().
 
 Tom and I have been talking about what we can do to not drop the event.
 Stay tuned.

Here's a simple solution that solves the problem:  

For any given cm_id, there are a finite (and small) number of
outstanding CM events that can be posted.  So we just pre-allocate them
when the cm_id is created and keep them on a free list hanging off of
the cm_id struct.  Then the event handler function will pull from this
free list.  

The only case where there is any non-finite issue is on the passive
listening cm_id.  Each incoming connection request will consume a work
struct.  So based on client connects, we could run out of work structs.
However, the CMA has the concept of a backlog, which is defined as the
max number of pending unaccepted connection requests.  So we allocate
these work structs based on that number (or a computation based on that
number), and if we run out, we simply drop the incoming connection
request due to backlog overflow (I suggest we log the drop event too).
When a MPA connection request is dropped, the (IETF conforming) MPA
client will eventually time out the connection and the consumer can
retry.

Comments?



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Brian F. G. Bidulock

Daniel,

On Wed, 14 Jun 2006, Daniel Phillips wrote:
 
 Speaking as a former member of a grey market binary module vendor that
 came in from the cold I can assure you that the distinction between EXPORT
 and EXPORT_GPL _is_ meaningful.  That tainted flag makes it extremely
 difficult to do deals with mainstream Linux companies and there is always
 the fear that it will turn into a legal problem.  The latter bit tends to
 make venture capitalists nervous.
 

EXPORT_SYMBOL_GPL and the Tainted flag have nothing to do with each other.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late

2006-06-14 Thread Grant Grundler

On Wed, Jun 14, 2006 at 11:03:48AM -0400, Jeff Garzik wrote:
 Grant Grundler wrote:
 Switching the order to be:
 tulip_stop_rxtx(tp);/* Stop DMA */
 free_irq (dev-irq, dev);   /* no more races after this */
 
 still leaves us open to IRQs being delivered _after_ we've stopped DMA.
 
 Correct.  And that is the preferred, natural, logical, obvious order:
 
 1) Turn things off.
 2) Wait for activity to cease.

Patch v3 does this in two stages:
1) turn off tulip interrupts
2) free_irq() calls syncronize_irq() to handle pending IRQs

then calls tulip_stop_rxtx() which:
1) tells tulip to stop DMA
2) poll until DMA completes

After this we can free remaining resources.

 That in turn allows the interrupt handler to re-enable DMA again.
 
 Then that would be a problem to solve...  Some interrupt handlers will 
 test netif_running() or a driver-specific shutting-down flag, 
 specifically to avoid such behaviors.

I'm not keen on adding more code to tulip_interrupt() routine
for something that rarely happens (compared to IRQs) and is handled
outside the interrupt routine.  I'm pretty sure stopping interrupts
before stopping DMA is sufficient.
Can you show an example where it doesn't work?

This is important since I'm going to propose a new Documentation/pci.txt
based on this experience.

thanks,
grant
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 6/5] rt2x00: per-queue TX flow control

2006-06-14 Thread Ivo van Doorn

Hi,

On Wednesday 14 June 2006 18:36, Jiri Benc wrote:
 This is a patch for rt2x00 driver to do TX flow control.
 
 It is compile-tested only.

 Signed-off-by: Jiri Benc [EMAIL PROTECTED]

I'll put my comments for the rt2400pci driver only,
since the same changes are made for each rt2x00 driver.

 --- dscape.orig/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
 +++ dscape/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
 @@ -1001,7 +1001,7 @@ rt2400pci_txdone(void *data)
   struct data_entry   *entry;
   struct txd  *txd;
   int tx_status;
 - int ack;
 + int ack, wake, queue;
  
   while (!rt2x00_ring_empty(ring)) {
   entry = rt2x00_get_data_entry_done(ring);
 @@ -1048,7 +1048,11 @@ rt2400pci_txdone(void *data)
   rt2x00_set_field32(txd-word0, TXD_W0_VALID, 0);
   entry-skb = NULL;
  
 + wake = rt2x00_ring_full(ring);
 + queue = entry-tx_status.control.queue;
   rt2x00_ring_index_done_inc(ring);
 + if (wake)
 + ieee80211_wake_queue(ring-net_dev, queue);
   }

This will not give the correct result I fear, and it would cause (unwanted)
overhead of checking if the queue was full.
Queue_full can be checked when the loop starts, and the
waking of the queue can best be done after freeing all entries
and after the second check if the queue is still not full. (There is no 
guarentee
the while() loop will end while there are free entries in the queue)

   /*
 @@ -1541,24 +1545,31 @@ rt2400pci_tx(struct net_device *net_dev,
   ERROR(Attempt to send packet over invalid queue %d.\n
   Please file bug report to %s.\n,
   control-queue, DRV_PROJECT);
 - return NET_XMIT_DROP;
 + dev_kfree_skb_any(skb);
 + return NETDEV_TX_OK;
   }
  
 - if (rt2x00_ring_full(ring))
 - return NET_XMIT_DROP;
 + if (rt2x00_ring_full(ring)) {
 + ieee80211_stop_queue(net_dev, control-queue);
 + return NETDEV_TX_BUSY;
 + }
  
   entry = rt2x00_get_data_entry(ring);
   txd = entry-desc_addr;
  
 - if (rt2x00_get_field32(txd-word0, TXD_W0_OWNER_NIC)
 - || rt2x00_get_field32(txd-word0, TXD_W0_VALID))
 - return NET_XMIT_DROP;
 + if (rt2x00_get_field32(txd-word0, TXD_W0_OWNER_NIC) ||
 + rt2x00_get_field32(txd-word0, TXD_W0_VALID)) {
 + ieee80211_stop_queue(net_dev, control-queue);
 + return NETDEV_TX_BUSY;
 + }

Not sure if I am happy with this one. When this check is made,
it occurs after the ring_full check. This means that when this statement
is true, the queue is not full. Instead it has more of a meaning that something
has gone wrong with the queue and this should not have happened.

But this is not really a problem in the patch itself, just a problem I only
now recognize thanks to your patch. ;)

For the time being I'll add a debug message, but I need to find a method
to clean up the ring if this occurs.
This check currently does not happen in the rt2570 and rt73 USB drivers,
but it is safer to add them in there as well.

   memcpy(entry-data_addr, skb-data, skb-len);
   rt2400pci_write_tx_desc(rt2x00pci, txd, skb, control);
   entry-skb = skb;
  
   rt2x00_ring_index_inc(ring);
 + if (rt2x00_ring_full(ring))
 + ieee80211_stop_queue(net_dev, control-queue);
  
   rt2x00_register_read(rt2x00pci, TXCSR0, reg);
   if (control-queue == IEEE80211_TX_QUEUE_DATA0)
 @@ -1668,6 +1679,7 @@ rt2400pci_open(struct net_device *net_de
   rt2x00_register_write(rt2x00pci, CSR8, reg);
  
   SET_FLAG(rt2x00pci, RADIO_ENABLED);
 + ieee80211_start_queues(net_dev);
  
   return 0;

Based on Jiri's patch for rt2x00 driver to do TX flow control.

Signed-off-by Ivo van Doorn [EMAIL PROTECTED]

---

diff --git a/drivers/net/wireless/d80211/rt2x00/rt2400pci.c 
b/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
index 8b856dd..946cf86 100644
--- a/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
+++ b/drivers/net/wireless/d80211/rt2x00/rt2400pci.c
@@ -1002,6 +1002,12 @@ rt2400pci_txdone(void *data)
struct txd  *txd;
int tx_status;
int ack;
+   int ring_full;
+
+   /*
+* Store the current status of the ring.
+*/
+   ring_full = rt2x00_ring_full(ring);
 
while (!rt2x00_ring_empty(ring)) {
entry = rt2x00_get_data_entry_done(ring);
@@ -1062,6 +1068,16 @@ rt2400pci_txdone(void *data)
rt2x00pci-scan-status = SCANNING_READY;
complete(rt2x00pci-scan-completion);
}
+
+   /*
+* If the data ring was full before the txdone handler
+* we must make sure the packet queue in the d80211 stack
+* is reenabled when

Re: Remove Prism II support from Orinoco

2006-06-14 Thread Mike Kershaw

On Tue, Jun 13, 2006 at 09:24:49PM +0300, Jar wrote:
 It always loads itself with or without blacklist. That's why I have to 
 do 'rm -f orinoco*.*  depmod -a' when the new kernel arrives. Seems 
 that users are directed to use unsecure orinoco (wep) driver rather than 
 secure hostap (wpa/wpa2,tkip,aes) driver for their prism2 hardware.

The hostap drivers are also much better behaved for rfmon than the
orinoco drivers for prism2.

-m

-- 
Mike Kershaw/Dragorn [EMAIL PROTECTED]
GPG Fingerprint: 3546 89DF 3C9D ED80 3381  A661 D7B2 8822 738B BDB1

Bus Error at 008BE426 while reading byte from DEADBEEF in User data space



pgp0O8uVbEENO.pgp
Description: PGP signature

Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late

2006-06-14 Thread Jeff Garzik


Grant Grundler wrote:

On Wed, Jun 14, 2006 at 11:03:48AM -0400, Jeff Garzik wrote:

Grant Grundler wrote:

Switching the order to be:
   tulip_stop_rxtx(tp);/* Stop DMA */
   free_irq (dev-irq, dev);   /* no more races after this */

still leaves us open to IRQs being delivered _after_ we've stopped DMA.

Correct.  And that is the preferred, natural, logical, obvious order:

1) Turn things off.
2) Wait for activity to cease.


Patch v3 does this in two stages:
1) turn off tulip interrupts
2) free_irq() calls syncronize_irq() to handle pending IRQs

then calls tulip_stop_rxtx() which:
1) tells tulip to stop DMA
2) poll until DMA completes

After this we can free remaining resources.


You need to turn off the thing that generates work (DMA engine), before 
turning off the thing that reaps work (irq handler).




That in turn allows the interrupt handler to re-enable DMA again.
Then that would be a problem to solve...  Some interrupt handlers will 
test netif_running() or a driver-specific shutting-down flag, 
specifically to avoid such behaviors.


I'm not keen on adding more code to tulip_interrupt() routine
for something that rarely happens (compared to IRQs) and is handled
outside the interrupt routine.  I'm pretty sure stopping interrupts
before stopping DMA is sufficient.
Can you show an example where it doesn't work?


It should be completely obvious that the chip is still generating 
work...  You don't want to leave the hardware in a position where it has 
unacknowledged events.


Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch] ipv4: fix lock usage in udp_ioctl

2006-06-14 Thread Heiko Carstens

From: Heiko Carstens [EMAIL PROTECTED]

Fix lock usage in udp_ioctl().

Signed-off-by: Heiko Carstens [EMAIL PROTECTED]
---

udp_poll() seems to have the same problem, right?

As reported by the lock validator:


[ BUG: illegal lock usage! ]

illegal {in-hardirq-W} - {hardirq-on-W} usage.
syslogd/739 [HC0[0]:SC0[1]:HE1:SE0] takes:
 (list-lock){++..}, at: [002e36d6] udp_ioctl+0x96/0x100
{in-hardirq-W} state was registered at:
  [00062128] lock_acquire+0x9c/0xc0
  [0036209e] _spin_lock_irqsave+0x66/0x84
  [002912ce] skb_dequeue+0x32/0xb0
  [00263160] qeth_qdio_output_handler+0x3e8/0xf8c
  [00219fdc] tiqdio_thinint_handler+0xde0/0x2234
  [0020448c] do_adapter_IO+0x5c/0xa8
  [0020842c] do_IRQ+0x13c/0x18c
  [000208a2] io_no_vtime+0x16/0x1c
  [0001978c] cpu_idle+0x1d0/0x20c
irq event stamp: 1694
hardirqs last  enabled at (1693): [003629c2] _spin_unlock_irqrestore+0x92/0xa8
hardirqs last disabled at (1692): [00362074] _spin_lock_irqsave+0x3c/0x84
softirqs last  enabled at (1682): [0028c7c4] release_sock+0xe4/0xf4
softirqs last disabled at (1694): [00361f7e] _spin_lock_bh+0x2e/0x70

other info that might help us debug this:
no locks held by syslogd/739.

stack backtrace:
0fd6c148 0de2f960 0002  
   0de2fa00 0de2f978 0de2f978 0001737c 
       
   0de2f960 000c 0de2f960 0de2f9d0 
   0036fe70 0001737c 0de2f960 0de2f9b0 
Call Trace:
([0001730a] show_trace+0x166/0x16c)
 [000173d6] show_stack+0xc6/0xf8
 [00017436] dump_stack+0x2e/0x3c
 [0005f978] print_usage_bug+0x23c/0x250
 [000607cc] mark_lock+0x594/0x714
 [000613be] __lock_acquire+0x252/0xf20
 [00062128] lock_acquire+0x9c/0xc0
 [00361fa8] _spin_lock_bh+0x58/0x70
 [002e36d6] udp_ioctl+0x96/0x100
 [002eadd6] inet_ioctl+0x72/0x11c
 [002893f2] sock_ioctl+0x1ca/0x2c0
 [000c13ee] do_ioctl+0x56/0xe0
 [000c14f2] vfs_ioctl+0x7a/0x384
 [000c184e] sys_ioctl+0x52/0x84
 [000e80a2] do_ioctl32_pointer+0x2a/0x3c
 [000e55c8] compat_sys_ioctl+0x168/0x378
 [00020338] sysc_noemu+0x10/0x16

diffstat:
 net/ipv4/udp.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 3f93292..b15a17b 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -740,7 +740,7 @@ int udp_ioctl(struct sock *sk, int cmd, 
unsigned long amount;
 
amount = 0;
-   spin_lock_bh(sk-sk_receive_queue.lock);
+   spin_lock_irq(sk-sk_receive_queue.lock);
skb = skb_peek(sk-sk_receive_queue);
if (skb != NULL) {
/*
@@ -750,7 +750,7 @@ int udp_ioctl(struct sock *sk, int cmd, 
 */
amount = skb-len - sizeof(struct udphdr);
}
-   spin_unlock_bh(sk-sk_receive_queue.lock);
+   spin_unlock_irq(sk-sk_receive_queue.lock);
return put_user(amount, (int __user *)arg);
}
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [openib-general] [PATCH v2 1/2] iWARP Connection Manager.

2006-06-14 Thread Caitlin Bestler

[EMAIL PROTECTED] wrote:
 On Tue, 2006-06-13 at 16:46 -0500, Steve Wise wrote:
 On Tue, 2006-06-13 at 14:36 -0700, Sean Hefty wrote:
 Er...no. It will lose this event. Depending on the event...the
 carnage varies. We'll take a look at this.
 
 
 This behavior is consistent with the Infiniband CM (see
 drivers/infiniband/core/cm.c function cm_recv_handler()).  But I
 think we should at least log an error because a lost event will
 usually stall the rdma connection.
 
 I believe that there's a difference here.  For the Infiniband CM, an
 allocation error behaves the same as if the received MAD were lost
 or dropped.  Since MADs are unreliable anyway, it's not so much that
 an IB CM event gets lost, as it doesn't ever occur.  A remote CM
 should retry the send, which hopefully allows the
 connection to make forward progress.
 
 
 hmm.  Ok.  I see.  I misunderstood the code in cm_recv_handler().
 
 Tom and I have been talking about what we can do to not drop the
 event. Stay tuned.
 
 Here's a simple solution that solves the problem:
 
 For any given cm_id, there are a finite (and small) number of
 outstanding CM events that can be posted.  So we just
 pre-allocate them when the cm_id is created and keep them on
 a free list hanging off of the cm_id struct.  Then the event
 handler function will pull from this free list.
 
 The only case where there is any non-finite issue is on the
 passive listening cm_id.  Each incoming connection request
 will consume a work struct.  So based on client connects, we
 could run out of work structs.
 However, the CMA has the concept of a backlog, which is
 defined as the max number of pending unaccepted connection
 requests.  So we allocate these work structs based on that
 number (or a computation based on that number), and if we run
 out, we simply drop the incoming connection request due to
 backlog overflow (I suggest we log the drop event too).
 When a MPA connection request is dropped, the (IETF
 conforming) MPA client will eventually time out the
 connection and the consumer can retry.
 
 Comments?
 

If the IWCM cannot accept a Connection Request event from
the driver then *someone* should generate a non-peer reject
MPA Response frame. Since the IWCM does not have the resources
to relay the event, it probably does not have the resources
to generate the MPA Response frame either. So simply returning
an I'm Busy error and expecting the driver to handle it
makes sense to me.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] e1000: fix netpoll with NAPI

2006-06-14 Thread Neil Horman

On Mon, Jun 12, 2006 at 02:06:00PM -0400, Neil Horman wrote:
 On Mon, Jun 12, 2006 at 09:42:14AM -0700, Mitch Williams wrote:
  On Sun, 2006-06-11 at 17:13 -0700, Neil Horman wrote:
   Any further thoughts on this guys?  I still think my last solution
   solves all of
   the netpoll problems, and isn't going to have any noticable impact on
   performance.
   
  I haven't had time to evaluate performance on your patch (sorry!), but
  after thinking about it, I agree that it should not have any noticeable
  impact.  OTOH, performance tuning is a funny thing, and things you think
  won't cause problems often do.
  
 Thats ok, I just didn't hear out of anyone on friday, so I was curious as to
 where we were on this.  I don't have the ability to do any real world
 performance testing here, but I'll try to record the run time of the interrupt
 routine on a limited number of frames here.
 

Hey, as promised, I've done some rudimentary performance benchmarking on various
ways that we have talked about to solve this problem.  As I previously mentioned
I didn't have the equipment to do any real full scale testing here, so what I
did was take a read of the real time counter at the start and end of the
e1000_intr routine with various patches applied, and I recorded the number of
ticks elapsed on the tsc during its run.  I did this on my single cpu x86_64
machine here, using the latest unpatched e1000 driver as a base, and then
comparing it to the e1000 driver using my patch and separately with a patch that
spinlocks the e1000_clean_rx_irq routine (so as to serialize the critical
section that would otherwise be subject to data corruption.  Here are my
results:

Base line: 
Avg. 8145 Ticks on the tsc.

With my patch: 
http://marc.theaimsgroup.com/?l=linux-netdevm=114970807606096w=2 
Avg. 8159 Ticks on the tsc. (+0.17% increase)

With a spinlock added to e1000_clean_rx_irq:
Avg. 8238 Ticks on the tsc. (+1.1% increase)

If you like I can send you the time stamp counter patch that I used, as well as
the patch which adds spinlocks to the clean routine.  Note that the free running
counter values will vary so you probably want to look at percentage increase.
Either way, I think either solution provides very little impact on interrupt run
time.  Given that my patch (granted using my test methodology here) is the
faster of the two, and arguably the more correct in terms of not using the poll
controller method to recieve frames, We should go with that patch.

Thoughts/opinions? 
Neil


-- 
/***
 *Neil Horman
 *Software Engineer
 *gpg keyid: 1024D / 0x92A74FA1 - http://pgp.mit.edu
 ***/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late

2006-06-14 Thread Francois Romieu

Grant Grundler [EMAIL PROTECTED] :
[...]
 I'm not keen on adding more code to tulip_interrupt() routine
 for something that rarely happens (compared to IRQs) and is handled
 outside the interrupt routine.  I'm pretty sure stopping interrupts
 before stopping DMA is sufficient.
 Can you show an example where it doesn't work?

Shared irq. 

The device has not quiesced, the kernel stop listening to it and the
neighbor device receives a late interruption from the network device.

-- 
Ueimor
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Sridhar Samudrala

On Wed, 2006-06-14 at 10:48 -0700, Daniel Phillips wrote:

 
 Did we settle the question of whether these particular exports should be
 EXPORT_SYMBOL_GPL?

When i submitted this patch, i didn't really think about the different
ways to export these symbols. I simply used the EXPORT_SYMBOL() that is 
used by all the other exports in net/socket.c including kernel_sendmsg()
and kernel_recvmsg().

I am OK with either option(EXPORT_SYMBOL or EXPORT_SYMBOL_GPL) and i will
leave it to David Miller to make that decision at this point.

Thanks
Sridhar

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch] ipv4: fix lock usage in udp_ioctl

2006-06-14 Thread David Miller

From: Heiko Carstens [EMAIL PROTECTED]
Date: Wed, 14 Jun 2006 21:43:05 +0200

 From: Heiko Carstens [EMAIL PROTECTED]

 Fix lock usage in udp_ioctl().

 Signed-off-by: Heiko Carstens [EMAIL PROTECTED]

More likely the qeth driver shouldn't call into the socket code in
hardware interrupt context.  From your logs that's what it seems is
happening.

The socket receive queue should only be touched in software
interrupt context, never in hardware interrupt context.  That's
why the locking does BH disabling at best.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

type of sadb_x_kmprivate_reserved in pfkeyv2.h

2006-06-14 Thread Tushar Gohad


Hi David and folks,

In include/linux/pfkeyv2.h, is the type 'u_int32_t' for 
sadb_x_kmprivate_reserved intentional or just an error while bringing in 
the PF_KEY IPsec extensions from KAME?


struct sadb_x_kmprivate {
   uint16_tsadb_x_kmprivate_len;
   uint16_tsadb_x_kmprivate_exttype;
   u_int32_t   sadb_x_kmprivate_reserved;
} __attribute__((packed));

This is causing erroneous ipsec-tools builds. How does the 
__BIT_TYPES_DEFINED define work? Seems like u_int32_t does not get 
defined in include/linux/types.h when building a userland program such 
as ipsec-tools.


An easy fix is to change the type to uint32_t. Patch attached.

Thanks.
- Tushar

Source: MontaVista Software, Inc.
MR: 19039
Type: Defect Fix
Disposition: needs submitting to kernel.org
Signed-off-by: Tushar Gohad [EMAIL PROTECTED]
Description:
When bringing over the PF_KEY extensions for IPsec from the
KAME stack, folks probably forgot to change this only variable
to be of type uint32_t. Or otherwise. This is the easiest and 
harmless fix.

Index: linux-p4/include/linux/pfkeyv2.h
===
--- linux-p4.orig/include/linux/pfkeyv2.h
+++ linux-p4/include/linux/pfkeyv2.h
@@ -159,7 +159,7 @@ struct sadb_spirange {
 struct sadb_x_kmprivate {
 	uint16_t	sadb_x_kmprivate_len;
 	uint16_t	sadb_x_kmprivate_exttype;
-	u_int32_t	sadb_x_kmprivate_reserved;
+	uint32_t	sadb_x_kmprivate_reserved;
 } __attribute__((packed));
 /* sizeof(struct sadb_x_kmprivate) == 8 */

[Ubuntu PATCH] IRDA: Add some IBM think pads

2006-06-14 Thread Randy Dunlap


From: Ben Collins [EMAIL PROTECTED]

[UBUNTU:nsc-ircc] Add some IBM think pads
Add Thinkpad T60/X60/Z60/T43/R52 Infrared driver support.

http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-dapper.git;a=commitdiff;h=7b8d2713435a9fb69719a282ba75e117f3f76a5b

Signed-off-by: Ben Collins [EMAIL PROTECTED]
---

--- a/drivers/net/irda/nsc-ircc.c
+++ b/drivers/net/irda/nsc-ircc.c
@@ -115,8 +115,12 @@ static nsc_chip_t chips[] = {
/* Contributed by Jan Frey - IBM A30/A31 */
	{ PC8739x, { 0x2e, 0x4e, 0x0 }, 0x20, 0xea, 0xff, 
	  nsc_ircc_probe_39x, nsc_ircc_init_39x },

-   { IBM, { 0x2e, 0x4e, 0x0 }, 0x20, 0xf4, 0xff,
- nsc_ircc_probe_39x, nsc_ircc_init_39x },
+   /* IBM ThinkPads using PC8738x (T60/X60/Z60) */
+   { IBM-PC8738x, { 0x2e, 0x4e, 0x0 }, 0x20, 0xf4, 0xff,
+ nsc_ircc_probe_39x, nsc_ircc_init_39x },
+   /* IBM ThinkPads using PC8394T (T43/R52/?) */
+   { IBM-PC8394T, { 0x2e, 0x4e, 0x0 }, 0x20, 0xf9, 0xff,
+ nsc_ircc_probe_39x, nsc_ircc_init_39x },
{ NULL }
};

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Ubuntu PATCH] Make tulip driver not handle Davicom NICs

2006-06-14 Thread Randy Dunlap


Make tulip driver not handle Davicom NICs, let dmfe take over

Reference: https://launchpad.net/bugs/48287
Source URL of Patch:
http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-dapper.git;a=commitdiff;h=1804482911a71bee9114cae1c2079507a38e9e7f


--- linux-2.6.17-rc5/drivers/net/tulip/tulip_core.c 2006-06-05 
09:20:30.0 +0800
+++ ubuntu-kernel/drivers/net/tulip/tulip_core.c2006-06-05 
09:56:55.0 +0800
@@ -223,8 +223,12 @@
{ 0x1259, 0xa120, PCI_ANY_ID, PCI_ANY_ID, 0, 0, COMET },
{ 0x11F6, 0x9881, PCI_ANY_ID, PCI_ANY_ID, 0, 0, COMPEX9881 },
{ 0x8086, 0x0039, PCI_ANY_ID, PCI_ANY_ID, 0, 0, I21145 },
+   /* dmfe module seems to handle these better. See:
+* https://launchpad.net/bugs/48287 */
+#if 0
{ 0x1282, 0x9100, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DM910X },
{ 0x1282, 0x9102, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DM910X },
+#endif
{ 0x1113, 0x1216, PCI_ANY_ID, PCI_ANY_ID, 0, 0, COMET },
{ 0x1113, 0x1217, PCI_ANY_ID, PCI_ANY_ID, 0, 0, MX98715 },
{ 0x1113, 0x9511, PCI_ANY_ID, PCI_ANY_ID, 0, 0, COMET },

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Ubuntu PATCH] forcedeth: Let the driver work when no PHY is found

2006-06-14 Thread Randy Dunlap


From: Ben Collins [EMAIL PROTECTED]

[UBUNTU:forcedeth] Let the driver work when no PHY is found

This matches breezy behavior.

Reference: https://launchpad.net/products/launchpad/+bug/45257
http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-dapper.git;a=commitdiff;h=38c1aaedc1f907e138698e54ceadeb9ae560b0d7

Signed-off-by: Ben Collins [EMAIL PROTECTED]
---

--- a/drivers/net/forcedeth.c
+++ b/drivers/net/forcedeth.c
@@ -2582,14 +2582,13 @@ static int __devinit nv_probe(struct pci
np-phy_oui = id1 | id2;
break;
}
-   if (i == 33) {
+
+   /* Let the damn card work if it can */
+   if (i == 33)
printk(KERN_INFO %s: open: Could not find a valid PHY.\n,
   pci_name(pci_dev));
-   goto out_freering;
-   }
-   
-   /* reset it */
-   phy_init(dev);
+   else
+   phy_init(dev);

/* set default link speed settings */
np-linkspeed = NVREG_LINKSPEED_FORCE|NVREG_LINKSPEED_10;


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: tcp_slow_start_after_idle

2006-06-14 Thread David Miller

From: Zach Brown [EMAIL PROTECTED]
Date: Wed, 14 Jun 2006 10:09:52 -0700

 Nice, thanks for the heads-up.  I'll pass the notice on to the guys who
 were asking about this in that thread.

Which Wall Street brokerage firm was it? :-)

That's basically who wants this stuff, people doing financial
transactions.  They seem to open up a connection, and just blast out
data periodically (with frequency  RTO, which is the whole problem)
and they want good latency results from that.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: PATCHv3 2.6.17-rc5 tulip free_irq() called too late

2006-06-14 Thread Grant Grundler

On Wed, Jun 14, 2006 at 10:47:20PM +0200, Francois Romieu wrote:
 Grant Grundler [EMAIL PROTECTED] :
 [...]
  I'm not keen on adding more code to tulip_interrupt() routine
  for something that rarely happens (compared to IRQs) and is handled
  outside the interrupt routine.  I'm pretty sure stopping interrupts
  before stopping DMA is sufficient.
  Can you show an example where it doesn't work?
 
 Shared irq. 
 
 The device has not quiesced, the kernel stop listening to it and the
 neighbor device receives a late interruption from the network device.

I thought we've worked through that already:
http://www.spinics.net/lists/netdev/msg05902.html

Patch v3 takes care of that problem.
The first step in the sequence is to mask IRQs on the tulip.
The neighbor device sharing the IRQ will not see any interrupts from
the tulip after that.

thanks,
grant
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Ubuntu PATCH] Broadcom wireless patch, PCIE/Mactel support

2006-06-14 Thread Randy Dunlap


From: Matthew Garrett [EMAIL PROTECTED]

Broadcom wireless patch, PCIE/Mactel support

http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-dapper.git;a=commitdiff;h=1373a8487e911b5ee204f4422ddea00929c8a4cc

This patch adds support for PCIE cores to the bcm43xx driver. This is
needed for wireless to work on the Intel imacs. I've submitted it to
bcm43xx upstream.

(cherry picked from d88edf6a433074323a1805365a8dfc9c26fceae3 commit)
(cherry picked from 7dbd83ed3255fde4371edcbb6ad1d30f3e6ddf08 commit)
---

--- a/drivers/net/wireless/bcm43xx/bcm43xx.h
+++ b/drivers/net/wireless/bcm43xx/bcm43xx.h
@@ -202,6 +202,8 @@
#define BCM43xx_COREID_USB20_HOST   0x819
#define BCM43xx_COREID_USB20_DEV0x81a
#define BCM43xx_COREID_SDIO_HOST0x81b
+#define BCM43xx_COREID_PCIE0x820
+#define BCM43xx_COREID_CHIPCOMMON_NEW  0x900

/* Core Information Registers */
#define BCM43xx_CIR_BASE0xf00
--- a/drivers/net/wireless/bcm43xx/bcm43xx_main.c
+++ b/drivers/net/wireless/bcm43xx/bcm43xx_main.c
@@ -130,6 +130,8 @@ MODULE_PARM_DESC(fwpostfix, Postfix for
{ PCI_VENDOR_ID_BROADCOM, 0x4301, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
/* Broadcom 4307 802.11b */
{ PCI_VENDOR_ID_BROADCOM, 0x4307, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
+   /* Broadcom 4312 80211a/b/g */
+   { PCI_VENDOR_ID_BROADCOM, 0x4312, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0},
/* Broadcom 4318 802.11b/g */
{ PCI_VENDOR_ID_BROADCOM, 0x4318, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
/* Broadcom 4319 802.11a/b/g */
@@ -2580,7 +2582,8 @@ static int bcm43xx_probe_cores(struct bc
core_vendor = (sb_id_hi  0x)  16;

/* if present, chipcommon is always core 0; read the chipid from it */
-   if (core_id == BCM43xx_COREID_CHIPCOMMON) {
+	if (core_id == BCM43xx_COREID_CHIPCOMMON || 
+	core_id == BCM43xx_COREID_CHIPCOMMON_NEW) {

chip_id_32 = bcm43xx_read32(bcm, 0);
chip_id_16 = chip_id_32  0x;
bcm-core_chipcommon.available = 1;
@@ -2614,7 +2617,8 @@ static int bcm43xx_probe_cores(struct bc

/* ChipCommon with Core Rev =4 encodes number of cores,
 * otherwise consult hardcoded table */
-   if ((core_id == BCM43xx_COREID_CHIPCOMMON)  (core_rev = 4)) {
+   if (((core_id == BCM43xx_COREID_CHIPCOMMON)  (core_rev = 4)) ||
+core_id == BCM43xx_COREID_CHIPCOMMON_NEW) {
core_count = (chip_id_32  0x0F00)  24;
} else {
switch (chip_id_16) {
@@ -2686,6 +2690,7 @@ static int bcm43xx_probe_cores(struct bc
core = NULL;
switch (core_id) {
case BCM43xx_COREID_PCI:
+   case BCM43xx_COREID_PCIE:
core = bcm-core_pci;
if (core-available) {
printk(KERN_WARNING PFX Multiple PCI cores 
found.\n);
@@ -2724,6 +2729,7 @@ static int bcm43xx_probe_cores(struct bc
case 6:
case 7:
case 9:
+   case 10:
break;
default:
printk(KERN_ERR PFX Error: Unsupported 80211 core 
revision %u\n,
@@ -3002,7 +3008,7 @@ static int bcm43xx_setup_backplane_pci_c
if (err)
goto out;

-   if (bcm-core_pci.rev  6) {
+   if (bcm-core_pci.rev  6  bcm-core_pci.id != BCM43xx_COREID_PCIE) {
value = bcm43xx_read32(bcm, BCM43xx_CIR_SBINTVEC);
value |= (1  backplane_flag_nr);
bcm43xx_write32(bcm, BCM43xx_CIR_SBINTVEC, value);
@@ -3024,7 +3030,7 @@ static int bcm43xx_setup_backplane_pci_c
value |= BCM43xx_SBTOPCI2_PREFETCH | BCM43xx_SBTOPCI2_BURST;
bcm43xx_write32(bcm, BCM43xx_PCICORE_SBTOPCI2, value);

-   if (bcm-core_pci.rev  5) {
+   if (bcm-core_pci.rev  5  bcm-core_pci.id != BCM43xx_COREID_PCIE) {
value = bcm43xx_read32(bcm, BCM43xx_CIR_SBIMCONFIGLOW);
value |= (2  BCM43xx_SBIMCONFIGLOW_SERVICE_TOUT_SHIFT)
  BCM43xx_SBIMCONFIGLOW_SERVICE_TOUT_MASK;
@@ -3351,7 +3357,7 @@ static int bcm43xx_read_phyinfo(struct b
bcm-ieee-freq_band = IEEE80211_24GHZ_BAND;
break;
case BCM43xx_PHYTYPE_G:
-   if (phy_rev  7)
+   if (phy_rev  8)
phy_rev_ok = 0;
bcm-ieee-modulation = IEEE80211_OFDM_MODULATION |
IEEE80211_CCK_MODULATION;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: tcp_slow_start_after_idle

2006-06-14 Thread David Miller

From: Rick Jones [EMAIL PROTECTED]
Date: Wed, 14 Jun 2006 09:46:58 -0700

 Also, does the congestion window time out or does it decay?

The modification made to the cwnd is indeed a decay function,
but the event is a time out, and it is also termed a restart
in other writings and contexts.

I think it's all the same. :)

  +/* By default, RFC2861 behavior.  */
  +int sysctl_tcp_slow_start_after_idle = 1;
  +

 Is this a candidate for readmostly?

All the networking sysctls are, we should do a sweep over them at some
point.

Thanks for reminding me.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch] ipv4: fix lock usage in udp_ioctl

2006-06-14 Thread Herbert Xu

Heiko Carstens [EMAIL PROTECTED] wrote:
 
 As reported by the lock validator:
 
 
 [ BUG: illegal lock usage! ]
 
 illegal {in-hardirq-W} - {hardirq-on-W} usage.
 syslogd/739 [HC0[0]:SC0[1]:HE1:SE0] takes:
 (list-lock){++..}, at: [002e36d6] udp_ioctl+0x96/0x100
 {in-hardirq-W} state was registered at:
  [00062128] lock_acquire+0x9c/0xc0
  [0036209e] _spin_lock_irqsave+0x66/0x84
  [002912ce] skb_dequeue+0x32/0xb0
  [00263160] qeth_qdio_output_handler+0x3e8/0xf8c
  [00219fdc] tiqdio_thinint_handler+0xde0/0x2234
  [0020448c] do_adapter_IO+0x5c/0xa8
  [0020842c] do_IRQ+0x13c/0x18c
  [000208a2] io_no_vtime+0x16/0x1c
  [0001978c] cpu_idle+0x1d0/0x20c

This is bogus.  These two locks belong to two different queues and they
never intersect.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] e1000: fix netpoll with NAPI

2006-06-14 Thread Mitch Williams



On Wed, 14 Jun 2006, Neil Horman wrote:

 Hey, as promised, I've done some rudimentary performance benchmarking on 
 various
 ways that we have talked about to solve this problem.  As I previously 
 mentioned

We see the same results here, Neil.  However, we've got a much less
invasive patch undergoing internal review, and which we will post to
netdev once everybody gets happy with it.  Basically, we just do our NAPI
scheduling on the real netdev structure instead of our polling netdev,
in the case where we only have one RX queue.  Since this is the case for
all our currently-shipping parts under Linux, netpoll works again across
the board.  It's a short-term fix because we do want to support multiple
queues going forward, but for now we need to get everybody working.

One of our engineers (on the I/O AT team) has been tasked with modifying
the Linux kernel to properly support multiple hardware queues (both TX and
RX).  We'll make sure that he looks at the netpoll interface as part of
that process.

Stay tuned for our impending patch.
-Mitch

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] NET: Accurate packet scheduling for ATM/ADSL (userspace)

2006-06-14 Thread Russell Stuart

On Wed, 2006-06-14 at 11:57 +0100, Alan Cox wrote:
 The other problem I see with this code is it is very tightly tied to ATM
 cell sizes, not to solving the generic question of packetisation.

Others have made this point also.  I can't speak for Jesper,
but I did consider making it generic.  The issue was that 
doing so would add more code, but I don't personally know 
of any real world situation that would use the generic 
solution.  I didn't fancy the thought of arguing on these
lists for code that no one would actually use.

If someone could put up their hand and say Hey, I need
this, then expanding the patch to accommodate them would
be a pleasure.  I like generic code too.


Russell

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Make in-kernel hostap less annoying

2006-06-14 Thread Jouni Malinen

On Mon, Jun 12, 2006 at 03:13:02PM -0400, Kyle McMartin wrote:

 Most user don't want their kern.log/dmesg filled with
 debugging gibberish, and could turn it on if prompted.
 
 ( Example:
 wifi0: TXEXC - status=0x0004 ([Discon]) tx_control=000c
 retry_count=0 tx_rate=0 fc=0x0108 (Data::0 ToDS)
 A1=00:0f:66:43:d7:0a A2=00:05:3c:06:63:01 A3=33:33:00:00:00:16 
 A4=00:00:00:00:00:00 )

I agree with removing these by default. However, I would prefer to do
this in more selective manor than disabling all debugging information at
build time. This would probably involve going through all debug messages
using this mechanism and selecting whether they are reasonable to enable
by default or not and ideally doing this as a run-time option.

 Also make hostap default to managed mode, instead of master mode, which
 has bitten a few users expecting it to behave like the orinoco driver
 it is replacing.

NAK. Host AP has been configured to use master mode by default for the
past six years and that is what most users would expect it to continue
to do. I do understand that this default differs from all drivers that
do not support AP mode, but I think it is too late to change this now.
The default could change once Host AP gets replaced with
net/d80211-based implementation for Prism2/2.5/3, but I would not change
this for Host AP driver.

 Two minor things I've been carrying around in my personal tree
 for quite some time. (This is only relevant to the in-kernel driver,
 I see no reason to change the out-of-tree driver.)

That would be even more confusing for the default mode.. I believe that
both versions should continue to use Master mode as the default unless
overridden by user.

-- 
Jouni MalinenPGP id EFC895FA
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch] e100 statistic value rx_bytes error

2006-06-14 Thread Wei Dong

Hi All:
   When I test linux kernel(2.6.9-16), I found that maybe there is a bug
in e100 driver. See function e100_rx_indicate() at line 1847:
nic-net_stats.rx_bytes += actual_size;
Here, actual_size is the actual size of an ethernent frame sans FCS.And
the e100 driver gets it from skb. Because rx_bytes is a statistc value
for a NIC, I think rx_bytes should include the FCS(4 bytes).
The following is the patch for the function in e100.c

diff -ruN old/drivers/net/e100.c new/drivers/net/e100.c
--- old/drivers/net/e100.c  2006-03-20 13:53:29.0 +0800
+++ new/drivers/net/e100.c  2006-06-15 11:16:04.0 +0800
@@ -1844,7 +1844,8 @@
dev_kfree_skb_any(skb);
} else {
nic-net_stats.rx_packets++;
-   nic-net_stats.rx_bytes += actual_size;
+   /* Don't forget FCS */
+   nic-net_stats.rx_bytes += actual_size + 4;
nic-netdev-last_rx = jiffies;
netif_receive_skb(skb);
if(work_done)

BR.
  Weidong

Signed-off-by: Weidong [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch] e100 statistic value rx_bytes error

2006-06-14 Thread Auke Kok


Wei Dong wrote:

Hi All:
   When I test linux kernel(2.6.9-16), I found that maybe there is a bug
in e100 driver. See function e100_rx_indicate() at line 1847:
nic-net_stats.rx_bytes += actual_size;
Here, actual_size is the actual size of an ethernent frame sans FCS.And
the e100 driver gets it from skb. Because rx_bytes is a statistc value
for a NIC, I think rx_bytes should include the FCS(4 bytes).
The following is the patch for the function in e100.c


This is definately not an issue, and I'm not for changing this: It always was 
like this in the first place. It's done for many drivers like this anyway, 
mostly those without real hardware counters do it this way anyway (I count 
half a dozen or so on first glance).


On top of that we would be changing statistics numbers after x years of e100 
driver. I'm sure everyone doing real performance work will frown upon this.


Next it's unlikely that every driver (or worse, every nic in hardware) 
accounts for FCS in the rx_bytes count. It really wouldn't surprise me if a 
driver (or chip) got this wrong here or there.


Bottom line is that for e100, it's well known and easily seeable that e100 is 
counting skb sizes. That's consistent and I think we should keep it that way.


Auke


PS please cc the driver maintainers when you post patches to a specific driver.



diff -ruN old/drivers/net/e100.c new/drivers/net/e100.c
--- old/drivers/net/e100.c  2006-03-20 13:53:29.0 +0800
+++ new/drivers/net/e100.c  2006-06-15 11:16:04.0 +0800
@@ -1844,7 +1844,8 @@
dev_kfree_skb_any(skb);
} else {
nic-net_stats.rx_packets++;
-   nic-net_stats.rx_bytes += actual_size;
+   /* Don't forget FCS */
+   nic-net_stats.rx_bytes += actual_size + 4;
nic-netdev-last_rx = jiffies;
netif_receive_skb(skb);
if(work_done)

BR.
  Weidong

Signed-off-by: Weidong [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.17-rc6] Remove Prism II support from Orinoco

2006-06-14 Thread Pavel Roskin

Hello, John!

On Mon, 2006-06-12 at 11:24 -0400, John W. Linville wrote:
 On Mon, Jun 12, 2006 at 01:49:54AM +0300, Faidon Liambotis wrote:
 
  Having two drivers supporting the same set of hardware seems pretty
  pointless to me. Plus, it confuses hotplugging/automatic detection.
 
 This subject comes-up from time to time.  In fact, I'm pretty sure
 it came-up very recently w.r.t. orinoco and hostap.
 
 The consensus seems to be that drivers should have IDs for all devices
 they support, even if that means that some devices are supported by
 multiple drivers.  This leaves the choice of which driver to use in
 the hands of the user and/or distro.
 
 If the Orinoco guys want this patch, I'll consider it.  Otherwise,
 I'm not inclined to take it.

I really appreciate your position in this regard.

The patch in question was never submitted to the orinoco mailing list.
I believe any such changes should be discussed by people using the
driver and participating in its development.  It's not some minor change
or API update.

I'm ready to consider disabling some ID's conditionally, primarily for
systems that cannot use udev.  But it's far from the top of my TODO
list.  And I'm not sure it would actually help users of desktop
distributions.

-- 
Regards,
Pavel Roskin

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.17-rc6] Remove Prism II support from Orinoco

2006-06-14 Thread Pavel Roskin

On Mon, 2006-06-12 at 17:10 -0700, Jesse Brandeburg wrote:
 my problem is that for my prism 2 adapter both drivers are loaded at
 which point neither of them works.  I'm running FC5, and i have to
 keep removing the orinoco*.ko files to keep them from loading, so I'm
 all for this patch.

I believe the right solution would be to do it in userspace.  The kernel
should not be making decisions which driver is _better_ for the device.
I'm yet to see any serious arguments why the kernel should be doing it.

As for non-working driver, this should be reported with sufficient
details.  I haven't seen any detailed reports of this problem.

-- 
Regards,
Pavel Roskin

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch] ipv4: fix lock usage in udp_ioctl

2006-06-14 Thread Ingo Molnar


* Herbert Xu [EMAIL PROTECTED] wrote:

 This is bogus.  These two locks belong to two different queues and 
 they never intersect.

yeah - qeth does its own skb-queue management here, and it's done in an 
irq-safe manner.

Heiko, in qeth_main.c, could you do something like:

+ static struct lockdep_type_key qdio_out_skb_queue_key;

...
skb_queue_head_init(card-qdio.out_qs[i]-bufs[j].
 skb_list);
+   lockdep_reinit_key(card-qdio.out_qs[i]-bufs[j].skb_list,
   qdio_out_skb_queue_key)

Ingo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

68 matches

Mail list logo