Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Herbert Xu
Chris Snook [EMAIL PROTECTED] wrote:
 
 Because atomic operations are generally used for synchronization, which 
 requires 
 volatile behavior.  Most such codepaths currently use an inefficient 
 barrier(). 
  Some forget to and we get bugs, because people assume that atomic_read() 
 actually reads something, and atomic_write() actually writes something.  
 Worse, 
 these are architecture-specific, even compiler version-specific bugs that are 
 often difficult to track down.

I'm yet to see a single example from the current tree where
this patch series is the correct solution.  So far the only
example has been a buggy piece of code which has since been
fixed with a cpu_relax.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Realtek r8168 slow outbound transfer - potential fix/workaround

2007-08-15 Thread Bruce Cole

Francois Romieu wrote:

Bruce Cole [EMAIL PROTECTED] :
[...]
  
What's the status of this fix?  It (or something more refined) seems 
necessary to correct the current performance problems with this driver.



An explanation or something more refined would be welcome.

  
Are you indicating that you need more information on how to reproduce 
the problem?  I believe the problem is 100% reproducible if you 
configure a samba server with a realtek r8111 interface and dragdrop a 
file from that server to a windows host on the directly connected lan.  
This with both machine's host interfaces reporting 1000mbps full duplex 
and a gig-e hub between.


I've just reproduced using a linux client as well, so it would be easy 
to compare tcpdumps of the TCP sessions from both ends.  Note the 
abysmal performance when I transfer *from* the server, but the 
performance is fine when I transfer *to* the server:


 zpad(root): mount -t cifs -o username=cole //192.168.1.11/u4 /mnt1
 Password:
 zpad(root): cd /mnt1/test
 zpad(root): ls -l t
 -rw-r--r-- 1 cole cole 52428800 2007-08-14 23:52 t
 zpad(root): date; cp -pr t ~/test/; date
 Tue Aug 14 11:57:35 PM
 Wed Aug 15 12:01:11 AM
 zpad(root): ls -l ~/test/t
 -rw-r--r-- 1 cole cole 50M 2007-08-14 23:52 /home/cole/test/t
 zpad(root): date; cp -pr ~/test/t tt; date
 Wed Aug 15 12:02:11 AM
 cp: setting permissions for `tt': Permission denied
 Wed Aug 15 12:02:13 AM
 zpad(root): ls -l tt
 -rwx-- 1 cole cole 52428800 2007-08-14 23:52 tt*
 zpad(root):

This test is with the vanilla 2.6.23-rc3 driver with NAPI enabled.


If you're hoping to receive a better fix than the spin wait, I could 
probably help with that too but I'd need to see realtek's programming 
documentation for this chip.



[...]
  

I can troubleshoot in more detail if that would help get a proper fix
developed.



Can you try 2.6.23-rc3 with NAPI enabled and send a complete dmesg +
/proc/interrupts + ethtool -S output ?
  

Sure, see below:
dmesg:
Linux version 2.6.23-0.104.rc3.fc8 
([EMAIL PROTECTED]) (gcc version 4.1.2 20070723 
(Red Hat 4.1.2-17)) #1 SMP Mon Aug 13 15:58:43 EDT 2007

Command line: ro root=/dev/sda6 rhgb quiet
BIOS-provided physical RAM map:
BIOS-e820:  - 0009e800 (usable)
BIOS-e820: 0009f800 - 000a (reserved)
BIOS-e820: 000f - 0010 (reserved)
BIOS-e820: 0010 - 7fee (usable)
BIOS-e820: 7fee - 7fee3000 (ACPI NVS)
BIOS-e820: 7fee3000 - 7fef (ACPI data)
BIOS-e820: 7fef - 7ff0 (reserved)
BIOS-e820: f000 - f400 (reserved)
BIOS-e820: fec0 - 0001 (reserved)
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 524000) 1 entries of 3200 used
end_pfn_map = 1048576
DMI 2.4 present.
ACPI: RSDP 000F6FE0, 0014 (r0 GBT   )
ACPI: RSDT 7FEE3040, 003C (r1 GBTGBTUACPI 42302E31 GBTU  1010101)
ACPI: FACP 7FEE30C0, 0074 (r1 GBTGBTUACPI 42302E31 GBTU  1010101)
ACPI: DSDT 7FEE3180, 4B36 (r1 GBTGBTUACPI 1000 MSFT  10C)
ACPI: FACS 7FEE, 0040
ACPI: HPET 7FEE7E00, 0038 (r1 GBTGBTUACPI 42302E31 GBTU   98)
ACPI: MCFG 7FEE7E80, 003C (r1 GBTGBTUACPI 42302E31 GBTU  1010101)
ACPI: APIC 7FEE7D00, 0084 (r1 GBTGBTUACPI 42302E31 GBTU  1010101)
ACPI: SSDT 7FEE7F00, 015C (r1  PmRef  Cpu0Ist 3000 INTL 20040311)
ACPI: SSDT 7FEE8390, 0275 (r1  PmRefCpuPm 3000 INTL 20040311)
No NUMA configuration found
Faking a node at -7fee
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 524000) 1 entries of 3200 used
Bootmem setup node 0 -7fee
Zone PFN ranges:
 DMA 0 - 4096
 DMA324096 -  1048576
 Normal1048576 -  1048576
Movable zone start PFN for each node
early_node_map[2] active PFN ranges
   0:0 -  158
   0:  256 -   524000
On node 0 totalpages: 523902
 DMA zone: 96 pages used for memmap
 DMA zone: 2488 pages reserved
 DMA zone: 1414 pages, LIFO batch:0
 DMA32 zone: 12185 pages used for memmap
 DMA32 zone: 507719 pages, LIFO batch:31
 Normal zone: 0 pages used for memmap
 Movable zone: 0 pages used for memmap
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee0
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] disabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] disabled)
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec0] gsi_base[0])
IOAPIC[0]: apic_id 2, address 0xfec0, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Heiko Carstens
On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote:
 Chris Snook [EMAIL PROTECTED] wrote:
  
  Because atomic operations are generally used for synchronization, which 
  requires 
  volatile behavior.  Most such codepaths currently use an inefficient 
  barrier(). 
   Some forget to and we get bugs, because people assume that atomic_read() 
  actually reads something, and atomic_write() actually writes something.  
  Worse, 
  these are architecture-specific, even compiler version-specific bugs that 
  are 
  often difficult to track down.
 
 I'm yet to see a single example from the current tree where
 this patch series is the correct solution.  So far the only
 example has been a buggy piece of code which has since been
 fixed with a cpu_relax.

Btw.: we still have

include/asm-i386/mach-es7000/mach_wakecpu.h:  while (!atomic_read(deassert));
include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert));

Looks like they need to be fixed as well.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [70/2many] MAINTAINERS - ARPD SUPPORT

2007-08-15 Thread Alan Cox
On Wed, 15 Aug 2007 10:10:38 +0200
Stefan Richter [EMAIL PROTECTED] wrote:

 Alan Cox wrote:
   Actually I think the entire thing is a
   bad idea but thats another matter.
 
  Working off the git tree as it shows who actually is making
  changes/updating stuff recently and why which is a major clue when
  tracing bugs
 
 Adrian Bunk wrote in http://lkml.org/lkml/2007/6/30/107 :
 | I don't see how git could tell you to Cc net patches to
 | [EMAIL PROTECTED]
 
 Ditto with problem reports.
 
 For example, Cc'ing linux1394-devel on FireWire topics is a lot more
 important than Cc'ing me.

Tools question. And even if you need to keep a simple set of mappings by
hand for mailing list/files thats a fairly small set of patterns
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [70/2many] MAINTAINERS - ARPD SUPPORT

2007-08-15 Thread Stefan Richter
Alan Cox wrote:
  Actually I think the entire thing is a
  bad idea but thats another matter.

 Working off the git tree as it shows who actually is making
 changes/updating stuff recently and why which is a major clue when
 tracing bugs

Adrian Bunk wrote in http://lkml.org/lkml/2007/6/30/107 :
| I don't see how git could tell you to Cc net patches to
| [EMAIL PROTECTED]

Ditto with problem reports.

For example, Cc'ing linux1394-devel on FireWire topics is a lot more
important than Cc'ing me.
-- 
Stefan Richter
-=-=-=== =--- -
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux kernel 2.6.18-20 bug: rcu_read_unlock in __sock_create

2007-08-15 Thread Herbert Xu
oleg 123 [EMAIL PROTECTED] wrote:

 There is a bug in __sock_create() function (net/socket.c).
 If an error occur in LSM hook security_socket_post_create, then unneeded
 rcu_read_unlock() is made in __sock_create.

Well spotted! Please cc netdev@vger.kernel.org for networking
issues.

[NET]: Fix unbalanced rcu_read_unlock in __sock_create

The recent RCU work created an unbalanced rcu_read_unlock
in __sock_create.  This patch fixes that.  Reported by
oleg 123.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/socket.c b/net/socket.c
index ec07703..7d44453 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1168,7 +1168,7 @@ static int __sock_create(int family, int type, int 
protocol,
module_put(pf-owner);
err = security_socket_post_create(sock, family, type, protocol, kern);
if (err)
-   goto out_release;
+   goto out_sock_release;
*res = sock;
 
return 0;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Stefan Richter
Satyam Sharma wrote:
 [ BTW, why do we want the compiler to not optimize atomic_read()'s in
   the first place? Atomic ops guarantee atomicity, which has nothing
   to do with volatility -- users that expect volatility from
   atomic ops are the ones who must be fixed instead, IMHO. ]

LDD3 says on page 125:  The following operations are defined for the
type [atomic_t] and are guaranteed to be atomic with respect to all
processors of an SMP computer.

Doesn't atomic WRT all processors require volatility?
-- 
Stefan Richter
-=-=-=== =--- -
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] nl80211: remove VLAN type

2007-08-15 Thread Johannes Berg
There is no point in trying to handle source MAC address based VLANs in
the wireless stack, something like smacvlan modeled after macvlan
should be sufficient.

Signed-off-by: Johannes Berg [EMAIL PROTECTED]

---
We could just not keep the hole as well since people can't really be
using that header yet, but the version we have in previous kernels could
still be shipped somewhere so I think it should be kept compatible.

 include/linux/nl80211.h |   19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

--- wireless-dev.orig/include/linux/nl80211.h   2007-08-15 00:43:06.290200043 
+0200
+++ wireless-dev/include/linux/nl80211.h2007-08-15 00:45:51.860200043 
+0200
@@ -207,7 +207,6 @@ enum nl80211_attrs {
  * @NL80211_IFTYPE_ADHOC: independent BSS member
  * @NL80211_IFTYPE_STATION: managed BSS member
  * @NL80211_IFTYPE_AP: access point
- * @NL80211_IFTYPE_AP_VLAN: VLAN interface for access points
  * @NL80211_IFTYPE_WDS: wireless distribution interface
  * @NL80211_IFTYPE_MONITOR: monitor interface receiving all frames
  * @__NL80211_IFTYPE_AFTER_LAST: internal use
@@ -217,13 +216,17 @@ enum nl80211_attrs {
  *
  */
 enum nl80211_iftype {
-   NL80211_IFTYPE_UNSPECIFIED,
-   NL80211_IFTYPE_ADHOC,
-   NL80211_IFTYPE_STATION,
-   NL80211_IFTYPE_AP,
-   NL80211_IFTYPE_AP_VLAN,
-   NL80211_IFTYPE_WDS,
-   NL80211_IFTYPE_MONITOR,
+   NL80211_IFTYPE_UNSPECIFIED = 0,
+   NL80211_IFTYPE_ADHOC = 1,
+   NL80211_IFTYPE_STATION = 2,
+   NL80211_IFTYPE_AP = 3,
+   /*
+* there was AP_VLAN here but it was never used
+* keep the number unallocated for the benefit
+* of those people using old headers
+*/
+   NL80211_IFTYPE_WDS = 5,
+   NL80211_IFTYPE_MONITOR = 6,
 
/* keep last */
__NL80211_IFTYPE_AFTER_LAST


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] remove Documentation/networking/net-modules.txt

2007-08-15 Thread Geert Uytterhoeven
On Tue, 14 Aug 2007, Adrian Bunk wrote:
 According to git, the only one who touched this file during the last
 5 years was me when removing drivers...
 
 modinfo offers less ancient information.
 
 Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

For the m68k net drivers:

Acked-by: Geert Uytterhoeven [EMAIL PROTECTED]

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[METH] Don't use GFP_DMA for zone allocation.

2007-08-15 Thread Ralf Baechle
IP32 doesn't even have a ZONE_DMA so no point in using GFP_DMA in any
IP32-specific device driver.

Signed-off-by: Ralf Baechle [EMAIL PROTECTED]

diff --git a/drivers/net/meth.c b/drivers/net/meth.c
index 92b403b..32bed6b 100644
--- a/drivers/net/meth.c
+++ b/drivers/net/meth.c
@@ -405,7 +405,7 @@ static void meth_rx(struct net_device* dev, unsigned long 
int_status)
priv-stats.rx_length_errors++;
skb = priv-rx_skbs[priv-rx_write];
} else {
-   skb = alloc_skb(METH_RX_BUFF_SIZE, GFP_ATOMIC | 
GFP_DMA);
+   skb = alloc_skb(METH_RX_BUFF_SIZE, GFP_ATOMIC);
if (!skb) {
/* Ouch! No memory! Drop packet on the 
floor */
DPRINTK(No mem: dropping packet\n);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Herbert Xu
On Wed, Aug 15, 2007 at 12:35:31PM +0200, Stefan Richter wrote:
 
 LDD3 says on page 125:  The following operations are defined for the
 type [atomic_t] and are guaranteed to be atomic with respect to all
 processors of an SMP computer.
 
 Doesn't atomic WRT all processors require volatility?

Not at all.  We also require this to be atomic without any
hint of volatility.

extern int foo;
foo = 4;

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma


On Wed, 15 Aug 2007, Stefan Richter wrote:

 Satyam Sharma wrote:
  [ BTW, why do we want the compiler to not optimize atomic_read()'s in
the first place? Atomic ops guarantee atomicity, which has nothing
to do with volatility -- users that expect volatility from
atomic ops are the ones who must be fixed instead, IMHO. ]
 
 LDD3 says on page 125:  The following operations are defined for the
 type [atomic_t] and are guaranteed to be atomic with respect to all
 processors of an SMP computer.
 
 Doesn't atomic WRT all processors require volatility?

No, it definitely doesn't. Why should it?

Atomic w.r.t. all processors is just your normal, simple atomicity
for SMP systems (ensure that that object is modified / set / replaced
in main memory atomically) and has nothing to do with volatile
behaviour.

Volatile behaviour itself isn't consistently defined (at least
definitely not consistently implemented in various gcc versions across
platforms), but it is /expected/ to mean something like: ensure that
every such access actually goes all the way to memory, and is not
re-ordered w.r.t. to other accesses, as far as the compiler can take
care of these. The last as far as compiler can take care disclaimer
comes about due to CPUs doing their own re-ordering nowadays.

For example (say on i386):

(A)
$ cat tp1.c
int a;

void func(void)
{
a = 10;
a = 20;
}
$ gcc -Os -S tp1.c
$ cat tp1.s
...
movl$20, a
...

(B)
$ cat tp2.c
volatile int a;

void func(void)
{
a = 10;
a = 20;
}
$ gcc -Os -S tp2.c
$ cat tp2.s
...
movl$10, a
movl$20, a
...

(C)
$ cat tp3.c
int a;

void func(void)
{
*(volatile int *)a = 10;
*(volatile int *)a = 20;
}
$ gcc -Os -S tp3.c
$ cat tp3.s
...
movl$10, a
movl$20, a
...

In (A) the compiler optimized a = 10; away, but the actual store
of the final value 20 to a was still atomic. (B) and (C) also
exhibit volatile behaviour apart from the atomicity.

But as others replied, it seems some callers out there depend upon
atomic ops exhibiting volatile behaviour as well, so that answers
my initial question, actually. I haven't looked at the code Paul
pointed me at, but I wonder if that forget(x) macro would help
those cases. I'd wish to avoid the volatile primitive, personally.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


set_multicast_list vs. set_rx_mode

2007-08-15 Thread Johannes Berg
Hey,

Is it intentional that in the case where set_rx_mode is assigned, you
still need to assign set_multicast_list even if it won't ever be called
as a flag for SIOCADDMULTI?

I was thinking of converting the wireless code to use set_rx_mode and
assign set_multicast_list only if the underlying hardware supports
multicast filtering, and it seems that is well-supported, but it does
seem a bit weird that set_multicast_list degrades to a flag.

johannes


signature.asc
Description: This is a digitally signed message part


Re: set_multicast_list vs. set_rx_mode

2007-08-15 Thread Patrick McHardy
Johannes Berg wrote:
 Is it intentional that in the case where set_rx_mode is assigned, you
 still need to assign set_multicast_list even if it won't ever be called
 as a flag for SIOCADDMULTI?
 
 I was thinking of converting the wireless code to use set_rx_mode and
 assign set_multicast_list only if the underlying hardware supports
 multicast filtering, and it seems that is well-supported, but it does
 seem a bit weird that set_multicast_list degrades to a flag.


Indeed, I missed that. It should check for !dev-set_multicast_list 
!dev-set_rx_mode before returning -EINVAL.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: set_multicast_list vs. set_rx_mode

2007-08-15 Thread Johannes Berg
On Wed, 2007-08-15 at 14:33 +0200, Patrick McHardy wrote:
 Johannes Berg wrote:
  Is it intentional that in the case where set_rx_mode is assigned, you
  still need to assign set_multicast_list even if it won't ever be called
  as a flag for SIOCADDMULTI?
  
  I was thinking of converting the wireless code to use set_rx_mode and
  assign set_multicast_list only if the underlying hardware supports
  multicast filtering, and it seems that is well-supported, but it does
  seem a bit weird that set_multicast_list degrades to a flag.
 
 
 Indeed, I missed that. It should check for !dev-set_multicast_list 
 !dev-set_rx_mode before returning -EINVAL.

Ok. Want me to send a patch?

And then the expected behaviour is that set_rx_mode will set
IFF_ALLMULTI if it can't honour the list, right?

johannes


signature.asc
Description: This is a digitally signed message part


Re: [RFC: -mm patch] improve the SSB dependencies

2007-08-15 Thread Michael Buesch
On Wednesday 15 August 2007 02:40:35 John W. Linville wrote:
 On Mon, Aug 13, 2007 at 12:44:02AM +0200, Adrian Bunk wrote:
  On Sun, Aug 12, 2007 at 02:00:26PM +0200, Michael Buesch wrote:
   Ok, I'll give it a try, with small modifications.
  
  Thanks.
  
   On Sunday 12 August 2007, Adrian Bunk wrote:
Additional changes in this patch:
- small help text changes
- B44_PCI is no longer usr visible (automatically enabled when possible)
   
   I think we want that to be selectable, as it's not needed
   on some embedded devices. And we need to save memory there.
  ...
  
  Makes sense, but then:
  
  config B44_PCI
  bool Broadcom 440x PCI device support if EMBEDDED
  ...
  default y
  ...
  
  I don't care about how many options we present if CONFIG_EMBEDDED=y, but 
  for the normal CONFIG_EMBEDDED=n case we should not bother the user with 
  this option.
 
 Was all this resolved?  Was there another patch?  If so, I missed it...

I am going to send one, soon.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: set_multicast_list vs. set_rx_mode

2007-08-15 Thread Johannes Berg
On Wed, 2007-08-15 at 14:33 +0200, Patrick McHardy wrote:
 Johannes Berg wrote:
  Is it intentional that in the case where set_rx_mode is assigned, you
  still need to assign set_multicast_list even if it won't ever be called
  as a flag for SIOCADDMULTI?
  
  I was thinking of converting the wireless code to use set_rx_mode and
  assign set_multicast_list only if the underlying hardware supports
  multicast filtering, and it seems that is well-supported, but it does
  seem a bit weird that set_multicast_list degrades to a flag.
 
 
 Indeed, I missed that. It should check for !dev-set_multicast_list 
 !dev-set_rx_mode before returning -EINVAL.

Hmm. We don't really support multiple unicast addresses so I should
probably not use set_rx_mode. What is the meaning of
dev-change_rx_flags? It seems to be called with IFF_ALLMULTI but if it
is assigned and set_multicast_list is not then you also cannot add
multicast addresses via SIOCADDMULTI.

johannes


signature.asc
Description: This is a digitally signed message part


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Stefan Richter
Satyam Sharma wrote:
 On Wed, 15 Aug 2007, Stefan Richter wrote:
 Doesn't atomic WRT all processors require volatility?
 
 No, it definitely doesn't. Why should it?
 
 Atomic w.r.t. all processors is just your normal, simple atomicity
 for SMP systems (ensure that that object is modified / set / replaced
 in main memory atomically) and has nothing to do with volatile
 behaviour.
 
 Volatile behaviour itself isn't consistently defined (at least
 definitely not consistently implemented in various gcc versions across
 platforms), but it is /expected/ to mean something like: ensure that
 every such access actually goes all the way to memory, and is not
 re-ordered w.r.t. to other accesses, as far as the compiler can take
 care of these. The last as far as compiler can take care disclaimer
 comes about due to CPUs doing their own re-ordering nowadays.
 
 For example (say on i386):

[...]

 In (A) the compiler optimized a = 10; away, but the actual store
 of the final value 20 to a was still atomic. (B) and (C) also
 exhibit volatile behaviour apart from the atomicity.
 
 But as others replied, it seems some callers out there depend upon
 atomic ops exhibiting volatile behaviour as well, so that answers
 my initial question, actually. I haven't looked at the code Paul
 pointed me at, but I wonder if that forget(x) macro would help
 those cases. I'd wish to avoid the volatile primitive, personally.

So, looking at load instead of store, understand I correctly that in
your opinion

int b;

b = atomic_read(a);
if (b)
do_something_time_consuming();

b = atomic_read(a);
if (b)
do_something_more();

should be changed to explicitly forget(a) after
do_something_time_consuming?

If so, how about the following:

static inline void A(atomic_t *a)
{
int b = atomic_read(a);
if (b)
do_something_time_consuming();
}

static inline void B(atomic_t *a)
{
int b = atomic_read(a);
if (b)
do_something_more();
}

static void C(atomic_t *a)
{
A(a);
B(b);
}

Would this need forget(a) after A(a)?

(Is the latter actually answered in C99 or is it compiler-dependent?)
-- 
Stefan Richter
-=-=-=== =--- -
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Stefan Richter
I wrote:
 static inline void A(atomic_t *a)
 {
   int b = atomic_read(a);
   if (b)
   do_something_time_consuming();
 }
 
 static inline void B(atomic_t *a)
 {
   int b = atomic_read(a);
   if (b)
   do_something_more();
 }
 
 static void C(atomic_t *a)
 {
   A(a);
   B(b);
/* ^ typo */
B(a);
 }
 
 Would this need forget(a) after A(a)?
 
 (Is the latter actually answered in C99 or is it compiler-dependent?)


-- 
Stefan Richter
-=-=-=== =--- -
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Nick Piggin

Paul E. McKenney wrote:

On Tue, Aug 14, 2007 at 03:34:25PM +1000, Nick Piggin wrote:



Maybe it is the safe way to go, but it does obscure cases where there
is a real need for barriers.



I prefer burying barriers into other primitives.


When they should naturally be there, eg. locking or the RCU primitives,
I agree.

I don't like having them scattered in various just in case places,
because it makes both the users and the APIs hard to understand and
change.

Especially since several big architectures don't have volatile in their
atomic_get and _set, I think it would be a step backwards to add them in
as a just in case thin now (unless there is a better reason).



Many atomic operations are allowed to be reordered between CPUs, so
I don't have a good idea for the rationale to order them within the
CPU (also loads and stores to long and ptr types are not ordered like
this, although we do consider those to be atomic operations too).

barrier() in a way is like enforcing sequential memory ordering
between process and interrupt context, wheras volatile is just
enforcing coherency of a single memory location (and as such is
cheaper).



barrier() is useful, but it has the very painful side-effect of forcing
the compiler to dump temporaries.  So we do need something that is
not quite so global in effect.


Yep.



What do you think of this crazy idea?

/* Enforce a compiler barrier for only operations to location X.
* Call multiple times to provide an ordering between multiple
* memory locations. Other memory operations can be assumed by
* the compiler to remain unchanged and may be reordered
*/
#define order(x) asm volatile( : +m (x))



There was something very similar discussed earlier in this thread,
with quite a bit of debate as to exactly what the m flag should
look like.  I suggested something similar named ACCESS_ONCE in the
context of RCU (http://lkml.org/lkml/2007/7/11/664):


Oh, I missed that earlier debate. Will go have a look.



#define ACCESS_ONCE(x) (*(volatile typeof(x) *)(x))

The nice thing about this is that it works for both loads and stores.
Not clear that order() above does this -- I get compiler errors when
I try something like b = order(a) or order(a) = 1 using gcc 4.1.2.


As Arnd ponted out, order() is not supposed to be an lvalue, but a
statement like the rest of our memory ordering API.

As to your followup question of why to use it over ACCESS_ONCE. I
guess, aside from consistency with the rest of the barrier APIs, you
can use it in other primitives when you don't actually know what the
caller is going to do or if it even will make an access. You could
also use it between calls to _other_ primitives, etc... it just
seems more flexible to me, but I haven't actually used such a thing
in real code...

ACCESS_ONCE doesn't seem as descriptive. What it results in is the
memory location being loaded or stored (presumably once exactly),
but I think the more general underlying idea is a barrier point.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] b44: Fix MII interface of the B44 driver

2007-08-15 Thread Michael Buesch
From: Aurelien Jarno [EMAIL PROTECTED]

The conversion of the B44 driver to SSB bus added the functions
__b44_{read,write}phy functions that have an argument phy_id. This gives
a way to fix the b44_mii_{read,write} functions used for MII interfaces.

The patch below does that. It allows B44 devices with integrated switch
to be configured from userland.

Signed-off-by: Aurelien Jarno [EMAIL PROTECTED]
Signed-off-by: Michael Buesch [EMAIL PROTECTED]

diff --git a/drivers/net/b44.c b/drivers/net/b44.c
index 36724fb..e6883f4 100644
--- a/drivers/net/b44.c
+++ b/drivers/net/b44.c
@@ -300,16 +300,11 @@ static inline int b44_writephy(struct b44 *bp, int reg, 
u32 val)
 }
 
 /* miilib interface */
-/* FIXME FIXME: phy_id is ignored, bp-phy_addr use is unconditional
- * due to code existing before miilib use was added to this driver.
- * Someone should remove this artificial driver limitation in
- * b44_{read,write}phy.  bp-phy_addr itself is fine (and needed).
- */
 static int b44_mii_read(struct net_device *dev, int phy_id, int location)
 {
u32 val;
struct b44 *bp = netdev_priv(dev);
-   int rc = b44_readphy(bp, location, val);
+   int rc = __b44_readphy(bp, phy_id, location, val);
if (rc)
return 0x;
return val;
@@ -319,7 +314,7 @@ static void b44_mii_write(struct net_device *dev, int 
phy_id, int location,
 int val)
 {
struct b44 *bp = netdev_priv(dev);
-   b44_writephy(bp, location, val);
+   __b44_writephy(bp, phy_id, location, val);
 }
 
 static int b44_phy_reset(struct b44 *bp)


-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


b44: Fix Kconfig dependencies by using select

2007-08-15 Thread Michael Buesch
Signed-off-by: Michael Buesch [EMAIL PROTECTED]

Index: wireless-dev-new/drivers/net/Kconfig
===
--- wireless-dev-new.orig/drivers/net/Kconfig   2007-08-11 00:49:28.0 
+0200
+++ wireless-dev-new/drivers/net/Kconfig2007-08-15 15:29:36.0 
+0200
@@ -1454,7 +1454,7 @@ config APRICOT
 
 config B44
tristate Broadcom 440x/47xx ethernet support
-   depends on HAS_IOMEM
+   depends on SSB_POSSIBLE
select SSB
select MII
help
@@ -1462,27 +1462,28 @@ config B44
  or M and read the Ethernet-HOWTO, available from
  http://www.tldp.org/docs.html#howto.
 
- If you have a Broadcom 440x PCI device (and if you don't
- know, you _do_ have one) you must also select the options
- EISA, VLB, PCI and on board controllers above and
- Broadcom 440x PCI device support below.
-
  To compile this driver as a module, choose M here and read
  file:Documentation/networking/net-modules.txt.  The module will be
  called b44.
 
-config B44_PCI
-   bool Broadcom 440x PCI device support
-   depends on B44  NET_PCI
+# Auto-select SSB PCI-HOST support, if possible
+config B44_PCI_AUTOSELECT
+   bool
+   depends on B44  SSB_PCIHOST_POSSIBLE
select SSB_PCIHOST
+   default y
+
+# Auto-select SSB PCICORE driver, if possible
+config B44_PCICORE_AUTOSELECT
+   bool
+   depends on B44  SSB_DRIVER_PCICORE_POSSIBLE
select SSB_DRIVER_PCICORE
default y
-   help
- Support for Broadcom 440x PCI devices.
 
- Say Y, unless you know what you are doing.
- If you say N here I will _not_ listen to your
- bugreports!
+config B44_PCI
+   bool
+   depends on B44_PCI_AUTOSELECT  B44_PCICORE_AUTOSELECT
+   default y
 
 config FORCEDETH
tristate nForce Ethernet support
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Stefan Richter
On 8/15/2007 10:18 AM, Heiko Carstens wrote:
 On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote:
 Chris Snook [EMAIL PROTECTED] wrote:
  
  Because atomic operations are generally used for synchronization, which 
  requires 
  volatile behavior.  Most such codepaths currently use an inefficient 
  barrier(). 
   Some forget to and we get bugs, because people assume that atomic_read() 
  actually reads something, and atomic_write() actually writes something.  
  Worse, 
  these are architecture-specific, even compiler version-specific bugs that 
  are 
  often difficult to track down.
 
 I'm yet to see a single example from the current tree where
 this patch series is the correct solution.  So far the only
 example has been a buggy piece of code which has since been
 fixed with a cpu_relax.
 
 Btw.: we still have
 
 include/asm-i386/mach-es7000/mach_wakecpu.h:  while (!atomic_read(deassert));
 include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert));
 
 Looks like they need to be fixed as well.


I don't know if this here is affected:

/* drivers/ieee1394/ieee1394_core.h */
static inline unsigned int get_hpsb_generation(struct hpsb_host *host)
{
return atomic_read(host-generation);
}

/* drivers/ieee1394/nodemgr.c */
static int nodemgr_host_thread(void *__hi)
{
[...]

for (;;) {
[... sleep until bus reset event ...]

/* Pause for 1/4 second in 1/16 second intervals,
 * to make sure things settle down. */
g = get_hpsb_generation(host);
for (i = 0; i  4 ; i++) {
if (msleep_interruptible(63) ||
kthread_should_stop())
goto exit;

/* Now get the generation in which the node ID's we collect
 * are valid.  During the bus scan we will use this generation
 * for the read transactions, so that if another reset occurs
 * during the scan the transactions will fail instead of
 * returning bogus data. */

generation = get_hpsb_generation(host);

/* If we get a reset before we are done waiting, then
 * start the waiting over again */

if (generation != g)
g = generation, i = 0;
}

[... scan bus, using generation ...]

}
exit:
[...]
}



-- 
Stefan Richter
-=-=-=== =--- -
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma


On Wed, 15 Aug 2007, Stefan Richter wrote:

 Satyam Sharma wrote:
  On Wed, 15 Aug 2007, Stefan Richter wrote:
  Doesn't atomic WRT all processors require volatility?
  
  No, it definitely doesn't. Why should it?
  
  Atomic w.r.t. all processors is just your normal, simple atomicity
  for SMP systems (ensure that that object is modified / set / replaced
  in main memory atomically) and has nothing to do with volatile
  behaviour.
  
  Volatile behaviour itself isn't consistently defined (at least
  definitely not consistently implemented in various gcc versions across
  platforms), but it is /expected/ to mean something like: ensure that
  every such access actually goes all the way to memory, and is not
  re-ordered w.r.t. to other accesses, as far as the compiler can take
  care of these. The last as far as compiler can take care disclaimer
  comes about due to CPUs doing their own re-ordering nowadays.
  
  For example (say on i386):
 
 [...]
 
  In (A) the compiler optimized a = 10; away, but the actual store
  of the final value 20 to a was still atomic. (B) and (C) also
  exhibit volatile behaviour apart from the atomicity.
  
  But as others replied, it seems some callers out there depend upon
  atomic ops exhibiting volatile behaviour as well, so that answers
  my initial question, actually. I haven't looked at the code Paul
  pointed me at, but I wonder if that forget(x) macro would help
  those cases. I'd wish to avoid the volatile primitive, personally.
 
 So, looking at load instead of store, understand I correctly that in
 your opinion
 
   int b;
 
   b = atomic_read(a);
   if (b)
   do_something_time_consuming();
 
   b = atomic_read(a);
   if (b)
   do_something_more();
 
 should be changed to explicitly forget(a) after
 do_something_time_consuming?

No, I'd actually prefer something like what Christoph Lameter suggested,
i.e. users (such as above) who want volatile-like behaviour from atomic
ops can use alternative functions. How about something like:

#define atomic_read_volatile(v) \
({  \
forget((v)-counter);  \
((v)-counter); \
})

Or possibly, implement these volatile atomic ops variants in inline asm
like the patch that Sebastian Siewior has submitted on another thread just
a while back.

Of course, if we find there are more callers in the kernel who want the
volatility behaviour than those who don't care, we can re-define the
existing ops to such variants, and re-name the existing definitions to
somethine else, say atomic_read_nonvolatile for all I care.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma
Hi Stefan,


On Wed, 15 Aug 2007, Stefan Richter wrote:

 On 8/15/2007 10:18 AM, Heiko Carstens wrote:
  On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote:
  Chris Snook [EMAIL PROTECTED] wrote:
   
   Because atomic operations are generally used for synchronization, which 
   requires 
   volatile behavior.  Most such codepaths currently use an inefficient 
   barrier(). 
Some forget to and we get bugs, because people assume that 
   atomic_read() 
   actually reads something, and atomic_write() actually writes something.  
   Worse, 
   these are architecture-specific, even compiler version-specific bugs 
   that are 
   often difficult to track down.
  
  I'm yet to see a single example from the current tree where
  this patch series is the correct solution.  So far the only
  example has been a buggy piece of code which has since been
  fixed with a cpu_relax.
  
  Btw.: we still have
  
  include/asm-i386/mach-es7000/mach_wakecpu.h:  while 
  (!atomic_read(deassert));
  include/asm-i386/mach-default/mach_wakecpu.h: while 
  (!atomic_read(deassert));
  
  Looks like they need to be fixed as well.
 
 
 I don't know if this here is affected:

Yes, I think it is. You're clearly expecting the read to actually happen
when you call get_hpsb_generation(). It's clearly not a busy-loop, so
cpu_relax() sounds pointless / wrong solution for this case, so I'm now
somewhat beginning to appreciate the motivation behind this series :-)

But as I said, there are ways to achieve the same goals of this series
without using volatile.

I think I'll submit a RFC/patch or two on this myself (will also fix
the code pieces listed here).


 /* drivers/ieee1394/ieee1394_core.h */
 static inline unsigned int get_hpsb_generation(struct hpsb_host *host)
 {
   return atomic_read(host-generation);
 }
 
 /* drivers/ieee1394/nodemgr.c */
 static int nodemgr_host_thread(void *__hi)
 {
   [...]
 
   for (;;) {
   [... sleep until bus reset event ...]
 
   /* Pause for 1/4 second in 1/16 second intervals,
* to make sure things settle down. */
   g = get_hpsb_generation(host);
   for (i = 0; i  4 ; i++) {
   if (msleep_interruptible(63) ||
   kthread_should_stop())
   goto exit;

Totally unrelated, but this looks weird. IMHO you actually wanted:

msleep_interruptible(63);
if (kthread_should_stop())
goto exit;

here, didn't you? Otherwise the thread will exit even when
kthread_should_stop() != TRUE (just because it received a signal),
and it is not good for a kthread to exit on its own if it uses
kthread_should_stop() or if some other piece of kernel code could
eventually call kthread_stop(tsk) on it.

Ok, probably the thread will never receive a signal in the first
place because it's spawned off kthreadd which ignores all signals
beforehand, but still ...

[PATCH] ieee1394: Fix kthread stopping in nodemgr_host_thread

The nodemgr host thread can exit on its own even when kthread_should_stop
is not true, on receiving a signal (might never happen in practice, as
it ignores signals). But considering kthread_stop() must not be mixed with
kthreads that can exit on their own, I think changing the code like this
is clearer. This change means the thread can cut its sleep short when
receive a signal but looking at the code around, that sounds okay (and
again, it might never actually recieve a signal in practice).

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/ieee1394/nodemgr.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/ieee1394/nodemgr.c b/drivers/ieee1394/nodemgr.c
index 2ffd534..981a7da 100644
--- a/drivers/ieee1394/nodemgr.c
+++ b/drivers/ieee1394/nodemgr.c
@@ -1721,7 +1721,8 @@ static int nodemgr_host_thread(void *__hi)
 * to make sure things settle down. */
g = get_hpsb_generation(host);
for (i = 0; i  4 ; i++) {
-   if (msleep_interruptible(63) || kthread_should_stop())
+   msleep_interruptible(63);
+   if (kthread_should_stop())
goto exit;
 
/* Now get the generation in which the node ID's we 
collect
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
 On Wed, 15 Aug 2007, Stefan Richter wrote:
  Satyam Sharma wrote:
   On Wed, 15 Aug 2007, Stefan Richter wrote:
   Doesn't atomic WRT all processors require volatility?
   
   No, it definitely doesn't. Why should it?
   
   Atomic w.r.t. all processors is just your normal, simple atomicity
   for SMP systems (ensure that that object is modified / set / replaced
   in main memory atomically) and has nothing to do with volatile
   behaviour.
   
   Volatile behaviour itself isn't consistently defined (at least
   definitely not consistently implemented in various gcc versions across
   platforms), but it is /expected/ to mean something like: ensure that
   every such access actually goes all the way to memory, and is not
   re-ordered w.r.t. to other accesses, as far as the compiler can take
   care of these. The last as far as compiler can take care disclaimer
   comes about due to CPUs doing their own re-ordering nowadays.
   
   For example (say on i386):
  
  [...]
  
   In (A) the compiler optimized a = 10; away, but the actual store
   of the final value 20 to a was still atomic. (B) and (C) also
   exhibit volatile behaviour apart from the atomicity.
   
   But as others replied, it seems some callers out there depend upon
   atomic ops exhibiting volatile behaviour as well, so that answers
   my initial question, actually. I haven't looked at the code Paul
   pointed me at, but I wonder if that forget(x) macro would help
   those cases. I'd wish to avoid the volatile primitive, personally.
  
  So, looking at load instead of store, understand I correctly that in
  your opinion
  
  int b;
  
  b = atomic_read(a);
  if (b)
  do_something_time_consuming();
  
  b = atomic_read(a);
  if (b)
  do_something_more();
  
  should be changed to explicitly forget(a) after
  do_something_time_consuming?
 
 No, I'd actually prefer something like what Christoph Lameter suggested,
 i.e. users (such as above) who want volatile-like behaviour from atomic
 ops can use alternative functions. How about something like:
 
 #define atomic_read_volatile(v)   \
   ({  \
   forget((v)-counter);  \
   ((v)-counter); \
   })

Wouldn't the above forget the value, throw it away, then forget
that it forgot it, giving non-volatile semantics?

 Or possibly, implement these volatile atomic ops variants in inline asm
 like the patch that Sebastian Siewior has submitted on another thread just
 a while back.

Given that you are advocating a change (please keep in mind that
atomic_read() and atomic_set() had volatile semantics on almost all
platforms), care to give some example where these historical volatile
semantics are causing a problem?

 Of course, if we find there are more callers in the kernel who want the
 volatility behaviour than those who don't care, we can re-define the
 existing ops to such variants, and re-name the existing definitions to
 somethine else, say atomic_read_nonvolatile for all I care.

Do we really need another set of APIs?  Can you give even one example
where the pre-existing volatile semantics are causing enough of a problem
to justify adding yet more atomic_*() APIs?

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.

2007-08-15 Thread Steve Wise



David Miller wrote:

From: Sean Hefty [EMAIL PROTECTED]
Date: Thu, 09 Aug 2007 14:40:16 -0700


Steve Wise wrote:

Any more comments?
Does anyone have ideas on how to reserve the port space without using a 
struct socket?


How about we just remove the RDMA stack altogether?  I am not at all
kidding.  If you guys can't stay in your sand box and need to cause
problems for the normal network stack, it's unacceptable.  We were
told all along the if RDMA went into the tree none of this kind of
stuff would be an issue.


I think removing the RDMA stack is the wrong thing to do, and you 
shouldn't just threaten to yank entire subsystems because you don't like 
the technology.  Lets keep this constructive, can we?  RDMA should get 
the respect of any other technology in Linux.  Maybe its a niche in your 
opinion, but come on, there's more RDMA users than say, the sparc64 
port.  Eh?




These are exactly the kinds of problems for which people like myself
were dreading.  These subsystems have no buisness using the TCP port
space of the Linux software stack, absolutely none.



Ok, although IMO its the correct solution.  But I'll propose other 
solutions below.  I ask for your feedback (and everyones!) on these 
alternate solutions.



After TCP port reservation, what's next?  It seems an at least
bi-monthly event that the RDMA folks need to put their fingers
into something else in the normal networking stack.  No more.



The only other change requested and commited, if I recall correctly, was 
for netevents, and that enabled both Infiniband and iWARP to integrate 
with the neighbour subsystem.  I think that was a useful and needed 
change.  Prior to that, these subsystems were snooping ARP replies to 
trigger events.  That was back in 2.6.18 or 2.6.19 I think...



I will NACK any patch that opens up sockets to eat up ports or
anything stupid like that.


Got it.

Here are alternate solutions that avoid the need to share the port space:

Solution 1)

1) admins must setup an alias interface on the iwarp device for use with 
rdma.  This interface will have to be a separate subnet from the TCP 
used interface.  And with a canonical name that indicates its for rdma 
only.  Like eth2:iw or eth2:rdma.  There can be many of these per device.


2) admins make sure their sockets/tcp services don't use the interface 
configured in #1, and their rdma service do use said interface.


3) iwarp providers must translation binds to ipaddr 0.0.0.0 to the 
associated for rdma only ip addresses.  They can do this by searching 
for all aliases of the canonical name that are aliases of the TCP 
interface for their nic device.  Or: somehow not handle incoming 
connections to any address but the for rdma use addresses and instead 
pass them up and not offload them.


This will avoid the collisions as long as the above steps are followed.


Solution 2)

Another possibility would be for the driver to create two net devices 
(and hence two interace names) like eth2 and iw2, and artificially 
separate the RDMA stuff that way.


These two solutions are similar in that they create a rdma only interface.

Pros:
- is not intrusive into the core networking code
- very minimal changes needed and in the iwarp provider's code, who are 
the ones with this problem

- makes it clear which subnets are RDMA only

Cons:
- relies on system admin to set it up correctly.
- native stack can still use this rdma-only interface and the same 
port space issue will exist.



For the record, here are possible port-sharing solutions Dave sez he'll NAK:

Solution NAK-1)

The rdma-cma just allocates a socket and binds it to reserve TCP ports.

Pros:
- minimal changes needed to implement (always a plus in my mind :)
- simple, clean, and it works (KISS)
- if no RDMA is in use, there is no impact on the native stack
- no need for a seperate RDMA interface

Cons:
- wastes memory
- puts a TCP socket in the CLOSED state in the pcb tables.
- Dave will NAK it :)

Solution NAK-2)

Create a low-level sockets-agnostic port allocation service that is 
shared by both TCP and RDMA.  This way, the rdma-cm can reserve ports in 
an efficient manor instead of doing it via kernel_bind() using a sock 
struct.


Pros:
- probably the correct solution (my opinion :) if we went down the path 
of sharing port space

- if no RDMA is in use, there is no impact on the native stack
- no need for a separate RDMA interface

Cons:

- very intrusive change because the port allocations stuff is tightly 
bound to the host stack and sock struct, etc.

- Dave will NAK it :)


Steve.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Arnd Bergmann
On Wednesday 15 August 2007, Paul E. McKenney wrote:

 ACCESS_ONCE() is indeed intended to be used when actually loading or
 storing the variable.  That said, I must admit that it is not clear to me
 why you would want to add an extra order() rather than ACCESS_ONCE()ing
 one or both of the adjacent accesses to that same variable.
 
 So, what am I missing?

You're probably right, the only case I can construct is something like

if (ACCESS_ONCE(x)) {
...
ACCESS_ONCE(x)++;
}

which would be slightly less efficient than

if (x)
x++;
order(x);

because in the first case, you need to do two ordered accesses
but only one in the second case. However, I can't think of a case
where this actually makes a noticable difference in real life.

Arnd 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Herbert Xu
On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:

  I don't know if this here is affected:
 
 Yes, I think it is. You're clearly expecting the read to actually happen
 when you call get_hpsb_generation(). It's clearly not a busy-loop, so
 cpu_relax() sounds pointless / wrong solution for this case, so I'm now
 somewhat beginning to appreciate the motivation behind this series :-)

Nope, we're calling schedule which is a rather heavy-weight
barrier.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] ipv6: corrects sended rtnetlink message

2007-08-15 Thread Milan Kocian
ipv6 sends a RTM_DELLINK netlink message on both events: NETDEV_DOWN,
NETDEV_UNREGISTER. Corrected by sending RTM_NEWLINK on NETDEV_DOWN event
and RTM_DELLINK on NETDEV_UNREGISTER event.
---

btw. I don't know why protocol sends a NL messages about iface state
changes. It comes from other sources. By contrast ipv6 doesn't send
no message on NETDEV_UP event !? IMO best possibility is deletion of
sending this message.


--- a/net/ipv6/addrconf.c   2007-08-13 11:35:23.481430029 +0200
+++ b/net/ipv6/addrconf.c   2007-08-13 12:47:01.825690662 +0200
@@ -2488,7 +2488,10 @@ static int addrconf_ifdown(struct net_de
 
/* Step 5: netlink notification of this interface */
idev-tstamp = jiffies;
-   inet6_ifinfo_notify(RTM_DELLINK, idev);
+   if (how)
+   inet6_ifinfo_notify(RTM_DELLINK, idev);
+   else 
+   inet6_ifinfo_notify(RTM_NEWLINK, idev);
 
/* Shot the device (if unregistered) */
 

Signed-off-by: Milan Kocian [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Michael Buesch
On Wednesday 15 August 2007 15:29:43 Arnd Bergmann wrote:
 On Wednesday 15 August 2007, Paul E. McKenney wrote:
 
  ACCESS_ONCE() is indeed intended to be used when actually loading or
  storing the variable.  That said, I must admit that it is not clear to me
  why you would want to add an extra order() rather than ACCESS_ONCE()ing
  one or both of the adjacent accesses to that same variable.
  
  So, what am I missing?
 
 You're probably right, the only case I can construct is something like
 
   if (ACCESS_ONCE(x)) {
   ...
   ACCESS_ONCE(x)++;
   }
 
 which would be slightly less efficient than
 
   if (x)
   x++;
   order(x);
 
 because in the first case, you need to do two ordered accesses
 but only one in the second case. However, I can't think of a case
 where this actually makes a noticable difference in real life.

How can this example actually get used in a sane and race-free
way? This would need locking around the whole if
statement. But locking is a barrier.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Herbert Xu
On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
 
 Do we really need another set of APIs?  Can you give even one example
 where the pre-existing volatile semantics are causing enough of a problem
 to justify adding yet more atomic_*() APIs?

Let's turn this around.  Can you give a single example where
the volatile semantics is needed in a legitimate way?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: skb_pull_rcsum - Fatal exception in interrupt

2007-08-15 Thread Evgeniy Polyakov
Hi Alan.

On Wed, Aug 15, 2007 at 04:07:23PM +0100, Alan J. Wylie ([EMAIL PROTECTED]) 
wrote:
 EIP: [c02b6fb2] skb_pull_rcsum+0x6d/0x71 SS:ESP 09068:c03e1ea4
 Kernel panic - not syncing: Fatal exception in interrupt

At least with this patch it should not panic.
More correct solution might be to use pskb_may_pull() or check aditional
length in llc_fixup_skb().
Actually if dmesg will show that there is something in fragments, it
should use pskb_may_pull(). The same bug exist in bridge and vlan, btw,
so it might be a solution to remove bug_on from skb_pull_rcsum() and
instead call may_pull?

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/net/802/psnap.c b/net/802/psnap.c
index 04ee43e..5f410e9 100644
--- a/net/802/psnap.c
+++ b/net/802/psnap.c
@@ -60,13 +60,24 @@ static int snap_rcv(struct sk_buff *skb, struct net_device 
*dev,
if (proto) {
/* Pass the frame on. */
skb-transport_header += 5;
+   if (skb-len  5 || skb-len - 5  skb-data_len) {
+   if (net_ratelimit())
+   printk(KERN_NOTICE Short packet: len: %u, 
+   data_len: %u.\n,
+   skb-len, skb-data_len);
+   goto err_out;
+   }
skb_pull_rcsum(skb, 5);
rc = proto-rcvfunc(skb, dev, snap_packet_type, orig_dev);
-   } else {
-   skb-sk = NULL;
-   kfree_skb(skb);
-   rc = 1;
}
+   
+   rcu_read_unlock();
+   return rc;
+
+err_out:
+   skb-sk = NULL;
+   kfree_skb(skb);
+   rc = 1;
 
rcu_read_unlock();
return rc;


-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote:
 On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
  
  Do we really need another set of APIs?  Can you give even one example
  where the pre-existing volatile semantics are causing enough of a problem
  to justify adding yet more atomic_*() APIs?
 
 Let's turn this around.  Can you give a single example where
 the volatile semantics is needed in a legitimate way?

Sorry, but you are the one advocating for the change.

Nice try, though!  ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Stefan Richter
Herbert Xu wrote:
 On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
 I don't know if this here is affected:

[...something like]
b = atomic_read(a);
for (i = 0; i  4; i++) {
msleep_interruptible(63);
c = atomic_read(a);
if (c != b) {
b = c;
i = 0;
}
}

 Nope, we're calling schedule which is a rather heavy-weight
 barrier.

How does the compiler know that msleep() has got barrier()s?
-- 
Stefan Richter
-=-=-=== =--- -
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Chris Snook

Herbert Xu wrote:

Chris Snook [EMAIL PROTECTED] wrote:
Because atomic operations are generally used for synchronization, which requires 
volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
 Some forget to and we get bugs, because people assume that atomic_read() 
actually reads something, and atomic_write() actually writes something.  Worse, 
these are architecture-specific, even compiler version-specific bugs that are 
often difficult to track down.


I'm yet to see a single example from the current tree where
this patch series is the correct solution.  So far the only
example has been a buggy piece of code which has since been
fixed with a cpu_relax.


Part of the motivation here is to fix heisenbugs.  If I knew where they 
were, I'd be posting patches for them.  Unlike most bugs, where we want 
to expose them as obviously as possible, these can be extremely 
difficult to track down, and are often due to people assuming that the 
atomic_* operations have the same semantics they've historically had. 
Remember that until recently, all SMP architectures except s390 (which 
very few kernel developers outside of IBM, Red Hat, and SuSE do much 
work on) had volatile declarations for atomic_t.  Removing the volatile 
declarations from i386 and x86_64 may have created heisenbugs that won't 
manifest themselves until GCC 6.0 comes out and people start compiling 
kernels with -O5.  We should have consistent semantics for atomic_* 
operations.


The other motivation is to reduce the need for the barriers used to 
prevent/fix such problems which clobber all your registers, and instead 
force atomic_* operations to behave in the way they're actually used. 
After the (resubmitted) patchset is merged, we'll be able to remove a 
whole bunch of barriers, shrinking our source and our binaries, and 
improving performance.


-- Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] improved xfrm_audit_log() patch

2007-08-15 Thread Joy Latten
On Tue, 2007-08-07 at 18:32 -0700, David Miller wrote:
From: Joy Latten [EMAIL PROTECTED]
Date: Thu, 2 Aug 2007 15:56:47 -0500

 @@ -426,10 +426,15 @@ struct xfrm_audit
  };
  
  #ifdef CONFIG_AUDITSYSCALL
 -extern void xfrm_audit_log(uid_t auid, u32 secid, int type, int result,
 -struct xfrm_policy *xp, struct xfrm_state *x);
 +extern void xfrm_audit_log(struct xfrm_audit audit_info, int result,
 +   __be32 flowid, struct xfrm_policy *xp, 
 +   struct xfrm_state *x, char *buf);

Passing audit_info as an aggregate argument puts them into
previous argument registers, or if they are not enough it
goes either partially of wholly onto the stack, depending
upon architecture.

In fact you've made the argument register usage worse than
in your previous revision. :-/

Perhaps you meant to pass struct xfrm_audit * instead?

Revised patch to pass pointer to struct xfrm_audit.
Sorry, I missed that.

Signed-off-by: Joy Latten [EMAIL PROTECTED]


diff -urpN linux-2.6.22/include/linux/audit.h 
linux-2.6.22.patch/include/linux/audit.h
--- linux-2.6.22/include/linux/audit.h  2007-08-14 18:13:53.0 -0500
+++ linux-2.6.22.patch/include/linux/audit.h2007-08-14 19:08:42.0 
-0500
@@ -112,6 +112,7 @@
 #define AUDIT_MAC_IPSEC_DELSA  1412/* Delete a XFRM state */
 #define AUDIT_MAC_IPSEC_ADDSPD 1413/* Add a XFRM policy */
 #define AUDIT_MAC_IPSEC_DELSPD 1414/* Delete a XFRM policy */
+#define AUDIT_MAC_IPSEC_EVENT  1415/* Audit IPSec events */
 
 #define AUDIT_FIRST_KERN_ANOM_MSG   1700
 #define AUDIT_LAST_KERN_ANOM_MSG1799
diff -urpN linux-2.6.22/include/net/xfrm.h linux-2.6.22.patch/include/net/xfrm.h
--- linux-2.6.22/include/net/xfrm.h 2007-08-14 18:13:53.0 -0500
+++ linux-2.6.22.patch/include/net/xfrm.h   2007-08-14 19:08:42.0 
-0500
@@ -426,10 +426,15 @@ struct xfrm_audit
 };
 
 #ifdef CONFIG_AUDITSYSCALL
-extern void xfrm_audit_log(uid_t auid, u32 secid, int type, int result,
-   struct xfrm_policy *xp, struct xfrm_state *x);
+extern void xfrm_audit_log(struct xfrm_audit *audit_info, int result,
+  __be32 flowid, struct xfrm_policy *xp, 
+  struct xfrm_state *x, char *buf);
+
+extern void xfrm_get_auditinfo(struct sk_buff *skb, 
+  struct xfrm_audit *audit_info);
 #else
-#define xfrm_audit_log(a,s,t,r,p,x) do { ; } while (0)
+#define xfrm_audit_log(a,r,f,p,s,b) do { ; } while (0)
+#define xfrm_get_auditinfo(s, a) do { ; } while (0)
 #endif /* CONFIG_AUDITSYSCALL */
 
 static inline void xfrm_pol_hold(struct xfrm_policy *policy)
diff -urpN linux-2.6.22/net/key/af_key.c linux-2.6.22.patch/net/key/af_key.c
--- linux-2.6.22/net/key/af_key.c   2007-08-14 18:13:53.0 -0500
+++ linux-2.6.22.patch/net/key/af_key.c 2007-08-14 19:08:42.0 -0500
@@ -1450,6 +1450,7 @@ static int pfkey_add(struct sock *sk, st
struct xfrm_state *x;
int err;
struct km_event c;
+   struct xfrm_audit audit_info;
 
x = pfkey_msg2xfrm_state(hdr, ext_hdrs);
if (IS_ERR(x))
@@ -1461,8 +1462,8 @@ static int pfkey_add(struct sock *sk, st
else
err = xfrm_state_update(x);
 
-   xfrm_audit_log(audit_get_loginuid(current-audit_context), 0,
-  AUDIT_MAC_IPSEC_ADDSA, err ? 0 : 1, NULL, x);
+   xfrm_get_auditinfo(0, audit_info);
+   xfrm_audit_log(audit_info, err ? 0 : 1, 0, 0, x, SAD-add);
 
if (err  0) {
x-km.state = XFRM_STATE_DEAD;
@@ -1487,6 +1488,7 @@ static int pfkey_delete(struct sock *sk,
struct xfrm_state *x;
struct km_event c;
int err;
+   struct xfrm_audit audit_info;
 
if (!ext_hdrs[SADB_EXT_SA-1] ||
!present_and_same_family(ext_hdrs[SADB_EXT_ADDRESS_SRC-1],
@@ -1515,8 +1517,9 @@ static int pfkey_delete(struct sock *sk,
c.event = XFRM_MSG_DELSA;
km_state_notify(x, c);
 out:
-   xfrm_audit_log(audit_get_loginuid(current-audit_context), 0,
-  AUDIT_MAC_IPSEC_DELSA, err ? 0 : 1, NULL, x);
+   xfrm_get_auditinfo(0, audit_info);
+   xfrm_audit_log(audit_info, err ? 0 : 1, 0, 0, x, SAD-delete);
+
xfrm_state_put(x);
 
return err;
@@ -1691,8 +1694,7 @@ static int pfkey_flush(struct sock *sk, 
if (proto == 0)
return -EINVAL;
 
-   audit_info.loginuid = audit_get_loginuid(current-audit_context);
-   audit_info.secid = 0;
+   xfrm_get_auditinfo(0, audit_info);
err = xfrm_state_flush(proto, audit_info);
if (err)
return err;
@@ -2182,6 +2184,7 @@ static int pfkey_spdadd(struct sock *sk,
struct xfrm_policy *xp;
struct km_event c;
struct sadb_x_sec_ctx *sec_ctx;
+   struct xfrm_audit audit_info;
 
if (!present_and_same_family(ext_hdrs[SADB_EXT_ADDRESS_SRC-1],
 

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 06:09:35PM +0200, Stefan Richter wrote:
 Herbert Xu wrote:
  On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
  I don't know if this here is affected:
 
 [...something like]
   b = atomic_read(a);
   for (i = 0; i  4; i++) {
   msleep_interruptible(63);
   c = atomic_read(a);
   if (c != b) {
   b = c;
   i = 0;
   }
   }
 
  Nope, we're calling schedule which is a rather heavy-weight
  barrier.
 
 How does the compiler know that msleep() has got barrier()s?

Because msleep_interruptible() is in a separate compilation unit,
the compiler has to assume that it might modify any arbitrary global.
In many cases, the compiler also has to assume that msleep_interruptible()
might call back into a function in the current compilation unit, thus
possibly modifying global static variables.

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma


On Wed, 15 Aug 2007, Paul E. McKenney wrote:

 On Wed, Aug 15, 2007 at 06:09:35PM +0200, Stefan Richter wrote:
  Herbert Xu wrote:
   On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
   I don't know if this here is affected:
  
  [...something like]
  b = atomic_read(a);
  for (i = 0; i  4; i++) {
  msleep_interruptible(63);
  c = atomic_read(a);
  if (c != b) {
  b = c;
  i = 0;
  }
  }
  
   Nope, we're calling schedule which is a rather heavy-weight
   barrier.
  
  How does the compiler know that msleep() has got barrier()s?
 
 Because msleep_interruptible() is in a separate compilation unit,
 the compiler has to assume that it might modify any arbitrary global.
 In many cases, the compiler also has to assume that msleep_interruptible()
 might call back into a function in the current compilation unit, thus
 possibly modifying global static variables.

Yup, I've just verified this with a testcase. So a call to any function
outside of the current compilation unit acts as a compiler barrier. Cool.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma


On Wed, 15 Aug 2007, Paul E. McKenney wrote:

 On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote:
  On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
   
   Do we really need another set of APIs?  Can you give even one example
   where the pre-existing volatile semantics are causing enough of a problem
   to justify adding yet more atomic_*() APIs?
  
  Let's turn this around.  Can you give a single example where
  the volatile semantics is needed in a legitimate way?
 
 Sorry, but you are the one advocating for the change.

Not for i386 and x86_64 -- those have atomic ops without any volatile
semantics (currently as per existing definitions).
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] ethtool: add register dump support for intel 82598 chipsets (ixgbe driver)

2007-08-15 Thread Auke Kok
From: Nicholas Nunley [EMAIL PROTECTED]

Signed-off-by: Nicholas Nunley [EMAIL PROTECTED]
Signed-off-by: Auke Kok [EMAIL PROTECTED]
---

 Makefile.am|2 
 ethtool-util.h |2 
 ethtool.c  |1 
 ixgbe.c| 1017 
 4 files changed, 1021 insertions(+), 1 deletions(-)

diff --git a/Makefile.am b/Makefile.am
index 7bb58d8..43d1236 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -6,7 +6,7 @@ EXTRA_DIST = ethtool.8 ethtool.spec.in aclocal.m4 ChangeLog 
autogen.sh
 sbin_PROGRAMS = ethtool
 ethtool_SOURCES = ethtool.c ethtool-copy.h ethtool-util.h  \
  amd8111e.c de2104x.c e100.c e1000.c igb.c \
- fec_8xx.c ibm_emac.c ixgb.c natsemi.c \
+ fec_8xx.c ibm_emac.c ixgb.c ixgbe.c natsemi.c \
  pcnet32.c realtek.c tg3.c marvell.c vioc.c\
  smsc911x.c
 
diff --git a/ethtool-util.h b/ethtool-util.h
index 1621dfe..5572771 100644
--- a/ethtool-util.h
+++ b/ethtool-util.h
@@ -56,6 +56,8 @@ int ibm_emac_dump_regs(struct ethtool_drvinfo *info, struct 
ethtool_regs *regs);
 /* Intel(R) PRO/10GBe Gigabit Adapter Family */
 int ixgb_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
 
+int ixgbe_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
+
 /* Broadcom Tigon3 Ethernet controller */
 int tg3_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
 
diff --git a/ethtool.c b/ethtool.c
index 29562a7..651529e 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -1005,6 +1005,7 @@ static struct {
{ e1000, e1000_dump_regs },
{ igb, igb_dump_regs },
{ ixgb, ixgb_dump_regs },
+   { ixgbe, ixgbe_dump_regs },
{ natsemi, natsemi_dump_regs },
{ e100, e100_dump_regs },
{ amd8111e, amd8111e_dump_regs },
diff --git a/ixgbe.c b/ixgbe.c
new file mode 100644
index 000..03eba6a
--- /dev/null
+++ b/ixgbe.c
@@ -0,0 +1,1017 @@
+/* Copyright (c) 2007 Intel Corporation */
+#include stdio.h
+#include ethtool-util.h
+
+/* Register Bit Masks */
+#define IXGBE_FCTRL_SBP0x0002
+#define IXGBE_FCTRL_MPE0x0100
+#define IXGBE_FCTRL_UPE0x0200
+#define IXGBE_FCTRL_BAM0x0400
+#define IXGBE_FCTRL_PMCF   0x1000
+#define IXGBE_FCTRL_DPF0x2000
+#define IXGBE_FCTRL_RPFCE  0x4000
+#define IXGBE_FCTRL_RFCE   0x8000
+#define IXGBE_VLNCTRL_VET  0x
+#define IXGBE_VLNCTRL_CFI  0x1000
+#define IXGBE_VLNCTRL_CFIEN0x2000
+#define IXGBE_VLNCTRL_VFE  0x4000
+#define IXGBE_VLNCTRL_VME  0x8000
+#define IXGBE_LINKS_UP 0x4000
+#define IXGBE_LINKS_SPEED  0x2000
+#define IXGBE_SRRCTL_BSIZEPKT_MASK 0x007F
+#define IXGBE_HLREG0_TXCRCEN   0x0001
+#define IXGBE_HLREG0_RXCRCSTRP 0x0002
+#define IXGBE_HLREG0_JUMBOEN   0x0004
+#define IXGBE_HLREG0_TXPADEN   0x0400
+#define IXGBE_HLREG0_LPBK  0x8000
+#define IXGBE_RMCS_TFCE_802_3X 0x0008
+#define IXGBE_RMCS_TFCE_PRIORITY   0x0010
+
+int
+ixgbe_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs)
+{
+   u32 *regs_buff = (u32 *)regs-data;
+   u32 reg;
+   u8 i;
+   u8 version = (u8)(regs-version  24);
+
+   if (version != 1)
+   return -1;
+
+   reg = regs_buff[1065];
+   fprintf(stdout,
+   0x042A4: LINKS (Link Status register) 0x%08X\n
+  Link Status:   %s\n
+  Link Speed:%s\n,
+   reg,
+   reg  IXGBE_LINKS_UP  ? up   : down,
+   reg  IXGBE_LINKS_SPEED   ? 10G  : 1G);
+   
+   reg = regs_buff[515];
+   fprintf(stdout,
+   0x05080: FCTRL (Filter Control register)  0x%08X\n
+  Receive Flow Control Packets:  %s\n
+  Receive Priority Flow Control Packets: %s\n
+  Discard Pause Frames:  %s\n
+  Pass MAC Control Frames:   %s\n
+  Broadcast Accept:  %s\n
+  Unicast Promiscuous:   %s\n
+  Multicast Promiscuous: %s\n
+  Store Bad Packets: %s\n,
+   reg,
+   reg  IXGBE_FCTRL_RFCE? enabled  : disabled,
+   reg  IXGBE_FCTRL_RPFCE   ? enabled  : disabled,
+   reg  IXGBE_FCTRL_DPF ? enabled  : disabled,
+   reg  IXGBE_FCTRL_PMCF? enabled  : disabled,
+   reg  IXGBE_FCTRL_BAM ? enabled  : disabled,
+   reg  IXGBE_FCTRL_UPE ? enabled  : disabled,
+   reg  IXGBE_FCTRL_MPE ? enabled  : disabled,
+   reg  IXGBE_FCTRL_SBP ? enabled  : disabled);
+   
+   reg = regs_buff[516];
+   

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma


On Wed, 15 Aug 2007, Paul E. McKenney wrote:

 On Wed, Aug 15, 2007 at 10:48:28PM +0530, Satyam Sharma wrote:
  [...]
  Not for i386 and x86_64 -- those have atomic ops without any volatile
  semantics (currently as per existing definitions).
 
 I claim unit volumes with arm, and the majority of the architectures, but
 I cannot deny the popularity of i386 and x86_64 with many developers.  ;-)

Hmm, does arm really need that volatile int counter;? Hopefully RMK will
take a patch removing that volatile ... ;-)


 However, I am not aware of code in the kernel that would benefit
 from the compiler coalescing multiple atomic_set() and atomic_read()
 invocations, thus I don't see the downside to volatility in this case.
 Are there some performance-critical code fragments that I am missing?

I don't know, and yes, code with multiple atomic_set's and atomic_read's
getting optimized or coalesced does sound strange to start with. Anyway,
I'm not against volatile semantics per se. As replied elsewhere, I do
appreciate the motivation behind this series (to _avoid_ gotchas, not to
fix existing ones). Just that I'd like to avoid using volatile, for
aforementioned reasons, especially given that there are perfectly
reasonable alternatives to achieve the same desired behaviour.


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Potential u32 classifier bug.

2007-08-15 Thread Thomas Graf
* Waskiewicz Jr, Peter P [EMAIL PROTECTED] 2007-08-09 18:07
 My big question is: Has anyone recently used the 802_3 protocol in tc
 with u32 and actually gotten it to work?  I can't see how the
 u32_classify() code can look at the mac header, since it is using the
 network header accessor to start looking.  I think this is an issue with
 the classification code, but I'm looking to see if I'm doing something
 stupid before I really start digging into this mess.

There is this very horrible way of using the u32 classifier with a
negative offset to look into the ethernet header.

You might want to look into the cmp ematch which can be attached to
almost any classifier. It allows basing offsets on any layer thus
making ethernet header filtering trivial.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma
Hi Paul,


On Wed, 15 Aug 2007, Paul E. McKenney wrote:

 On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
  [...]
  No, I'd actually prefer something like what Christoph Lameter suggested,
  i.e. users (such as above) who want volatile-like behaviour from atomic
  ops can use alternative functions. How about something like:
  
  #define atomic_read_volatile(v) \
  ({  \
  forget((v)-counter);  \
  ((v)-counter); \
  })
 
 Wouldn't the above forget the value, throw it away, then forget
 that it forgot it, giving non-volatile semantics?

Nope, I don't think so. I wrote the following trivial testcases:
[ See especially tp4.c and tp4.s (last example). ]

==
$ cat tp1.c # Using volatile access casts

#define atomic_read(a)  (*(volatile int *)a)

int a;

void func(void)
{
a = 0;
while (atomic_read(a))
;
}
==
$ gcc -Os -S tp1.c; cat tp1.s

func:
pushl   %ebp
movl%esp, %ebp
movl$0, a
.L2:
movla, %eax
testl   %eax, %eax
jne .L2
popl%ebp
ret
==
$ cat tp2.c # Using nothing; gcc will optimize the whole loop away

#define forget(x)

#define atomic_read(a)  \
({  \
forget((a));   \
(a);\
})

int a;

void func(void)
{
a = 0;
while (atomic_read(a))
;
}
==
$ gcc -Os -S tp2.c; cat tp2.s

func:
pushl   %ebp
movl%esp, %ebp
popl%ebp
movl$0, a
ret
==
$ cat tp3.c # Using a full memory clobber barrier

#define forget(x)   asm volatile (:::memory)

#define atomic_read(a)  \
({  \
forget((a));   \
(a);\
})

int a;

void func(void)
{
a = 0;
while (atomic_read(a))
;
}
==
$ gcc -Os -S tp3.c; cat tp3.s

func:
pushl   %ebp
movl%esp, %ebp
movl$0, a
.L2:
cmpl$0, a
jne .L2
popl%ebp
ret
==
$ cat tp4.c # Using a forget(var) macro

#define forget(a)   __asm__ __volatile__ ( :=m (a) :m (a))

#define atomic_read(a)  \
({  \
forget(a);  \
(a);\
})

int a;

void func(void)
{
a = 0;
while (atomic_read(a))
;
}
==
$ gcc -Os -S tp4.c; cat tp4.s

func:
pushl   %ebp
movl%esp, %ebp
movl$0, a
.L2:
cmpl$0, a
jne .L2
popl%ebp
ret
==


Possibly these were too trivial to expose any potential problems that you
may have been referring to, so would be helpful if you could write a more
concrete example / sample code.


  Or possibly, implement these volatile atomic ops variants in inline asm
  like the patch that Sebastian Siewior has submitted on another thread just
  a while back.
 
 Given that you are advocating a change (please keep in mind that
 atomic_read() and atomic_set() had volatile semantics on almost all
 platforms), care to give some example where these historical volatile
 semantics are causing a problem?
 [...]
 Can you give even one example
 where the pre-existing volatile semantics are causing enough of a problem
 to justify adding yet more atomic_*() APIs?

Will take this to the other sub-thread ...


  Of course, if we find there are more callers in the kernel who want the
  volatility behaviour than those who don't care, we can re-define the
  existing ops to such variants, and re-name the existing definitions to
  somethine else, say atomic_read_nonvolatile for all I care.
 
 Do we really need another set of APIs?

Well, if there's one set of users who do care about volatile behaviour,
and another set that doesn't, it only sounds correct to provide both
those APIs, instead of forcing one behaviour on all users.


Thanks,
Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 10:48:28PM +0530, Satyam Sharma wrote:
 On Wed, 15 Aug 2007, Paul E. McKenney wrote:
  On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote:
   On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:

Do we really need another set of APIs?  Can you give even one example
where the pre-existing volatile semantics are causing enough of a 
problem
to justify adding yet more atomic_*() APIs?
   
   Let's turn this around.  Can you give a single example where
   the volatile semantics is needed in a legitimate way?
  
  Sorry, but you are the one advocating for the change.
 
 Not for i386 and x86_64 -- those have atomic ops without any volatile
 semantics (currently as per existing definitions).

I claim unit volumes with arm, and the majority of the architectures, but
I cannot deny the popularity of i386 and x86_64 with many developers.  ;-)

However, I am not aware of code in the kernel that would benefit
from the compiler coalescing multiple atomic_set() and atomic_read()
invocations, thus I don't see the downside to volatility in this case.
Are there some performance-critical code fragments that I am missing?

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] ethtool: add register dump support for intel 82575 chipsets (igb driver)

2007-08-15 Thread Auke Kok
From: Nicholas Nunley [EMAIL PROTECTED]

Signed-off-by: Nicholas Nunley [EMAIL PROTECTED]
Signed-off-by: Auke Kok [EMAIL PROTECTED]
---

 Makefile.am|2 
 ethtool-util.h |2 
 ethtool.c  |1 
 igb.c  |  864 
 4 files changed, 868 insertions(+), 1 deletions(-)

diff --git a/Makefile.am b/Makefile.am
index 9f4bbd3..7bb58d8 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -5,7 +5,7 @@ EXTRA_DIST = ethtool.8 ethtool.spec.in aclocal.m4 ChangeLog 
autogen.sh
 
 sbin_PROGRAMS = ethtool
 ethtool_SOURCES = ethtool.c ethtool-copy.h ethtool-util.h  \
- amd8111e.c de2104x.c e100.c e1000.c   \
+ amd8111e.c de2104x.c e100.c e1000.c igb.c \
  fec_8xx.c ibm_emac.c ixgb.c natsemi.c \
  pcnet32.c realtek.c tg3.c marvell.c vioc.c\
  smsc911x.c
diff --git a/ethtool-util.h b/ethtool-util.h
index f429555..1621dfe 100644
--- a/ethtool-util.h
+++ b/ethtool-util.h
@@ -30,6 +30,8 @@ int de2104x_dump_regs(struct ethtool_drvinfo *info, struct 
ethtool_regs *regs);
 /* Intel(R) PRO/1000 Gigabit Adapter Family */
 int e1000_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
 
+int igb_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
+
 /* RealTek PCI */
 int realtek_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
 
diff --git a/ethtool.c b/ethtool.c
index b04f747..29562a7 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -1003,6 +1003,7 @@ static struct {
{ r8169, realtek_dump_regs },
{ de2104x, de2104x_dump_regs },
{ e1000, e1000_dump_regs },
+   { igb, igb_dump_regs },
{ ixgb, ixgb_dump_regs },
{ natsemi, natsemi_dump_regs },
{ e100, e100_dump_regs },
diff --git a/igb.c b/igb.c
new file mode 100644
index 000..fadc22e
--- /dev/null
+++ b/igb.c
@@ -0,0 +1,864 @@
+/* Copyright (c) 2007 Intel Corporation */
+#include stdio.h
+#include ethtool-util.h
+
+/* Register Bit Masks */
+/* Device Control */
+#define E1000_CTRL_FD 0x0001  /* Full duplex.0=half; 1=full */
+#define E1000_CTRL_PRIOR  0x0004  /* Priority on PCI. 0=rx,1=fair */
+#define E1000_CTRL_GIOMASTERD 0x0008  /* GIO Master Disable*/
+#define E1000_CTRL_TME0x0010  /* Test mode. 0=normal,1=test */
+#define E1000_CTRL_SLE0x0020  /* Serial Link on 0=dis,1=en */
+#define E1000_CTRL_ASDE   0x0020  /* Auto-speed detect enable */
+#define E1000_CTRL_SLU0x0040  /* Set link up (Force Link) */
+#define E1000_CTRL_ILOS   0x0080  /* Invert Loss-Of Signal */
+#define E1000_CTRL_SPD_SEL0x0300  /* Speed Select Mask */
+#define E1000_CTRL_SPD_10 0x  /* Force 10Mb */
+#define E1000_CTRL_SPD_1000x0100  /* Force 100Mb */
+#define E1000_CTRL_SPD_1000   0x0200  /* Force 1Gb */
+#define E1000_CTRL_FRCSPD 0x0800  /* Force Speed */
+#define E1000_CTRL_FRCDPX 0x1000  /* Force Duplex */
+#define E1000_CTRL_SDP0_GPIEN 0x0001  /* General Purpose Interrupt 
Detection Enable for SDP0 */
+#define E1000_CTRL_SDP1_GPIEN 0x0002  /* General Purpose Interrupt 
Detection Enable for SDP1 */
+#define E1000_CTRL_SDP0_DATA  0x0004  /* SDP0 Data */
+#define E1000_CTRL_SDP1_DATA  0x0008  /* SDP1 Data */
+#define E1000_CTRL_ADVD3WUC   0x0010  /* D3Cold WakeUp Capability 
Advertisement Enable */
+#define E1000_CTRL_SDP0_WDE   0x0020  /* Watchdog Indication for SDP0 */
+#define E1000_CTRL_SDP1_IODIR 0x0040  /* SDP1 Directionality */
+#define E1000_CTRL_RST0x0400  /* Global reset */
+#define E1000_CTRL_RFCE   0x0800  /* Receive Flow Control enable */
+#define E1000_CTRL_TFCE   0x1000  /* Transmit flow control enable */
+#define E1000_CTRL_VME0x4000  /* IEEE VLAN mode enable */
+#define E1000_CTRL_PHY_RST0x8000  /* PHY Reset */
+
+/* Device Status */
+#define E1000_STATUS_FD  0x0001  /* Full duplex.0=half,1=full 
*/
+#define E1000_STATUS_LU  0x0002  /* Link up.0=no,1=link */
+#define E1000_STATUS_LANID   0x0008  /* LAN ID */
+#define E1000_STATUS_TXOFF   0x0010  /* transmission paused */
+#define E1000_STATUS_TBIMODE 0x0020  /* TBI mode */
+#define E1000_STATUS_SPEED_MASK  0x00C0  /* Speed Mask */
+#define E1000_STATUS_SPEED_100x  /* Speed 10Mb/s */
+#define E1000_STATUS_SPEED_100   0x0040  /* Speed 100Mb/s */
+#define E1000_STATUS_SPEED_1000  0x0080  /* Speed 1000Mb/s */
+#define E1000_STATUS_ASDV0x0300  /* Auto speed detect value */
+#define E1000_STATUS_PHYRA   0x0400  /* PHY Reset Asserted */
+#define E1000_STATUS_GIOMASTEREN 0x0008  /* GIO Master Enable Status */
+#define E1000_STATUS_DMA_CGEN0x8000  /* DMA clock gating Enable */
+
+/* Receive Control */
+#define E1000_RCTL_EN   

Re: [GENETLINK] some thoughts on the usage

2007-08-15 Thread Thomas Graf
* Richard MUSIL [EMAIL PROTECTED] 2007-08-10 10:45
 I have noticed that although ops for each family are the same (each
 device is functionally same) I cannot use same genl_ops struct for
 registration, because it uses internal member to link in list. Therefore
 it is necessary to allocate new genl_ops for each device and pass it to
 registration. But I cannot officially use this list to track those
 genl_ops (so I can properly destroy them later), because there is no
 interface. So I need to redo the management of the structures on my own.

The intended usage of the interface in your example would be to register
only one genetlink family, say tpm, register one set of operations
and then have an attribute in every message which specifies which TPM
device to use. This helps keeping the total number of genetlink families
down.

 The second inconvenience is that for each family I register, I also
 register basically same ops (basically means, the definitions, and doit,
 dumpit handlers are same, though the structures are at different
 addresses for reasons described above). When the handler receives the
 message it needs to associate the message with the actual device it is
 handling. This could be done through family lookup (using
 nlmsghdr::nlmsg_type), but I wondered if it would make sense to extend
 genl_family for user custom data pointer and then pass this custom data
 (or genl_family reference) to each handler (for example inside
 genl_info). It is already parsed by genetlink layer, so it should not
 slow things down.

That's not a bad idea, although I think we should try and keep the
generic netlink part as simple as possible. There is a family specific
header, referred to as user header in genl_info which is basically
what you're looking for with the custom header. I believe making the
generic netlink family aware of anything beyond family id and operations
id only complicates things.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread David Howells
Herbert Xu [EMAIL PROTECTED] wrote:

 Let's turn this around.  Can you give a single example where
 the volatile semantics is needed in a legitimate way?

Accessing H/W registers?  But apart from that...

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Potential u32 classifier bug.

2007-08-15 Thread Waskiewicz Jr, Peter P
 * Waskiewicz Jr, Peter P [EMAIL PROTECTED] 
 2007-08-15 11:02
   There is this very horrible way of using the u32 
 classifier with a 
   negative offset to look into the ethernet header.
  
  Based on this, it sounds like u32 using protocol 802_3 is broken?
 
 You might be expecting too much from u32. The protocol given 
 to u32 is just a filter, it doesn't imply anything beyond that.
 u32 has its usage the way it is, that's way we've added an 
 ematch rather than extending u32 itself.

Ok, that clarifies it a bit.  I've just found a few examples on the net,
one of which is in a TC filter manual
(http://tcn.hypert.net/tcmanual.pdf, section 2.2.1.3 at the bottom of
the section), that was using u32 to simply filter on dest MAC address
without anything elaborate.  Either it worked way back when, or it was a
bogus example.

Thanks again Thomas,

-PJ
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Potential u32 classifier bug.

2007-08-15 Thread Waskiewicz Jr, Peter P
 * Waskiewicz Jr, Peter P [EMAIL PROTECTED] 
 2007-08-09 18:07
  My big question is: Has anyone recently used the 802_3 
 protocol in tc 
  with u32 and actually gotten it to work?  I can't see how the
  u32_classify() code can look at the mac header, since it is 
 using the 
  network header accessor to start looking.  I think this is an issue 
  with the classification code, but I'm looking to see if I'm doing 
  something stupid before I really start digging into this mess.
 
 There is this very horrible way of using the u32 classifier 
 with a negative offset to look into the ethernet header.

Based on this, it sounds like u32 using protocol 802_3 is broken?

 You might want to look into the cmp ematch which can be 
 attached to almost any classifier. It allows basing offsets 
 on any layer thus making ethernet header filtering trivial.

I'll look at this.  Thanks Thomas for the response!

-PJ
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Potential u32 classifier bug.

2007-08-15 Thread Thomas Graf
* Waskiewicz Jr, Peter P [EMAIL PROTECTED] 2007-08-15 11:02
  There is this very horrible way of using the u32 classifier 
  with a negative offset to look into the ethernet header.
 
 Based on this, it sounds like u32 using protocol 802_3 is broken?

You might be expecting too much from u32. The protocol given
to u32 is just a filter, it doesn't imply anything beyond that.
u32 has its usage the way it is, that's way we've added an ematch
rather than extending u32 itself.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 07:19:57PM +0100, David Howells wrote:
 Herbert Xu [EMAIL PROTECTED] wrote:
 
  Let's turn this around.  Can you give a single example where
  the volatile semantics is needed in a legitimate way?
 
 Accessing H/W registers?  But apart from that...

Communicating between process context and interrupt/NMI handlers using
per-CPU variables.

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] bonding: eliminate RTNL assertion spew

2007-08-15 Thread Mike Snitzer
I'd very much like to help out.  The rtnl assertion spew isn't
instilling confidence in customers I've been working with.  If you'd
like to send me patches in private I'd help test them ASAP.

Could you elaborate on the associated risk of _not_ fixing these
issues?  balance-alb _seems_ to be working even though these traces
occur on initialization.  But these rtnl traces are clearly more
generic than balance-alb.

regards,
Mike

On 8/14/07, Andy Gospodarek [EMAIL PROTECTED] wrote:
 On 8/14/07, Mike Snitzer [EMAIL PROTECTED] wrote:
  Andy,
 
  Is there an updated version of this patch?
 
  Please advise, thanks.
 
 

 Mike,

 There is a version that Jay and I have been testing and if you would
 like to help out, we could probably send you some patches.

 Jay has split the entire patch into smaller chunks, and we hope to
 post the entire thing quite soon (days not months).

 -andy

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool

Volatile behaviour itself isn't consistently defined (at least
definitely not consistently implemented in various gcc versions across
platforms),


It should be consistent across platforms; if not, file a bug please.


but it is /expected/ to mean something like: ensure that
every such access actually goes all the way to memory, and is not
re-ordered w.r.t. to other accesses, as far as the compiler can take
care of these. The last as far as compiler can take care disclaimer
comes about due to CPUs doing their own re-ordering nowadays.


You can *expect* whatever you want, but this isn't in line with
reality at all.

volatile _does not_ make accesses go all the way to memory.
volatile _does not_ prevent reordering wrt other accesses.

What volatile does are a) never optimise away a read (or write)
to the object, since the data can change in ways the compiler
cannot see; and b) never move stores to the object across a
sequence point.  This does not mean other accesses cannot be
reordered wrt the volatile access.

If the abstract machine would do an access to a volatile-
qualified object, the generated machine code will do that
access too.  But, for example, it can still be optimised
away by the compiler, if it can prove it is allowed to.

If you want stuff to go all the way to memory, you need some
architecture-specific flush sequence; to make a store globally
visible before another store, you need mb(); before some following
read, you need mb(); to prevent reordering you need a barrier.


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool
Well if there is only one memory location involved, then smp_rmb() 
isn't

going to really do anything anyway, so it would be incorrect to use it.


rmb() orders *any* two reads; that includes two reads from the same
location.


Consider that smp_rmb basically will do anything from flushing the
pipeline to invalidating loads speculatively executed out of order. 
AFAIK
it will not control the visibility of stores coming from other CPUs 
(that

is up to the cache coherency).


The writer side should typically use wmb() whenever the reader side
uses rmb(), sure.


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote:
 How does the compiler know that msleep() has got barrier()s?
 
 Because msleep_interruptible() is in a separate compilation unit,
 the compiler has to assume that it might modify any arbitrary global.
 
 No; compilation units have nothing to do with it, GCC can optimise
 across compilation unit boundaries just fine, if you tell it to
 compile more than one compilation unit at once.

Last I checked, the Linux kernel build system did compile each .c file
as a separate compilation unit.

 What you probably mean is that the compiler has to assume any code
 it cannot currently see can do anything (insofar as allowed by the
 relevant standards etc.)

Indeed.

 In many cases, the compiler also has to assume that 
 msleep_interruptible()
 might call back into a function in the current compilation unit, thus
 possibly modifying global static variables.
 
 It most often is smart enough to see what compilation-unit-local
 variables might be modified that way, though :-)

Yep.  For example, if it knows the current value of a given such local
variable, and if all code paths that would change some other variable
cannot be reached given that current value of the first variable.
At least given that gcc doesn't know about multiple threads of execution!

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 0/4] Update network drivers to use devres

2007-08-15 Thread Brandon Philips
These patches update the network driver core, e100 and e1000 driver to use
devres.  Devres is a simple resource manager for device drivers, see
Documentation/driver-model/devres.txt for more information.

The use of devres will continue to be optional with this patch set and can be
applied to drivers where it makes sense.

This series includes all of the feedback from the final RFC.  Thanks :)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/4] Update net core to use devres.

2007-08-15 Thread Brandon Philips
* netdev_pci_remove_one() can replace simple pci device remove
  functions

* devm_alloc_netdev() is like alloc_netdev but allocates memory using devres.

Signed-off-by: Brandon Philips [EMAIL PROTECTED]

---
 include/linux/etherdevice.h |5 ++
 include/linux/netdevice.h   |7 ++
 net/core/dev.c  |  105 +++-
 net/ethernet/eth.c  |8 +++
 4 files changed, 115 insertions(+), 10 deletions(-)

Index: linux-2.6/include/linux/netdevice.h
===
--- linux-2.6.orig/include/linux/netdevice.h
+++ linux-2.6/include/linux/netdevice.h
@@ -656,6 +656,7 @@ extern int  dev_queue_xmit(struct sk_buf
 extern int register_netdevice(struct net_device *dev);
 extern voidunregister_netdevice(struct net_device *dev);
 extern voidfree_netdev(struct net_device *dev);
+extern voidnetdev_pci_remove_one(struct pci_dev *pdev);
 extern voidsynchronize_net(void);
 extern int register_netdevice_notifier(struct notifier_block *nb);
 extern int unregister_netdevice_notifier(struct notifier_block 
*nb);
@@ -1085,8 +1086,14 @@ extern void  ether_setup(struct net_devi
 extern struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name,
   void (*setup)(struct net_device *),
   unsigned int queue_count);
+extern struct net_device *devm_alloc_netdev_mq(struct device *dev,
+  int sizeof_priv, const char *name,
+  void (*setup)(struct net_device *),
+  unsigned int queue_count);
 #define alloc_netdev(sizeof_priv, name, setup) \
alloc_netdev_mq(sizeof_priv, name, setup, 1)
+#define devm_alloc_netdev(dev, sizeof_priv, name, setup) \
+   devm_alloc_netdev_mq(dev, sizeof_priv, name, setup, 1)
 extern int register_netdev(struct net_device *dev);
 extern voidunregister_netdev(struct net_device *dev);
 /* Functions used for secondary unicast and multicast support */
Index: linux-2.6/net/core/dev.c
===
--- linux-2.6.orig/net/core/dev.c
+++ linux-2.6/net/core/dev.c
@@ -89,6 +89,7 @@
 #include linux/interrupt.h
 #include linux/if_ether.h
 #include linux/netdevice.h
+#include linux/pci.h
 #include linux/etherdevice.h
 #include linux/notifier.h
 #include linux/skbuff.h
@@ -3658,18 +3659,46 @@ static struct net_device_stats *internal
 }
 
 /**
- * alloc_netdev_mq - allocate network device
- * @sizeof_priv:   size of private data to allocate space for
- * @name:  device name format string
- * @setup: callback to initialize device
- * @queue_count:   the number of subqueues to allocate
+ * devm_free_netdev - wrapper around free_netdev for devres
+ */
+static void devm_free_netdev(struct device *gendev, void *res)
+{
+   struct net_device **dev = (struct net_device **)res;
+   free_netdev(*dev);
+}
+
+/**
+ * register_netdev_devres - register netdev with a managed device
+ * @dev:   devres managed device responsible for the memory
+ * @netdev:pointer to netdev to be managed
  *
- * Allocates a struct net_device with private data area for driver use
- * and performs basic initialization.  Also allocates subquue structs
- * for each queue on the device at the end of the netdevice.
+ * Registers @netdev to the device @dev and calls free_netdev automatically
+ * when the device disappears
  */
-struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name,
-   void (*setup)(struct net_device *), unsigned int queue_count)
+static void *register_netdev_devres(struct device *gendev,
+   struct net_device *dev)
+{
+   struct net_device **p;
+
+   p = devres_alloc(devm_free_netdev, sizeof(*p), GFP_KERNEL);
+
+   if (unlikely(!p))
+   return NULL;
+
+   *p = dev;
+   devres_add(gendev, p);
+
+   return dev;
+}
+
+/**
+ * __alloc_netdev_mq - does the work to allocate a network device
+ * @dev:   devres managed device responsible for mem.
+ * NULL if unmanaged
+ */
+struct net_device *__alloc_netdev_mq(struct device *gendev, int sizeof_priv,
+   const char *name, void (*setup)(struct net_device *),
+   unsigned int queue_count)
 {
void *p;
struct net_device *dev;
@@ -3706,8 +3735,44 @@ struct net_device *alloc_netdev_mq(int s
dev-get_stats = internal_stats;
setup(dev);
strcpy(dev-name, name);
+
+   /* If we are given a device then manage this netdev with devres */
+   if (gendev != NULL)
+   return register_netdev_devres(gendev, dev);
+
return dev;
 }
+
+/**
+ * 

[patch 2/4] Update e100 driver to use devres.

2007-08-15 Thread Brandon Philips
devres manages device resources and is currently used by all libata low level
drivers.   When devres is used in a driver the complexity of error handling in
the device probe and remove functions can be reduced.

In this case e100_free(), e100_remove() and all of the gotos in e100_probe have
been removed.  

Signed-off-by: Brandon Philips [EMAIL PROTECTED]

---
 drivers/net/e100.c |   78 +
 1 file changed, 20 insertions(+), 58 deletions(-)

Index: linux-2.6/drivers/net/e100.c
===
--- linux-2.6.orig/drivers/net/e100.c
+++ linux-2.6/drivers/net/e100.c
@@ -2517,18 +2517,11 @@ static int e100_do_ioctl(struct net_devi
 
 static int e100_alloc(struct nic *nic)
 {
-   nic-mem = pci_alloc_consistent(nic-pdev, sizeof(struct mem),
-   nic-dma_addr);
-   return nic-mem ? 0 : -ENOMEM;
-}
+   struct device *dev = nic-pdev-dev;
 
-static void e100_free(struct nic *nic)
-{
-   if(nic-mem) {
-   pci_free_consistent(nic-pdev, sizeof(struct mem),
-   nic-mem, nic-dma_addr);
-   nic-mem = NULL;
-   }
+   nic-mem = dmam_alloc_coherent(dev, sizeof(struct mem),
+   nic-dma_addr, GFP_ATOMIC);
+   return nic-mem ? 0 : -ENOMEM;
 }
 
 static int e100_open(struct net_device *netdev)
@@ -2554,8 +2547,10 @@ static int __devinit e100_probe(struct p
struct net_device *netdev;
struct nic *nic;
int err;
+   void __iomem **iomap;
+   int bar;
 
-   if(!(netdev = alloc_etherdev(sizeof(struct nic {
+   if (!(netdev = devm_alloc_etherdev(pdev-dev, sizeof(struct nic {
if(((1  debug) - 1)  NETIF_MSG_PROBE)
printk(KERN_ERR PFX Etherdev alloc failed, abort.\n);
return -ENOMEM;
@@ -2585,26 +2580,28 @@ static int __devinit e100_probe(struct p
nic-msg_enable = (1  debug) - 1;
pci_set_drvdata(pdev, netdev);
 
-   if((err = pci_enable_device(pdev))) {
+   if ((err = pcim_enable_device(pdev))) {
DPRINTK(PROBE, ERR, Cannot enable PCI device, aborting.\n);
-   goto err_out_free_dev;
+   return err;
}
 
if(!(pci_resource_flags(pdev, 0)  IORESOURCE_MEM)) {
DPRINTK(PROBE, ERR, Cannot find proper PCI device 
base address, aborting.\n);
err = -ENODEV;
-   goto err_out_disable_pdev;
+   return err;
}
 
-   if((err = pci_request_regions(pdev, DRV_NAME))) {
+   bar = use_io ? 1 : 0;
+   if((err = pcim_iomap_regions(pdev, 1  bar, DRV_NAME))) {
DPRINTK(PROBE, ERR, Cannot obtain PCI resources, aborting.\n);
-   goto err_out_disable_pdev;
+   return err;
}
+   iomap = pcim_iomap_table(pdev)[bar];
 
if((err = pci_set_dma_mask(pdev, DMA_32BIT_MASK))) {
DPRINTK(PROBE, ERR, No usable DMA configuration, aborting.\n);
-   goto err_out_free_res;
+   return err;
}
 
SET_MODULE_OWNER(netdev);
@@ -2613,12 +2610,6 @@ static int __devinit e100_probe(struct p
if (use_io)
DPRINTK(PROBE, INFO, using i/o access mode\n);
 
-   nic-csr = pci_iomap(pdev, (use_io ? 1 : 0), sizeof(struct csr));
-   if(!nic-csr) {
-   DPRINTK(PROBE, ERR, Cannot map device registers, aborting.\n);
-   err = -ENOMEM;
-   goto err_out_free_res;
-   }
 
if(ent-driver_data)
nic-flags |= ich;
@@ -2650,11 +2641,11 @@ static int __devinit e100_probe(struct p
 
if((err = e100_alloc(nic))) {
DPRINTK(PROBE, ERR, Cannot alloc driver memory, aborting.\n);
-   goto err_out_iounmap;
+   return err;
}
 
if((err = e100_eeprom_load(nic)))
-   goto err_out_free;
+   return err;
 
e100_phy_init(nic);
 
@@ -2664,8 +2655,7 @@ static int __devinit e100_probe(struct p
if (!eeprom_bad_csum_allow) {
DPRINTK(PROBE, ERR, Invalid MAC address from 
EEPROM, aborting.\n);
-   err = -EAGAIN;
-   goto err_out_free;
+   return -EAGAIN;
} else {
DPRINTK(PROBE, ERR, Invalid MAC address from EEPROM, 
you MUST configure one.\n);
@@ -2685,7 +2675,7 @@ static int __devinit e100_probe(struct p
strcpy(netdev-name, eth%d);
if((err = register_netdev(netdev))) {
DPRINTK(PROBE, ERR, Cannot register net device, aborting.\n);
-   goto err_out_free;
+   return err;
}
 
DPRINTK(PROBE, INFO, addr 0x%llx, irq %d, 
@@ -2695,36 +2685,8 @@ static int __devinit e100_probe(struct p

[patch 3/4] Implement devm_kcalloc

2007-08-15 Thread Brandon Philips
devm_kcalloc is a simple wrapper around devm_kzalloc for arrays.  This is
needed because kcalloc is often used in network devices. 

Signed-off-by: Brandon Philips [EMAIL PROTECTED]

---
 drivers/base/devres.c  |   16 
 include/linux/device.h |2 ++
 2 files changed, 18 insertions(+)

Index: linux-2.6/drivers/base/devres.c
===
--- linux-2.6.orig/drivers/base/devres.c
+++ linux-2.6/drivers/base/devres.c
@@ -630,6 +630,22 @@ void * devm_kzalloc(struct device *dev, 
 EXPORT_SYMBOL_GPL(devm_kzalloc);
 
 /**
+ * devm_kcalloc - resource-managed kcalloc
+ * @dev: Device to allocate memory for
+ * @n: number of elements.
+ * @size: element size.
+ * @flags: the type of memory to allocate.
+ */
+void *devm_kcalloc(struct device *dev, size_t n, size_t size,
+  gfp_t flags)
+{
+   if (n != 0  size  ULONG_MAX / n)
+   return NULL;
+   return devm_kzalloc(dev, n * size, flags);
+}
+EXPORT_SYMBOL_GPL(devm_kcalloc);
+
+/**
  * devm_kfree - Resource-managed kfree
  * @dev: Device this memory belongs to
  * @p: Memory to free
Index: linux-2.6/include/linux/device.h
===
--- linux-2.6.orig/include/linux/device.h
+++ linux-2.6/include/linux/device.h
@@ -402,6 +402,8 @@ extern int devres_release_group(struct d
 
 /* managed kzalloc/kfree for device drivers, no kmalloc, always use kzalloc */
 extern void *devm_kzalloc(struct device *dev, size_t size, gfp_t gfp);
+extern void *devm_kcalloc(struct device *dev, size_t n, size_t size,
+ gfp_t flags);
 extern void devm_kfree(struct device *dev, void *p);
 
 struct device {

-- 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 4/4] Update e1000 driver to use devres.

2007-08-15 Thread Brandon Philips
Conversion of e1000 probe() and remove() to devres.

Signed-off-by: Brandon Philips [EMAIL PROTECTED]
---
 drivers/net/e1000/e1000.h  |1 
 drivers/net/e1000/e1000_main.c |   92 -
 2 files changed, 28 insertions(+), 65 deletions(-)

Index: linux-2.6/drivers/net/e1000/e1000_main.c
===
--- linux-2.6.orig/drivers/net/e1000/e1000_main.c
+++ linux-2.6/drivers/net/e1000/e1000_main.c
@@ -860,15 +860,14 @@ e1000_probe(struct pci_dev *pdev,
 {
struct net_device *netdev;
struct e1000_adapter *adapter;
-   unsigned long mmio_start, mmio_len;
-   unsigned long flash_start, flash_len;
+   unsigned long mmio_len, flash_len;
 
static int cards_found = 0;
static int global_quad_port_a = 0; /* global ksp3 port a indication */
int i, err, pci_using_dac;
uint16_t eeprom_data = 0;
uint16_t eeprom_apme_mask = E1000_EEPROM_APME;
-   if ((err = pci_enable_device(pdev)))
+   if ((err = pcim_enable_device(pdev)))
return err;
 
if (!(err = pci_set_dma_mask(pdev, DMA_64BIT_MASK)) 
@@ -878,20 +877,19 @@ e1000_probe(struct pci_dev *pdev,
if ((err = pci_set_dma_mask(pdev, DMA_32BIT_MASK)) 
(err = pci_set_consistent_dma_mask(pdev, DMA_32BIT_MASK))) {
E1000_ERR(No usable DMA configuration, aborting\n);
-   goto err_dma;
+   return err;
}
pci_using_dac = 0;
}
 
if ((err = pci_request_regions(pdev, e1000_driver_name)))
-   goto err_pci_reg;
+   return err;
 
pci_set_master(pdev);
 
-   err = -ENOMEM;
-   netdev = alloc_etherdev(sizeof(struct e1000_adapter));
+   netdev = devm_alloc_etherdev(pdev-dev, sizeof(struct e1000_adapter));
if (!netdev)
-   goto err_alloc_etherdev;
+   return -ENOMEM;
 
SET_MODULE_OWNER(netdev);
SET_NETDEV_DEV(netdev, pdev-dev);
@@ -903,13 +901,11 @@ e1000_probe(struct pci_dev *pdev,
adapter-hw.back = adapter;
adapter-msg_enable = (1  debug) - 1;
 
-   mmio_start = pci_resource_start(pdev, BAR_0);
mmio_len = pci_resource_len(pdev, BAR_0);
 
-   err = -EIO;
-   adapter-hw.hw_addr = ioremap(mmio_start, mmio_len);
+   adapter-hw.hw_addr = pcim_iomap(pdev, BAR_0, mmio_len);
if (!adapter-hw.hw_addr)
-   goto err_ioremap;
+   return -EIO;
 
for (i = BAR_1; i = BAR_5; i++) {
if (pci_resource_len(pdev, i) == 0)
@@ -943,8 +939,8 @@ e1000_probe(struct pci_dev *pdev,
 #endif
strncpy(netdev-name, pci_name(pdev), sizeof(netdev-name) - 1);
 
-   netdev-mem_start = mmio_start;
-   netdev-mem_end = mmio_start + mmio_len;
+   netdev-mem_start = pci_resource_start(pdev, BAR_0);
+   netdev-mem_end = netdev-mem_start + mmio_len;
netdev-base_addr = adapter-hw.io_base;
 
adapter-bd_number = cards_found;
@@ -952,16 +948,15 @@ e1000_probe(struct pci_dev *pdev,
/* setup the private structure */
 
if ((err = e1000_sw_init(adapter)))
-   goto err_sw_init;
+   return err;
 
err = -EIO;
/* Flash BAR mapping must happen after e1000_sw_init
 * because it depends on mac_type */
if ((adapter-hw.mac_type == e1000_ich8lan) 
   (pci_resource_flags(pdev, 1)  IORESOURCE_MEM)) {
-   flash_start = pci_resource_start(pdev, 1);
flash_len = pci_resource_len(pdev, 1);
-   adapter-hw.flash_address = ioremap(flash_start, flash_len);
+   adapter-hw.flash_address = pcim_iomap(pdev, 1, flash_len);
if (!adapter-hw.flash_address)
goto err_flashmap;
}
@@ -1163,29 +1158,11 @@ err_register:
 err_eeprom:
if (!e1000_check_phy_reset_block(adapter-hw))
e1000_phy_hw_reset(adapter-hw);
-
-   if (adapter-hw.flash_address)
-   iounmap(adapter-hw.flash_address);
 err_flashmap:
 #ifdef CONFIG_E1000_NAPI
for (i = 0; i  adapter-num_rx_queues; i++)
dev_put(adapter-polling_netdev[i]);
 #endif
-
-   kfree(adapter-tx_ring);
-   kfree(adapter-rx_ring);
-#ifdef CONFIG_E1000_NAPI
-   kfree(adapter-polling_netdev);
-#endif
-err_sw_init:
-   iounmap(adapter-hw.hw_addr);
-err_ioremap:
-   free_netdev(netdev);
-err_alloc_etherdev:
-   pci_release_regions(pdev);
-err_pci_reg:
-err_dma:
-   pci_disable_device(pdev);
return err;
 }
 
@@ -1224,21 +1201,6 @@ e1000_remove(struct pci_dev *pdev)
 
if (!e1000_check_phy_reset_block(adapter-hw))
e1000_phy_hw_reset(adapter-hw);
-
-   kfree(adapter-tx_ring);
-   kfree(adapter-rx_ring);
-#ifdef CONFIG_E1000_NAPI
-   kfree(adapter-polling_netdev);
-#endif
-
-   

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 11:25:05PM +0530, Satyam Sharma wrote:
 Hi Paul,
 On Wed, 15 Aug 2007, Paul E. McKenney wrote:
 
  On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
   [...]
   No, I'd actually prefer something like what Christoph Lameter suggested,
   i.e. users (such as above) who want volatile-like behaviour from atomic
   ops can use alternative functions. How about something like:
   
   #define atomic_read_volatile(v)   \
 ({  \
 forget((v)-counter);  \
 ((v)-counter); \
 })
  
  Wouldn't the above forget the value, throw it away, then forget
  that it forgot it, giving non-volatile semantics?
 
 Nope, I don't think so. I wrote the following trivial testcases:
 [ See especially tp4.c and tp4.s (last example). ]

Right.  I should have said wouldn't the compiler be within its rights
to forget the value, throw it away, then forget that it forgot it.
The value coming out of the #define above is an unadorned ((v)-counter),
which has no volatile semantics.

 ==
 $ cat tp1.c # Using volatile access casts
 
 #define atomic_read(a)(*(volatile int *)a)
 
 int a;
 
 void func(void)
 {
   a = 0;
   while (atomic_read(a))
   ;
 }
 ==
 $ gcc -Os -S tp1.c; cat tp1.s
 
 func:
   pushl   %ebp
   movl%esp, %ebp
   movl$0, a
 .L2:
   movla, %eax
   testl   %eax, %eax
   jne .L2
   popl%ebp
   ret
 ==
 $ cat tp2.c # Using nothing; gcc will optimize the whole loop away
 
 #define forget(x)
 
 #define atomic_read(a)\
   ({  \
   forget((a));   \
   (a);\
   })
 
 int a;
 
 void func(void)
 {
   a = 0;
   while (atomic_read(a))
   ;
 }
 ==
 $ gcc -Os -S tp2.c; cat tp2.s
 
 func:
   pushl   %ebp
   movl%esp, %ebp
   popl%ebp
   movl$0, a
   ret
 ==
 $ cat tp3.c # Using a full memory clobber barrier
 
 #define forget(x) asm volatile (:::memory)
 
 #define atomic_read(a)\
   ({  \
   forget((a));   \
   (a);\
   })
 
 int a;
 
 void func(void)
 {
   a = 0;
   while (atomic_read(a))
   ;
 }
 ==
 $ gcc -Os -S tp3.c; cat tp3.s
 
 func:
   pushl   %ebp
   movl%esp, %ebp
   movl$0, a
 .L2:
   cmpl$0, a
   jne .L2
   popl%ebp
   ret
 ==
 $ cat tp4.c # Using a forget(var) macro
 
 #define forget(a) __asm__ __volatile__ ( :=m (a) :m (a))
 
 #define atomic_read(a)\
   ({  \
   forget(a);  \
   (a);\
   })
 
 int a;
 
 void func(void)
 {
   a = 0;
   while (atomic_read(a))
   ;
 }
 ==
 $ gcc -Os -S tp4.c; cat tp4.s
 
 func:
   pushl   %ebp
   movl%esp, %ebp
   movl$0, a
 .L2:
   cmpl$0, a
   jne .L2
   popl%ebp
   ret
 ==
 
 Possibly these were too trivial to expose any potential problems that you
 may have been referring to, so would be helpful if you could write a more
 concrete example / sample code.

The trick is to have a sufficiently complicated expression to force
the compiler to run out of registers.  If the value is non-volatile,
it will refetch it (and expect it not to have changed, possibly being
disappointed by an interrupt handler running on that same CPU).

   Or possibly, implement these volatile atomic ops variants in inline asm
   like the patch that Sebastian Siewior has submitted on another thread just
   a while back.
  
  Given that you are advocating a change (please keep in mind that
  atomic_read() and atomic_set() had volatile semantics on almost all
  platforms), care to give some example where these historical volatile
  semantics are causing a problem?
  [...]
  Can you give even one example
  where the pre-existing volatile semantics are causing enough of a problem
  to justify adding yet more atomic_*() APIs?
 
 Will take this to the other sub-thread ...

OK.

   Of course, if we find there are more callers in the kernel who want the
   volatility behaviour than those who don't care, we can 

Re: [PATCH 1/1] bonding: eliminate RTNL assertion spew

2007-08-15 Thread Jay Vosburgh
Mike Snitzer [EMAIL PROTECTED] wrote:

I'd very much like to help out.  The rtnl assertion spew isn't
instilling confidence in customers I've been working with.  If you'd
like to send me patches in private I'd help test them ASAP.

I'll send you some stuff off-list in a bit.

Could you elaborate on the associated risk of _not_ fixing these
issues?  balance-alb _seems_ to be working even though these traces
occur on initialization.  But these rtnl traces are clearly more
generic than balance-alb.

There are really a couple of things going on.

One danger is that some network device drivers may sleep in
certain critical sections (set MAC address, for example) while bonding
holds some lock.  Most drivers don't have potential sleeps here, but a
few do.  The most notable as I recall are a subset of the tg3 devices,

The other danger is that some callback in the notifier call when
the MAC address changes may sleep.

These are both separate from the RTNL warnings, which are a
notification that the interface is being messed with, but RTNL isn't
held.  The danger here is that a concurrent, independent, operation
could acquire RTNL and simultaneously fiddle with the interface.

The ultimate problem with fixing it is that the locking in
bonding was implemented before these locking constraints existed, and
untangling the locking to conform to the new rules is fairly invovled.
Andy and I have been through several iterations of a final patch, and
we keep finding regressions.

-J

---
-Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 08:51:58PM +0200, Segher Boessenkool wrote:
 Well if there is only one memory location involved, then smp_rmb() 
 isn't
 going to really do anything anyway, so it would be incorrect to use it.
 
 rmb() orders *any* two reads; that includes two reads from the same
 location.

If the two reads are to the same location, all CPUs I am aware of
will maintain the ordering without need for a memory barrier.

Thanx, Paul

 Consider that smp_rmb basically will do anything from flushing the
 pipeline to invalidating loads speculatively executed out of order. 
 AFAIK
 it will not control the visibility of stores coming from other CPUs 
 (that
 is up to the cache coherency).
 
 The writer side should typically use wmb() whenever the reader side
 uses rmb(), sure.
 
 
 Segher
 
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma


On Wed, 15 Aug 2007, Segher Boessenkool wrote:

  Volatile behaviour itself isn't consistently defined (at least
  definitely not consistently implemented in various gcc versions across
  platforms),
 
 It should be consistent across platforms; if not, file a bug please.
 
  but it is /expected/ to mean something like: ensure that
  every such access actually goes all the way to memory, and is not
  re-ordered w.r.t. to other accesses, as far as the compiler can take
  ^
  (volatile)

(or, alternatively, other accesses to the same volatile object ...)

  care of these. The last as far as compiler can take care disclaimer
  comes about due to CPUs doing their own re-ordering nowadays.
 
 You can *expect* whatever you want, but this isn't in line with
 reality at all.
 
 volatile _does not_ prevent reordering wrt other accesses.
 [...]
 What volatile does are a) never optimise away a read (or write)
 to the object, since the data can change in ways the compiler
 cannot see; and b) never move stores to the object across a
 sequence point.  This does not mean other accesses cannot be
 reordered wrt the volatile access.
 
 If the abstract machine would do an access to a volatile-
 qualified object, the generated machine code will do that
 access too.  But, for example, it can still be optimised
 away by the compiler, if it can prove it is allowed to.

As (now) indicated above, I had meant multiple volatile accesses to
the same object, obviously.

BTW:

#define atomic_read(a)  (*(volatile int *)(a))
#define atomic_set(a,i) (*(volatile int *)(a) = (i))

int a;

void func(void)
{
int b;

b = atomic_read(a);
atomic_set(a, 20);
b = atomic_read(a);
}

gives:

func:
pushl   %ebp
movla, %eax
movl%esp, %ebp
movl$20, a
movla, %eax
popl%ebp
ret

so the first atomic_read() wasn't optimized away.


 volatile _does not_ make accesses go all the way to memory.
 [...]
 If you want stuff to go all the way to memory, you need some
 architecture-specific flush sequence; to make a store globally
 visible before another store, you need mb(); before some following
 read, you need mb(); to prevent reordering you need a barrier.

Sure, which explains the as far as the compiler can take care bit.
Poor phrase / choice of words, probably.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8891] New: in-kernel rpc generates broken RPCBPROC_GETVERSADDR v4 requests

2007-08-15 Thread Andrew Morton
On Wed, 15 Aug 2007 12:22:51 -0700 (PDT)
[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=8891
 
Summary: in-kernel rpc generates broken RPCBPROC_GETVERSADDR v4
 requests
Product: Networking
Version: 2.5
  KernelVersion: 2.6.22.1
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: normal
   Priority: P1
  Component: Other
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur: 2.6.20

Apparently a regression.

 Distribution: Fedora Core 6
 Hardware Environment: x86_64
 Software Environment: NFS
 Problem Description: When locking a file, an invalid RPCBPROC_GETVERSADDR
 procedure call is sent out
 
 Steps to reproduce:
 on the NFS server, run: tcpdump -xX -pni eth0 -s0 port 111 and src host
 theNFSclient
 on a client running 2.6.22.1, run: flock -x /nfsmount/somefile ls
 tcpdump will show:
 
 17:10:01.290655 IP 141.2.15.141.34572  141.2.1.1.sunrpc: P 0:168(168) ack 1
 win 183 nop,nop,timestamp 966695 115079499
 0x:  4500 00dc 53ad 4000 3f06 bcdc 8d02 0f8d  [EMAIL PROTECTED]
 0x0010:  8d02 0101 870c 006f cdcd 938d db00 c8a7 
 ...o
 0x0020:  8018 00b7 e59d  0101 080a 000e c027  ...'
 0x0030:  06db f94b 8000 00a4 fd48 2a07    ...K.H*.
 0x0040:   0002 0001 86a0  0004  0009  
 0x0050:   0001  0054  03c6  0007  ...T
 0x0060:  6572 6964 796b 6500      eridyke.
 0x0070:   000e    0001  0002  
 0x0080:   0003  0004  0005  0006  
 0x0090:   0007  0007  0009  000a  
 0x00a0:   000c  0014  001c    
 0x00b0:    0001 86b5  0004  0003  
 0x00c0:  7463 7000  0009 3134 312e 322e 312e  tcp.141.2.1.
 0x00d0:  3100   0004 7270 63621...rpcb
 
 Note the 141.2.1.1 in the output.
 
 According to RFC 1833, you can read here the following fields:
 
 RPCBPROC_GETVERSADDR version 4 procedure is being called
 r_prog == 0x000186b5 == 100021 == nfs.lockd
 r_vers == 4
 r_netid == (length 3) tcp
 r_addr == (length 9) 141.2.1.1
 r_owner == (length 4) rpcb
 
 This r_addr member is supposed to contain an universal address. Although I 
 have
 no source for that, the RFC clearly says a service can listen to an address,
 and RPCBPROC_GETVERSADDR is supposed to return an universal address too. From
 this I can conclude that an universal address is supposed to contain the port
 number, and other operating systems are using the format
 a.b.c.d.PortHighByte.PortLowByte (like in FTP, but with . instead of ,). I
 interpret the RFC 1833 so that the port number of the rpcbind, that is, 111, 
 is
 to be appended, so the correct value of r_addr would be 141.2.1.1.0.111. 
 This
 matches other implementations.
 
 Of course, all this does sound like nitpicking, and as the callee is not even
 supposed to look at the port number (this argument is only there so the callee
 can decide correctly which interface address to use for the response). 
 However,
 this omission triggers an apparently unpatched denial of service vulnerability
 against HP-UX 11.11's rpcbind and causes it to quit with a core dump and a bus
 error. When using a correctly formed universal address, or the empty string,
 there is no crash of HP-UX's rpcbind.
 
 And of course it is HP who has to fix the DoS bug - but Linux 2.6.22 triggers
 it by a RFC violation, which is a minor bug to be fixed on Linux's side. By 
 the
 way, does anyone have the right contacts to HP to report this bug? I do not
 have a HP service contract any more, and the only other support place I found
 is their public forum, where I of course can only provide limited information
 about the bug (basically, the same I am posting here).
 
 Maybe the bug is fixed in a more current kernel, I was only using Fedora's
 highly patched distribution kernel and compared it to the sources of the base
 version in the main tree, where I see the very same problem in the source 
 code.
 So I do not think it was Fedora who caused the problem, and am assigning it to
 mainline.
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Satyam Sharma
[ The Cc: list scares me. Should probably trim it. ]


On Wed, 15 Aug 2007, Paul E. McKenney wrote:

 On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote:
  How does the compiler know that msleep() has got barrier()s?
  
  Because msleep_interruptible() is in a separate compilation unit,
  the compiler has to assume that it might modify any arbitrary global.
  
  No; compilation units have nothing to do with it, GCC can optimise
  across compilation unit boundaries just fine, if you tell it to
  compile more than one compilation unit at once.
 
 Last I checked, the Linux kernel build system did compile each .c file
 as a separate compilation unit.
 
  What you probably mean is that the compiler has to assume any code
  it cannot currently see can do anything (insofar as allowed by the
  relevant standards etc.)

I think this was just terminology confusion here again. Isn't any code
that it cannot currently see the same as another compilation unit,
and wouldn't the compilation unit itself expand if we ask gcc to
compile more than one unit at once? Or is there some more specific
definition for compilation unit (in gcc lingo, possibly?)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool

Well if there is only one memory location involved, then smp_rmb()
isn't
going to really do anything anyway, so it would be incorrect to use 
it.


rmb() orders *any* two reads; that includes two reads from the same
location.


If the two reads are to the same location, all CPUs I am aware of
will maintain the ordering without need for a memory barrier.


That's true of course, although there is no real guarantee for that.


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool

Well if there is only one memory location involved, then smp_rmb()
isn't
going to really do anything anyway, so it would be incorrect to use
it.


rmb() orders *any* two reads; that includes two reads from the same
location.


If the two reads are to the same location, all CPUs I am aware of
will maintain the ordering without need for a memory barrier.


That's true of course, although there is no real guarantee for that.


A CPU that did not provide this property (cache coherence) would be
most emphatically reviled.


That doesn't have anything to do with coherency as far as I can see.

It's just about the order in which a CPU (speculatively) performs the 
loads

(which isn't necessarily the same as the order in which it executes the
corresponding instructions, even).


So we are pretty safe assuming that CPUs
will provide it.


Yeah, pretty safe.  I just don't like undocumented assumptions :-)


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Paul E. McKenney
On Thu, Aug 16, 2007 at 01:24:42AM +0530, Satyam Sharma wrote:
 [ The Cc: list scares me. Should probably trim it. ]

Trim away!  ;-)

 On Wed, 15 Aug 2007, Paul E. McKenney wrote:
 
  On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote:
   How does the compiler know that msleep() has got barrier()s?
   
   Because msleep_interruptible() is in a separate compilation unit,
   the compiler has to assume that it might modify any arbitrary global.
   
   No; compilation units have nothing to do with it, GCC can optimise
   across compilation unit boundaries just fine, if you tell it to
   compile more than one compilation unit at once.
  
  Last I checked, the Linux kernel build system did compile each .c file
  as a separate compilation unit.
  
   What you probably mean is that the compiler has to assume any code
   it cannot currently see can do anything (insofar as allowed by the
   relevant standards etc.)
 
 I think this was just terminology confusion here again. Isn't any code
 that it cannot currently see the same as another compilation unit,
 and wouldn't the compilation unit itself expand if we ask gcc to
 compile more than one unit at once? Or is there some more specific
 definition for compilation unit (in gcc lingo, possibly?)

This is indeed my understanding -- compilation unit is whatever the
compiler looks at in one go.  I have heard the word module used for
the minimal compilation unit covering a single .c file and everything
that it #includes, but there might be a better name for this.

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 11:30:05PM +1000, Nick Piggin wrote:
 Paul E. McKenney wrote:
 On Tue, Aug 14, 2007 at 03:34:25PM +1000, Nick Piggin wrote:
 
 Maybe it is the safe way to go, but it does obscure cases where there
 is a real need for barriers.
 
 
 I prefer burying barriers into other primitives.
 
 When they should naturally be there, eg. locking or the RCU primitives,
 I agree.
 
 I don't like having them scattered in various just in case places,
 because it makes both the users and the APIs hard to understand and
 change.

I certainly agree that we shouldn't do volatile just for the fun of it,
and also that use of volatile should be quite rare.

 Especially since several big architectures don't have volatile in their
 atomic_get and _set, I think it would be a step backwards to add them in
 as a just in case thin now (unless there is a better reason).

Good point, except that I would expect gcc's optimization to continue
to improve.  I would like the kernel to be able to take advantage of
improved optimization, which means that we are going to have to make
a few changes.  Adding volatile to atomic_get() and atomic_set() is
IMHO one of those changes.

 Many atomic operations are allowed to be reordered between CPUs, so
 I don't have a good idea for the rationale to order them within the
 CPU (also loads and stores to long and ptr types are not ordered like
 this, although we do consider those to be atomic operations too).
 
 barrier() in a way is like enforcing sequential memory ordering
 between process and interrupt context, wheras volatile is just
 enforcing coherency of a single memory location (and as such is
 cheaper).
 
 
 barrier() is useful, but it has the very painful side-effect of forcing
 the compiler to dump temporaries.  So we do need something that is
 not quite so global in effect.
 
 Yep.
 
 What do you think of this crazy idea?
 
 /* Enforce a compiler barrier for only operations to location X.
 * Call multiple times to provide an ordering between multiple
 * memory locations. Other memory operations can be assumed by
 * the compiler to remain unchanged and may be reordered
 */
 #define order(x) asm volatile( : +m (x))
 
 There was something very similar discussed earlier in this thread,
 with quite a bit of debate as to exactly what the m flag should
 look like.  I suggested something similar named ACCESS_ONCE in the
 context of RCU (http://lkml.org/lkml/2007/7/11/664):
 
 Oh, I missed that earlier debate. Will go have a look.
 
  #define ACCESS_ONCE(x) (*(volatile typeof(x) *)(x))
 
 The nice thing about this is that it works for both loads and stores.
 Not clear that order() above does this -- I get compiler errors when
 I try something like b = order(a) or order(a) = 1 using gcc 4.1.2.
 
 As Arnd ponted out, order() is not supposed to be an lvalue, but a
 statement like the rest of our memory ordering API.
 
 As to your followup question of why to use it over ACCESS_ONCE. I
 guess, aside from consistency with the rest of the barrier APIs, you
 can use it in other primitives when you don't actually know what the
 caller is going to do or if it even will make an access. You could
 also use it between calls to _other_ primitives, etc... it just
 seems more flexible to me, but I haven't actually used such a thing
 in real code...
 
 ACCESS_ONCE doesn't seem as descriptive. What it results in is the
 memory location being loaded or stored (presumably once exactly),
 but I think the more general underlying idea is a barrier point.

OK, first, I am not arguing that ACCESS_ONCE() can replace all current
uses of barrier().  Similarly, rcu_dereference(), rcu_assign_pointer(),
and the various RCU versions of the list-manipulation API in no way
replaced all uses of explicit memory barriers.  However, I do believe that
these API are much easier to use (where they apply, of course) than were
the earlier idioms involving explicit memory barriers.

But we of course need to keep the explicit memory and compiler barriers
for other situations.

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 09:46:55PM +0200, Segher Boessenkool wrote:
 Well if there is only one memory location involved, then smp_rmb()
 isn't
 going to really do anything anyway, so it would be incorrect to use 
 it.
 
 rmb() orders *any* two reads; that includes two reads from the same
 location.
 
 If the two reads are to the same location, all CPUs I am aware of
 will maintain the ordering without need for a memory barrier.
 
 That's true of course, although there is no real guarantee for that.

A CPU that did not provide this property (cache coherence) would be
most emphatically reviled.  So we are pretty safe assuming that CPUs
will provide it.

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2007, Stefan Richter wrote:

 LDD3 says on page 125:  The following operations are defined for the
 type [atomic_t] and are guaranteed to be atomic with respect to all
 processors of an SMP computer.
 
 Doesn't atomic WRT all processors require volatility?

Atomic operations only require exclusive access to the cacheline while the 
value is modified.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NFS] [Bugme-new] [Bug 8891] New: in-kernel rpc generates broken RPCBPROC_GETVERSADDR v4 requests

2007-08-15 Thread Chuck Lever

Andrew Morton wrote:

On Wed, 15 Aug 2007 12:22:51 -0700 (PDT)
[EMAIL PROTECTED] wrote:


http://bugzilla.kernel.org/show_bug.cgi?id=8891

   Summary: in-kernel rpc generates broken RPCBPROC_GETVERSADDR v4
requests
   Product: Networking
   Version: 2.5
 KernelVersion: 2.6.22.1
  Platform: All
OS/Version: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]


Most recent kernel where this bug did not occur: 2.6.20


Apparently a regression.


RPCBIND v4 requests are an experimental feature.  They can be disabled 
via a kernel build option (CONFIG_SUNRPC_BIND34).  I think these are 
left enabled in Fedora to find just the type of problem before v3 and v4 
becomes the default.



Distribution: Fedora Core 6
Hardware Environment: x86_64
Software Environment: NFS
Problem Description: When locking a file, an invalid RPCBPROC_GETVERSADDR
procedure call is sent out

Steps to reproduce:
on the NFS server, run: tcpdump -xX -pni eth0 -s0 port 111 and src host
theNFSclient
on a client running 2.6.22.1, run: flock -x /nfsmount/somefile ls
tcpdump will show:

17:10:01.290655 IP 141.2.15.141.34572  141.2.1.1.sunrpc: P 0:168(168) ack 1
win 183 nop,nop,timestamp 966695 115079499
0x:  4500 00dc 53ad 4000 3f06 bcdc 8d02 0f8d  [EMAIL PROTECTED]
0x0010:  8d02 0101 870c 006f cdcd 938d db00 c8a7 
...o

0x0020:  8018 00b7 e59d  0101 080a 000e c027  ...'
0x0030:  06db f94b 8000 00a4 fd48 2a07    ...K.H*.
0x0040:   0002 0001 86a0  0004  0009  
0x0050:   0001  0054  03c6  0007  ...T
0x0060:  6572 6964 796b 6500      eridyke.
0x0070:   000e    0001  0002  
0x0080:   0003  0004  0005  0006  
0x0090:   0007  0007  0009  000a  
0x00a0:   000c  0014  001c    
0x00b0:    0001 86b5  0004  0003  
0x00c0:  7463 7000  0009 3134 312e 322e 312e  tcp.141.2.1.
0x00d0:  3100   0004 7270 63621...rpcb

Note the 141.2.1.1 in the output.

According to RFC 1833, you can read here the following fields:

RPCBPROC_GETVERSADDR version 4 procedure is being called
r_prog == 0x000186b5 == 100021 == nfs.lockd
r_vers == 4
r_netid == (length 3) tcp
r_addr == (length 9) 141.2.1.1
r_owner == (length 4) rpcb

This r_addr member is supposed to contain an universal address. Although I have
no source for that, the RFC clearly says a service can listen to an address,
and RPCBPROC_GETVERSADDR is supposed to return an universal address too. From
this I can conclude that an universal address is supposed to contain the port
number, and other operating systems are using the format
a.b.c.d.PortHighByte.PortLowByte (like in FTP, but with . instead of ,). I
interpret the RFC 1833 so that the port number of the rpcbind, that is, 111, is
to be appended, so the correct value of r_addr would be 141.2.1.1.0.111. This
matches other implementations.


That RFC is unclear on exactly what a universal address should look 
like.  However speidel's interpretation is not unreasonable, and I'm 
glad he was able to check this against other implementations that I 
don't have.  My unit testing against Solaris did not reveal this issue.


I'll look into the problem.
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard



Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool

How does the compiler know that msleep() has got barrier()s?


Because msleep_interruptible() is in a separate compilation unit,
the compiler has to assume that it might modify any arbitrary global.


No; compilation units have nothing to do with it, GCC can optimise
across compilation unit boundaries just fine, if you tell it to
compile more than one compilation unit at once.

What you probably mean is that the compiler has to assume any code
it cannot currently see can do anything (insofar as allowed by the
relevant standards etc.)

In many cases, the compiler also has to assume that 
msleep_interruptible()

might call back into a function in the current compilation unit, thus
possibly modifying global static variables.


It most often is smart enough to see what compilation-unit-local
variables might be modified that way, though :-)


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
I think this was just terminology confusion here again. Isn't any 
code

that it cannot currently see the same as another compilation unit,
and wouldn't the compilation unit itself expand if we ask gcc to
compile more than one unit at once? Or is there some more specific
definition for compilation unit (in gcc lingo, possibly?)


This is indeed my understanding -- compilation unit is whatever the
compiler looks at in one go.  I have heard the word module used for
the minimal compilation unit covering a single .c file and everything
that it #includes, but there might be a better name for this.


Yes, that's what's called compilation unit :-)

[/me double checks]

Erm, the C standard actually calls it translation unit.

To be exact, to avoid any more confusion:

5.1.1.1/1:
A C program need not all be translated at the same time. The
text of the program is kept in units called source files, (or
preprocessing files) in this International Standard. A source
file together with all the headers and source files included
via the preprocessing directive #include is known as a
preprocessing translation unit. After preprocessing, a
preprocessing translation unit is called a translation unit.



Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool

What volatile does are a) never optimise away a read (or write)
to the object, since the data can change in ways the compiler
cannot see; and b) never move stores to the object across a
sequence point.  This does not mean other accesses cannot be
reordered wrt the volatile access.

If the abstract machine would do an access to a volatile-
qualified object, the generated machine code will do that
access too.  But, for example, it can still be optimised
away by the compiler, if it can prove it is allowed to.


As (now) indicated above, I had meant multiple volatile accesses to
the same object, obviously.


Yes, accesses to volatile objects are never reordered with
respect to each other.


BTW:

#define atomic_read(a)  (*(volatile int *)(a))
#define atomic_set(a,i) (*(volatile int *)(a) = (i))

int a;

void func(void)
{
int b;

b = atomic_read(a);
atomic_set(a, 20);
b = atomic_read(a);
}

gives:

func:
pushl   %ebp
movla, %eax
movl%esp, %ebp
movl$20, a
movla, %eax
popl%ebp
ret

so the first atomic_read() wasn't optimized away.


Of course.  It is executed by the abstract machine, so
it will be executed by the actual machine.  On the other
hand, try

b = 0;
if (b)
b = atomic_read(a);

or similar.


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool

What you probably mean is that the compiler has to assume any code
it cannot currently see can do anything (insofar as allowed by the
relevant standards etc.)


I think this was just terminology confusion here again. Isn't any code
that it cannot currently see the same as another compilation unit,


It is not; try  gcc -combine  or the upcoming link-time optimisation
stuff, for example.


and wouldn't the compilation unit itself expand if we ask gcc to
compile more than one unit at once? Or is there some more specific
definition for compilation unit (in gcc lingo, possibly?)


compilation unit is a C standard term.  It typically boils down
to single .c file.


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 10:13:49PM +0200, Segher Boessenkool wrote:
 Well if there is only one memory location involved, then smp_rmb()
 isn't
 going to really do anything anyway, so it would be incorrect to use
 it.
 
 rmb() orders *any* two reads; that includes two reads from the same
 location.
 
 If the two reads are to the same location, all CPUs I am aware of
 will maintain the ordering without need for a memory barrier.
 
 That's true of course, although there is no real guarantee for that.
 
 A CPU that did not provide this property (cache coherence) would be
 most emphatically reviled.
 
 That doesn't have anything to do with coherency as far as I can see.
 
 It's just about the order in which a CPU (speculatively) performs the 
 loads
 (which isn't necessarily the same as the order in which it executes the
 corresponding instructions, even).

Please check the definition of cache coherence.

Summary: the CPU is indeed within its rights to execute loads and stores
to a single variable out of order, -but- only if it gets the same result
that it would have obtained by executing them in order.  Which means that
any reordering of accesses by a single CPU to a single variable will be
invisible to the software.

 So we are pretty safe assuming that CPUs
 will provide it.
 
 Yeah, pretty safe.  I just don't like undocumented assumptions :-)

Can't help you there!  ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 08/18] 3c59x: check return of pci_enable_device()

2007-08-15 Thread Steffen Klassert
On Wed, Aug 15, 2007 at 06:30:00PM +0200, Steffen Klassert wrote:
 On Tue, Aug 14, 2007 at 10:54:32AM +0100, Mark Hindley wrote:
  On Tue, Aug 14, 2007 at 01:33:26AM -0400, Jeff Garzik wrote:
   I would strongly prefer that vortex_up return a value, since all the 
   important callers of this function can themselves return an error back 
   to the system.
   
   we can definitely return a meaningful return value here, if 
   pci_enable_device() fails, and I would rather not apply a patch that 
   fails to propagate a serious condition (pci_enable_device failure is 
   indeed serious) when it is possible to do so
   
  
  OK. Any comments on this revised version? I have only ignored the return of
  vortex_up in vortex_error. It is not immediately clear what to do if
  vortex_up still fails there after a pci reset.
  
  Mark
  
  
  diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
  index 001c66d..a1dfd6b 100644
  --- a/drivers/net/3c59x.c
  +++ b/drivers/net/3c59x.c
  @@ -705,7 +705,7 @@ static struct {
   
   static int vortex_probe1(struct device *gendev, void __iomem *ioaddr, int 
  irq,
 int chip_idx, int card_idx);
  -static void vortex_up(struct net_device *dev);
  +static int vortex_up(struct net_device *dev);
   static void vortex_down(struct net_device *dev, int final);
   static int vortex_open(struct net_device *dev);
   static void mdio_sync(void __iomem *ioaddr, int bits);
  @@ -841,8 +841,11 @@ static int vortex_resume(struct pci_dev *pdev)
  return -EBUSY;
  }
  if (netif_running(dev)) {
  -   vortex_up(dev);
  -   netif_device_attach(dev);
  +   err = vortex_up(dev);
  +   if (err)
  +   return err;
  +   else  
  +   netif_device_attach(dev);
  }
  }
  return 0;
 
 I think we should free the requested irq if vortex_up really fails here.
 

I was wrong, this will be done in vortex_close. So it is OK as it is.

Steffen 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
Of course, if we find there are more callers in the kernel who want 
the

volatility behaviour than those who don't care, we can re-define the
existing ops to such variants, and re-name the existing definitions 
to

somethine else, say atomic_read_nonvolatile for all I care.


Do we really need another set of APIs?


Well, if there's one set of users who do care about volatile behaviour,
and another set that doesn't, it only sounds correct to provide both
those APIs, instead of forcing one behaviour on all users.


But since there currently is only one such API, and there are
users expecting the stronger behaviour, the only sane thing to
do is let the API provide that behaviour.  You can always add
a new API with weaker behaviour later, and move users that are
okay with it over to that new API.


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool

No; compilation units have nothing to do with it, GCC can optimise
across compilation unit boundaries just fine, if you tell it to
compile more than one compilation unit at once.


Last I checked, the Linux kernel build system did compile each .c file
as a separate compilation unit.


I have some patches to use -combine -fwhole-program for Linux.
Highly experimental, you need a patched bleeding edge toolchain.
If there's interest I'll clean it up and put it online.

David Woodhouse had some similar patches about a year ago.


In many cases, the compiler also has to assume that
msleep_interruptible()
might call back into a function in the current compilation unit, thus
possibly modifying global static variables.


It most often is smart enough to see what compilation-unit-local
variables might be modified that way, though :-)


Yep.  For example, if it knows the current value of a given such local
variable, and if all code paths that would change some other variable
cannot be reached given that current value of the first variable.


Or the most common thing: if neither the address of the translation-
unit local variable nor the address of any function writing to that
variable can escape from that translation unit, nothing outside
the translation unit can write to the variable.

At least given that gcc doesn't know about multiple threads of 
execution!


Heh, only about the threads it creates itself (not relevant to
the kernel, for sure :-) )


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


skge and sky2 stop working after a few minutes.

2007-08-15 Thread Sébastien Judenherc
Hello,
I can see some very strange problem with sky2 and skge interfaces with 2.6.22,
and also 2.6.23-rc2/3.
After 8-9 minutes, the interfaces stop working. ethtool reports that the link is
down and the only way to make the interfaces usable again is
removing/reinserting the module or running ethtool -r.
This occurs for both sky2 and skge for two PCs connected to a gigabit switch.
The other PCs on the same switch continue to work fine (most are 100Mbit/s). It
also occurs on a third PC with skge, directly connected to a 100Mbit/s interface
over a cross over cable. I didn't notice the problem with 2.6.19.

After 8-9 minutes, the interface stops working. ethtool reports that the link is
down and the only way to make the interface usable again is removing/reinserting
the module or running ethtool -r.

For the skge interface connected to the gigabit switch, here is what ethtool
reports:

Settings for eth0:
Supported ports: [ TP ]
Supported link modes:   10baseT/Half 10baseT/Full 
100baseT/Half 100baseT/Full 
1000baseT/Half 1000baseT/Full 
Supports auto-negotiation: Yes
Advertised link modes:  10baseT/Half 10baseT/Full 
100baseT/Half 100baseT/Full 
1000baseT/Half 1000baseT/Full 
Advertised auto-negotiation: Yes
Speed: Unknown! (65535)
Duplex: Unknown! (255)
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pg
Wake-on: g
Current message level: 0x0037 (55)
Link detected: no
NIC statistics:
 tx_bytes: 0
 rx_bytes: 0
 tx_broadcast: 0
 rx_broadcast: 0
 tx_multicast: 0
 rx_multicast: 0
 tx_unicast: 0
 rx_unicast: 0
 tx_mac_pause: 0
 rx_mac_pause: 0
 collisions: 0
 multi_collisions: 0
 aborted: 0
 late_collision: 0
 fifo_underrun: 0
 fifo_overflow: 0
 rx_toolong: 0
 rx_jabber: 0
 rx_runt: 0
 rx_too_long: 0
 rx_fcs_error: 0

and here is what dmesg says just after the hangs with debug=16 for skge:
...
eth0: tx queued, slot 127, len 1098
skge eth0: tx done slot 127
skge eth0: rx slot 394 status 0x420100 len 66
skge eth0: rx slot 395 status 0x420100 len 66
eth0: tx queued, slot 0, len 42
skge eth0: tx done slot 0
eth0: tx queued, slot 1, len 74
skge eth0: tx done slot 1
skge eth0: rx slot 396 status 0x420100 len 66
eth0: tx queued, slot 2, len 1414
skge eth0: tx done slot 2
skge eth0: rx slot 397 status 0x420100 len 66
eth0: tx queued, slot 3, len 270
skge eth0: tx done slot 3
skge eth0: rx slot 398 status 0x420100 len 66
eth0: tx queued, slot 4, len 74
skge eth0: tx done slot 4
skge eth0: rx slot 399 status 0x420100 len 66
skge eth0: phy interrupt status 0x400 0x814c
skge eth0: Link is down.

I would be happy to help.

regards
sébastien.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
Possibly these were too trivial to expose any potential problems that 
you
may have been referring to, so would be helpful if you could write a 
more

concrete example / sample code.


The trick is to have a sufficiently complicated expression to force
the compiler to run out of registers.


You can use -ffixed-XXX to keep the testcase simple.


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool

Please check the definition of cache coherence.


Which of the twelve thousand such definitions?  :-)

Summary: the CPU is indeed within its rights to execute loads and 
stores
to a single variable out of order, -but- only if it gets the same 
result
that it would have obtained by executing them in order.  Which means 
that

any reordering of accesses by a single CPU to a single variable will be
invisible to the software.


I'm still not sure if that applies to all architectures.
Doesn't matter anyway, let's kill this thread :-)


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux kernel 2.6.18-20 bug: rcu_read_unlock in __sock_create

2007-08-15 Thread David Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Wed, 15 Aug 2007 16:33:35 +0800

 [NET]: Fix unbalanced rcu_read_unlock in __sock_create
 
 The recent RCU work created an unbalanced rcu_read_unlock
 in __sock_create.  This patch fixes that.  Reported by
 oleg 123.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Patch applied, thanks Herbert.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [IPv6]: Invalid semicolon after if statement

2007-08-15 Thread Ilpo Järvinen

A similar fix to netfilter from Eric Dumazet inspired me to
look around a bit by using some grep/sed stuff as looking for
this kind of bugs seemed easy to automate. This is one of them
I found where it looks like this semicolon is not valid.

Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
---
 net/ipv6/ipv6_sockglue.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index d684639..761a910 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -820,7 +820,7 @@ static int ipv6_getsockopt_sticky(struct sock *sk, struct 
ipv6_txoptions *opt,
return 0;
 
len = min_t(unsigned int, len, ipv6_optlen(hdr));
-   if (copy_to_user(optval, hdr, len));
+   if (copy_to_user(optval, hdr, len))
return -EFAULT;
return ipv6_optlen(hdr);
 }
-- 
1.5.0.6

Re: [PATCH] [IPv6]: Invalid semicolon after if statement

2007-08-15 Thread David Miller
From: Ilpo_Järvinen [EMAIL PROTECTED]
Date: Thu, 16 Aug 2007 00:57:00 +0300 (EEST)

 A similar fix to netfilter from Eric Dumazet inspired me to
 look around a bit by using some grep/sed stuff as looking for
 this kind of bugs seemed easy to automate. This is one of them
 I found where it looks like this semicolon is not valid.
 
 Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]

Yikes!  Makes you want to audit the entire tree for these
things :-)))

Patch applied, thanks a lot Ilpo!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2.6.24 1/5]S2io: Enable all the error and alarm indications

2007-08-15 Thread Ramkrishna Vepa
- Added support to unmask entire set of device errors and alarms.

Signed-off-by: Sivakumar Subramani [EMAIL PROTECTED]
Signed-off-by: Santosh Rastapur [EMAIL PROTECTED]
Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED]
---
diff -Nurp orig/drivers/net/s2io.c patch1/drivers/net/s2io.c
--- orig/drivers/net/s2io.c 2007-08-15 08:04:06.0 -0700
+++ patch1/drivers/net/s2io.c   2007-08-15 08:55:53.0 -0700
@@ -892,8 +892,9 @@ static void free_shared_mem(struct s2io_
k++;
}
kfree(mac_control-rings[i].ba[j]);
-   nic-mac_control.stats_info-sw_stat.mem_freed  
+= (sizeof(struct buffAdd) * 
-   (rxd_count[nic-rxd_mode] + 1));
+   nic-mac_control.stats_info-sw_stat.mem_freed 
+= 
+   (sizeof(struct buffAdd) * 
+   (rxd_count[nic-rxd_mode] + 1));
}
kfree(mac_control-rings[i].ba);
nic-mac_control.stats_info-sw_stat.mem_freed += 
@@ -1732,6 +1733,365 @@ static int s2io_link_fault_indication(st
return MAC_RMAC_ERR_TIMER;
 }
 
+void en_dis_err_alarms(struct s2io_nic *nic, u16 mask, int flag)
+{
+   struct XENA_dev_config __iomem *bar0 = nic-bar0;
+   register u64 val64 = 0, temp64 = 0, gen_int_mask = 0;
+
+   if (mask  TX_DMA_INTR) {
+   gen_int_mask |= TXDMA_INT_M;
+
+   if (flag == ENABLE_INTRS) {
+
+   val64 = TXDMA_TDA_INT|TXDMA_PFC_INT|TXDMA_PCC_INT
+   |TXDMA_TTI_INT|TXDMA_LSO_INT|TXDMA_TPA_INT
+   |TXDMA_SM_INT;
+   temp64 = readq(bar0-txdma_int_mask);
+   temp64 = ~((u64) val64);
+   writeq(temp64, bar0-txdma_int_mask);
+
+   val64 = PFC_ECC_DB_ERR|PFC_SM_ERR_ALARM|PFC_MISC_0_ERR
+   |PFC_MISC_1_ERR|PFC_PCIX_ERR|PFC_ECC_SG_ERR;
+   temp64 = readq(bar0-pfc_err_mask);
+   temp64 = ~((u64) val64);
+   writeq(temp64, bar0-pfc_err_mask);
+
+   val64 = TDA_Fn_ECC_DB_ERR|TDA_SM0_ERR_ALARM
+   |TDA_SM1_ERR_ALARM|TDA_Fn_ECC_SG_ERR|
+   TDA_PCIX_ERR;
+   temp64 = readq(bar0-tda_err_mask);
+   temp64 = ~((u64) val64);
+   writeq(temp64, bar0-tda_err_mask);
+
+   val64 = PCC_FB_ECC_DB_ERR|PCC_TXB_ECC_DB_ERR
+   |PCC_SM_ERR_ALARM|PCC_WR_ERR_ALARM|PCC_N_SERR
+   |PCC_6_COF_OV_ERR|PCC_7_COF_OV_ERR
+   |PCC_6_LSO_OV_ERR|PCC_7_LSO_OV_ERR
+   |PCC_FB_ECC_SG_ERR|PCC_TXB_ECC_SG_ERR;
+   temp64 = readq(bar0-pcc_err_mask);
+   temp64 = ~((u64) val64);
+   writeq(temp64, bar0-pcc_err_mask);
+
+   val64 = TTI_SM_ERR_ALARM|TTI_ECC_SG_ERR|TTI_ECC_DB_ERR;
+   temp64 = readq(bar0-tti_err_mask);
+   temp64 = ~((u64) val64);
+   writeq(temp64, bar0-tti_err_mask);
+
+
+   val64 = LSO6_ABORT|LSO7_ABORT|LSO6_SM_ERR_ALARM
+   |LSO7_SM_ERR_ALARM|LSO6_SEND_OFLOW
+   |LSO7_SEND_OFLOW;
+   temp64 = readq(bar0-lso_err_mask);
+   temp64 = ~((u64) val64);
+   writeq(temp64, bar0-lso_err_mask);
+
+   val64 = TPA_SM_ERR_ALARM|TPA_TX_FRM_DROP;
+   temp64 = readq(bar0-tpa_err_mask);
+   temp64 = ~((u64) val64);
+   writeq(temp64, bar0-tpa_err_mask);
+
+   val64 = SM_SM_ERR_ALARM;
+   temp64 = readq(bar0-sm_err_mask);
+   temp64 = ~((u64) val64);
+   writeq(temp64, bar0-sm_err_mask);
+   }
+   else {
+   val64 = TXDMA_TDA_INT|TXDMA_PFC_INT|TXDMA_PCC_INT
+   |TXDMA_TTI_INT|TXDMA_LSO_INT|TXDMA_TPA_INT
+   |TXDMA_SM_INT;
+   temp64 = readq(bar0-txdma_int_mask);
+   temp64 |= ((u64) val64);
+   writeq(temp64, bar0-txdma_int_mask);
+
+   val64 = PFC_ECC_DB_ERR|PFC_SM_ERR_ALARM|PFC_MISC_0_ERR
+   |PFC_MISC_1_ERR|PFC_PCIX_ERR|PFC_ECC_SG_ERR;
+   temp64 = readq(bar0-pfc_err_mask);
+   temp64 |= ((u64) val64);
+   writeq(temp64, bar0-pfc_err_mask);
+
+

[PATCH 2.6.24 2/5]S2io: Handle and monitor all of the device errors and alarms

2007-08-15 Thread Ramkrishna Vepa
- Added support to poll entire set of device errors and alarams.
- Replaced alarm_intr_handler() with s2io_handle_errors().
- Added statistic counters to monitor the alarms.

Signed-off-by: Sivakumar Subramani [EMAIL PROTECTED]
Signed-off-by: Santosh Rastapur [EMAIL PROTECTED]
Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED]
---
diff -Nurp patch1/drivers/net/s2io.c patch2/drivers/net/s2io.c
--- patch1/drivers/net/s2io.c   2007-08-15 08:55:53.0 -0700
+++ patch2/drivers/net/s2io.c   2007-08-15 08:57:44.0 -0700
@@ -263,7 +263,14 @@ static char ethtool_driver_stats_keys[][
{serious_err_cnt},
{soft_reset_cnt},
{fifo_full_cnt},
-   {ring_full_cnt},
+   {ring_0_full_cnt},
+   {ring_1_full_cnt},
+   {ring_2_full_cnt},
+   {ring_3_full_cnt},
+   {ring_4_full_cnt},
+   {ring_5_full_cnt},
+   {ring_6_full_cnt},
+   {ring_7_full_cnt},
(alarm_transceiver_temp_high),
(alarm_transceiver_temp_low),
(alarm_laser_bias_current_high),
@@ -303,7 +310,24 @@ static char ethtool_driver_stats_keys[][
(rx_tcode_fcs_err_cnt),
(rx_tcode_buf_size_err_cnt),
(rx_tcode_rxd_corrupt_cnt),
-   (rx_tcode_unkn_err_cnt)
+   (rx_tcode_unkn_err_cnt),
+   {tda_err_cnt},
+   {pfc_err_cnt},
+   {pcc_err_cnt},
+   {tti_err_cnt},
+   {tpa_err_cnt},
+   {sm_err_cnt},
+   {lso_err_cnt},
+   {mac_tmac_err_cnt},
+   {mac_rmac_err_cnt},
+   {xgxs_txgxs_err_cnt},
+   {xgxs_rxgxs_err_cnt},
+   {rc_err_cnt},
+   {prc_pcix_err_cnt},
+   {rpa_err_cnt},
+   {rda_err_cnt},
+   {rti_err_cnt},
+   {mc_err_cnt}
 };
 
 #define S2IO_XENA_STAT_LEN sizeof(ethtool_xena_stats_keys)/ ETH_GSTRING_LEN
@@ -3478,135 +3502,6 @@ static void s2io_updt_xpak_counter(struc
 }
 
 /**
- *  alarm_intr_handler - Alarm Interrrupt handler
- *  @nic: device private variable
- *  Description: If the interrupt was neither because of Rx packet or Tx
- *  complete, this function is called. If the interrupt was to indicate
- *  a loss of link, the OSM link status handler is invoked for any other
- *  alarm interrupt the block that raised the interrupt is displayed
- *  and a H/W reset is issued.
- *  Return Value:
- *  NONE
-*/
-
-static void alarm_intr_handler(struct s2io_nic *nic)
-{
-   struct net_device *dev = (struct net_device *) nic-dev;
-   struct XENA_dev_config __iomem *bar0 = nic-bar0;
-   register u64 val64 = 0, err_reg = 0;
-   u64 cnt;
-   int i;
-   if (atomic_read(nic-card_state) == CARD_DOWN)
-   return;
-   if (pci_channel_offline(nic-pdev))
-   return;
-   nic-mac_control.stats_info-sw_stat.ring_full_cnt = 0;
-   /* Handling the XPAK counters update */
-   if(nic-mac_control.stats_info-xpak_stat.xpak_timer_count  72000) {
-   /* waiting for an hour */
-   nic-mac_control.stats_info-xpak_stat.xpak_timer_count++;
-   } else {
-   s2io_updt_xpak_counter(dev);
-   /* reset the count to zero */
-   nic-mac_control.stats_info-xpak_stat.xpak_timer_count = 0;
-   }
-
-   /* Handling link status change error Intr */
-   if (s2io_link_fault_indication(nic) == MAC_RMAC_ERR_TIMER) {
-   err_reg = readq(bar0-mac_rmac_err_reg);
-   writeq(err_reg, bar0-mac_rmac_err_reg);
-   if (err_reg  RMAC_LINK_STATE_CHANGE_INT) {
-   schedule_work(nic-set_link_task);
-   }
-   }
-
-   /* Handling Ecc errors */
-   val64 = readq(bar0-mc_err_reg);
-   writeq(val64, bar0-mc_err_reg);
-   if (val64  (MC_ERR_REG_ECC_ALL_SNG | MC_ERR_REG_ECC_ALL_DBL)) {
-   if (val64  MC_ERR_REG_ECC_ALL_DBL) {
-   nic-mac_control.stats_info-sw_stat.
-   double_ecc_errs++;
-   DBG_PRINT(INIT_DBG, %s: Device indicates ,
- dev-name);
-   DBG_PRINT(INIT_DBG, double ECC error!!\n);
-   if (nic-device_type != XFRAME_II_DEVICE) {
-   /* Reset XframeI only if critical error */
-   if (val64  (MC_ERR_REG_MIRI_ECC_DB_ERR_0 |
-MC_ERR_REG_MIRI_ECC_DB_ERR_1)) {
-   netif_stop_queue(dev);
-   schedule_work(nic-rst_timer_task);
-   nic-mac_control.stats_info-sw_stat.
-   soft_reset_cnt++;
-   }
-   }
-   } else {
-   nic-mac_control.stats_info-sw_stat.
-   single_ecc_errs++;
-   }
-   }
-
-   /* In case of a serious error, the device will be Reset. */
-   val64 = 

[PATCH 2.6.24 3/5]S2io: Cleanup - removed unused variable, intr_type

2007-08-15 Thread Ramkrishna Vepa
- Removed the unused variable, intr_type, in device private structure.

Signed-off-by: Sivakumar Subramani [EMAIL PROTECTED]
Signed-off-by: Santosh Rastapur [EMAIL PROTECTED]
Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED]
---
diff -Nurp patch2/drivers/net/s2io.c patch3/drivers/net/s2io.c
--- patch2/drivers/net/s2io.c   2007-08-15 08:57:44.0 -0700
+++ patch3/drivers/net/s2io.c   2007-08-15 08:57:32.0 -0700
@@ -1611,7 +1611,7 @@ static int init_nic(struct s2io_nic *nic
 
val64 = RTI_DATA2_MEM_RX_UFC_A(0x1) |
RTI_DATA2_MEM_RX_UFC_B(0x2) ;
-   if (nic-intr_type == MSI_X)
+   if (nic-config.intr_type == MSI_X)
val64 |= (RTI_DATA2_MEM_RX_UFC_C(0x20) | \
RTI_DATA2_MEM_RX_UFC_D(0x40));
else
@@ -1749,7 +1749,7 @@ static int init_nic(struct s2io_nic *nic
 
 static int s2io_link_fault_indication(struct s2io_nic *nic)
 {
-   if (nic-intr_type != INTA)
+   if (nic-config.intr_type != INTA)
return MAC_RMAC_ERR_TIMER;
if (nic-device_type == XFRAME_II_DEVICE)
return LINK_UP_DOWN_INTERRUPT;
@@ -3774,7 +3774,7 @@ static int s2io_set_swapper(struct s2io_
 SWAPPER_CTRL_RXF_W_FE |
 SWAPPER_CTRL_XMSI_FE |
 SWAPPER_CTRL_STATS_FE | SWAPPER_CTRL_STATS_SE);
-   if (sp-intr_type == INTA)
+   if (sp-config.intr_type == INTA)
val64 |= SWAPPER_CTRL_XMSI_SE;
writeq(val64, bar0-swapper_ctrl);
 #else
@@ -3797,7 +3797,7 @@ static int s2io_set_swapper(struct s2io_
 SWAPPER_CTRL_RXF_W_FE |
 SWAPPER_CTRL_XMSI_FE |
 SWAPPER_CTRL_STATS_FE | SWAPPER_CTRL_STATS_SE);
-   if (sp-intr_type == INTA)
+   if (sp-config.intr_type == INTA)
val64 |= SWAPPER_CTRL_XMSI_SE;
writeq(val64, bar0-swapper_ctrl);
 #endif
@@ -4071,7 +4071,7 @@ static int s2io_open(struct net_device *
netif_carrier_off(dev);
sp-last_link_state = 0;
 
-   if (sp-intr_type == MSI_X) {
+   if (sp-config.intr_type == MSI_X) {
int ret = s2io_enable_msi_x(sp);
 
if (!ret) {
@@ -4103,12 +4103,12 @@ static int s2io_open(struct net_device *
DBG_PRINT(ERR_DBG,
  %s: MSI-X requested but failed to enable\n,
  dev-name);
-   sp-intr_type = INTA;
+   sp-config.intr_type = INTA;
}
}
 
/* NAPI doesn't work well with MSI(X) */
-if (sp-intr_type != INTA) {
+if (sp-config.intr_type != INTA) {
if(sp-config.napi)
sp-config.napi = 0;
}
@@ -4132,7 +4132,7 @@ static int s2io_open(struct net_device *
return 0;
 
 hw_init_failed:
-   if (sp-intr_type == MSI_X) {
+   if (sp-config.intr_type == MSI_X) {
if (sp-entries) {
kfree(sp-entries);
sp-mac_control.stats_info-sw_stat.mem_freed 
@@ -7017,18 +7017,18 @@ static int s2io_add_isr(struct s2io_nic 
struct net_device *dev = sp-dev;
int err = 0;
 
-   if (sp-intr_type == MSI_X)
+   if (sp-config.intr_type == MSI_X)
ret = s2io_enable_msi_x(sp);
if (ret) {
DBG_PRINT(ERR_DBG, %s: Defaulting to INTA\n, dev-name);
-   sp-intr_type = INTA;
+   sp-config.intr_type = INTA;
}
 
/* Store the values of the MSIX table in the struct s2io_nic structure 
*/
store_xmsi_data(sp);
 
/* After proper initialization of H/W, register ISR */
-   if (sp-intr_type == MSI_X) {
+   if (sp-config.intr_type == MSI_X) {
int i, msix_tx_cnt=0,msix_rx_cnt=0;
 
for (i=1; (sp-s2io_entries[i].in_use == MSIX_FLG); i++) {
@@ -7080,7 +7080,7 @@ static int s2io_add_isr(struct s2io_nic 
printk(MSI-X-TX %d entries enabled\n,msix_tx_cnt);
printk(MSI-X-RX %d entries enabled\n,msix_rx_cnt);
}
-   if (sp-intr_type == INTA) {
+   if (sp-config.intr_type == INTA) {
err = request_irq((int) sp-pdev-irq, s2io_isr, IRQF_SHARED,
sp-name, dev);
if (err) {
@@ -7097,7 +7097,7 @@ static void s2io_rem_isr(struct s2io_nic
struct net_device *dev = sp-dev;
struct swStat *stats = sp-mac_control.stats_info-sw_stat;
 
-   if (sp-intr_type == MSI_X) {
+   if (sp-config.intr_type == MSI_X) {
int i;
u16 msi_control;
 
@@ -7270,7 +7270,7 @@ static int s2io_card_up(struct s2io_nic 
 
/* Add interrupt service routine */
if (s2io_add_isr(sp) != 0) {
-   if (sp-intr_type == MSI_X)
+   if (sp-config.intr_type == MSI_X)
s2io_rem_isr(sp);

[PATCH 2.6.24 4/5]S2io: Check for CARD_DOWN before handling traffic

2007-08-15 Thread Ramkrishna Vepa
- Added check to return from the traffic handling function, if the card status 
  is DOWN.

Signed-off-by: Sivakumar Subramani [EMAIL PROTECTED]
Signed-off-by: Santosh Rastapur [EMAIL PROTECTED]
Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED]
---
diff -Nurp patch3/drivers/net/s2io.c patch4/drivers/net/s2io.c
--- patch3/drivers/net/s2io.c   2007-08-15 08:57:32.0 -0700
+++ patch4/drivers/net/s2io.c   2007-08-15 08:42:14.0 -0700
@@ -2927,6 +2927,11 @@ static int s2io_poll(struct net_device *
int i;
 
atomic_inc(nic-isr_cnt);
+   if (unlikely(atomic_read(nic-card_state) == CARD_DOWN)) {
+   atomic_dec(nic-isr_cnt);
+   return IRQ_NONE;
+   }
+
mac_control = nic-mac_control;
config = nic-config;
 
@@ -3062,12 +3067,6 @@ static void rx_intr_handler(struct ring_
struct RxD3* rxdp3;
 
spin_lock(nic-rx_lock);
-   if (atomic_read(nic-card_state) == CARD_DOWN) {
-   DBG_PRINT(INTR_DBG, %s: %s going down for reset\n,
- __FUNCTION__, dev-name);
-   spin_unlock(nic-rx_lock);
-   return;
-   }
 
get_info = ring_data-rx_curr_get_info;
get_block = get_info.block_index;
@@ -4404,6 +4403,10 @@ static irqreturn_t s2io_msix_ring_handle
struct s2io_nic *sp = ring-nic;
 
atomic_inc(sp-isr_cnt);
+   if (unlikely(atomic_read(sp-card_state) == CARD_DOWN)){
+   atomic_dec(sp-isr_cnt);
+   return IRQ_HANDLED;
+   }
 
rx_intr_handler(ring);
s2io_chk_rx_buffers(sp, ring-ring_no);
@@ -4418,6 +4421,10 @@ static irqreturn_t s2io_msix_fifo_handle
struct s2io_nic *sp = fifo-nic;
 
atomic_inc(sp-isr_cnt);
+   if (unlikely(atomic_read(sp-card_state) == CARD_DOWN)){
+   atomic_dec(sp-isr_cnt);
+   return IRQ_HANDLED;
+   }
tx_intr_handler(fifo);
atomic_dec(sp-isr_cnt);
return IRQ_HANDLED;
@@ -4896,6 +4903,11 @@ static irqreturn_t s2io_isr(int irq, voi
return IRQ_NONE;
 
atomic_inc(sp-isr_cnt);
+   if (unlikely(atomic_read(sp-card_state) == CARD_DOWN)) {
+   atomic_dec(sp-isr_cnt);
+   return IRQ_NONE;
+   }
+
mac_control = sp-mac_control;
config = sp-config;
 



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2.6.24 5/5]S2io: Optimize isr fast path

2007-08-15 Thread Ramkrishna Vepa
- Optimized interrupt routine fast path.

Signed-off-by: Sivakumar Subramani [EMAIL PROTECTED]
Signed-off-by: Santosh Rastapur [EMAIL PROTECTED]
Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED]
---
diff -Nurp patch4/drivers/net/s2io.c patch5/drivers/net/s2io.c
--- patch4/drivers/net/s2io.c   2007-08-15 08:42:14.0 -0700
+++ patch5/drivers/net/s2io.c   2007-08-15 08:42:51.0 -0700
@@ -84,7 +84,7 @@
 #include s2io.h
 #include s2io-regs.h
 
-#define DRV_VERSION 2.0.26.1
+#define DRV_VERSION 2.0.26.2
 
 /* S2io Driver name  version. */
 static char s2io_driver_name[] = Neterion;
@@ -4917,71 +4917,76 @@ static irqreturn_t s2io_isr(int irq, voi
 * 1. Rx of packet.
 * 2. Tx complete.
 * 3. Link down.
-* 4. Error in any functional blocks of the NIC.
 */
reason = readq(bar0-general_int_status);
 
-   if (!reason) {
-   /* The interrupt was not raised by us. */
+   if (unlikely(reason == S2IO_MINUS_ONE) ) {
+   /* Nothing much can be done. Get out */
atomic_dec(sp-isr_cnt);
-   return IRQ_NONE;
-   }
-   else if (unlikely(reason == S2IO_MINUS_ONE) ) {
-   /* Disable device and get out */
-   atomic_dec(sp-isr_cnt);
-   return IRQ_NONE;
+   return IRQ_HANDLED;
}
 
-   if (napi) {
-   if (reason  GEN_INTR_RXTRAFFIC) {
-   if ( likely ( netif_rx_schedule_prep(dev)) ) {
-   __netif_rx_schedule(dev);
-   writeq(S2IO_MINUS_ONE, bar0-rx_traffic_mask);
+   if (reason  (GEN_INTR_RXTRAFFIC |
+   GEN_INTR_TXTRAFFIC | GEN_INTR_TXPIC))
+   {
+   writeq(S2IO_MINUS_ONE, bar0-general_int_mask);
+
+   if (config-napi) {
+   if (reason  GEN_INTR_RXTRAFFIC) {
+   if ( likely (netif_rx_schedule_prep(dev)) ) {
+   __netif_rx_schedule(dev);
+   writeq(S2IO_MINUS_ONE,
+   bar0-rx_traffic_mask);
+   } else
+   writeq(S2IO_MINUS_ONE,
+   bar0-rx_traffic_int);
}
-   else
+   } else {
+   /*
+* rx_traffic_int reg is an R1 register, writing all 1's
+* will ensure that the actual interrupt causing bit
+* get's cleared and hence a read can be avoided.
+*/
+   if (reason  GEN_INTR_RXTRAFFIC)
writeq(S2IO_MINUS_ONE, bar0-rx_traffic_int);
+
+   for (i = 0; i  config-rx_ring_num; i++)
+   rx_intr_handler(mac_control-rings[i]);
}
-   } else {
+
/*
-* Rx handler is called by default, without checking for the
-* cause of interrupt.
-* rx_traffic_int reg is an R1 register, writing all 1's
+* tx_traffic_int reg is an R1 register, writing all 1's
 * will ensure that the actual interrupt causing bit get's
 * cleared and hence a read can be avoided.
 */
-   if (reason  GEN_INTR_RXTRAFFIC)
-   writeq(S2IO_MINUS_ONE, bar0-rx_traffic_int);
+   if (reason  GEN_INTR_TXTRAFFIC)
+   writeq(S2IO_MINUS_ONE, bar0-tx_traffic_int);
 
-   for (i = 0; i  config-rx_ring_num; i++) {
-   rx_intr_handler(mac_control-rings[i]);
-   }
-   }
+   for (i = 0; i  config-tx_fifo_num; i++)
+   tx_intr_handler(mac_control-fifos[i]);
 
-   /*
-* tx_traffic_int reg is an R1 register, writing all 1's
-* will ensure that the actual interrupt causing bit get's
-* cleared and hence a read can be avoided.
-*/
-   if (reason  GEN_INTR_TXTRAFFIC)
-   writeq(S2IO_MINUS_ONE, bar0-tx_traffic_int);
+   if (reason  GEN_INTR_TXPIC)
+   s2io_txpic_intr_handle(sp);
 
-   for (i = 0; i  config-tx_fifo_num; i++)
-   tx_intr_handler(mac_control-fifos[i]);
+   /*
+* Reallocate the buffers from the interrupt handler itself.
+*/
+   if (!config-napi) {
+   for (i = 0; i  config-rx_ring_num; i++)
+   s2io_chk_rx_buffers(sp, i);
+   }
+   writeq(sp-general_int_mask, bar0-general_int_mask);
+   readl(bar0-general_int_status);
 
-   if (reason  GEN_INTR_TXPIC)
-   s2io_txpic_intr_handle(sp);
-   /*
-* If the Rx 

[PATCH 2.6.24 1/2]S2io: Change kmalloc+memset to k[zc]alloc

2007-08-15 Thread Ramkrishna Vepa
- Changed kmalloc+memset to k[zc]alloc as per Mariusz's patch
  [EMAIL PROTECTED]

Signed-off-by: Sivakumar Subramani [EMAIL PROTECTED]
Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED]
---
diff -urpN org/drivers/net/s2io.c patch1/drivers/net/s2io.c
--- org/drivers/net/s2io.c  2007-08-09 17:28:19.0 +0530
+++ patch1/drivers/net/s2io.c   2007-08-10 11:53:46.0 +0530
@@ -556,7 +556,7 @@ static int init_shared_mem(struct s2io_n
for (i = 0; i  config-tx_fifo_num; i++) {
int fifo_len = config-tx_cfg[i].fifo_len;
int list_holder_size = fifo_len * sizeof(struct list_info_hold);
-   mac_control-fifos[i].list_info = kmalloc(list_holder_size,
+   mac_control-fifos[i].list_info = kzalloc(list_holder_size,
  GFP_KERNEL);
if (!mac_control-fifos[i].list_info) {
DBG_PRINT(INFO_DBG,
@@ -564,7 +564,6 @@ static int init_shared_mem(struct s2io_n
return -ENOMEM;
}
mem_allocated += list_holder_size;
-   memset(mac_control-fifos[i].list_info, 0, list_holder_size);
}
for (i = 0; i  config-tx_fifo_num; i++) {
int page_num = TXD_MEM_PAGE_CNT(config-tx_cfg[i].fifo_len,
@@ -3883,9 +3882,9 @@ static int s2io_enable_msi_x(struct s2io
u16 msi_control; /* Temp variable */
int ret, i, j, msix_indx = 1;
 
-   nic-entries = kmalloc(MAX_REQUESTED_MSI_X * sizeof(struct msix_entry),
+   nic-entries = kcalloc(MAX_REQUESTED_MSI_X, sizeof(struct msix_entry),
   GFP_KERNEL);
-   if (nic-entries == NULL) {
+   if (!nic-entries) {
DBG_PRINT(INFO_DBG, %s: Memory allocation failed\n, \
__FUNCTION__);
nic-mac_control.stats_info-sw_stat.mem_alloc_fail_cnt++;
@@ -3893,12 +3892,11 @@ static int s2io_enable_msi_x(struct s2io
}
nic-mac_control.stats_info-sw_stat.mem_allocated 
+= (MAX_REQUESTED_MSI_X * sizeof(struct msix_entry));
-   memset(nic-entries, 0,MAX_REQUESTED_MSI_X * sizeof(struct msix_entry));
 
nic-s2io_entries =
-   kmalloc(MAX_REQUESTED_MSI_X * sizeof(struct s2io_msix_entry),
+   kcalloc(MAX_REQUESTED_MSI_X, sizeof(struct s2io_msix_entry),
   GFP_KERNEL);
-   if (nic-s2io_entries == NULL) {
+   if (!nic-s2io_entries) {
DBG_PRINT(INFO_DBG, %s: Memory allocation failed\n, 
__FUNCTION__);
nic-mac_control.stats_info-sw_stat.mem_alloc_fail_cnt++;
@@ -3909,8 +3907,6 @@ static int s2io_enable_msi_x(struct s2io
}
 nic-mac_control.stats_info-sw_stat.mem_allocated 
+= (MAX_REQUESTED_MSI_X * sizeof(struct s2io_msix_entry));
-   memset(nic-s2io_entries, 0,
-  MAX_REQUESTED_MSI_X * sizeof(struct s2io_msix_entry));
 
for (i=0; i MAX_REQUESTED_MSI_X; i++) {
nic-entries[i].entry = i;



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2.6.24 2/2]S2io: Removed unused feature - bimodal interrupts

2007-08-15 Thread Ramkrishna Vepa
- Removed bimodal interrupt support - unused feature

Signed-off-by: Sivakumar Subramani [EMAIL PROTECTED]
Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED]
---
diff -urpN patch1/drivers/net/s2io.c patch2/drivers/net/s2io.c
--- patch1/drivers/net/s2io.c   2007-08-10 11:53:46.0 +0530
+++ patch2/drivers/net/s2io.c   2007-08-10 11:54:21.0 +0530
@@ -84,7 +84,7 @@
 #include s2io.h
 #include s2io-regs.h
 
-#define DRV_VERSION 2.0.26.3
+#define DRV_VERSION 2.0.26.4
 
 /* S2io Driver name  version. */
 static char s2io_driver_name[] = Neterion;
@@ -447,7 +447,6 @@ S2IO_PARM_INT(mc_pause_threshold_q4q7, 1
 S2IO_PARM_INT(shared_splits, 0);
 S2IO_PARM_INT(tmac_util_period, 5);
 S2IO_PARM_INT(rmac_util_period, 5);
-S2IO_PARM_INT(bimodal, 0);
 S2IO_PARM_INT(l3l4hdr_size, 128);
 /* Frequency of Rx desc syncs expressed as power of 2 */
 S2IO_PARM_INT(rxsync_frequency, 3);
@@ -1559,90 +1558,57 @@ static int init_nic(struct s2io_nic *nic
time++;
}
 
-   if (nic-config.bimodal) {
-   int k = 0;
-   for (k = 0; k  config-rx_ring_num; k++) {
-   val64 = TTI_CMD_MEM_WE | TTI_CMD_MEM_STROBE_NEW_CMD;
-   val64 |= TTI_CMD_MEM_OFFSET(0x38+k);
-   writeq(val64, bar0-tti_command_mem);
-
+   /* RTI Initialization */
+   if (nic-device_type == XFRAME_II_DEVICE) {
/*
-* Once the operation completes, the Strobe bit of the command
-* register will be reset. We poll for this particular condition
-* We wait for a maximum of 500ms for the operation to complete,
-* if it's not complete by then we return error.
-   */
-   time = 0;
-   while (TRUE) {
-   val64 = readq(bar0-tti_command_mem);
-   if (!(val64  TTI_CMD_MEM_STROBE_NEW_CMD)) {
-   break;
-   }
-   if (time  10) {
-   DBG_PRINT(ERR_DBG,
-   %s: TTI init Failed\n,
-   dev-name);
-   return -1;
-   }
-   time++;
-   msleep(50);
-   }
-   }
-   } else {
+* Programmed to generate Apprx 500 Intrs per
+* second
+*/
+   int count = (nic-config.bus_speed * 125)/4;
+   val64 = RTI_DATA1_MEM_RX_TIMER_VAL(count);
+   } else
+   val64 = RTI_DATA1_MEM_RX_TIMER_VAL(0xFFF);
 
-   /* RTI Initialization */
-   if (nic-device_type == XFRAME_II_DEVICE) {
-   /*
-* Programmed to generate Apprx 500 Intrs per
-* second
-*/
-   int count = (nic-config.bus_speed * 125)/4;
-   val64 = RTI_DATA1_MEM_RX_TIMER_VAL(count);
-   } else {
-   val64 = RTI_DATA1_MEM_RX_TIMER_VAL(0xFFF);
-   }
-   val64 |= RTI_DATA1_MEM_RX_URNG_A(0xA) |
-   RTI_DATA1_MEM_RX_URNG_B(0x10) |
-   RTI_DATA1_MEM_RX_URNG_C(0x30) | 
RTI_DATA1_MEM_RX_TIMER_AC_EN;
-
-   writeq(val64, bar0-rti_data1_mem);
-
-   val64 = RTI_DATA2_MEM_RX_UFC_A(0x1) |
-   RTI_DATA2_MEM_RX_UFC_B(0x2) ;
-   if (nic-config.intr_type == MSI_X)
-   val64 |= (RTI_DATA2_MEM_RX_UFC_C(0x20) | \
+   val64 |= RTI_DATA1_MEM_RX_URNG_A(0xA) |
+   RTI_DATA1_MEM_RX_URNG_B(0x10) |
+   RTI_DATA1_MEM_RX_URNG_C(0x30) | RTI_DATA1_MEM_RX_TIMER_AC_EN;
+
+   writeq(val64, bar0-rti_data1_mem);
+
+   val64 = RTI_DATA2_MEM_RX_UFC_A(0x1) |
+   RTI_DATA2_MEM_RX_UFC_B(0x2) ;
+   if (nic-config.intr_type == MSI_X)
+   val64 |= (RTI_DATA2_MEM_RX_UFC_C(0x20) | \
RTI_DATA2_MEM_RX_UFC_D(0x40));
-   else
-   val64 |= (RTI_DATA2_MEM_RX_UFC_C(0x40) | \
+   else
+   val64 |= (RTI_DATA2_MEM_RX_UFC_C(0x40) | \
RTI_DATA2_MEM_RX_UFC_D(0x80));
-   writeq(val64, bar0-rti_data2_mem);
+   writeq(val64, bar0-rti_data2_mem);
 
-   for (i = 0; i  config-rx_ring_num; i++) {
-   val64 = RTI_CMD_MEM_WE | RTI_CMD_MEM_STROBE_NEW_CMD
-   | RTI_CMD_MEM_OFFSET(i);
-   writeq(val64, bar0-rti_command_mem);
+   for (i = 0; i  config-rx_ring_num; i++) {
+   val64 = RTI_CMD_MEM_WE | RTI_CMD_MEM_STROBE_NEW_CMD
+   | RTI_CMD_MEM_OFFSET(i);
+   

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 10:52:53PM +0200, Segher Boessenkool wrote:
 I think this was just terminology confusion here again. Isn't any 
 code
 that it cannot currently see the same as another compilation unit,
 and wouldn't the compilation unit itself expand if we ask gcc to
 compile more than one unit at once? Or is there some more specific
 definition for compilation unit (in gcc lingo, possibly?)
 
 This is indeed my understanding -- compilation unit is whatever the
 compiler looks at in one go.  I have heard the word module used for
 the minimal compilation unit covering a single .c file and everything
 that it #includes, but there might be a better name for this.
 
 Yes, that's what's called compilation unit :-)
 
 [/me double checks]
 
 Erm, the C standard actually calls it translation unit.
 
 To be exact, to avoid any more confusion:
 
 5.1.1.1/1:
 A C program need not all be translated at the same time. The
 text of the program is kept in units called source files, (or
 preprocessing files) in this International Standard. A source
 file together with all the headers and source files included
 via the preprocessing directive #include is known as a
 preprocessing translation unit. After preprocessing, a
 preprocessing translation unit is called a translation unit.

I am OK with translation and compilation being near-synonyms.  ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Paul E. McKenney
On Wed, Aug 15, 2007 at 11:05:35PM +0200, Segher Boessenkool wrote:
 No; compilation units have nothing to do with it, GCC can optimise
 across compilation unit boundaries just fine, if you tell it to
 compile more than one compilation unit at once.
 
 Last I checked, the Linux kernel build system did compile each .c file
 as a separate compilation unit.
 
 I have some patches to use -combine -fwhole-program for Linux.
 Highly experimental, you need a patched bleeding edge toolchain.
 If there's interest I'll clean it up and put it online.
 
 David Woodhouse had some similar patches about a year ago.

Sounds exciting...  ;-)

 In many cases, the compiler also has to assume that
 msleep_interruptible()
 might call back into a function in the current compilation unit, thus
 possibly modifying global static variables.
 
 It most often is smart enough to see what compilation-unit-local
 variables might be modified that way, though :-)
 
 Yep.  For example, if it knows the current value of a given such local
 variable, and if all code paths that would change some other variable
 cannot be reached given that current value of the first variable.
 
 Or the most common thing: if neither the address of the translation-
 unit local variable nor the address of any function writing to that
 variable can escape from that translation unit, nothing outside
 the translation unit can write to the variable.

But there is usually at least one externally callable function in
a .c file.

 At least given that gcc doesn't know about multiple threads of 
 execution!
 
 Heh, only about the threads it creates itself (not relevant to
 the kernel, for sure :-) )

;-)

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] Add ETHTOOL_[GS]FLAGS sub-ioctls

2007-08-15 Thread David Miller
From: Jeff Garzik [EMAIL PROTECTED]
Date: Fri, 10 Aug 2007 16:56:17 -0400

 All this is currently checked into the 'eflags' branch of
 git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
 
 But when everybody is happy with it, IMO we should get it into 
 net-2.6.24.git, as it enables LRO.

I've backed out the ETHTOOL LRO stuff from Auke, and applied your
patches to net-2.6.24

We can get rid of that NETIF_F_LRO thing, since as you observed
it is indeed superfluous.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for 2.6.24] SCTP: Rewrite of sctp buffer management code

2007-08-15 Thread David Miller
From: Vlad Yasevich [EMAIL PROTECTED]
Date: Mon, 13 Aug 2007 09:24:00 -0400

 Sorry about that.  Not sure what happened to that patch.  Here is
 the good patch with witespace cleanups.

Applied to net-2.6.24, thanks for fixing this patch up.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please pull 'upstream-davem' branch of wireless-2.6

2007-08-15 Thread David Miller
From: John W. Linville [EMAIL PROTECTED]
Date: Tue, 14 Aug 2007 20:34:10 -0400

 Individual patches available here:
 
   
 http://www.kernel.org/pub/linux/kernel/people/linville/wireless-2.6/upstream-davem

John, what I'm going to do is wait for Linus to pull in the
2.6.23 mac80211 fixes you submitted yesterday, then rebase
the net-2.6.24 tree and add these fixes on top.

Take care.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >