RE: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Ramkrishna Vepa


> -Original Message-
> From: Roland Dreier [mailto:[EMAIL PROTECTED]
> Sent: Monday, November 06, 2006 12:55 PM
> To: Christoph Hellwig
> Cc: Ramkrishna Vepa; Benjamin Herrenschmidt; Jeff Garzik; Linus
Torvalds;
> netdev@vger.kernel.org; [EMAIL PROTECTED]
> Subject: Re: [PATCH] s2io ppc64 fix for readq/writeq
> 
>  > For consistencies sake we really want to have readq() and writeq()
> available
>  > on all platforms.  I remember that some IB cards require it to
actually
>  > be a 64bit transactions, otherwise they have to do funny
workarounds.
>  > I think the best solution is to define ARCH_HAS_ATOMIC_READQ_WRITEQ
>  > and let drivers do their workarounds based on that.
>  >
>  > I've Cc'ed Roland because he should be able to explain the IB issue
in
>  > details.
> 
> The issue I know about is drivers/infiniband/hw/mthca.  The card has
> 64-bit "doorbell registers", and the restriction is that if you write
> the doorbell write two 32-bit writes, you can't write anything else on
> the same register page in between writing the two halves.  Since
> different CPUs might be doing stuff on the same doorbell page at the
> same time, there are two things we can do:
>  - If writeq() exists then use that and assume it will generate only a
>single bus transaction that can't let anything sneak in the
>middle.  (That's a fairly safe assumption because the devices being
>driven are either 64-bit PCI-X or PCIe only)
>  - If writeq() doesn't exist, use a spinlock to protect access to each
>doorbell page.
> 
> ARCH_HAS_ATOMIC_READQ_WRITEQ would be fine for that, but of course the
> tricky thing is writing down the exact semantics that "HAS_ATOMIC" is
> actually promising.
> 
>  - R.
[Ram] If the writes broken up into 32 bit writes they are posted to the
bridge and need to be flushed with a lock around the whole access. This
is in the domain of the driver and need not be part of the platform
specific code. 

Ram

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Benjamin Herrenschmidt

> Generally the kernel code should write the two 32-bit chunks to the 
> memory-mapped region in order (low dword first), and let things take 
> care of themselves from there.
> 
> That's pretty much the implementation that -every- driver copies, when 
> they need readq/writeq to work on a 32-bit platform.

What do you mean by low dword first ? For example, the implementation
in the s2io driver does:

static inline u64 readq(void __iomem *addr)
{
u64 ret = 0;
ret = readl(addr + 4);
ret <<= 32;
ret |= readl(addr);

return ret;
}

static inline void writeq(u64 val, void __iomem *addr)
{
writel((u32) (val), addr);
writel((u32) (val >> 32), (addr + 4));
}

As you can see, it reads the -second- dword first (high order dword in
little endian), but writes the first dword first (low order dword in
little endian).

If there is any logic here, it's card specific.

Or is this really what PCI does when doing 64 bits accesses on a 32 bits
PCI bus ? I would have expected the later (what write does) but this
driver does it reverse on reads.

I'm tempted to go to the simple

#define readq readq for now until we clear that up.
 
Ben.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Roland Dreier
 > For consistencies sake we really want to have readq() and writeq() available
 > on all platforms.  I remember that some IB cards require it to actually
 > be a 64bit transactions, otherwise they have to do funny workarounds.
 > I think the best solution is to define ARCH_HAS_ATOMIC_READQ_WRITEQ
 > and let drivers do their workarounds based on that.
 > 
 > I've Cc'ed Roland because he should be able to explain the IB issue in
 > details.

The issue I know about is drivers/infiniband/hw/mthca.  The card has
64-bit "doorbell registers", and the restriction is that if you write
the doorbell write two 32-bit writes, you can't write anything else on
the same register page in between writing the two halves.  Since
different CPUs might be doing stuff on the same doorbell page at the
same time, there are two things we can do:
 - If writeq() exists then use that and assume it will generate only a
   single bus transaction that can't let anything sneak in the
   middle.  (That's a fairly safe assumption because the devices being
   driven are either 64-bit PCI-X or PCIe only)
 - If writeq() doesn't exist, use a spinlock to protect access to each
   doorbell page.

ARCH_HAS_ATOMIC_READQ_WRITEQ would be fine for that, but of course the
tricky thing is writing down the exact semantics that "HAS_ATOMIC" is
actually promising.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Christoph Hellwig
On Mon, Nov 06, 2006 at 03:33:19PM -0500, Ramkrishna Vepa wrote:
> The 64 bit io operation on the IA64 platform is a 64 bit transaction on
> the pci bus and is optimal to leave it as such. I prefer Jeff's
> suggestion  - 
> 
> guaranteeing that a "good enough for drivers" readq() and writeq() exist
> on all platforms even 32-bit platforms where the operation isn't
> inherently atomic.

For consistencies sake we really want to have readq() and writeq() available
on all platforms.  I remember that some IB cards require it to actually
be a 64bit transactions, otherwise they have to do funny workarounds.
I think the best solution is to define ARCH_HAS_ATOMIC_READQ_WRITEQ
and let drivers do their workarounds based on that.

I've Cc'ed Roland because he should be able to explain the IB issue in
details.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Ramkrishna Vepa
The 64 bit io operation on the IA64 platform is a 64 bit transaction on
the pci bus and is optimal to leave it as such. I prefer Jeff's
suggestion  - 

guaranteeing that a "good enough for drivers" readq() and writeq() exist
on all platforms even 32-bit platforms where the operation isn't
inherently atomic.

Ram

> -Original Message-
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
> On Behalf Of Benjamin Herrenschmidt
> Sent: Monday, November 06, 2006 1:57 AM
> To: Jeff Garzik
> Cc: Linus Torvalds; netdev@vger.kernel.org
> Subject: Re: [PATCH] s2io ppc64 fix for readq/writeq
> 
> On Mon, 2006-11-06 at 04:55 -0500, Jeff Garzik wrote:
> > Benjamin Herrenschmidt wrote:
> > > On Mon, 2006-11-06 at 01:50 -0800, Linus Torvalds wrote:
> > >> On Mon, 6 Nov 2006, Benjamin Herrenschmidt wrote:
> > >>> Anyway, what do you think of Jeff proposal to just implement
them as
> two
> > >>> 32 bits operations ? My arch guy side screams at the idea, but
if,
> > >>> indeed, drivers generally cope fine with it, I suppose that's
ok.
> > >> Last I saw, that's how normal PCI will split the IO anyway, so I
> guess it
> > >> makes sense.
> > >
> > > Hrm.. true indeed. I'll implement them that way for ppc32 then.
> >
> > Bonus points if you want to find-and-kill where individual drivers
did
> >
> > #ifndef readq
> > implement readq and writeq by hand...
> > #endif
> 
> Yes, well, we would have to make sure all archs have them defined
> first though, but I suppose I can have a look later this week, maybe
> tomorrow. Shouldn't be too hard :)
> 
> Ben.
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Benjamin Herrenschmidt
On Mon, 2006-11-06 at 04:55 -0500, Jeff Garzik wrote:
> Benjamin Herrenschmidt wrote:
> > On Mon, 2006-11-06 at 01:50 -0800, Linus Torvalds wrote:
> >> On Mon, 6 Nov 2006, Benjamin Herrenschmidt wrote:
> >>> Anyway, what do you think of Jeff proposal to just implement them as two
> >>> 32 bits operations ? My arch guy side screams at the idea, but if,
> >>> indeed, drivers generally cope fine with it, I suppose that's ok.
> >> Last I saw, that's how normal PCI will split the IO anyway, so I guess it 
> >> makes sense.
> > 
> > Hrm.. true indeed. I'll implement them that way for ppc32 then.
> 
> Bonus points if you want to find-and-kill where individual drivers did
> 
>   #ifndef readq
>   implement readq and writeq by hand...
>   #endif

Yes, well, we would have to make sure all archs have them defined
first though, but I suppose I can have a look later this week, maybe
tomorrow. Shouldn't be too hard :)

Ben.



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Jeff Garzik

Benjamin Herrenschmidt wrote:

On Mon, 2006-11-06 at 01:50 -0800, Linus Torvalds wrote:

On Mon, 6 Nov 2006, Benjamin Herrenschmidt wrote:

Anyway, what do you think of Jeff proposal to just implement them as two
32 bits operations ? My arch guy side screams at the idea, but if,
indeed, drivers generally cope fine with it, I suppose that's ok.
Last I saw, that's how normal PCI will split the IO anyway, so I guess it 
makes sense.


Hrm.. true indeed. I'll implement them that way for ppc32 then.


Bonus points if you want to find-and-kill where individual drivers did

#ifndef readq
implement readq and writeq by hand...
#endif

:)

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Benjamin Herrenschmidt
On Mon, 2006-11-06 at 01:50 -0800, Linus Torvalds wrote:
> 
> On Mon, 6 Nov 2006, Benjamin Herrenschmidt wrote:
> >
> > Anyway, what do you think of Jeff proposal to just implement them as two
> > 32 bits operations ? My arch guy side screams at the idea, but if,
> > indeed, drivers generally cope fine with it, I suppose that's ok.
> 
> Last I saw, that's how normal PCI will split the IO anyway, so I guess it 
> makes sense.

Hrm.. true indeed. I'll implement them that way for ppc32 then.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Linus Torvalds


On Mon, 6 Nov 2006, Benjamin Herrenschmidt wrote:
>
> Anyway, what do you think of Jeff proposal to just implement them as two
> 32 bits operations ? My arch guy side screams at the idea, but if,
> indeed, drivers generally cope fine with it, I suppose that's ok.

Last I saw, that's how normal PCI will split the IO anyway, so I guess it 
makes sense.

Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Benjamin Herrenschmidt
On Mon, 2006-11-06 at 01:37 -0800, Linus Torvalds wrote:
> 
> On Mon, 6 Nov 2006, Jeff Garzik wrote:
> >
> > This seems a bit ugly.  Could you add
> > 
> > #define readq readq
> > 
> > to your platform instead?
> 
> Heartily agreed. MUCH better than adding unrelated #if defined() stuff, 
> whether arch-related or otherwise.

I agree it's less ugly, though I still don't like it much :-)

Anyway, what do you think of Jeff proposal to just implement them as two
32 bits operations ? My arch guy side screams at the idea, but if,
indeed, drivers generally cope fine with it, I suppose that's ok.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Linus Torvalds


On Mon, 6 Nov 2006, Jeff Garzik wrote:
>
> This seems a bit ugly.  Could you add
> 
>   #define readq readq
> 
> to your platform instead?

Heartily agreed. MUCH better than adding unrelated #if defined() stuff, 
whether arch-related or otherwise.

Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Jeff Garzik

Benjamin Herrenschmidt wrote:

This is why I said "good enough for drivers".  This is _key_.

I have run into several [PCI] devices with 64-bit registers, and 
__none__ of them had requirements such that the Linux platform code 
-must- provide an atomic readq/writeq.  Probably because everybody wants 
to support 32-bit platforms with their devices.


What you call "fairly bogus" is precisely what drivers need.  These 
devices with 64-bit registers just don't need the atomicity that arch 
developers harp about :)


Is there any consistency in that case in which half need to be
read/written first ? Or none of these ever had side effects ?


Generally the kernel code should write the two 32-bit chunks to the 
memory-mapped region in order (low dword first), and let things take 
care of themselves from there.


That's pretty much the implementation that -every- driver copies, when 
they need readq/writeq to work on a 32-bit platform.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Benjamin Herrenschmidt
> This is why I said "good enough for drivers".  This is _key_.
> 
> I have run into several [PCI] devices with 64-bit registers, and 
> __none__ of them had requirements such that the Linux platform code 
> -must- provide an atomic readq/writeq.  Probably because everybody wants 
> to support 32-bit platforms with their devices.
> 
> What you call "fairly bogus" is precisely what drivers need.  These 
> devices with 64-bit registers just don't need the atomicity that arch 
> developers harp about :)

Is there any consistency in that case in which half need to be
read/written first ? Or none of these ever had side effects ?

Ben.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Jeff Garzik

Benjamin Herrenschmidt wrote:

This seems a bit ugly.  Could you add

#define readq readq

to your platform instead?


That's ugly too imho but I suppose I can do it :-)

I generally think it's a bug in the kernel-wide API, if use of said API 
requires arch-specific ifdefs.


Yes. I agree. In that specific case, I suppose what you propose is the
least ugly of the solutions. HAVE_ARCH_* is pretty much out of fascion
(and I tend to agree with Linus that it's not pretty anyway).

Actually, I tend to think in that specific case that the driver defining
something called readq and writeq based on a pair of readl's and
writel's is fairly bogus though.

Or maybe the problem could be solved another way, by guaranteeing that a 
"good enough for drivers" readq() and writeq() exist on all platforms, 
even 32-bit platforms where the operation isn't inherently atomic.


I'd rather not provide readq/writeq if they aren't atomic.


This is why I said "good enough for drivers".  This is _key_.

I have run into several [PCI] devices with 64-bit registers, and 
__none__ of them had requirements such that the Linux platform code 
-must- provide an atomic readq/writeq.  Probably because everybody wants 
to support 32-bit platforms with their devices.


What you call "fairly bogus" is precisely what drivers need.  These 
devices with 64-bit registers just don't need the atomicity that arch 
developers harp about :)


Jeff


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-06 Thread Benjamin Herrenschmidt

> This seems a bit ugly.  Could you add
> 
>   #define readq readq
> 
> to your platform instead?

That's ugly too imho but I suppose I can do it :-)

> I generally think it's a bug in the kernel-wide API, if use of said API 
> requires arch-specific ifdefs.

Yes. I agree. In that specific case, I suppose what you propose is the
least ugly of the solutions. HAVE_ARCH_* is pretty much out of fascion
(and I tend to agree with Linus that it's not pretty anyway).

Actually, I tend to think in that specific case that the driver defining
something called readq and writeq based on a pair of readl's and
writel's is fairly bogus though.

> Or maybe the problem could be solved another way, by guaranteeing that a 
> "good enough for drivers" readq() and writeq() exist on all platforms, 
> even 32-bit platforms where the operation isn't inherently atomic.

I'd rather not provide readq/writeq if they aren't atomic.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] s2io ppc64 fix for readq/writeq

2006-11-05 Thread Jeff Garzik

Benjamin Herrenschmidt wrote:

The s2io driver is redefining it's own readq/writeq based on
readl/writel when the platform doesn't provide native ones. However, it
currently does so by testing #ifndef readq. While that works for now, we
are about to change ppc64 to use inline functions rather that macros for
all those IO accessors which will break that test. This fixes it. I
don't have anything less ugly at hand unfortunately.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

The patch changing ppc64 own definition is scheduled to go in 2.6.20 when
the merge window opens, so it would be nice if this patch could go in a
similar timeframe, provided that you agree with it of course. It can go
earlier as it won't break current ppc64.

Index: linux-cell/drivers/net/s2io.h
===
--- linux-cell.orig/drivers/net/s2io.h  2006-10-13 17:23:49.0 +1000
+++ linux-cell/drivers/net/s2io.h   2006-11-06 13:19:32.0 +1100
@@ -862,8 +862,10 @@ struct s2io_nic {
 #define RESET_ERROR 1;
 #define CMD_ERROR   2;
 
-/*  OS related system calls */

-#ifndef readq
+/* OS related system calls. Note that ppc64 has readq defined as
+ * an inline, not a macro
+ */
+#if !defined(CONFIG_PPC64) && !defined(readq)
 static inline u64 readq(void __iomem *addr)
 {
u64 ret = 0;
@@ -875,7 +877,7 @@ static inline u64 readq(void __iomem *ad
 }
 #endif
 
-#ifndef writeq

+#if !defined(CONFIG_PPC64) && !defined(writeq)
 static inline void writeq(u64 val, void __iomem *addr)


This seems a bit ugly.  Could you add

#define readq readq

to your platform instead?

I generally think it's a bug in the kernel-wide API, if use of said API 
requires arch-specific ifdefs.


Or maybe the problem could be solved another way, by guaranteeing that a 
"good enough for drivers" readq() and writeq() exist on all platforms, 
even 32-bit platforms where the operation isn't inherently atomic.


Jeff


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] s2io ppc64 fix for readq/writeq

2006-11-05 Thread Benjamin Herrenschmidt
The s2io driver is redefining it's own readq/writeq based on
readl/writel when the platform doesn't provide native ones. However, it
currently does so by testing #ifndef readq. While that works for now, we
are about to change ppc64 to use inline functions rather that macros for
all those IO accessors which will break that test. This fixes it. I
don't have anything less ugly at hand unfortunately.

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

The patch changing ppc64 own definition is scheduled to go in 2.6.20 when
the merge window opens, so it would be nice if this patch could go in a
similar timeframe, provided that you agree with it of course. It can go
earlier as it won't break current ppc64.

Index: linux-cell/drivers/net/s2io.h
===
--- linux-cell.orig/drivers/net/s2io.h  2006-10-13 17:23:49.0 +1000
+++ linux-cell/drivers/net/s2io.h   2006-11-06 13:19:32.0 +1100
@@ -862,8 +862,10 @@ struct s2io_nic {
 #define RESET_ERROR 1;
 #define CMD_ERROR   2;
 
-/*  OS related system calls */
-#ifndef readq
+/* OS related system calls. Note that ppc64 has readq defined as
+ * an inline, not a macro
+ */
+#if !defined(CONFIG_PPC64) && !defined(readq)
 static inline u64 readq(void __iomem *addr)
 {
u64 ret = 0;
@@ -875,7 +877,7 @@ static inline u64 readq(void __iomem *ad
 }
 #endif
 
-#ifndef writeq
+#if !defined(CONFIG_PPC64) && !defined(writeq)
 static inline void writeq(u64 val, void __iomem *addr)
 {
writel((u32) (val), addr);



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html