RE: [PATCH] s2io ppc64 fix for readq/writeq
> -Original Message- > From: Roland Dreier [mailto:[EMAIL PROTECTED] > Sent: Monday, November 06, 2006 12:55 PM > To: Christoph Hellwig > Cc: Ramkrishna Vepa; Benjamin Herrenschmidt; Jeff Garzik; Linus Torvalds; > netdev@vger.kernel.org; [EMAIL PROTECTED] > Subject: Re: [PATCH] s2io ppc64 fix for readq/writeq > > > For consistencies sake we really want to have readq() and writeq() > available > > on all platforms. I remember that some IB cards require it to actually > > be a 64bit transactions, otherwise they have to do funny workarounds. > > I think the best solution is to define ARCH_HAS_ATOMIC_READQ_WRITEQ > > and let drivers do their workarounds based on that. > > > > I've Cc'ed Roland because he should be able to explain the IB issue in > > details. > > The issue I know about is drivers/infiniband/hw/mthca. The card has > 64-bit "doorbell registers", and the restriction is that if you write > the doorbell write two 32-bit writes, you can't write anything else on > the same register page in between writing the two halves. Since > different CPUs might be doing stuff on the same doorbell page at the > same time, there are two things we can do: > - If writeq() exists then use that and assume it will generate only a >single bus transaction that can't let anything sneak in the >middle. (That's a fairly safe assumption because the devices being >driven are either 64-bit PCI-X or PCIe only) > - If writeq() doesn't exist, use a spinlock to protect access to each >doorbell page. > > ARCH_HAS_ATOMIC_READQ_WRITEQ would be fine for that, but of course the > tricky thing is writing down the exact semantics that "HAS_ATOMIC" is > actually promising. > > - R. [Ram] If the writes broken up into 32 bit writes they are posted to the bridge and need to be flushed with a lock around the whole access. This is in the domain of the driver and need not be part of the platform specific code. Ram - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
> Generally the kernel code should write the two 32-bit chunks to the > memory-mapped region in order (low dword first), and let things take > care of themselves from there. > > That's pretty much the implementation that -every- driver copies, when > they need readq/writeq to work on a 32-bit platform. What do you mean by low dword first ? For example, the implementation in the s2io driver does: static inline u64 readq(void __iomem *addr) { u64 ret = 0; ret = readl(addr + 4); ret <<= 32; ret |= readl(addr); return ret; } static inline void writeq(u64 val, void __iomem *addr) { writel((u32) (val), addr); writel((u32) (val >> 32), (addr + 4)); } As you can see, it reads the -second- dword first (high order dword in little endian), but writes the first dword first (low order dword in little endian). If there is any logic here, it's card specific. Or is this really what PCI does when doing 64 bits accesses on a 32 bits PCI bus ? I would have expected the later (what write does) but this driver does it reverse on reads. I'm tempted to go to the simple #define readq readq for now until we clear that up. Ben. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
> For consistencies sake we really want to have readq() and writeq() available > on all platforms. I remember that some IB cards require it to actually > be a 64bit transactions, otherwise they have to do funny workarounds. > I think the best solution is to define ARCH_HAS_ATOMIC_READQ_WRITEQ > and let drivers do their workarounds based on that. > > I've Cc'ed Roland because he should be able to explain the IB issue in > details. The issue I know about is drivers/infiniband/hw/mthca. The card has 64-bit "doorbell registers", and the restriction is that if you write the doorbell write two 32-bit writes, you can't write anything else on the same register page in between writing the two halves. Since different CPUs might be doing stuff on the same doorbell page at the same time, there are two things we can do: - If writeq() exists then use that and assume it will generate only a single bus transaction that can't let anything sneak in the middle. (That's a fairly safe assumption because the devices being driven are either 64-bit PCI-X or PCIe only) - If writeq() doesn't exist, use a spinlock to protect access to each doorbell page. ARCH_HAS_ATOMIC_READQ_WRITEQ would be fine for that, but of course the tricky thing is writing down the exact semantics that "HAS_ATOMIC" is actually promising. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
On Mon, Nov 06, 2006 at 03:33:19PM -0500, Ramkrishna Vepa wrote: > The 64 bit io operation on the IA64 platform is a 64 bit transaction on > the pci bus and is optimal to leave it as such. I prefer Jeff's > suggestion - > > guaranteeing that a "good enough for drivers" readq() and writeq() exist > on all platforms even 32-bit platforms where the operation isn't > inherently atomic. For consistencies sake we really want to have readq() and writeq() available on all platforms. I remember that some IB cards require it to actually be a 64bit transactions, otherwise they have to do funny workarounds. I think the best solution is to define ARCH_HAS_ATOMIC_READQ_WRITEQ and let drivers do their workarounds based on that. I've Cc'ed Roland because he should be able to explain the IB issue in details. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] s2io ppc64 fix for readq/writeq
The 64 bit io operation on the IA64 platform is a 64 bit transaction on the pci bus and is optimal to leave it as such. I prefer Jeff's suggestion - guaranteeing that a "good enough for drivers" readq() and writeq() exist on all platforms even 32-bit platforms where the operation isn't inherently atomic. Ram > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of Benjamin Herrenschmidt > Sent: Monday, November 06, 2006 1:57 AM > To: Jeff Garzik > Cc: Linus Torvalds; netdev@vger.kernel.org > Subject: Re: [PATCH] s2io ppc64 fix for readq/writeq > > On Mon, 2006-11-06 at 04:55 -0500, Jeff Garzik wrote: > > Benjamin Herrenschmidt wrote: > > > On Mon, 2006-11-06 at 01:50 -0800, Linus Torvalds wrote: > > >> On Mon, 6 Nov 2006, Benjamin Herrenschmidt wrote: > > >>> Anyway, what do you think of Jeff proposal to just implement them as > two > > >>> 32 bits operations ? My arch guy side screams at the idea, but if, > > >>> indeed, drivers generally cope fine with it, I suppose that's ok. > > >> Last I saw, that's how normal PCI will split the IO anyway, so I > guess it > > >> makes sense. > > > > > > Hrm.. true indeed. I'll implement them that way for ppc32 then. > > > > Bonus points if you want to find-and-kill where individual drivers did > > > > #ifndef readq > > implement readq and writeq by hand... > > #endif > > Yes, well, we would have to make sure all archs have them defined > first though, but I suppose I can have a look later this week, maybe > tomorrow. Shouldn't be too hard :) > > Ben. > > > > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
On Mon, 2006-11-06 at 04:55 -0500, Jeff Garzik wrote: > Benjamin Herrenschmidt wrote: > > On Mon, 2006-11-06 at 01:50 -0800, Linus Torvalds wrote: > >> On Mon, 6 Nov 2006, Benjamin Herrenschmidt wrote: > >>> Anyway, what do you think of Jeff proposal to just implement them as two > >>> 32 bits operations ? My arch guy side screams at the idea, but if, > >>> indeed, drivers generally cope fine with it, I suppose that's ok. > >> Last I saw, that's how normal PCI will split the IO anyway, so I guess it > >> makes sense. > > > > Hrm.. true indeed. I'll implement them that way for ppc32 then. > > Bonus points if you want to find-and-kill where individual drivers did > > #ifndef readq > implement readq and writeq by hand... > #endif Yes, well, we would have to make sure all archs have them defined first though, but I suppose I can have a look later this week, maybe tomorrow. Shouldn't be too hard :) Ben. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
Benjamin Herrenschmidt wrote: On Mon, 2006-11-06 at 01:50 -0800, Linus Torvalds wrote: On Mon, 6 Nov 2006, Benjamin Herrenschmidt wrote: Anyway, what do you think of Jeff proposal to just implement them as two 32 bits operations ? My arch guy side screams at the idea, but if, indeed, drivers generally cope fine with it, I suppose that's ok. Last I saw, that's how normal PCI will split the IO anyway, so I guess it makes sense. Hrm.. true indeed. I'll implement them that way for ppc32 then. Bonus points if you want to find-and-kill where individual drivers did #ifndef readq implement readq and writeq by hand... #endif :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
On Mon, 2006-11-06 at 01:50 -0800, Linus Torvalds wrote: > > On Mon, 6 Nov 2006, Benjamin Herrenschmidt wrote: > > > > Anyway, what do you think of Jeff proposal to just implement them as two > > 32 bits operations ? My arch guy side screams at the idea, but if, > > indeed, drivers generally cope fine with it, I suppose that's ok. > > Last I saw, that's how normal PCI will split the IO anyway, so I guess it > makes sense. Hrm.. true indeed. I'll implement them that way for ppc32 then. Ben. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
On Mon, 6 Nov 2006, Benjamin Herrenschmidt wrote: > > Anyway, what do you think of Jeff proposal to just implement them as two > 32 bits operations ? My arch guy side screams at the idea, but if, > indeed, drivers generally cope fine with it, I suppose that's ok. Last I saw, that's how normal PCI will split the IO anyway, so I guess it makes sense. Linus - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
On Mon, 2006-11-06 at 01:37 -0800, Linus Torvalds wrote: > > On Mon, 6 Nov 2006, Jeff Garzik wrote: > > > > This seems a bit ugly. Could you add > > > > #define readq readq > > > > to your platform instead? > > Heartily agreed. MUCH better than adding unrelated #if defined() stuff, > whether arch-related or otherwise. I agree it's less ugly, though I still don't like it much :-) Anyway, what do you think of Jeff proposal to just implement them as two 32 bits operations ? My arch guy side screams at the idea, but if, indeed, drivers generally cope fine with it, I suppose that's ok. Ben. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
On Mon, 6 Nov 2006, Jeff Garzik wrote: > > This seems a bit ugly. Could you add > > #define readq readq > > to your platform instead? Heartily agreed. MUCH better than adding unrelated #if defined() stuff, whether arch-related or otherwise. Linus - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
Benjamin Herrenschmidt wrote: This is why I said "good enough for drivers". This is _key_. I have run into several [PCI] devices with 64-bit registers, and __none__ of them had requirements such that the Linux platform code -must- provide an atomic readq/writeq. Probably because everybody wants to support 32-bit platforms with their devices. What you call "fairly bogus" is precisely what drivers need. These devices with 64-bit registers just don't need the atomicity that arch developers harp about :) Is there any consistency in that case in which half need to be read/written first ? Or none of these ever had side effects ? Generally the kernel code should write the two 32-bit chunks to the memory-mapped region in order (low dword first), and let things take care of themselves from there. That's pretty much the implementation that -every- driver copies, when they need readq/writeq to work on a 32-bit platform. Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
> This is why I said "good enough for drivers". This is _key_. > > I have run into several [PCI] devices with 64-bit registers, and > __none__ of them had requirements such that the Linux platform code > -must- provide an atomic readq/writeq. Probably because everybody wants > to support 32-bit platforms with their devices. > > What you call "fairly bogus" is precisely what drivers need. These > devices with 64-bit registers just don't need the atomicity that arch > developers harp about :) Is there any consistency in that case in which half need to be read/written first ? Or none of these ever had side effects ? Ben. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
Benjamin Herrenschmidt wrote: This seems a bit ugly. Could you add #define readq readq to your platform instead? That's ugly too imho but I suppose I can do it :-) I generally think it's a bug in the kernel-wide API, if use of said API requires arch-specific ifdefs. Yes. I agree. In that specific case, I suppose what you propose is the least ugly of the solutions. HAVE_ARCH_* is pretty much out of fascion (and I tend to agree with Linus that it's not pretty anyway). Actually, I tend to think in that specific case that the driver defining something called readq and writeq based on a pair of readl's and writel's is fairly bogus though. Or maybe the problem could be solved another way, by guaranteeing that a "good enough for drivers" readq() and writeq() exist on all platforms, even 32-bit platforms where the operation isn't inherently atomic. I'd rather not provide readq/writeq if they aren't atomic. This is why I said "good enough for drivers". This is _key_. I have run into several [PCI] devices with 64-bit registers, and __none__ of them had requirements such that the Linux platform code -must- provide an atomic readq/writeq. Probably because everybody wants to support 32-bit platforms with their devices. What you call "fairly bogus" is precisely what drivers need. These devices with 64-bit registers just don't need the atomicity that arch developers harp about :) Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
> This seems a bit ugly. Could you add > > #define readq readq > > to your platform instead? That's ugly too imho but I suppose I can do it :-) > I generally think it's a bug in the kernel-wide API, if use of said API > requires arch-specific ifdefs. Yes. I agree. In that specific case, I suppose what you propose is the least ugly of the solutions. HAVE_ARCH_* is pretty much out of fascion (and I tend to agree with Linus that it's not pretty anyway). Actually, I tend to think in that specific case that the driver defining something called readq and writeq based on a pair of readl's and writel's is fairly bogus though. > Or maybe the problem could be solved another way, by guaranteeing that a > "good enough for drivers" readq() and writeq() exist on all platforms, > even 32-bit platforms where the operation isn't inherently atomic. I'd rather not provide readq/writeq if they aren't atomic. Ben. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] s2io ppc64 fix for readq/writeq
Benjamin Herrenschmidt wrote: The s2io driver is redefining it's own readq/writeq based on readl/writel when the platform doesn't provide native ones. However, it currently does so by testing #ifndef readq. While that works for now, we are about to change ppc64 to use inline functions rather that macros for all those IO accessors which will break that test. This fixes it. I don't have anything less ugly at hand unfortunately. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> --- The patch changing ppc64 own definition is scheduled to go in 2.6.20 when the merge window opens, so it would be nice if this patch could go in a similar timeframe, provided that you agree with it of course. It can go earlier as it won't break current ppc64. Index: linux-cell/drivers/net/s2io.h === --- linux-cell.orig/drivers/net/s2io.h 2006-10-13 17:23:49.0 +1000 +++ linux-cell/drivers/net/s2io.h 2006-11-06 13:19:32.0 +1100 @@ -862,8 +862,10 @@ struct s2io_nic { #define RESET_ERROR 1; #define CMD_ERROR 2; -/* OS related system calls */ -#ifndef readq +/* OS related system calls. Note that ppc64 has readq defined as + * an inline, not a macro + */ +#if !defined(CONFIG_PPC64) && !defined(readq) static inline u64 readq(void __iomem *addr) { u64 ret = 0; @@ -875,7 +877,7 @@ static inline u64 readq(void __iomem *ad } #endif -#ifndef writeq +#if !defined(CONFIG_PPC64) && !defined(writeq) static inline void writeq(u64 val, void __iomem *addr) This seems a bit ugly. Could you add #define readq readq to your platform instead? I generally think it's a bug in the kernel-wide API, if use of said API requires arch-specific ifdefs. Or maybe the problem could be solved another way, by guaranteeing that a "good enough for drivers" readq() and writeq() exist on all platforms, even 32-bit platforms where the operation isn't inherently atomic. Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html