Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-18 Thread Mikulas Patocka
> > > > You already must not place any data structures into WC memory --- for > > > > example, spinlocks wouldn't work there. > > > > > > What do you mean "already"? > > > > I mean "in current kernel" (I checked it in 2.6.22) > > Ahh, that's not "current kernel", though ;) > > 4071c718555d955a

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-17 Thread Nick Piggin
On Wed, Oct 17, 2007 at 01:51:17PM +0800, Herbert Xu wrote: > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > Also, for non-wb memory. I don't think the Intel document referenced > > says anything about this, but the AMD document says that loads can pass > > loads (page 8, rule b). > > > > This is

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-17 Thread Nick Piggin
On Wed, Oct 17, 2007 at 02:30:32AM +0200, Mikulas Patocka wrote: > > > You already must not place any data structures into WC memory --- for > > > example, spinlocks wouldn't work there. > > > > What do you mean "already"? > > I mean "in current kernel" (I checked it in 2.6.22) Ahh, that's not

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-16 Thread Herbert Xu
Nick Piggin <[EMAIL PROTECTED]> wrote: > > Also, for non-wb memory. I don't think the Intel document referenced > says anything about this, but the AMD document says that loads can pass > loads (page 8, rule b). > > This is why our rmb() is still an lfence. BTW, Xen (in particular, the code in dr

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-16 Thread Mikulas Patocka
> > You already must not place any data structures into WC memory --- for > > example, spinlocks wouldn't work there. > > What do you mean "already"? I mean "in current kernel" (I checked it in 2.6.22) > If we already have drivers loading data from > WC memory, then rmb() needs to order them, w

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-16 Thread Nick Piggin
On Wed, Oct 17, 2007 at 01:05:16AM +0200, Mikulas Patocka wrote: > > > I see, AMD says that WC memory loads can be out-of-order. > > > > > > There is very little usability to it --- framebuffer and AGP aperture is > > > the only piece of memory that is WC and no kernel structures are placed > >

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-16 Thread Mikulas Patocka
> > I see, AMD says that WC memory loads can be out-of-order. > > > > There is very little usability to it --- framebuffer and AGP aperture is > > the only piece of memory that is WC and no kernel structures are placed > > there, so it is possible to remove that lfence. > > No. In Linux kernel,

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-16 Thread Nick Piggin
On Tue, Oct 16, 2007 at 12:33:54PM +0200, Mikulas Patocka wrote: > > > On Tue, 16 Oct 2007, Nick Piggin wrote: > > > > > The cpus also have an explicit set of instructions that deliberately do > > > > unordered stores/loads, and s/lfence etc are mostly designed for those. > > > > > > I know ab

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-16 Thread Mikulas Patocka
On Tue, 16 Oct 2007, Nick Piggin wrote: > > > The cpus also have an explicit set of instructions that deliberately do > > > unordered stores/loads, and s/lfence etc are mostly designed for those. > > > > I know about unordered stores (movnti & similar) --- they basically use > > write-combini

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-16 Thread Mikulas Patocka
On Mon, 15 Oct 2007, H. Peter Anvin wrote: > Mikulas Patocka wrote: > > > > I know about unordered stores (movnti & similar) --- they basically use > > write-combining method on memory that is normally write-back --- and they > > need sfence. But which one instruction does unordered load and need

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-15 Thread Nick Piggin
On Tue, Oct 16, 2007 at 12:08:01AM +0200, Mikulas Patocka wrote: > > On Mon, 15 Oct 2007 22:47:42 +0200 (CEST) > > Mikulas Patocka <[EMAIL PROTECTED]> wrote: > > > > > > According to latest memory ordering specification documents from > > > > Intel and AMD, both manufacturers are committed to in-o

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-15 Thread H. Peter Anvin
Mikulas Patocka wrote: I know about unordered stores (movnti & similar) --- they basically use write-combining method on memory that is normally write-back --- and they need sfence. But which one instruction does unordered load and needs lefence? PREFETCHNTA. -hpa - To unsubscrib

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-15 Thread Mikulas Patocka
> On Mon, 15 Oct 2007 22:47:42 +0200 (CEST) > Mikulas Patocka <[EMAIL PROTECTED]> wrote: > > > > According to latest memory ordering specification documents from > > > Intel and AMD, both manufacturers are committed to in-order loads > > > from cacheable memory for the x86 architecture. Hence, smp

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-15 Thread Arjan van de Ven
On Mon, 15 Oct 2007 22:47:42 +0200 (CEST) Mikulas Patocka <[EMAIL PROTECTED]> wrote: > > According to latest memory ordering specification documents from > > Intel and AMD, both manufacturers are committed to in-order loads > > from cacheable memory for the x86 architecture. Hence, smp_rmb() > > m