subject:"Re\: Intel Memory Ordering White Paper"

Re: Intel Memory Ordering White Paper

2007-09-19 Thread Andi Kleen

Jesse Barnes <[EMAIL PROTECTED]> writes:
> 
> It's really both (1) and (2).  This document will become part of the 
> regular manuals when the next version is published.  And yes, 
> processors may do something different internally, but software can rely 
> on the behavior described by the rules in the document.

... until the first erratum comes around. With the multitude of x86
cores being introduced all the time (how many did only Intel just announce at 
IDF?@) that is going to happen sooner or later.

i386 with full legacy enabled already has to care about old PPros and 
those seriously violate write ordering.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-19 Thread Jesse Barnes

On Wednesday, September 12, 2007 11:26 am Dr. David Alan Gilbert wrote:
> * Jesse Barnes ([EMAIL PROTECTED]) wrote:
> > FYI, we just released a new white paper describing memory ordering
> > for Intel processors:
> > http://developer.intel.com/products/processor/manuals/index.htm
> >
> > Should help answer some questions about some of the ordering
> > primitives we use on i386 and x86_64.
>
> Hi Jesse,
>   Thanks for letting everyone know about that paper, however - it
> has confused me somewhat; there seem to be differences in that
> description and that described in the 'Intel 64 and IA-32
> Architectures Software Developer's Manual' and I'd like to understand
> whether this paper is designed just to explain points or is actually
> intended to change what can be expected of the processor.
>
> That ordering doc states:
> 'Loads are not reordered with other loads'
>
> Vol3a section 7.2.1 of the architecture manual states:
>
> 'Reads can be carried out speculatively and in any order.'
>
> Is this a:
>   1) Change in the definition of the architecture that existing
> processors actually follow anyway.
>   2) A difference between what the processor does and what is visible
> to the software (the intro to this paper does seem to emphasize
> software visibility more than the architecture manual).
>   3) Some other difference I haven't spotted.

It's really both (1) and (2).  This document will become part of the 
regular manuals when the next version is published.  And yes, 
processors may do something different internally, but software can rely 
on the behavior described by the rules in the document.

Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-19 Thread Jesse Barnes

On Wednesday, September 12, 2007 11:26 am Dr. David Alan Gilbert wrote:
 * Jesse Barnes ([EMAIL PROTECTED]) wrote:
  FYI, we just released a new white paper describing memory ordering
  for Intel processors:
  http://developer.intel.com/products/processor/manuals/index.htm
 
  Should help answer some questions about some of the ordering
  primitives we use on i386 and x86_64.

 Hi Jesse,
   Thanks for letting everyone know about that paper, however - it
 has confused me somewhat; there seem to be differences in that
 description and that described in the 'Intel 64 and IA-32
 Architectures Software Developer's Manual' and I'd like to understand
 whether this paper is designed just to explain points or is actually
 intended to change what can be expected of the processor.

 That ordering doc states:
 'Loads are not reordered with other loads'

 Vol3a section 7.2.1 of the architecture manual states:

 'Reads can be carried out speculatively and in any order.'

 Is this a:
   1) Change in the definition of the architecture that existing
 processors actually follow anyway.
   2) A difference between what the processor does and what is visible
 to the software (the intro to this paper does seem to emphasize
 software visibility more than the architecture manual).
   3) Some other difference I haven't spotted.

It's really both (1) and (2).  This document will become part of the 
regular manuals when the next version is published.  And yes, 
processors may do something different internally, but software can rely 
on the behavior described by the rules in the document.

Jesse
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-19 Thread Andi Kleen

Jesse Barnes [EMAIL PROTECTED] writes:
 
 It's really both (1) and (2).  This document will become part of the 
 regular manuals when the next version is published.  And yes, 
 processors may do something different internally, but software can rely 
 on the behavior described by the rules in the document.

... until the first erratum comes around. With the multitude of x86
cores being introduced all the time (how many did only Intel just announce at 
IDF?@) that is going to happen sooner or later.

i386 with full legacy enabled already has to care about old PPros and 
those seriously violate write ordering.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-12 Thread Dr. David Alan Gilbert

* Jesse Barnes ([EMAIL PROTECTED]) wrote:
> FYI, we just released a new white paper describing memory ordering for 
> Intel processors:
> http://developer.intel.com/products/processor/manuals/index.htm
> 
> Should help answer some questions about some of the ordering primitives 
> we use on i386 and x86_64.

Hi Jesse,
  Thanks for letting everyone know about that paper, however - it
has confused me somewhat; there seem to be differences in that
description and that described in the 'Intel 64 and IA-32 Architectures
Software Developer's Manual' and I'd like to understand whether
this paper is designed just to explain points or is actually 
intended to change what can be expected of the processor.

That ordering doc states:
'Loads are not reordered with other loads'

Vol3a section 7.2.1 of the architecture manual states:

'Reads can be carried out speculatively and in any order.'

Is this a:
  1) Change in the definition of the architecture that existing
processors actually follow anyway.
  2) A difference between what the processor does and what is visible
to the software (the intro to this paper does seem to emphasize
software visibility more than the architecture manual).
  3) Some other difference I haven't spotted.

The other thing that made me think about it was that the Itanium
Architecture Software Dev Manul vol2 2.1.2 states that the Itanium
uses ld.acq/st.rel (acquire/release) references to
'operate according to the IA-32 ordering model.' which I think means
that all those loads are in order relative to all the other acquire
loads?

Dave

-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert| Running GNU/Linux on Alpha,68K| Happy  \ 
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
 \ _|_ http://www.treblig.org   |___/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-12 Thread Dr. David Alan Gilbert

* Jesse Barnes ([EMAIL PROTECTED]) wrote:
 FYI, we just released a new white paper describing memory ordering for 
 Intel processors:
 http://developer.intel.com/products/processor/manuals/index.htm
 
 Should help answer some questions about some of the ordering primitives 
 we use on i386 and x86_64.

Hi Jesse,
  Thanks for letting everyone know about that paper, however - it
has confused me somewhat; there seem to be differences in that
description and that described in the 'Intel 64 and IA-32 Architectures
Software Developer's Manual' and I'd like to understand whether
this paper is designed just to explain points or is actually 
intended to change what can be expected of the processor.

That ordering doc states:
'Loads are not reordered with other loads'

Vol3a section 7.2.1 of the architecture manual states:

'Reads can be carried out speculatively and in any order.'

Is this a:
  1) Change in the definition of the architecture that existing
processors actually follow anyway.
  2) A difference between what the processor does and what is visible
to the software (the intro to this paper does seem to emphasize
software visibility more than the architecture manual).
  3) Some other difference I haven't spotted.

The other thing that made me think about it was that the Itanium
Architecture Software Dev Manul vol2 2.1.2 states that the Itanium
uses ld.acq/st.rel (acquire/release) references to
'operate according to the IA-32 ordering model.' which I think means
that all those loads are in order relative to all the other acquire
loads?

Dave

-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert| Running GNU/Linux on Alpha,68K| Happy  \ 
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC  HPPA | In Hex /
 \ _|_ http://www.treblig.org   |___/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread H. Peter Anvin


Nick Piggin wrote:

smp_rmb() should not need to do anything because loads are done
in order anyway. Both AMD and Intel have committed to this now.

The important point is that they *appear* to be done in order. AFAIK,
the CPUs can still do speculative and out of order loads, but throw
out the results if they could be wrong.


Is there anything even semiofficial from VIA?  Not that the x86 
architecture isn't pretty much definable as the AMD-Intel consensus...


-hpa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Alan Cox

> AMD processors guarantee loads are ordered and stores are ordered
> (with exceptions of non-temporal, and non-wb policy).
> 
> As for the others that do out of order stores, are any of them SMP?

IDT winchip isn't, Geode isn't
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread dean gaudet

On Sat, 8 Sep 2007, Petr Vandrovec wrote:

> dean gaudet wrote:
> > On Sun, 9 Sep 2007, Nick Piggin wrote:
> > 
> > > I've also heard that string operations do not follow the normal ordering,
> > > but
> > > that's just with respect to individual loads/stores in the one operation,
> > > I
> > > hope? And they will still follow ordering rules WRT surrounding loads and
> > > stores?
> > 
> > see section 7.2.3 of intel volume 3A...
> > 
> > "Code dependent upon sequential store ordering should not use the string
> > operations for the entire data structure to be stored. Data and semaphores
> > should be separated. Order dependent code should use a discrete semaphore
> > uniquely stored to after any string operations to allow correctly ordered
> > data to be seen by all processors."
> > 
> > i think we need sfence after things like copy_page, clear_page, and possibly
> > copy_user... at least on intel processors with fast strings option enabled.
> 
> I do not think.  I believe that authors are trying to say that
> 
> struct { uint8 lock; uint8 data; } x;
> 
> lea (x.data),%edi
> mov $2,%ecx
> std
> rep movsb
> 
> to set both data and lock does not guarantee that x.lock will be set after
> x.data and that you should do
> 
> lea (x.data),%edi
> std
> movsb
> movsb  # or mov (%esi),%al; mov %al,(%edi), but movsb looks discrete enough to
> me
> 
> instead (and yes, I know that my example is silly).

no it's worse than that -- intel fast string stores can become globally 
visible in any order at all w.r.t. normal loads or stores... so take all 
those great examples in their recent whitepaper and throw out all the 
ordering guarantees for addresses on different cachelines if any of the 
stores are rep string.

for example transitive store ordering for locations on multiple cachelines 
is not guaranteed at all.  the kernel could return a zero page and one 
core could see the zeroes out of order with another core performing some 
sort of lockless data structure operation.

fast strings don't break ordering from the point of view of the core 
performing the rep string operation, but externally there are no 
guarantees (it's right there in the docs).

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Petr Vandrovec


dean gaudet wrote:

On Sun, 9 Sep 2007, Nick Piggin wrote:


I've also heard that string operations do not follow the normal ordering, but
that's just with respect to individual loads/stores in the one operation, I
hope? And they will still follow ordering rules WRT surrounding loads and
stores?


see section 7.2.3 of intel volume 3A...

"Code dependent upon sequential store ordering should not use the string 
operations for the entire data structure to be stored. Data and semaphores 
should be separated. Order dependent code should use a discrete semaphore 
uniquely stored to after any string operations to allow correctly ordered 
data to be seen by all processors."


i think we need sfence after things like copy_page, clear_page, and 
possibly copy_user... at least on intel processors with fast strings 
option enabled.


I do not think.  I believe that authors are trying to say that

struct { uint8 lock; uint8 data; } x;

lea (x.data),%edi
mov $2,%ecx
std
rep movsb

to set both data and lock does not guarantee that x.lock will be set 
after x.data and that you should do


lea (x.data),%edi
std
movsb
movsb  # or mov (%esi),%al; mov %al,(%edi), but movsb looks discrete 
enough to me


instead (and yes, I know that my example is silly).
Petr

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread dean gaudet

On Sun, 9 Sep 2007, Nick Piggin wrote:

> I've also heard that string operations do not follow the normal ordering, but
> that's just with respect to individual loads/stores in the one operation, I
> hope? And they will still follow ordering rules WRT surrounding loads and
> stores?

see section 7.2.3 of intel volume 3A...

"Code dependent upon sequential store ordering should not use the string 
operations for the entire data structure to be stored. Data and semaphores 
should be separated. Order dependent code should use a discrete semaphore 
uniquely stored to after any string operations to allow correctly ordered 
data to be seen by all processors."

i think we need sfence after things like copy_page, clear_page, and 
possibly copy_user... at least on intel processors with fast strings 
option enabled.

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Saturday 08 September 2007 20:29, Alan Cox wrote:
> On Fri, 7 Sep 2007 15:26:50 -0700
>
> Jesse Barnes <[EMAIL PROTECTED]> wrote:
> > FYI, we just released a new white paper describing memory ordering for
> > Intel processors:
> > http://developer.intel.com/products/processor/manuals/index.htm
> >
> > Should help answer some questions about some of the ordering primitives
> > we use on i386 and x86_64.
>
> Nice - but it appears to be 64bit only - and indeed it appears to be
> untrue for real 32bit because of the Pentium Pro fencing errata.

As I said, we're not doing anything special in barriers for the ppro errata
today anyway.

> The kernel also runs on IDT Winchip, Cyrix and AMD processors not all of
> which have exactly the same behaviour (the IDT Winchip as we run it
> profoundly differs)

AMD processors guarantee loads are ordered and stores are ordered
(with exceptions of non-temporal, and non-wb policy).

As for the others that do out of order stores, are any of them SMP?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Saturday 08 September 2007 20:30, Alan Cox wrote:
> On Sat, 8 Sep 2007 18:54:57 +1000
>
> Nick Piggin <[EMAIL PROTECTED]> wrote:
> > On Saturday 08 September 2007 08:26, Jesse Barnes wrote:
> > > FYI, we just released a new white paper describing memory ordering for
> > > Intel processors:
> > > http://developer.intel.com/products/processor/manuals/index.htm
> > >
> > > Should help answer some questions about some of the ordering primitives
> > > we use on i386 and x86_64.
> >
> > So, can we finally noop smp_rmb and smp_wmb on x86?
>
> Nakked-by: Alan Cox <[EMAIL PROTECTED]>
>
> You can only no-op it on 64bit Intel processors. On 32bit it needs to be
> conditional on whether your processor family (or back compat for it) as
> the Pentium Pro has some serious store ordering errata (hence the way it
> needs lock decb for spin_unlock)

We already noop smp_wmb on i386 even when CONFIG_X86_PPRO_FENCE.

I'm not sure if either errata can be solved completely by adding lock ops
in barrier instructions anyway: they both seem to involve situations where
there is just a single problematic cacheline in question.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Saturday 08 September 2007 20:19, Andi Kleen wrote:
> On Friday 07 September 2007 21:57:35 Nick Piggin wrote:
> > > > Anyway, the lfence should be able to go away without so much trouble.
> > >
> > > You mean sfence? lfence in rmb is definitely needed.
> >
> > I mean lfence in smp_rmb().
>
> One point of rmb is to stop speculative loads and I don't think we
> can get that without lfence.

smp_rmb() should not need to do anything because loads are done
in order anyway. Both AMD and Intel have committed to this now.

The important point is that they *appear* to be done in order. AFAIK,
the CPUs can still do speculative and out of order loads, but throw
out the results if they could be wrong.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Alan Cox

On Sat, 8 Sep 2007 18:54:57 +1000
Nick Piggin <[EMAIL PROTECTED]> wrote:

> On Saturday 08 September 2007 08:26, Jesse Barnes wrote:
> > FYI, we just released a new white paper describing memory ordering for
> > Intel processors:
> > http://developer.intel.com/products/processor/manuals/index.htm
> >
> > Should help answer some questions about some of the ordering primitives
> > we use on i386 and x86_64.
> 
> So, can we finally noop smp_rmb and smp_wmb on x86?

Nakked-by: Alan Cox <[EMAIL PROTECTED]>

You can only no-op it on 64bit Intel processors. On 32bit it needs to be
conditional on whether your processor family (or back compat for it) as
the Pentium Pro has some serious store ordering errata (hence the way it
needs lock decb for spin_unlock)

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Alan Cox

On Fri, 7 Sep 2007 15:26:50 -0700
Jesse Barnes <[EMAIL PROTECTED]> wrote:

> FYI, we just released a new white paper describing memory ordering for 
> Intel processors:
> http://developer.intel.com/products/processor/manuals/index.htm
> 
> Should help answer some questions about some of the ordering primitives 
> we use on i386 and x86_64.

Nice - but it appears to be 64bit only - and indeed it appears to be
untrue for real 32bit because of the Pentium Pro fencing errata.

The kernel also runs on IDT Winchip, Cyrix and AMD processors not all of
which have exactly the same behaviour (the IDT Winchip as we run it
profoundly differs)

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Andi Kleen

On Friday 07 September 2007 21:57:35 Nick Piggin wrote:

> 
> > > Anyway, the lfence should be able to go away without so much trouble.
> >
> > You mean sfence? lfence in rmb is definitely needed.
> 
> I mean lfence in smp_rmb().

One point of rmb is to stop speculative loads and I don't think we 
can get that without lfence.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Saturday 08 September 2007 18:53, Andi Kleen wrote:
> On Friday 07 September 2007 20:13:12 Nick Piggin wrote:
> > On Sunday 09 September 2007 03:48, Nick Piggin wrote:
> > > There is some suggestion in the source code that non-temporal stores
> > > (movntq) are weakly ordered. But AFAIKS from the documents, it is
> > > ordered when operating on wb memory. What's the situation there?
> >
> > Sorry, it looks from the AMD document like nontemporal stores to wb
> > memory can go out of order.
>
> Yes, that is how NT stores are defined.
>
> > If this is the case, we can either retain the sfence in smp_wmb(), or
> > noop it, and put explicit sfences around any place that performs
> > nontemporal stores...
>
> We do this already, but in most cases it doesn't matter anyways. We AFAIK
> do not rely on any ordering for copy_*_user for example. There are not
> that many users of nt so it's not a huge issue.

OK, but we just don't want to be making lots of little exceptions. For
bulk copies, I don't see it being a big issue to always sfence around
them (it would be a relatively minor cost).


> > Anyway, the lfence should be able to go away without so much trouble.
>
> You mean sfence? lfence in rmb is definitely needed.

I mean lfence in smp_rmb().


> sfence on x86-64 is not strictly needed, but also shouldn't hurt very much
> so I always kept it in.
>
> -Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Andi Kleen

On Friday 07 September 2007 20:13:12 Nick Piggin wrote:
> On Sunday 09 September 2007 03:48, Nick Piggin wrote:
> 
> > There is some suggestion in the source code that non-temporal stores
> > (movntq) are weakly ordered. But AFAIKS from the documents, it is ordered
> > when operating on wb memory. What's the situation there?
> 
> Sorry, it looks from the AMD document like nontemporal stores to wb
> memory can go out of order.

Yes, that is how NT stores are defined.
 
> If this is the case, we can either retain the sfence in smp_wmb(), or noop
> it, and put explicit sfences around any place that performs nontemporal
> stores...

We do this already, but in most cases it doesn't matter anyways. We AFAIK
do not rely on any ordering for copy_*_user for example. There are not
that many users of nt so it's not a huge issue.

> 
> Anyway, the lfence should be able to go away without so much trouble.

You mean sfence? lfence in rmb is definitely needed.

sfence on x86-64 is not strictly needed, but also shouldn't hurt very much 
so I always kept it in.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Sunday 09 September 2007 03:48, Nick Piggin wrote:

> There is some suggestion in the source code that non-temporal stores
> (movntq) are weakly ordered. But AFAIKS from the documents, it is ordered
> when operating on wb memory. What's the situation there?

Sorry, it looks from the AMD document like nontemporal stores to wb
memory can go out of order. It is a bit hard to decipher what the types
mean.

If this is the case, we can either retain the sfence in smp_wmb(), or noop
it, and put explicit sfences around any place that performs nontemporal
stores...

Anyway, the lfence should be able to go away without so much trouble.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Sunday 09 September 2007 03:34, Nick Piggin wrote:
> On Saturday 08 September 2007 09:20, Linus Torvalds wrote:
> > On Sat, 8 Sep 2007, Nick Piggin wrote:
> > > So, can we finally noop smp_rmb and smp_wmb on x86?
> >
> > Did AMD already release their version? If so, we should probably add a
> > commit that does that in somewhat early 2.6.24 rc, and add the pointers
> > to the whitepapers in the commit message.
>
> http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/245
>93.pdf
>
> AMD64 Architecture Programmer's Manual Volume 2: System Programming
> section 7.2: Multiprocessor Memory Access Ordering, a paragraph on the
> first page says
>
> "Loads do not pass previous loads (loads are not re-ordered). Stores do
> not pass previous stores (stores are not re-ordered)"
>
> So, yes, it should be easy to do.

There is some suggestion in the source code that non-temporal stores
(movntq) are weakly ordered. But AFAIKS from the documents, it is ordered
when operating on wb memory. What's the situation there?

I've also heard that string operations do not follow the normal ordering, but
that's just with respect to individual loads/stores in the one operation, I
hope? And they will still follow ordering rules WRT surrounding loads and
stores?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Saturday 08 September 2007 09:20, Linus Torvalds wrote:
> On Sat, 8 Sep 2007, Nick Piggin wrote:
> > So, can we finally noop smp_rmb and smp_wmb on x86?
>
> Did AMD already release their version? If so, we should probably add a
> commit that does that in somewhat early 2.6.24 rc, and add the pointers to
> the whitepapers in the commit message.

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf

AMD64 Architecture Programmer's Manual Volume 2: System Programming
section 7.2: Multiprocessor Memory Access Ordering, a paragraph on the
first page says

"Loads do not pass previous loads (loads are not re-ordered). Stores do
not pass previous stores (stores are not re-ordered)"

So, yes, it should be easy to do.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Saturday 08 September 2007 09:20, Linus Torvalds wrote:
 On Sat, 8 Sep 2007, Nick Piggin wrote:
  So, can we finally noop smp_rmb and smp_wmb on x86?

 Did AMD already release their version? If so, we should probably add a
 commit that does that in somewhat early 2.6.24 rc, and add the pointers to
 the whitepapers in the commit message.

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf

AMD64 Architecture Programmer's Manual Volume 2: System Programming
section 7.2: Multiprocessor Memory Access Ordering, a paragraph on the
first page says

Loads do not pass previous loads (loads are not re-ordered). Stores do
not pass previous stores (stores are not re-ordered)

So, yes, it should be easy to do.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Sunday 09 September 2007 03:34, Nick Piggin wrote:
 On Saturday 08 September 2007 09:20, Linus Torvalds wrote:
  On Sat, 8 Sep 2007, Nick Piggin wrote:
   So, can we finally noop smp_rmb and smp_wmb on x86?
 
  Did AMD already release their version? If so, we should probably add a
  commit that does that in somewhat early 2.6.24 rc, and add the pointers
  to the whitepapers in the commit message.

 http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/245
93.pdf

 AMD64 Architecture Programmer's Manual Volume 2: System Programming
 section 7.2: Multiprocessor Memory Access Ordering, a paragraph on the
 first page says

 Loads do not pass previous loads (loads are not re-ordered). Stores do
 not pass previous stores (stores are not re-ordered)

 So, yes, it should be easy to do.

There is some suggestion in the source code that non-temporal stores
(movntq) are weakly ordered. But AFAIKS from the documents, it is ordered
when operating on wb memory. What's the situation there?

I've also heard that string operations do not follow the normal ordering, but
that's just with respect to individual loads/stores in the one operation, I
hope? And they will still follow ordering rules WRT surrounding loads and
stores?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Sunday 09 September 2007 03:48, Nick Piggin wrote:

 There is some suggestion in the source code that non-temporal stores
 (movntq) are weakly ordered. But AFAIKS from the documents, it is ordered
 when operating on wb memory. What's the situation there?

Sorry, it looks from the AMD document like nontemporal stores to wb
memory can go out of order. It is a bit hard to decipher what the types
mean.

If this is the case, we can either retain the sfence in smp_wmb(), or noop
it, and put explicit sfences around any place that performs nontemporal
stores...

Anyway, the lfence should be able to go away without so much trouble.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Andi Kleen

On Friday 07 September 2007 20:13:12 Nick Piggin wrote:
 On Sunday 09 September 2007 03:48, Nick Piggin wrote:
 
  There is some suggestion in the source code that non-temporal stores
  (movntq) are weakly ordered. But AFAIKS from the documents, it is ordered
  when operating on wb memory. What's the situation there?
 
 Sorry, it looks from the AMD document like nontemporal stores to wb
 memory can go out of order.

Yes, that is how NT stores are defined.
 
 If this is the case, we can either retain the sfence in smp_wmb(), or noop
 it, and put explicit sfences around any place that performs nontemporal
 stores...

We do this already, but in most cases it doesn't matter anyways. We AFAIK
do not rely on any ordering for copy_*_user for example. There are not
that many users of nt so it's not a huge issue.

 
 Anyway, the lfence should be able to go away without so much trouble.

You mean sfence? lfence in rmb is definitely needed.

sfence on x86-64 is not strictly needed, but also shouldn't hurt very much 
so I always kept it in.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Saturday 08 September 2007 18:53, Andi Kleen wrote:
 On Friday 07 September 2007 20:13:12 Nick Piggin wrote:
  On Sunday 09 September 2007 03:48, Nick Piggin wrote:
   There is some suggestion in the source code that non-temporal stores
   (movntq) are weakly ordered. But AFAIKS from the documents, it is
   ordered when operating on wb memory. What's the situation there?
 
  Sorry, it looks from the AMD document like nontemporal stores to wb
  memory can go out of order.

 Yes, that is how NT stores are defined.

  If this is the case, we can either retain the sfence in smp_wmb(), or
  noop it, and put explicit sfences around any place that performs
  nontemporal stores...

 We do this already, but in most cases it doesn't matter anyways. We AFAIK
 do not rely on any ordering for copy_*_user for example. There are not
 that many users of nt so it's not a huge issue.

OK, but we just don't want to be making lots of little exceptions. For
bulk copies, I don't see it being a big issue to always sfence around
them (it would be a relatively minor cost).


  Anyway, the lfence should be able to go away without so much trouble.

 You mean sfence? lfence in rmb is definitely needed.

I mean lfence in smp_rmb().


 sfence on x86-64 is not strictly needed, but also shouldn't hurt very much
 so I always kept it in.

 -Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Andi Kleen

On Friday 07 September 2007 21:57:35 Nick Piggin wrote:

 
   Anyway, the lfence should be able to go away without so much trouble.
 
  You mean sfence? lfence in rmb is definitely needed.
 
 I mean lfence in smp_rmb().

One point of rmb is to stop speculative loads and I don't think we 
can get that without lfence.

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Alan Cox

On Fri, 7 Sep 2007 15:26:50 -0700
Jesse Barnes [EMAIL PROTECTED] wrote:

 FYI, we just released a new white paper describing memory ordering for 
 Intel processors:
 http://developer.intel.com/products/processor/manuals/index.htm
 
 Should help answer some questions about some of the ordering primitives 
 we use on i386 and x86_64.

Nice - but it appears to be 64bit only - and indeed it appears to be
untrue for real 32bit because of the Pentium Pro fencing errata.

The kernel also runs on IDT Winchip, Cyrix and AMD processors not all of
which have exactly the same behaviour (the IDT Winchip as we run it
profoundly differs)

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Alan Cox

On Sat, 8 Sep 2007 18:54:57 +1000
Nick Piggin [EMAIL PROTECTED] wrote:

 On Saturday 08 September 2007 08:26, Jesse Barnes wrote:
  FYI, we just released a new white paper describing memory ordering for
  Intel processors:
  http://developer.intel.com/products/processor/manuals/index.htm
 
  Should help answer some questions about some of the ordering primitives
  we use on i386 and x86_64.
 
 So, can we finally noop smp_rmb and smp_wmb on x86?

Nakked-by: Alan Cox [EMAIL PROTECTED]

You can only no-op it on 64bit Intel processors. On 32bit it needs to be
conditional on whether your processor family (or back compat for it) as
the Pentium Pro has some serious store ordering errata (hence the way it
needs lock decb for spin_unlock)

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Saturday 08 September 2007 20:19, Andi Kleen wrote:
 On Friday 07 September 2007 21:57:35 Nick Piggin wrote:
Anyway, the lfence should be able to go away without so much trouble.
  
   You mean sfence? lfence in rmb is definitely needed.
 
  I mean lfence in smp_rmb().

 One point of rmb is to stop speculative loads and I don't think we
 can get that without lfence.

smp_rmb() should not need to do anything because loads are done
in order anyway. Both AMD and Intel have committed to this now.

The important point is that they *appear* to be done in order. AFAIK,
the CPUs can still do speculative and out of order loads, but throw
out the results if they could be wrong.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Saturday 08 September 2007 20:30, Alan Cox wrote:
 On Sat, 8 Sep 2007 18:54:57 +1000

 Nick Piggin [EMAIL PROTECTED] wrote:
  On Saturday 08 September 2007 08:26, Jesse Barnes wrote:
   FYI, we just released a new white paper describing memory ordering for
   Intel processors:
   http://developer.intel.com/products/processor/manuals/index.htm
  
   Should help answer some questions about some of the ordering primitives
   we use on i386 and x86_64.
 
  So, can we finally noop smp_rmb and smp_wmb on x86?

 Nakked-by: Alan Cox [EMAIL PROTECTED]

 You can only no-op it on 64bit Intel processors. On 32bit it needs to be
 conditional on whether your processor family (or back compat for it) as
 the Pentium Pro has some serious store ordering errata (hence the way it
 needs lock decb for spin_unlock)

We already noop smp_wmb on i386 even when CONFIG_X86_PPRO_FENCE.

I'm not sure if either errata can be solved completely by adding lock ops
in barrier instructions anyway: they both seem to involve situations where
there is just a single problematic cacheline in question.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Nick Piggin

On Saturday 08 September 2007 20:29, Alan Cox wrote:
 On Fri, 7 Sep 2007 15:26:50 -0700

 Jesse Barnes [EMAIL PROTECTED] wrote:
  FYI, we just released a new white paper describing memory ordering for
  Intel processors:
  http://developer.intel.com/products/processor/manuals/index.htm
 
  Should help answer some questions about some of the ordering primitives
  we use on i386 and x86_64.

 Nice - but it appears to be 64bit only - and indeed it appears to be
 untrue for real 32bit because of the Pentium Pro fencing errata.

As I said, we're not doing anything special in barriers for the ppro errata
today anyway.


 The kernel also runs on IDT Winchip, Cyrix and AMD processors not all of
 which have exactly the same behaviour (the IDT Winchip as we run it
 profoundly differs)

AMD processors guarantee loads are ordered and stores are ordered
(with exceptions of non-temporal, and non-wb policy).

As for the others that do out of order stores, are any of them SMP?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread dean gaudet

On Sun, 9 Sep 2007, Nick Piggin wrote:

 I've also heard that string operations do not follow the normal ordering, but
 that's just with respect to individual loads/stores in the one operation, I
 hope? And they will still follow ordering rules WRT surrounding loads and
 stores?

see section 7.2.3 of intel volume 3A...

Code dependent upon sequential store ordering should not use the string 
operations for the entire data structure to be stored. Data and semaphores 
should be separated. Order dependent code should use a discrete semaphore 
uniquely stored to after any string operations to allow correctly ordered 
data to be seen by all processors.

i think we need sfence after things like copy_page, clear_page, and 
possibly copy_user... at least on intel processors with fast strings 
option enabled.

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Petr Vandrovec


dean gaudet wrote:

On Sun, 9 Sep 2007, Nick Piggin wrote:


I've also heard that string operations do not follow the normal ordering, but
that's just with respect to individual loads/stores in the one operation, I
hope? And they will still follow ordering rules WRT surrounding loads and
stores?


see section 7.2.3 of intel volume 3A...

Code dependent upon sequential store ordering should not use the string 
operations for the entire data structure to be stored. Data and semaphores 
should be separated. Order dependent code should use a discrete semaphore 
uniquely stored to after any string operations to allow correctly ordered 
data to be seen by all processors.


i think we need sfence after things like copy_page, clear_page, and 
possibly copy_user... at least on intel processors with fast strings 
option enabled.


I do not think.  I believe that authors are trying to say that

struct { uint8 lock; uint8 data; } x;

lea (x.data),%edi
mov $2,%ecx
std
rep movsb

to set both data and lock does not guarantee that x.lock will be set 
after x.data and that you should do


lea (x.data),%edi
std
movsb
movsb  # or mov (%esi),%al; mov %al,(%edi), but movsb looks discrete 
enough to me


instead (and yes, I know that my example is silly).
Petr

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread dean gaudet

On Sat, 8 Sep 2007, Petr Vandrovec wrote:

 dean gaudet wrote:
  On Sun, 9 Sep 2007, Nick Piggin wrote:
  
   I've also heard that string operations do not follow the normal ordering,
   but
   that's just with respect to individual loads/stores in the one operation,
   I
   hope? And they will still follow ordering rules WRT surrounding loads and
   stores?
  
  see section 7.2.3 of intel volume 3A...
  
  Code dependent upon sequential store ordering should not use the string
  operations for the entire data structure to be stored. Data and semaphores
  should be separated. Order dependent code should use a discrete semaphore
  uniquely stored to after any string operations to allow correctly ordered
  data to be seen by all processors.
  
  i think we need sfence after things like copy_page, clear_page, and possibly
  copy_user... at least on intel processors with fast strings option enabled.
 
 I do not think.  I believe that authors are trying to say that
 
 struct { uint8 lock; uint8 data; } x;
 
 lea (x.data),%edi
 mov $2,%ecx
 std
 rep movsb
 
 to set both data and lock does not guarantee that x.lock will be set after
 x.data and that you should do
 
 lea (x.data),%edi
 std
 movsb
 movsb  # or mov (%esi),%al; mov %al,(%edi), but movsb looks discrete enough to
 me
 
 instead (and yes, I know that my example is silly).

no it's worse than that -- intel fast string stores can become globally 
visible in any order at all w.r.t. normal loads or stores... so take all 
those great examples in their recent whitepaper and throw out all the 
ordering guarantees for addresses on different cachelines if any of the 
stores are rep string.

for example transitive store ordering for locations on multiple cachelines 
is not guaranteed at all.  the kernel could return a zero page and one 
core could see the zeroes out of order with another core performing some 
sort of lockless data structure operation.

fast strings don't break ordering from the point of view of the core 
performing the rep string operation, but externally there are no 
guarantees (it's right there in the docs).

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread Alan Cox

 AMD processors guarantee loads are ordered and stores are ordered
 (with exceptions of non-temporal, and non-wb policy).
 
 As for the others that do out of order stores, are any of them SMP?

IDT winchip isn't, Geode isn't
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-08 Thread H. Peter Anvin


Nick Piggin wrote:

smp_rmb() should not need to do anything because loads are done
in order anyway. Both AMD and Intel have committed to this now.

The important point is that they *appear* to be done in order. AFAIK,
the CPUs can still do speculative and out of order loads, but throw
out the results if they could be wrong.


Is there anything even semiofficial from VIA?  Not that the x86 
architecture isn't pretty much definable as the AMD-Intel consensus...


-hpa

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-07 Thread Linus Torvalds

On Sat, 8 Sep 2007, Nick Piggin wrote:
> 
> So, can we finally noop smp_rmb and smp_wmb on x86?

Did AMD already release their version? If so, we should probably add a 
commit that does that in somewhat early 2.6.24 rc, and add the pointers to 
the whitepapers in the commit message.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Intel Memory Ordering White Paper

2007-09-07 Thread Nick Piggin

On Saturday 08 September 2007 08:26, Jesse Barnes wrote:
> FYI, we just released a new white paper describing memory ordering for
> Intel processors:
> http://developer.intel.com/products/processor/manuals/index.htm
>
> Should help answer some questions about some of the ordering primitives
> we use on i386 and x86_64.

So, can we finally noop smp_rmb and smp_wmb on x86?
Index: linux-2.6/include/asm-i386/system.h
===
--- linux-2.6.orig/include/asm-i386/system.h
+++ linux-2.6/include/asm-i386/system.h
@@ -286,7 +286,7 @@ static inline unsigned long get_limit(un
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
-#define smp_rmb()	rmb()
+#define smp_rmb()	barrier()
 #define smp_wmb()	wmb()
 #define smp_read_barrier_depends()	read_barrier_depends()
 #define set_mb(var, value) do { (void) xchg(, value); } while (0)
Index: linux-2.6/include/asm-x86_64/system.h
===
--- linux-2.6.orig/include/asm-x86_64/system.h
+++ linux-2.6/include/asm-x86_64/system.h
@@ -141,8 +141,8 @@ static inline void write_cr8(unsigned lo
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
-#define smp_rmb()	rmb()
-#define smp_wmb()	wmb()
+#define smp_rmb()	barrier()
+#define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	do {} while(0)
 #else
 #define smp_mb()	barrier()

Re: Intel Memory Ordering White Paper

2007-09-07 Thread Nick Piggin

On Saturday 08 September 2007 08:26, Jesse Barnes wrote:
 FYI, we just released a new white paper describing memory ordering for
 Intel processors:
 http://developer.intel.com/products/processor/manuals/index.htm

 Should help answer some questions about some of the ordering primitives
 we use on i386 and x86_64.

So, can we finally noop smp_rmb and smp_wmb on x86?
Index: linux-2.6/include/asm-i386/system.h
===
--- linux-2.6.orig/include/asm-i386/system.h
+++ linux-2.6/include/asm-i386/system.h
@@ -286,7 +286,7 @@ static inline unsigned long get_limit(un
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
-#define smp_rmb()	rmb()
+#define smp_rmb()	barrier()
 #define smp_wmb()	wmb()
 #define smp_read_barrier_depends()	read_barrier_depends()
 #define set_mb(var, value) do { (void) xchg(var, value); } while (0)
Index: linux-2.6/include/asm-x86_64/system.h
===
--- linux-2.6.orig/include/asm-x86_64/system.h
+++ linux-2.6/include/asm-x86_64/system.h
@@ -141,8 +141,8 @@ static inline void write_cr8(unsigned lo
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
-#define smp_rmb()	rmb()
-#define smp_wmb()	wmb()
+#define smp_rmb()	barrier()
+#define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	do {} while(0)
 #else
 #define smp_mb()	barrier()

Re: Intel Memory Ordering White Paper

2007-09-07 Thread Linus Torvalds



On Sat, 8 Sep 2007, Nick Piggin wrote:
 
 So, can we finally noop smp_rmb and smp_wmb on x86?

Did AMD already release their version? If so, we should probably add a 
commit that does that in somewhat early 2.6.24 rc, and add the pointers to 
the whitepapers in the commit message.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

42 matches

Mail list logo