Re: Volatile vs Non-Volatile Spin Locks on SMP.

2005-07-18 Thread Joe Seigh

Joe Seigh wrote:

For synchronization you need memory barriers in most cases and the only
way to get these is using assembler since there are no C or gcc intrinsics
for these yet.  For inline assembler, the convention seems to be to use
the volatile attribute, which I take as meaning no code movement across
the inline assembler code.  It if doesn't mean that then a lot of stuff
is broken AFAICT.



Usenet rule #1.  If you don't find something in the documentation, you
will find it after you post about it.  Volatile does seem to be documented
somewhat in the gcc docs
http://gcc.gnu.org/onlinedocs/gcc-4.0.1/gcc/Extended-Asm.html#Extended-Asm

I was using "memory" in the clobber list as the main thing to keep optimization 
from
occurring across inline asm.  This seems to say that you also need to say 
"volatile" to
tell the compiler that you really mean it.

"If your assembler instructions access memory in an unpredictable fashion, add 
`memory' to the list of clobbered registers. This will cause GCC to not keep memory 
values cached in registers across the assembler instruction and not optimize stores or 
loads to that memory. You will also want to add the volatile keyword if the memory 
affected is not listed in the inputs or outputs of the asm, as the `memory' clobber does 
not count as a side-effect of the asm. If you know how large the accessed memory is, you 
can add it as input or output but if this is not known, you should add `memory'."

Also this needs to be looked at, i.e. does "sequence" mean in program order or 
with no interleaved
C statements.

"Similarly, you can't expect a sequence of volatile asm instructions to remain 
perfectly consecutive. If you want consecutive output, use a single asm. Also, GCC will 
perform some optimizations across a volatile asm instruction; GCC does not “forget 
everything” when it encounters a volatile asm instruction the way some other compilers 
do."

One of the problems with volatile in C was that the compiler could move code 
around the volatile
accesses and even accesses to other volatile variables.   This was a problem 
that Java had and
which they fixed with JSR-133 so you could actually do useful things with 
volatile in Java.  It's
just worse in C since C has nowhere as useful or clear definitions to work 
with.  The only
reason you can get away with something like

  do {
while (lock != 0);
  } while (!testandset());  // interlocked test and set

is the correctness of the code isn't affected by how the compiler treats
the test for lock != 0 as long as it terminates in a finite amount of time.  Or 
by the fact
that's not the best way to do a spin wait on hyperthreaded Intel processors.  
Intel
recommends you use a PAUSE intstruction in the wait loop.

Anyway it looks like I'll have to do a little more augury on the gcc docs. :)


--
Joe Seigh

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Volatile vs Non-Volatile Spin Locks on SMP.

2005-07-18 Thread Joe Seigh

Joe Seigh wrote:

For synchronization you need memory barriers in most cases and the only
way to get these is using assembler since there are no C or gcc intrinsics
for these yet.  For inline assembler, the convention seems to be to use
the volatile attribute, which I take as meaning no code movement across
the inline assembler code.  It if doesn't mean that then a lot of stuff
is broken AFAICT.



Usenet rule #1.  If you don't find something in the documentation, you
will find it after you post about it.  Volatile does seem to be documented
somewhat in the gcc docs
http://gcc.gnu.org/onlinedocs/gcc-4.0.1/gcc/Extended-Asm.html#Extended-Asm

I was using memory in the clobber list as the main thing to keep optimization 
from
occurring across inline asm.  This seems to say that you also need to say 
volatile to
tell the compiler that you really mean it.

If your assembler instructions access memory in an unpredictable fashion, add 
`memory' to the list of clobbered registers. This will cause GCC to not keep memory 
values cached in registers across the assembler instruction and not optimize stores or 
loads to that memory. You will also want to add the volatile keyword if the memory 
affected is not listed in the inputs or outputs of the asm, as the `memory' clobber does 
not count as a side-effect of the asm. If you know how large the accessed memory is, you 
can add it as input or output but if this is not known, you should add `memory'.

Also this needs to be looked at, i.e. does sequence mean in program order or 
with no interleaved
C statements.

Similarly, you can't expect a sequence of volatile asm instructions to remain 
perfectly consecutive. If you want consecutive output, use a single asm. Also, GCC will 
perform some optimizations across a volatile asm instruction; GCC does not “forget 
everything” when it encounters a volatile asm instruction the way some other compilers 
do.

One of the problems with volatile in C was that the compiler could move code 
around the volatile
accesses and even accesses to other volatile variables.   This was a problem 
that Java had and
which they fixed with JSR-133 so you could actually do useful things with 
volatile in Java.  It's
just worse in C since C has nowhere as useful or clear definitions to work 
with.  The only
reason you can get away with something like

  do {
while (lock != 0);
  } while (!testandset(lock));  // interlocked test and set

is the correctness of the code isn't affected by how the compiler treats
the test for lock != 0 as long as it terminates in a finite amount of time.  Or 
by the fact
that's not the best way to do a spin wait on hyperthreaded Intel processors.  
Intel
recommends you use a PAUSE intstruction in the wait loop.

Anyway it looks like I'll have to do a little more augury on the gcc docs. :)


--
Joe Seigh

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Volatile vs Non-Volatile Spin Locks on SMP.

2005-07-17 Thread Joe Seigh

[EMAIL PROTECTED] wrote:

Hello,

By using volatile keyword for spin lock defined by in spinlock_t, it 
seems Linux choose to always
reload the value of spin locks from cache instead of using the content 
from registers. This may be

helpful for synchronization between multithreads in a single CPU.

I use two Xeon cpus with HyperThreading being disabled on both cpus. I 
think the MESI
protocol will enforce the cache coherency and update the spin lock value 
automatically between

these two cpus. So maybe we don't need to use the volatile any more, right?

Based on this, I rebuilt the Intel e1000 Gigabit network card driver 
with volatile being removed,

but I didn't notice any performance improvement.

Any idea about this,



Volatile is meaningless as far as threading is concerned.  Technically, its
meaning is implementation defined and since for Linux we're talking about
gcc, I suppose someone could claim it has some meaning although most of us
will have no way of verifying those claims.  You might see some usage
of volatile in the Linux kernel which makes it appear as though it
has some meaning but you might want to be careful in depending on that
since there's no way of knowing if your interpretation of the meaning
is the same as what the authors of that code have in mind.

For synchronization you need memory barriers in most cases and the only
way to get these is using assembler since there are no C or gcc intrinsics
for these yet.  For inline assembler, the convention seems to be to use
the volatile attribute, which I take as meaning no code movement across
the inline assembler code.  It if doesn't mean that then a lot of stuff
is broken AFAICT.

Assuming you're doing this in assembler, using volatile on the C declaration
will have no effect on performance in this case.  You're seeing the most
"recent" value due to the cache implementation.

--
Joe Seigh

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Volatile vs Non-Volatile Spin Locks on SMP.

2005-07-17 Thread Joe Seigh

[EMAIL PROTECTED] wrote:

Hello,

By using volatile keyword for spin lock defined by in spinlock_t, it 
seems Linux choose to always
reload the value of spin locks from cache instead of using the content 
from registers. This may be

helpful for synchronization between multithreads in a single CPU.

I use two Xeon cpus with HyperThreading being disabled on both cpus. I 
think the MESI
protocol will enforce the cache coherency and update the spin lock value 
automatically between

these two cpus. So maybe we don't need to use the volatile any more, right?

Based on this, I rebuilt the Intel e1000 Gigabit network card driver 
with volatile being removed,

but I didn't notice any performance improvement.

Any idea about this,



Volatile is meaningless as far as threading is concerned.  Technically, its
meaning is implementation defined and since for Linux we're talking about
gcc, I suppose someone could claim it has some meaning although most of us
will have no way of verifying those claims.  You might see some usage
of volatile in the Linux kernel which makes it appear as though it
has some meaning but you might want to be careful in depending on that
since there's no way of knowing if your interpretation of the meaning
is the same as what the authors of that code have in mind.

For synchronization you need memory barriers in most cases and the only
way to get these is using assembler since there are no C or gcc intrinsics
for these yet.  For inline assembler, the convention seems to be to use
the volatile attribute, which I take as meaning no code movement across
the inline assembler code.  It if doesn't mean that then a lot of stuff
is broken AFAICT.

Assuming you're doing this in assembler, using volatile on the C declaration
will have no effect on performance in this case.  You're seeing the most
recent value due to the cache implementation.

--
Joe Seigh

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Volatile vs Non-Volatile Spin Locks on SMP.

2005-07-16 Thread Robert Hancock

[EMAIL PROTECTED] wrote:

Hello,

By using volatile keyword for spin lock defined by in spinlock_t, it 
seems Linux choose to always
reload the value of spin locks from cache instead of using the content 
from registers. This may be

helpful for synchronization between multithreads in a single CPU.

I use two Xeon cpus with HyperThreading being disabled on both cpus. I 
think the MESI
protocol will enforce the cache coherency and update the spin lock value 
automatically between

these two cpus. So maybe we don't need to use the volatile any more, right?


The value must always be loaded from memory. If the value is cached in a 
register it will not update when another CPU changes it.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Volatile vs Non-Volatile Spin Locks on SMP.

2005-07-16 Thread Robert Hancock

[EMAIL PROTECTED] wrote:

Hello,

By using volatile keyword for spin lock defined by in spinlock_t, it 
seems Linux choose to always
reload the value of spin locks from cache instead of using the content 
from registers. This may be

helpful for synchronization between multithreads in a single CPU.

I use two Xeon cpus with HyperThreading being disabled on both cpus. I 
think the MESI
protocol will enforce the cache coherency and update the spin lock value 
automatically between

these two cpus. So maybe we don't need to use the volatile any more, right?


The value must always be loaded from memory. If the value is cached in a 
register it will not update when another CPU changes it.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Volatile vs Non-Volatile Spin Locks on SMP.

2005-07-15 Thread multisyncfe991

Hello,

By using volatile keyword for spin lock defined by in spinlock_t, it seems 
Linux choose to always
reload the value of spin locks from cache instead of using the content from 
registers. This may be

helpful for synchronization between multithreads in a single CPU.

I use two Xeon cpus with HyperThreading being disabled on both cpus. I think 
the MESI
protocol will enforce the cache coherency and update the spin lock value 
automatically between

these two cpus. So maybe we don't need to use the volatile any more, right?

Based on this, I rebuilt the Intel e1000 Gigabit network card driver with 
volatile being removed,

but I didn't notice any performance improvement.

Any idea about this,

Thanks a lot,

Liang



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Volatile vs Non-Volatile Spin Locks on SMP.

2005-07-15 Thread multisyncfe991

Hello,

By using volatile keyword for spin lock defined by in spinlock_t, it seems 
Linux choose to always
reload the value of spin locks from cache instead of using the content from 
registers. This may be

helpful for synchronization between multithreads in a single CPU.

I use two Xeon cpus with HyperThreading being disabled on both cpus. I think 
the MESI
protocol will enforce the cache coherency and update the spin lock value 
automatically between

these two cpus. So maybe we don't need to use the volatile any more, right?

Based on this, I rebuilt the Intel e1000 Gigabit network card driver with 
volatile being removed,

but I didn't notice any performance improvement.

Any idea about this,

Thanks a lot,

Liang



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/