The approach exploiting "locked" CMPXCHG8B is ready.
The patches attached to HARMONY-2092.

This solution is about 1.6 times slower than Jrockit in a special microbenchmark heavily utilizing volatile variables of type long.


George Timoshenko wrote:

I had a question in the JIRA about this issue: why don't we use "lock"
prefix for the atomic access?

well...

Originally we split all 64-bit memory access into 2 ones of 32-bit.
It does not have sense to set #LOCK prefix for them. (there is a gap between)

We can only set #LOCK to some instruction that reads/writes whole 64 bits.

The bad thing is the only instruction (according to IA32 spec) we can set #LOCK to is CMPXCHG8B (MOVQ, MOVSD and any others can not be used with #LOCK)

This monster (CMPXCHG8B) requires 4 registers:

EAX
EBX
ECX
EDX

and (FLAGS) also.

I am not sure CMPXCHG8B usage will be faster than making volatile fields always synchronized (artificially)





Reply via email to