Re: sun.misc.Unsafe.getAndAddInt implementation, question.

2017-10-16 Thread Gil Tene
The bytecode doesn't matter. It's not the javac compiler that will be doing 
the optimizations you should be worried about. It's the JIT compilers in 
the JVM. The javac-generated bytecode is only executed by the interpreter. 
The bytecode is eventually transformed to machine code by the JIT compiler, 
during which it will undergo aggressive optimization.

The "CPU" you should worry about and model in your mind is not x86, SPARC, 
or ARM. It's the JVM's execution engine the JIT-generated machine code that 
does most of the actual execution, and that "CPU" will reorder the code 
more aggressively than any HW CPU ever would. The JIT's optimizing 
transformations include arbitrary and massive re-ordereing, reshaping, 
folding-together, and completely eliminating big parts off your apparent 
bytecode instructions. And the JIT will do all those as long as it can 
prove that the transformations are allowed.

On Monday, October 16, 2017 at 3:30:13 PM UTC+1, John Hening wrote:
>
> Thanks Gil Tene! 
>
> You obviously right. The read is not volatile so the compiler is allowed 
> to reorder it. Moreover, the read is not volatile, so the compiler assumes 
> that noone changes the source of getInt(). So, it can hoist out curr.
>
>
> The last question (sorry for being inquisitive)  is:
>
> Let's assume that the compiler generated bytecode equivalent to 
>
> public final int getAndAddInt(Object ptr, long offset, int value) {
>  int curr;
>  do {
> curr = this.getInt(ptr, offset); (1)
>  } while(!this.compareAndSwapInt(ptr, offset, curr, curr + value)); (2)
>  return curr;
> }
>
>
>
> so it wasn't optimized. 
>
> Now, it seems to work correctly. But, note that CPU can make some 
> reordering. We are lucky because the CPU cannot reorder here: there is a 
> data dependency: (1) -> (2). So, on every sensible (with data dependency) 
> CPU it works, yes?
>
> W dniu poniedziałek, 16 października 2017 12:13:07 UTC+2 użytkownik Gil 
> Tene napisał:
>>
>> Ok. So the question below (ignoring other optimizations in the JVM that 
>> are specific to this method) is "If I were doing this myself in some other 
>> method, would this logic be valid if Unsafe.getIntVolatile() could be be 
>> replaced with Unsafe.getInt()?"
>>
>> The answer IMO is "no".
>>
>> The issue here is that unlike e.g. AtomicInteger.compareAndSet(), which 
>> is explicitly specified to include the behavior of a volatile read on the 
>> field involved, Unsafe.compareAndSwapInt() does not make any claims about 
>> exhibiting volatile read semantics. As a result, if you replace 
>> Unsafe.getIntVolatile() with Unsafe.getInt(), the resulting code:
>>
>> public final int getAndAddInt(Object ptr, long offset, int value) {
>>  int curr;
>>  do {
>> curr = this.getInt(ptr, offset); (1)
>>  } while(!this.compareAndSwapInt(ptr, offset, curr, curr + value)); (2)
>>  return curr;
>> }
>>
>> Can be validly transformed by the optimizer to:
>>
>> public final int getAndAddInt(Object ptr, long offset, int value) {
>>  int curr = this.getInt(ptr, offset); (1)
>>  do {  
>>  } while(!this.compareAndSwapInt(ptr, offset, curr, curr + value)); (2)
>>  return curr;
>> }
>>
>> Because:
>>
>> (a) The optimizer can prove that if the compareAndSwapInt ever actually 
>> wrote to the field, the method would return and curr wouldn't be read again.
>> (b) Since the read of curr is not volatile, and the read in 
>> Unsafe.compareAndSwapInt() is not required to act like a volatile read, all 
>> the reads of curr can be reordered with the all the reads in the 
>> compareAndSwapInt() calls, which means that they can be folded together and 
>> hoisted out of the loop.
>>
>> If this valid optimization happened, the resulting code would get stuck 
>> in an infinite loop if another thread modified the field between the read 
>> of curr and the compareAndSwapInt call, and that is obviously not the 
>> intended behavior of getAndAddInt()...
>>
>> On Sunday, October 15, 2017 at 2:12:30 AM UTC-7, John Hening wrote:
>>>
>>> Gil Tene, thanks you very much. 
>>>
>>> Ok, so does it mean that Unsafe.getIntVolatile() could be be replaced 
>>> with Unsafe.getInt()?
>>>
>>> W dniu niedziela, 15 października 2017 01:34:34 UTC+2 użytkownik Gil 
>>> Tene napisał:

 A simple answer would be that the field is treated by the method as a 
 volatile, and the code is simply staying consistent with that notion. Is 
 an 
 optimization possible here? Possibly. Probably. But does it matter? No. 
 The 
 source code involved is not performance critical, and is not worth 
 optimizing. The interpreter may be running this logic, but no hot path 
 would be executing the actual logic in this code... 

 Why?  Because the java code you see there is NOT what the hot code 
 would be doing on (most) JVMs. Specifically, optimizing JITs can and will 
 identify and intrinsify the method, replacing it's body with code that 
 does 
 whatever they want it to do. They don't have 

Re: sun.misc.Unsafe.getAndAddInt implementation, question.

2017-10-16 Thread John Hening
Thanks Gil Tene! 

You obviously right. The read is not volatile so the compiler is allowed to 
reorder it. Moreover, the read is not volatile, so the compiler assumes 
that noone changes the source of getInt(). So, it can hoist out curr.


The last question (sorry for being inquisitive)  is:

Let's assume that the compiler generated bytecode equivalent to 

public final int getAndAddInt(Object ptr, long offset, int value) {
 int curr;
 do {
curr = this.getInt(ptr, offset); (1)
 } while(!this.compareAndSwapInt(ptr, offset, curr, curr + value)); (2)
 return curr;
}



so it wasn't optimized. 

Now, it seems to work correctly. But, note that CPU can make some 
reordering. We are lucky because the CPU cannot reorder here: there is a 
data dependency: (1) -> (2). So, on every sensible (with data dependency) 
CPU it works. 

W dniu poniedziałek, 16 października 2017 12:13:07 UTC+2 użytkownik Gil 
Tene napisał:
>
> Ok. So the question below (ignoring other optimizations in the JVM that 
> are specific to this method) is "If I were doing this myself in some other 
> method, would this logic be valid if Unsafe.getIntVolatile() could be be 
> replaced with Unsafe.getInt()?"
>
> The answer IMO is "no".
>
> The issue here is that unlike e.g. AtomicInteger.compareAndSet(), which is 
> explicitly specified to include the behavior of a volatile read on the 
> field involved, Unsafe.compareAndSwapInt() does not make any claims about 
> exhibiting volatile read semantics. As a result, if you replace 
> Unsafe.getIntVolatile() with Unsafe.getInt(), the resulting code:
>
> public final int getAndAddInt(Object ptr, long offset, int value) {
>  int curr;
>  do {
> curr = this.getInt(ptr, offset); (1)
>  } while(!this.compareAndSwapInt(ptr, offset, curr, curr + value)); (2)
>  return curr;
> }
>
> Can be validly transformed by the optimizer to:
>
> public final int getAndAddInt(Object ptr, long offset, int value) {
>  int curr = this.getInt(ptr, offset); (1)
>  do {  
>  } while(!this.compareAndSwapInt(ptr, offset, curr, curr + value)); (2)
>  return curr;
> }
>
> Because:
>
> (a) The optimizer can prove that if the compareAndSwapInt ever actually 
> wrote to the field, the method would return and curr wouldn't be read again.
> (b) Since the read of curr is not volatile, and the read in 
> Unsafe.compareAndSwapInt() is not required to act like a volatile read, all 
> the reads of curr can be reordered with the all the reads in the 
> compareAndSwapInt() calls, which means that they can be folded together and 
> hoisted out of the loop.
>
> If this valid optimization happened, the resulting code would get stuck in 
> an infinite loop if another thread modified the field between the read of 
> curr and the compareAndSwapInt call, and that is obviously not the intended 
> behavior of getAndAddInt()...
>
> On Sunday, October 15, 2017 at 2:12:30 AM UTC-7, John Hening wrote:
>>
>> Gil Tene, thanks you very much. 
>>
>> Ok, so does it mean that Unsafe.getIntVolatile() could be be replaced 
>> with Unsafe.getInt()?
>>
>> W dniu niedziela, 15 października 2017 01:34:34 UTC+2 użytkownik Gil Tene 
>> napisał:
>>>
>>> A simple answer would be that the field is treated by the method as a 
>>> volatile, and the code is simply staying consistent with that notion. Is an 
>>> optimization possible here? Possibly. Probably. But does it matter? No. The 
>>> source code involved is not performance critical, and is not worth 
>>> optimizing. The interpreter may be running this logic, but no hot path 
>>> would be executing the actual logic in this code... 
>>>
>>> Why?  Because the java code you see there is NOT what the hot code would 
>>> be doing on (most) JVMs. Specifically, optimizing JITs can and will 
>>> identify and intrinsify the method, replacing it's body with code that does 
>>> whatever they want it to do. They don't have to perform any of the actual 
>>> the logic in the method, as long as they make sure the method's performs 
>>> it's intended (contracted) function. and that contracted functionality is 
>>> to perform a getAndAddInt on a field, treating it logically as a volatile.
>>>
>>> For example, on x86 there is support for atomic add via the XADD 
>>> instruction. Using XADD for this method's functionality has multiple 
>>> advantages over doing the as-coded CAS loop. And most optimizing JITs will 
>>> [transparently] use an XADD in place of a CAS in this case and get rid of 
>>> the loop altogether.
>>>
>>> On Saturday, October 14, 2017 at 6:58:17 AM UTC-7, John Hening wrote:

 Hello


 it is an implementation from sun.misc.Unsafe.



 public final int getAndAddInt(Object ptr, long offset, int value) {
  int curr;
  do {
 curr = this.getIntVolatile(ptr, offset); (1)
  } while(!this.compareAndSwapInt(ptr, offset, curr, curr + value)); (2)

  return curr;}





 Why there is 
 Unsafe.getIntVolatile()

  called instead of

Re: sun.misc.Unsafe.getAndAddInt implementation, question.

2017-10-16 Thread Gil Tene
Ok. So the question below (ignoring other optimizations in the JVM that are 
specific to this method) is "If I were doing this myself in some other 
method, would this logic be valid if Unsafe.getIntVolatile() could be be 
replaced with Unsafe.getInt()?"

The answer IMO is "no".

The issue here is that unlike e.g. AtomicInteger.compareAndSet(), which is 
explicitly specified to include the behavior of a volatile read on the 
field involved, Unsafe.compareAndSwapInt() does not make any claims about 
exhibiting volatile read semantics. As a result, if you replace 
Unsafe.getIntVolatile() with Unsafe.getInt(), the resulting code:

public final int getAndAddInt(Object ptr, long offset, int value) {
 int curr;
 do {
curr = this.getInt(ptr, offset); (1)
 } while(!this.compareAndSwapInt(ptr, offset, curr, curr + value)); (2)
 return curr;
}

Can be validly transformed by the optimizer to:

public final int getAndAddInt(Object ptr, long offset, int value) {
 int curr = this.getInt(ptr, offset); (1)
 do {  
 } while(!this.compareAndSwapInt(ptr, offset, curr, curr + value)); (2)
 return curr;
}

Because:

(a) The optimizer can prove that if the compareAndSwapInt ever actually 
wrote to the field, the method would return and curr wouldn't be read again.
(b) Since the read of curr is not volatile, and the read in 
Unsafe.compareAndSwapInt() is not required to act like a volatile read, all 
the reads of curr can be reordered with the all the reads in the 
compareAndSwapInt() calls, which means that they can be folded together and 
hoisted out of the loop.

If this valid optimization happened, the resulting code would get stuck in 
an infinite loop if another thread modified the field between the read of 
curr and the compareAndSwapInt call, and that is obviously not the intended 
behavior of getAndAddInt()...

On Sunday, October 15, 2017 at 2:12:30 AM UTC-7, John Hening wrote:
>
> Gil Tene, thanks you very much. 
>
> Ok, so does it mean that Unsafe.getIntVolatile() could be be replaced with 
> Unsafe.getInt()?
>
> W dniu niedziela, 15 października 2017 01:34:34 UTC+2 użytkownik Gil Tene 
> napisał:
>>
>> A simple answer would be that the field is treated by the method as a 
>> volatile, and the code is simply staying consistent with that notion. Is an 
>> optimization possible here? Possibly. Probably. But does it matter? No. The 
>> source code involved is not performance critical, and is not worth 
>> optimizing. The interpreter may be running this logic, but no hot path 
>> would be executing the actual logic in this code... 
>>
>> Why?  Because the java code you see there is NOT what the hot code would 
>> be doing on (most) JVMs. Specifically, optimizing JITs can and will 
>> identify and intrinsify the method, replacing it's body with code that does 
>> whatever they want it to do. They don't have to perform any of the actual 
>> the logic in the method, as long as they make sure the method's performs 
>> it's intended (contracted) function. and that contracted functionality is 
>> to perform a getAndAddInt on a field, treating it logically as a volatile.
>>
>> For example, on x86 there is support for atomic add via the XADD 
>> instruction. Using XADD for this method's functionality has multiple 
>> advantages over doing the as-coded CAS loop. And most optimizing JITs will 
>> [transparently] use an XADD in place of a CAS in this case and get rid of 
>> the loop altogether.
>>
>> On Saturday, October 14, 2017 at 6:58:17 AM UTC-7, John Hening wrote:
>>>
>>> Hello
>>>
>>>
>>> it is an implementation from sun.misc.Unsafe.
>>>
>>>
>>>
>>> public final int getAndAddInt(Object ptr, long offset, int value) {
>>>  int curr;
>>>  do {
>>> curr = this.getIntVolatile(ptr, offset); (1)
>>>  } while(!this.compareAndSwapInt(ptr, offset, curr, curr + value)); (2)
>>>
>>>  return curr;}
>>>
>>>
>>>
>>>
>>>
>>> Why there is 
>>> Unsafe.getIntVolatile()
>>>
>>>  called instead of 
>>> Unsafe.getInt()
>>>
>>>  here?
>>>
>>>
>>> I am basically familiar with memory models, memory barriers etc., but, 
>>> perhaps I don't see any something important. 
>>>
>>>
>>> *getIntVolatile* means here: ensure the order of execution: (1) -> (2)
>>>
>>>
>>> It looks something like:
>>>
>>>
>>> curr = read(); 
>>> acquire();
>>> CAS operation
>>>
>>>
>>>
>>> Obviously, acquire() depends on CPU, for example on x86 it is empty, on ARM 
>>> it is a memory barrier, etc. 
>>>
>>>
>>> My question/misunderstanding:
>>>
>>>
>>> For my eye the order is ensured by data dependency between read of *(ptr + 
>>> offset)* and *CAS* operation on it. So, I don't see a reason to worry about 
>>> memory (re)ordering. 
>>>
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.