Re: Volatile semantic for failed/noop atomic operations

2019-10-08 Thread Vitaly Davidovich
On Sat, Oct 5, 2019 at 11:41 AM Steven Stewart-Gallus <
stevenselectronicm...@gmail.com> wrote:

> Couldn't you do a compare and compare and swap? With VarHandles something
> like
>
> if (ACTION.getOpaque(this) != expected) return false;
> return compareAndExchange(this, expected, newValue) == expected;
>
> Not sure I got this correct
>
In general you could.  This is sometimes referred to as TTAS (test and
test-and-set).

In this thread’s example, the “checks the count via a volatile load” I
mentioned is a similar effect; under some conditions, and opaque load might
be slightly cheaper than a volatile one (on x86 at least).  The difference
would only really be a result of the optimizer doing other scheduling
around the opaque load, but not around a volatile one.

The right/best approach for Simone’s case depends on specifics (don’t they
all?! :)).  I didn’t actually pick up on how often the termination protocol
triggers - I assumed it’s an uncommon/slow path.

>
>
> On Saturday, September 14, 2019 at 11:29:00 AM UTC-7, Vitaly Davidovich
> wrote:
>>
>> Unlike C++, where you can specify mem ordering for failure and success
>> separately, Java doesn’t allow that.  But, the mem ordering is the same for
>> failure/success there.  Unfortunately it doesn’t look like the javadocs
>> mention that, but I recall Doug Lea saying that’s the case on the
>> concurrency-interest mailing list (btw that’s probably the more appropriate
>> channel for this Java-centric question).
>>
>> For your case, it seems like an AtomicReference is more
>> appropriate.  terminate() sets it, then checks the count via a volatile
>> load (or maybe it can decrement() itself?); if zero, CAS null into the
>> action field to take/claim the action.  decrement() likewise tries to claim
>> the action via a CAS.  The snippet you have now would allow for concurrent
>> action execution, which is likely unsafe/wrong.
>>
>
>> On Fri, Sep 13, 2019 at 3:08 AM Simone Bordet 
>> wrote:
>>
>>> Hi,
>>>
>>> I have an atomic counter that gets incremented and decremented over
>>> time (non monotonically).
>>> At a certain point, I would like to enter a termination protocol where
>>> increments are not possible anymore and I set an action to run if/when
>>> the counter reaches zero.
>>> Trivial when using synchronized/lock, but I'd like to give it a try
>>> without them.
>>>
>>> class A {
>>>   private final AtomicLong counter;
>>>   // Non-volatile
>>>   private Runnable action;
>>>
>>>   void terminate(Runnable action) {
>>> this.action = action;
>>> // Volatile write needed here for visibility.
>>> if (counter.addAndGet(0) == 0) {
>>>   action.run();
>>> }
>>>   }
>>>
>>>   void decrement() {
>>> // Volatile read required to see this.action.
>>> if (counter.decrementAndGet() == 0) {
>>>   Runnable a = this.action;
>>>   if (a != null) {
>>> a.run()
>>>   }
>>> }
>>>   }
>>> }
>>>
>>> Is addAndGet(0) a volatile write? Can the write be optimized away?
>>> Similarly (although not relevant for this particular example), a
>>> _failed_ compareAndSet() has the semantic of a volatile write even if
>>> the set part was not done because the compare part failed?
>>>
>>> Thanks!
>>>
>>> --
>>> Simone Bordet
>>> ---
>>> Finally, no matter how good the architecture and design are,
>>> to deliver bug-free software with optimal performance and reliability,
>>> the implementation technique must be flawless.   Victoria Livschitz
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "mechanical-sympathy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to mechanical-sympathy+unsubscr...@googlegroups.com.
>>> To view this discussion on the web, visit
>>> https://groups.google.com/d/msgid/mechanical-sympathy/CAFWmRJ3qGJ_qqrXmAHNDZ6ro01BQwe8czHZP7b-SoZ%2BrULhJAw%40mail.gmail.com
>>> .
>>>
>> --
>> Sent from my phone
>>
> --
> You received this message because you are subscribed to the Google Groups
> "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mechanical-sympathy+unsubscr...@googlegroups.com.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/mechanical-sympathy/43729dfa-1e73-4318-b446-e80c81422b6e%40googlegroups.com
> 
> .
>
-- 
Sent from my phone

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/CAHjP37Fjb01ReqeVtj26JxT%2BKhhsVVRpZx7J-CpDXi_ZDpF7ow%40mail.gmail.com.


Re: Volatile semantic for failed/noop atomic operations

2019-10-05 Thread Gil Tene
That's actually the common idiom for performant concurrent code that 
occasionally needs to do something atomic, but usually (the hot case) 
doesn't. Without the first read check, there is a good likelihood 
(depending on how the CPU and prefetchers work for CAS, but very probably) 
that you will run into serious false sharing contention on the line under 
high load on multiple cores, as the CAS will bring the line in exclusive 
and invalidate it in other caches, while the first read check will bring it 
in shared, and the invalidation will only happen if you actually need to 
change the value.

On Saturday, October 5, 2019 at 8:41:26 AM UTC-7, Steven Stewart-Gallus 
wrote:
>
> Couldn't you do a compare and compare and swap? With VarHandles something 
> like
>
> if (ACTION.getOpaque(this) != expected) return false;
> return compareAndExchange(this, expected, newValue) == expected;
>
> Not sure I got this correct
>
> On Saturday, September 14, 2019 at 11:29:00 AM UTC-7, Vitaly Davidovich 
> wrote:
>>
>> Unlike C++, where you can specify mem ordering for failure and success 
>> separately, Java doesn’t allow that.  But, the mem ordering is the same for 
>> failure/success there.  Unfortunately it doesn’t look like the javadocs 
>> mention that, but I recall Doug Lea saying that’s the case on the 
>> concurrency-interest mailing list (btw that’s probably the more appropriate 
>> channel for this Java-centric question).
>>
>> For your case, it seems like an AtomicReference is more 
>> appropriate.  terminate() sets it, then checks the count via a volatile 
>> load (or maybe it can decrement() itself?); if zero, CAS null into the 
>> action field to take/claim the action.  decrement() likewise tries to claim 
>> the action via a CAS.  The snippet you have now would allow for concurrent 
>> action execution, which is likely unsafe/wrong.
>>
>> On Fri, Sep 13, 2019 at 3:08 AM Simone Bordet  
>> wrote:
>>
>>> Hi,
>>>
>>> I have an atomic counter that gets incremented and decremented over
>>> time (non monotonically).
>>> At a certain point, I would like to enter a termination protocol where
>>> increments are not possible anymore and I set an action to run if/when
>>> the counter reaches zero.
>>> Trivial when using synchronized/lock, but I'd like to give it a try
>>> without them.
>>>
>>> class A {
>>>   private final AtomicLong counter;
>>>   // Non-volatile
>>>   private Runnable action;
>>>
>>>   void terminate(Runnable action) {
>>> this.action = action;
>>> // Volatile write needed here for visibility.
>>> if (counter.addAndGet(0) == 0) {
>>>   action.run();
>>> }
>>>   }
>>>
>>>   void decrement() {
>>> // Volatile read required to see this.action.
>>> if (counter.decrementAndGet() == 0) {
>>>   Runnable a = this.action;
>>>   if (a != null) {
>>> a.run()
>>>   }
>>> }
>>>   }
>>> }
>>>
>>> Is addAndGet(0) a volatile write? Can the write be optimized away?
>>> Similarly (although not relevant for this particular example), a
>>> _failed_ compareAndSet() has the semantic of a volatile write even if
>>> the set part was not done because the compare part failed?
>>>
>>> Thanks!
>>>
>>> -- 
>>> Simone Bordet
>>> ---
>>> Finally, no matter how good the architecture and design are,
>>> to deliver bug-free software with optimal performance and reliability,
>>> the implementation technique must be flawless.   Victoria Livschitz
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "mechanical-sympathy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to mechanical-sympathy+unsubscr...@googlegroups.com.
>>> To view this discussion on the web, visit 
>>> https://groups.google.com/d/msgid/mechanical-sympathy/CAFWmRJ3qGJ_qqrXmAHNDZ6ro01BQwe8czHZP7b-SoZ%2BrULhJAw%40mail.gmail.com
>>> .
>>>
>> -- 
>> Sent from my phone
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/bf62a5dc-0c89-44cf-aa36-65f0655d0b3a%40googlegroups.com.


Re: Volatile semantic for failed/noop atomic operations

2019-10-05 Thread Steven Stewart-Gallus
Couldn't you do a compare and compare and swap? With VarHandles something 
like

if (ACTION.getOpaque(this) != expected) return false;
return compareAndExchange(this, expected, newValue) == expected;

Not sure I got this correct

On Saturday, September 14, 2019 at 11:29:00 AM UTC-7, Vitaly Davidovich 
wrote:
>
> Unlike C++, where you can specify mem ordering for failure and success 
> separately, Java doesn’t allow that.  But, the mem ordering is the same for 
> failure/success there.  Unfortunately it doesn’t look like the javadocs 
> mention that, but I recall Doug Lea saying that’s the case on the 
> concurrency-interest mailing list (btw that’s probably the more appropriate 
> channel for this Java-centric question).
>
> For your case, it seems like an AtomicReference is more 
> appropriate.  terminate() sets it, then checks the count via a volatile 
> load (or maybe it can decrement() itself?); if zero, CAS null into the 
> action field to take/claim the action.  decrement() likewise tries to claim 
> the action via a CAS.  The snippet you have now would allow for concurrent 
> action execution, which is likely unsafe/wrong.
>
> On Fri, Sep 13, 2019 at 3:08 AM Simone Bordet  > wrote:
>
>> Hi,
>>
>> I have an atomic counter that gets incremented and decremented over
>> time (non monotonically).
>> At a certain point, I would like to enter a termination protocol where
>> increments are not possible anymore and I set an action to run if/when
>> the counter reaches zero.
>> Trivial when using synchronized/lock, but I'd like to give it a try
>> without them.
>>
>> class A {
>>   private final AtomicLong counter;
>>   // Non-volatile
>>   private Runnable action;
>>
>>   void terminate(Runnable action) {
>> this.action = action;
>> // Volatile write needed here for visibility.
>> if (counter.addAndGet(0) == 0) {
>>   action.run();
>> }
>>   }
>>
>>   void decrement() {
>> // Volatile read required to see this.action.
>> if (counter.decrementAndGet() == 0) {
>>   Runnable a = this.action;
>>   if (a != null) {
>> a.run()
>>   }
>> }
>>   }
>> }
>>
>> Is addAndGet(0) a volatile write? Can the write be optimized away?
>> Similarly (although not relevant for this particular example), a
>> _failed_ compareAndSet() has the semantic of a volatile write even if
>> the set part was not done because the compare part failed?
>>
>> Thanks!
>>
>> -- 
>> Simone Bordet
>> ---
>> Finally, no matter how good the architecture and design are,
>> to deliver bug-free software with optimal performance and reliability,
>> the implementation technique must be flawless.   Victoria Livschitz
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "mechanical-sympathy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to mechanical-sympathy+unsubscr...@googlegroups.com .
>> To view this discussion on the web, visit 
>> https://groups.google.com/d/msgid/mechanical-sympathy/CAFWmRJ3qGJ_qqrXmAHNDZ6ro01BQwe8czHZP7b-SoZ%2BrULhJAw%40mail.gmail.com
>> .
>>
> -- 
> Sent from my phone
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/43729dfa-1e73-4318-b446-e80c81422b6e%40googlegroups.com.


Re: Volatile semantic for failed/noop atomic operations

2019-09-25 Thread Francesco Nigro
> I've never heard of failed CAS being cheaper
I'm referring to this old (but good) article: 
https://blogs.oracle.com/dave/biased-locking-in-hotspot

Il giorno sabato 14 settembre 2019 21:41:56 UTC+2, Vitaly Davidovich ha 
scritto:
>
> On x86, I’ve never heard of failed CAS being cheaper.  In theory, cache 
> snooping can inform the core whether it’s xchg would succeed without going 
> through the RFO dance.  But, to perform the actual xchg it would need 
> ownership regardless (if not already owned/exclusive).
>
> Sharing ordinary mutable memory isn’t scalable, forget locks :) Just doing 
> plain loads/stores of shared memory will kill scalability (and perf) due to 
> cache coherence traffic.
>
> ARM uses LL/SC for CAS, and is much more susceptible to failures (eg even 
> if the value is expected, if the line has been refreshed after the LL part, 
> it fails).  But, I’m not overly familiar with ARM internals/mechanics.
>
> On Sat, Sep 14, 2019 at 3:20 PM Francesco Nigro  > wrote:
>
>> Although not mentioned (neither on Doug's cookbook) I have always 
>> supposed was more likely the c++ default for both weak/strong CAS ie 
>> seq_cst memory ordering.
>> To make this question more mechanical sympathy focused: on hardware level 
>> what happen? I would be curious to know both x86/arm versions for this, 
>> what kind of memory reordering are guaranteed...
>>
>> Most people says that a failed CAS cost much less then one that succeed, 
>> but just like a load or more? 
>> Probably the success will cause invalidation of all the caches that 
>> reference the cache line changed, but the failed case should lock (on x86) 
>> it and AFAIK locks (hw or SW) are not well known to be scalable and 
>> performant :)
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "mechanical-sympathy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to mechanical-sympathy+unsubscr...@googlegroups.com .
>> To view this discussion on the web, visit 
>> https://groups.google.com/d/msgid/mechanical-sympathy/caebab48-f88f-477d-bc8e-95702e45bc76%40googlegroups.com
>> .
>>
> -- 
> Sent from my phone
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/54efeb7e-04a2-48eb-a4f8-8e97556ccc03%40googlegroups.com.


Re: Volatile semantic for failed/noop atomic operations

2019-09-15 Thread Simone Bordet
Hi,

On Sun, Sep 15, 2019 at 2:24 AM Vitaly Davidovich  wrote:
> In your snippet, terminate() sets the field - it can be visible immediately 
> (or terminate() thread is descheduled).  decrement runs, decrements to 0, 
> sees action != null.  terminate() also now sees 0, and likewise proceeds to 
> run the action.
>
> The AR I was suggesting would allow ownership protocol over the action.  Read 
> it, if not null, try to CAS a null in there.  If CAS succeeds, you’ve claimed 
> ownership.  If it fails, someone else claimed it.  Whoever claims it runs the 
> action.

I see, thanks!

-- 
Simone Bordet
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/CAFWmRJ1o9FM0wbNm5TWk_wzR59RCevgBxoM9%3D4L%3DEtO3VmtSFw%40mail.gmail.com.


Re: Volatile semantic for failed/noop atomic operations

2019-09-14 Thread Simone Bordet
Hi,

On Sat, Sep 14, 2019 at 8:28 PM Vitaly Davidovich  wrote:
>
> Unlike C++, where you can specify mem ordering for failure and success 
> separately, Java doesn’t allow that.  But, the mem ordering is the same for 
> failure/success there.  Unfortunately it doesn’t look like the javadocs 
> mention that,

Unfortunately so.

> but I recall Doug Lea saying that’s the case on the concurrency-interest 
> mailing list (btw that’s probably the more appropriate channel for this 
> Java-centric question).
>
> For your case, it seems like an AtomicReference is more appropriate.

But then I would have 2 volatile accesses instead of one.

> terminate() sets it, then checks the count via a volatile load (or maybe it 
> can decrement() itself?); if zero, CAS null into the action field to 
> take/claim the action.  decrement() likewise tries to claim the action via a 
> CAS.  The snippet you have now would allow for concurrent action execution, 
> which is likely unsafe/wrong.

I don't see the difference between 2 atomics and my snippet WRT
concurrent execution.
They can both be executed concurrently with arbitrary interleaving, no?

I think my snippet is safe as long as in terminate() there is a
volatile write after setting the action (it is admittedly weird to add
zero just to obtain a memory effect, I could have used
VarHandle.releaseFence() if I were in JDK 11), since the action would
be visible from decrement() because it does a volatile read on the
same variable, no?

-- 
Simone Bordet
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/CAFWmRJ0gh54BKzEQcbTAyb%3DRpa41c8mpiQSGVc5xsYno-qbzVw%40mail.gmail.com.


Re: Volatile semantic for failed/noop atomic operations

2019-09-14 Thread Vitaly Davidovich
On x86, I’ve never heard of failed CAS being cheaper.  In theory, cache
snooping can inform the core whether it’s xchg would succeed without going
through the RFO dance.  But, to perform the actual xchg it would need
ownership regardless (if not already owned/exclusive).

Sharing ordinary mutable memory isn’t scalable, forget locks :) Just doing
plain loads/stores of shared memory will kill scalability (and perf) due to
cache coherence traffic.

ARM uses LL/SC for CAS, and is much more susceptible to failures (eg even
if the value is expected, if the line has been refreshed after the LL part,
it fails).  But, I’m not overly familiar with ARM internals/mechanics.

On Sat, Sep 14, 2019 at 3:20 PM Francesco Nigro  wrote:

> Although not mentioned (neither on Doug's cookbook) I have always supposed
> was more likely the c++ default for both weak/strong CAS ie seq_cst memory
> ordering.
> To make this question more mechanical sympathy focused: on hardware level
> what happen? I would be curious to know both x86/arm versions for this,
> what kind of memory reordering are guaranteed...
>
> Most people says that a failed CAS cost much less then one that succeed,
> but just like a load or more?
> Probably the success will cause invalidation of all the caches that
> reference the cache line changed, but the failed case should lock (on x86)
> it and AFAIK locks (hw or SW) are not well known to be scalable and
> performant :)
>
> --
> You received this message because you are subscribed to the Google Groups
> "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mechanical-sympathy+unsubscr...@googlegroups.com.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/mechanical-sympathy/caebab48-f88f-477d-bc8e-95702e45bc76%40googlegroups.com
> .
>
-- 
Sent from my phone

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/CAHjP37HDQLSMhX6oriRWS%3DWkfUDy9fGe41zjFWwzm9zO%3DFJbZg%40mail.gmail.com.


Re: Volatile semantic for failed/noop atomic operations

2019-09-14 Thread Francesco Nigro
Although not mentioned (neither on Doug's cookbook) I have always supposed was 
more likely the c++ default for both weak/strong CAS ie seq_cst memory ordering.
To make this question more mechanical sympathy focused: on hardware level what 
happen? I would be curious to know both x86/arm versions for this, what kind of 
memory reordering are guaranteed...

Most people says that a failed CAS cost much less then one that succeed, but 
just like a load or more? 
Probably the success will cause invalidation of all the caches that reference 
the cache line changed, but the failed case should lock (on x86) it and AFAIK 
locks (hw or SW) are not well known to be scalable and performant :)

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/caebab48-f88f-477d-bc8e-95702e45bc76%40googlegroups.com.