On Fri, Sep 13, 2019 at 7:28 PM Andrew Barnert via Python-ideas <
python-ideas@python.org> wrote:

> First, I’m pretty sure that, contrary to your claims, C++ does not support
> this. C++ doesn’t even support shared memory out of the box. The
> third-party Boost library does provide it—as long as you only care about
> systems that correctly supports POSIX shared memory, and Windows, and as
> long as you either


I am not sure what is your point here. Shared memory support in C++ was not
the OP's point. <atomic> support in C++ seems to be there from C++11. I do
not know if historically it was always supported, but just a simple smoke
test on gcc in linux and MSVC on windows by compiling:
```
#include <atomic>
int main(int argc, char** argv)
{
    std::atomic<int> a(0);
    a++;
    return a;
}
```
gives an expected result - both compilers emit "lock xadd" into the code
without any additional effort (gcc with --std=c++11 and MSVC with
/std:c++14 since it does not know c++11).


> Using atomics means that your statistics can be out of sync. You could,
> for example, see up-to-the-nanosecond bad requests count but out-of-date
> total requests and therefore calculate an incorrect request success rate
> (for an extreme example, it could be 1/0) and do the wrong thing. You can
> fix this by coming up with an ordering discipline for updates, and adding
> an extra value for total updates that you can use to manage the potential
> error bounds, but this is hardly simple.
>

Yes, by using only atomics you cannot get the "frozen" global state, as the
values will be changing behind your back, but again, the OP did not claim
that this was his goal.


> By contrast, just grabbing a lock around every call to `update_stats` and
> `read_stats` is trivial.
>

It is trivial and can kill the performance if there is a lot of contention.
I can imagine that grabbing the lock can be as fast as an atomic when there
is no contention (the lock being implemented by an atomic), but if there is
a contention, stopping the thread seems a magnitude higher than locking the
bus for one atomic op. We may argue about some particular implementation
and its memory access pattern (and memory partitioning and contention), but
I guess without knowing all the details about the OP's app, we can just as
well argue about the weather.


> Meanwhile, a single lock plus dozens of nonatomic writes is going to be a
> lot faster than dozens of atomic writes (especially if you don’t have a
> library flexible enough to let you do acquire-release semantics instead of
> fully-atomic semantics), as well as being simpler. Not to mention that
> having dozens of

atomics likely to be allocated near each other may well mean stalling the
> cache 5 times for each contention or false contention instead of just once,
> but with a lock you don’t need to even consider that, much less figure out
> how to test for it and fix it.
>

I would expect one would need a quite a lot of atomics to equal one blocked
thread, but it is just a gut feeling. Did you do any benchmark in this
matter?

>

> When the lock isn’t acceptable, it’s because there’s too much contention
> on the lock—and there would have been too much contention on the atomic
> writes


So we do not differ much in our understanding after all. You just assume
that there won't be a (lot of) contention. I do not know. Maybe the OP does.

Richard
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EX5A2Z6ZVTL3XRLU7ZONZTKN3W5K27PZ/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to