[Bug libstdc++/115402] std::atomic_ref compile-error in compare_exchange_[weak/strong]() and wait()

2024-06-14 Thread lewissbaker.opensource at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115402

--- Comment #7 from Lewis Baker  ---
This paper has now been published and is available at:
https://isocpp.org/files/papers/P3323R0.html

[Bug libstdc++/115402] New: std::atomic_ref compile-error in compare_exchange_[weak/strong]() and wait()

2024-06-09 Thread lewissbaker.opensource at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115402

Bug ID: 115402
   Summary: std::atomic_ref compile-error in
compare_exchange_[weak/strong]() and wait()
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lewissbaker.opensource at gmail dot com
  Target Milestone: ---

See https://godbolt.org/z/q4jYvPaah

The following code-snippet fails to compile:

 volatile int vi = 0;
 std::atomic_ref vref(vi);
 int val = vref.load(); // ok
 vref.exchange(val); // ok
 vref.compare_exchange_weak(val, 0); // error
 vref.compare_exchange_strong(val, 0); // error
 vref.wait(0); // error

Fails with error messages like:

.../include/c++/15.0.0/bits/atomic_base.h: In instantiation of 'bool
std::__atomic_ref<_Tp, true, false>::compare_exchange_weak(_Tp&, _Tp,
std::memory_order, std::memory_order) const [with _Tp = volatile int]':
/opt/compiler-explorer/gcc-trunk-20240609/include/c++/15.0.0/bits/atomic_base.h:1674:30:
  required from 'bool std::__atomic_ref<_Tp, true,
false>::compare_exchange_weak(_Tp&, _Tp, std::memory_order) const [with _Tp =
volatile int]'
 1674 | return compare_exchange_weak(__expected, __desired, __order,
  |~^~~~
 1675 |  __cmpexch_failure_order(__order));
  |  ~
:8:31:   required from here
8 | vref.compare_exchange_weak(val, 0); // error
  | ~~^~~~
.../include/c++/15.0.0/bits/atomic_base.h:1656:58: error: binding reference of
type 'std::__atomic_impl::_Val&' {aka 'int&'} to 'volatile int'
discards qualifiers
 1656 | return __atomic_impl::compare_exchange_weak(
  |~~^
 1657 |  _M_ptr, __expected, __desired, __success, __failure);
  |  
.../include/c++/15.0.0/bits/atomic_base.h:1119:52: note:   initializing
argument 2 of 'bool std::__atomic_impl::compare_exchange_weak(_Tp*, _Val<_Tp>&,
_Val<_Tp>, std::memory_order, std::memory_order, bool) [with bool _AtomicRef =
true; _Tp = volatile int; _Val<_Tp> = int]'
 1119 |   compare_exchange_weak(_Tp* __ptr, _Val<_Tp>& __expected,
  | ~~~^~

[Bug libstdc++/106183] New: std::atomic::wait might deadlock on platforms without platform_wait()

2022-07-04 Thread lewissbaker.opensource at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106183

Bug ID: 106183
   Summary: std::atomic::wait might deadlock on platforms without
platform_wait()
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lewissbaker.opensource at gmail dot com
  Target Milestone: ---

I have been doing some research on implementations of std::atomic::notify_all()
and std::atomic::wait() as part of a C++ paper I've been working on.

I've recently been studying the libc++ implementation and I think I have
discovered a potential bug in the implementation for platforms that do not have
__GLIBCXX_HAVE_PLATFORM_WAIT defined (i.e. that don't have futex syscall or
similar) and for std::atomic types where T is different from
__platform_wait_t.

I believe there is potential for a thread calling x.wait(old) to fail to be
unblocked by a call by another thread to x.notify_all() after modifying the
value to something other than 'old'.

I have reduced the current implementation of the std::atomic::wait() and
std::atomic::notify_all() functions and I believe the code currently in
trunk to be effectively equivalent to the following code-snippet:

--
using __platform_wait_t = std::uint64_t;

struct __waiter_pool {
  std::atomic<__platform_wait_t> _M_wait{0};
  std::mutex _M_mut;
  std::atomic<__platform_wait_t> _M_ver{0};
  std::condition_variable _M_cv;

  static __waiter_pool& _S_for(void* __addr) noexcept {
constexpr uintptr_t __count = 16;
static __waiter_pool __x[__count];
uintptr_t __key = (((uintptr_t)__addr) >> 2) % __count;
return __x[__key];
  }
};

template
bool __atomic_compare(const _Tp& __a, const _Tp& __b) noexcept {
  return std::memcmp(std::addressof(__a), std::addressof(__b), sizeof(_Tp)) ==
0;
}

template
void atomic::wait(T __old, memory_order __mo = memory_order_seq_cst)
noexcept {
  __waiter_pool& __w = __waiter_pool::_S_for(this);
  __w._M_wait.fetch_add(1, std::memory_order_seq_cst);
  do {
__platform_wait_t __val1 = __w._M_ver.load(std::memory_order_acquire);
if (!__atomic_compare(__old, this->load(__mo))) {
return;
}

__platform__wait_t __val2 = __w._M_ver.load(std::memory_order_seq_cst);
// < BUG: problem if notify_all() is executed at this point
if (__val2 == __val1) {
lock_guard __lk(__w._M_mtx);
__w._M_cv.wait(__lk);
}

  } while (__atomic_compare(__old, this->load(__mo)));

  __w._M_wait.fetch_sub(1, std::memory_order_release);
}

void atomic::notify_all() noexcept {
__waiter_pool& __w = __waiter_pool::_S_for(this);
__w._M_ver.fetch_add(1, memory_order_seq_cst);
if (__w._M_wait.load(memory_order_seq_cst) != 0) {
__w._M_cv.notify_all();
}
}
---

The wait() method reads the _M_ver value then checks whether the value being
waited on has changed
and then if it has not, then reads the _M_ver value again. If the two values
read from _M_ver are
the same then we can infer that there has not been an intervening call to
notify_all().

However, after checking that _M_ver has not changed, it then (and only then)
proceeds to acquire
the lock on the _M_mut mutex and then waits on the _M_cv condition variable.

The problem occurs if the waiting thread happens to be delayed between reading
_M_ver for the second
time and blocking inside the call to  _M_cv.wait() (indicated with a comment).
If this happens, it may be possible then for another thread that was supposed
to unblock this thread
to then modify the atomic value, call notify_all() which increments _M_ver and
calls _M_cv.notify_all(),
all before the waiting thread acquires the mutex and blocks on the
condition-variable.

If this happens and no other thread is subsequently going to call notify_all()
on the atomic variable
then it's possible the call to wait() will block forever as it missed its
wake-up call.

The solution here is to do more work while holding the mutex.

I haven't fully verified the correctness of the following code, but I think it
should help to
avoid the missed wake-ups situations that are possible in the current
implementation. It does
come at a higher synchronisation cost, however, as notifying threads also need
to acquire the
mutex.

-

template
void atomic::wait(T __old, memory_order __mo = memory_order_seq_cst)
noexcept {
  __waiter_pool& __w = __waiter_pool::_S_for(this);
  __w._M_wait.fetch_add(1, std::memory_order_seq_cst);
  do {
__platform_wait_t __val1 = __w._M_ver.load(std::memory_order_acquire);
if (!__atomic_compare(__old, this->load(__mo))) {
return;
}

__platform__wait_t __val2 = __w._M_ver.load(std::memory_order_seq_cst);
if (__val2 == __val1) {
lock_guard __lk(__w._M_mtx);
// read again under protection of the lock
__val2 = __w._M_ver.load(std::memor

[Bug c++/101247] New: ICE when using static constexpr bool defined in base-class in derived class constructor requires-clause

2021-06-28 Thread lewissbaker.opensource at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101247

Bug ID: 101247
   Summary: ICE when using static constexpr bool defined in
base-class in derived class constructor
requires-clause
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lewissbaker.opensource at gmail dot com
  Target Milestone: ---

Compiling the following code with GCC trunk and flags: -std=c++20

```
template
struct in_place_type_t {};

template
inline constexpr in_place_type_t in_place_type;

template
inline constexpr bool outer_helper_v = true;

template
struct foo {
struct type;

template
static constexpr bool helper_v = outer_helper_v;
};

template
struct foo::type {
template
requires helper_v
type(in_place_type_t) {}
};

int main() {
foo::type x(in_place_type);
}
```

results in an internal compiler-error

```
: In substitution of 'template  requires  helper_v
foo::type::type(in_place_type_t) [with U = {int}]':
:26:40:   required from here
:21:18: internal compiler error: tree check: accessed elt 1 of
'tree_vec' with 0 elts in tsubst_pack_expansion, at cp/pt.c:13084
   21 | requires helper_v
  |  ^~~
0x1d4c179 internal_error(char const*, ...)
???:0
0x685746 tree_vec_elt_check_failed(int, int, char const*, int, char const*)
???:0
0x99e0bc tsubst_pack_expansion(tree_node*, tree_node*, int, tree_node*)
???:0
0x9afa62 tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
???:0
0x9b064b tsubst_argument_pack(tree_node*, tree_node*, int, tree_node*)
???:0
0x9afa04 tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
???:0
0x983ebd tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0x7abcda constraints_satisfied_p(tree_node*, tree_node*)
???:0
0x9c26c3 fn_type_unification(tree_node*, tree_node*, tree_node*, tree_node*
const*, unsigned int, tree_node*, unification_kind_t, int, conversion**, bool,
bool)
???:0
0x74abc9 build_new_method_call(tree_node*, tree_node*, vec**, tree_node*, int, tree_node**, int)
???:0
0x74c8c0 build_special_member_call(tree_node*, tree_node*, vec**, tree_node*, int, int)
???:0
0x8611c8 build_aggr_init(tree_node*, tree_node*, int, int)
???:0
0x81f0d9 cp_finish_decl(tree_node*, tree_node*, bool, tree_node*, int)
???:0
0x94f36d c_parse_file()
???:0
0xad2922 c_common_parse_file()
???:0
```

See https://godbolt.org/z/ebnxfsMs9 for an example.

[Bug c++/97452] [coroutines] incorrect sequencing of await_resume() when multiple co_await expressions occur in a single statement

2021-04-09 Thread lewissbaker.opensource at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97452

--- Comment #9 from Lewis Baker  ---
> In terms of the standard do you think this is technically undefined behaviour?

Yes, I think this is something that Gor was looking into as a wording issue
that could do with some clarification.

I think the suggestion was something along the lines of adding some wording to
ensure that the evaluation of a an await-expression was sequenced atomically
with respect to the evaluation of other expressions in the statement.

[Bug libstdc++/99537] New: Wrong memory_order used in stop_token ref-counting

2021-03-10 Thread lewissbaker.opensource at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99537

Bug ID: 99537
   Summary: Wrong memory_order used in stop_token ref-counting
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lewissbaker.opensource at gmail dot com
  Target Milestone: ---

In the implementation of stop_token the _Stop_state_t implements
reference-counting for tracking shared ownership of the stop-state.

This is done via two methods:

  void
  _M_add_owner() noexcept
  {
_M_owners.fetch_add(1, memory_order::relaxed);
  }

  void
  _M_release_ownership() noexcept
  {
if (_M_owners.fetch_sub(1, memory_order::release) == 1)
  delete this;
  }

Other than initialising the _M_owners atomic value to 1, these are the only two
accesses of the _M_owners variable.

The 'fetch_sub()' operation in _M_release_ownership() should be using
memory_order::acq_rel instead of memory_order::release. The use of 'release'
only is insufficient as it does not synchronise with any corresponding
'acquire' operation.

With the current implementation, it's possible that a prior write to one of the
_M_value or _M_head data-members by a thread releasing the second-to-last
reference might not be visible to another thread that releases the last
reference and frees the memory, resulting in potential write to freed memory.

[Bug c++/97452] New: [coroutines] incorrect sequencing of await_resume() when multiple co_await expressions occur in a single statement

2020-10-15 Thread lewissbaker.opensource at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97452

Bug ID: 97452
   Summary: [coroutines] incorrect sequencing of await_resume()
when multiple co_await expressions occur in a single
statement
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lewissbaker.opensource at gmail dot com
  Target Milestone: ---

See https://godbolt.org/z/9Kj9o8

Compile the following program with GCC trunk and flags: -std=c++20


#include 
#include 
#include 
#include 
#include 

struct task {
struct promise_type {
task get_return_object() noexcept {
return
task{std::coroutine_handle::from_promise(*this)};
}
std::suspend_always initial_suspend() noexcept {
std::puts("task initial_suspend");
return {};
}

struct yield_awaiter {
bool await_ready() noexcept { return false; }
auto await_suspend(std::coroutine_handle h) noexcept
{
std::puts("task yielding/returning value");
return std::exchange(h.promise().continuation, {});
}
void await_resume() noexcept {
std::puts("task resumed");
}
};

yield_awaiter yield_value(int x) noexcept {
value = x;
return {};
}

void return_value(int x) noexcept {
value = x;
}

yield_awaiter final_suspend() noexcept {
return {};
}

[[noreturn]] void unhandled_exception() noexcept {
std::terminate();
}

int value;
std::coroutine_handle<> continuation;
};

explicit task(std::coroutine_handle coro) noexcept
: coro(coro)
{}

task(task&& t) noexcept
: coro(std::exchange(t.coro, {}))
{}

~task() {
if (coro) coro.destroy();
}

__attribute__((noinline)) bool await_ready() {
std::puts("task::await_ready");
return false;
}

__attribute__((noinline)) auto await_suspend(std::coroutine_handle<> h)
noexcept {
std::puts("task::await_suspend");
coro.promise().continuation = h;
return coro;
}

__attribute__((noinline)) int await_resume() {
std::puts("task::await_resume");
return coro.promise().value;
}

std::coroutine_handle coro;
};

task two_values() {
co_yield 1;
co_return 2;
}

task example() {
task t = two_values();
int result = co_await t + co_await t;
std::printf("result = %i (should be 3)\n", result);
std::fflush(stdout);
co_return result;
}

int main() {
task t = example();
t.coro.resume();
assert(t.coro.done());
}


Expected output:

task initial_suspend
task initial_suspend
task::await_ready
task::await_suspend
task yielding/returning value
task::await_resume
task::await_ready
task::await_suspend
task resumed
task yielding/returning value
task::await_resume
result = 3 (should be 3)

Actual output:

task initial_suspend
task initial_suspend
task::await_ready
task::await_suspend
task yielding/returning value
task::await_ready
task::await_suspend
task resumed
task yielding/returning value
task::await_resume
task::await_resume
result = 4 (should be 3)


Note that the output indicates the the compiler is only calling await_resume()
on both awaiters immediately before evaluating the plus-expression rather than
evaluating await_resume() on the first awaiter immediately upon resuming from
the first suspend-point and before evaluating await_ready() for the second
awaiter.