Very nice detective work!
I'm very rusty in this memory ordering business, so don't have a very
useful review, and rather have a question below.

--
Nadav Har'El
n...@scylladb.com


On Sun, May 9, 2021 at 11:13 PM Waldemar Kozaczuk <jwkozac...@gmail.com>
wrote:

> As the issue #1134 explains, some unit tests hang in the do_main_thread
> while shutting down when OSv runs with 2 or more CPUs on aarch64. This
> happens even with the patch to address the issue #1123 applied.
>
> After connecting with gdb, the thread dump always looks very similar
> where the do_main_thread() thread waits on semaphore in the
> osv::rcu_flush()
> method that calls rcu_defer() to force execution of any outstanding
> callbacks and that callback never seems be executed.
> The threads of cpu_quiescent_state_thread type (one for each CPU)
> seem to be stuck waiting to converge to progress to the next generation.
> More specifically it seems that even though a condition involving
> the atomic variable `_generation` is true, one of the threads is stuck
> in sched::thread::wait_until(). This seems to indicate that most likely
> even though this thread was woken, it could not see correct value
> of `_generation` atomic variable modified by other thread on other CPUs.
>
> This looks somewhat similar to the issue #1123 in that it is caused by
> memory order related deficiencies in the code that come to light when
> running on aarch64 which comes with weak memory model.
>
> To address those, this patch modifies the memory order qualifier
> for some atomic operations in rcu.cc to make sure that the changes made
> on one CPU are visible on another one. More specifically the
> cpu_quiescent_state_thread
> threads interact with each other while converging on next generation by
> writing and reading the _request, _requested and _generation atomics.
> Currently all operations on those use the memory_order_relaxed qualifier
> which only guarantees atomicity but not visibility. To make sure that
> the changes to _request, _requested and _generation are visible across
> CPUs in sched::thread::wait_until(), we modify the memory modifier
> in relevant places to memory_order_acquire (when reading),
> memory_order_release (when writing)
> and memory_order_acq_rel for the CAS operation. These changes create
> proper acquire-release relationships between relevant read/write
> operations on those atomics and thus enforce necessary memory barrier
> instructions generated around those.
>
> Before this patch, the tst-evenfd.so test would hang under 100 runs
> with 2 vCPUs with the symptoms described by #1134. After this patch
> I could execute the same test over 13K times.
>
> Fixes #1134
>
> Signed-off-by: Waldemar Kozaczuk <jwkozac...@gmail.com>
> ---
>  core/rcu.cc | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/core/rcu.cc b/core/rcu.cc
> index d6f587b8..95969109 100644
> --- a/core/rcu.cc
> +++ b/core/rcu.cc
> @@ -73,7 +73,7 @@
> cpu_quiescent_state_thread::cpu_quiescent_state_thread(sched::cpu* cpu)
>  void cpu_quiescent_state_thread::request(uint64_t generation)
>  {
>      auto r = _request.load(std::memory_order_relaxed);
> -    while (generation > r && !_request.compare_exchange_weak(r,
> generation, std::memory_order_relaxed)) {
> +    while (generation > r && !_request.compare_exchange_weak(r,
> generation, std::memory_order_acq_rel)) {
>
         // nothing to do
>      }
>      _t->wake();
> @@ -90,7 +90,7 @@ void cpu_quiescent_state_thread::set_generation(uint64_t
> generation)
>      // Wake the quiescent threads who might be interested in my
> _generation.
>      for (auto cqst : cpu_quiescent_state_threads) {
>          if (cqst != this &&
> -                cqst->_requested.load(std::memory_order_relaxed)) {
> +                cqst->_requested.load(std::memory_order_acquire)) {
>              cqst->_t->wake();
>          }
>      }
> @@ -135,7 +135,7 @@ void cpu_quiescent_state_thread::do_work()
>          }
>          if (toclean) {
>              auto g = next_generation.fetch_add(1,
> std::memory_order_relaxed) + 1;
> -            _requested.store(true, std::memory_order_relaxed);
> +            _requested.store(true, std::memory_order_release);
>

I sadly have to admit I'm too rusty in memory ordering to check this code
:-(

I see that this place is the only memory_order_release you did. I would
think that this means that whatever we wrote *before* this release, becomes
visible for code that does a memory_order_acquire on _requested. But what
did we write in this code before the release? For example, we only write
_generation a few lines down...

Again, I'm probably missing something because I'm so rusty in this...


             // copy cpu_quiescent_state_threads to prevent a hotplugged cpu
>              // from changing the number of cpus we request a new
> generation on,
>              // and the number of cpus we wait on
> @@ -152,7 +152,7 @@ void cpu_quiescent_state_thread::do_work()
>              while (true) {
>                  sched::thread::wait_until([&cqsts, &g, this] {
>                      return ( (_generation.load(std::memory_order_relaxed)
> <
> -                                _request.load(std::memory_order_relaxed))
> +                                _request.load(std::memory_order_acquire))
>                               || all_at_generation(cqsts, g)); });
>                  auto r = _request.load(std::memory_order_relaxed);
>                  if (_generation.load(std::memory_order_relaxed) < r) {
> @@ -177,7 +177,7 @@ void cpu_quiescent_state_thread::do_work()
>              // wants to clean up, or we are woken to clean up our
> callbacks
>              sched::thread::wait_until([=] {
>                  return (_generation.load(std::memory_order_relaxed) <
> -                        _request.load(std::memory_order_relaxed)) ||
> +                        _request.load(std::memory_order_acquire)) ||
>
>  percpu_callbacks->ncallbacks[percpu_callbacks->buf]; });
>              auto r = _request.load(std::memory_order_relaxed);
>              if (_generation.load(std::memory_order_relaxed) < r) {
> --
> 2.30.2
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to osv-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/osv-dev/20210509201256.876696-1-jwkozaczuk%40gmail.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/CANEVyjtTW%2Bh2-SO8ykxwaQ%2B8%2BNQDh3vav9POjkbGjvaFtdHUJQ%40mail.gmail.com.

Reply via email to