On Thu, Apr 27, 2023 at 09:22:50AM -0700, Patrick O'Neill wrote:
> This patchset aims to make the RISCV atomics implementation stronger
> than the recommended mapping present in table A.6 of the ISA manual.
> https://github.com/riscv/riscv-isa-manual/blob/c7cf84547b3aefacab5463add1734c1602b67a49/src/memory.tex#L1083-L1157
>  
> 
> Context
> ---------
> GCC defined RISC-V mappings [1] before the Memory Model task group
> finalized their work and provided the ISA Manual Table A.6/A.7 mappings[2].
> 
> For at least a year now, we've known that the mappings were different,
> but it wasn't clear if these unique mappings had correctness issues.
> 
> Andrea Parri found an issue with the GCC mappings, showing that
> atomic_compare_exchange_weak_explicit(-,-,-,release,relaxed) mappings do
> not enforce release ordering guarantees. (Meaning the GCC mappings have
> a correctness issue).
>   https://inbox.sourceware.org/gcc-patches/Y1GbJuhcBFpPGJQ0@andrea/ 
> 
> Why not A.6?
> ---------
> We can update our mappings now, so the obvious choice would be to
> implement Table A.6 (what LLVM implements/ISA manual recommends).
> 
> The reason why that isn't the best path forward for GCC is due to a
> proposal by Hans Boehm to add L{d|w|b|h}.aq/rl and S{d|w|b|h}.aq/rl.
> 
> For context, there is discussion about fast-tracking the addition of
> these instructions. The RISCV architectural review committee supports
> adopting a "new and common atomics ABI for gcc and LLVM toochains ...
> that assumes the addition of the preceding instructions”. That common
> ABI is likely to be A.7.
>   https://lists.riscv.org/g/tech-privileged/message/1284 
> 
> Transitioning from A.6 to A.7 will cause an ABI break. We can hedge
> against that risk by emitting a conservative fence after SEQ_CST stores
> to make the mapping compatible with both A.6 and A.7.
> 
> What does a mapping compatible with both A.6 & A.7 look like?
> ---------
> It is exactly the same as Table A.6, but SEQ_CST stores have a trailing
> fence rw,rw. It's strictly stronger than Table A.6.
> 
> Microbenchmark
> ---------
> Hans Boehm helpfully wrote a microbenchmark [3] that uses ARM to give a
> rough estimate for the performance benefits/penalties of the different
> mappings. The microbenchmark is single threaded and almost-write-only.
> This case seems unlikely but is useful for getting a rough idea of the
> workload that would be impacted the most.
> 
> Testcases
> -------
> Control: A simple volatile store. This is most similar to a relaxed
> store.
> Release Store: This is most similar to Sw.rl (one of the instructions in
> Hans' proposal).
> Store with release fence: This is most similar to the mapping present in
> Table A.6.
> Store with two fences: This is most similar to the compatibility mapping
> present in this patchset.
> 
> Machines
> -------
> Intel(R) Core(TM) i7-8650U (sanity check only): x86 TSO
> Cortex A53 (Raspberry pi): ARM In order core
> Cortex A55 (Pixel 6 Pro): ARM In order core
> Cortex A76 (Pixel 6 Pro): ARM Out of order core
> Cortex X1 (Pixel 6 Pro): ARM Out of order core
> 
> Microbenchmark Results [4]
> --------
> Units are nsecs per iteration.
> 
> Sanity check
> Machine          CONTROL   REL_STORE   STORE_REL_FENCE   STORE_TWO_FENCE
> -------          -------   ---------   ---------------   ---------------
> Intel i7-8650U 1.34812   1.30038     1.2933            18.0474
> 
> 
> Machine       CONTROL   REL_STORE   STORE_REL_FENCE   STORE_TWO_FENCE
> -------       -------   ---------   ---------------   ---------------
> Cortex A53    7.15224   10.7282     7.15221           10.013
> Cortex A55    2.77965   8.89654     4.44787           7.78331
> Cortex A76    1.78021   1.86095     5.33088           8.88462
> Cortex X1     2.14252   2.14258     4.32982           7.05234
> 
> Reordered tests (using -r flag on microbenchmark)
> Machine       CONTROL   REL_STORE   STORE_REL_FENCE   STORE_TWO_FENCE
> -------       -------   ---------   ---------------   ---------------
> Cortex A53    7.15227   10.7282     7.16113           10.034
> Cortex A55    2.78024   8.89574     4.44844           7.78428
> Cortex A76    1.77686   1.81081     5.3301            8.88346
> Cortex X1     2.14254   2.14251     4.3273            7.05239
> 
> Benchmark Interpretation
> --------
> As expected, out of order machines are significantly faster with the
> REL_STORE mappings. Unexpectedly, the in-order machines are
> significantly slower with REL_STORE rather than REL_STORE_FENCE.
> 
> Most machines in the wild are expected to use Table A.7 once the
> instructions are introduced. 
> Incurring this added cost now will make it easier for compiled RISC-V
> binaries to transition to the A.7 memory model mapping.
> 
> The performance benefits of moving to A.7 can be more clearly seen using
> an almost-all-load microbenchmark (included on page 3 of Hans’
> proposal). The code for that microbenchmark is attached below [5].
>   
> https://lists.riscv.org/g/tech-unprivileged/attachment/382/0/load-acquire110422.pdf
>  
>   https://lists.riscv.org/g/tech-unprivileged/topic/92916241 
> 
> Caveats
> --------
> This is a very synthetic microbenchmark that represents what is expected
> to be a very unlikely workload. Nevertheless, it's helpful to see the
> worst-case price we are paying for compatibility. 
> 
> “All times include an entire loop iteration, indirect dispatch and all.
> The benchmark alternates tests, but does not lock CPU frequency. Since a
> single core was in use, I expect this was running at basically full
> speed. Any throttling affected everything more or less uniformly.”
> - Hans Boehm
> 
> Patchset overview
> --------
> Patch 1 simplifies the memmodel to ignore MEMMODEL_SYNC_* cases (legacy
> cases that aren't handled differently for RISC-V).
> Patches 2-6 make the mappings strictly stronger.
> Patches 7-9 weaken the mappings to be in line with table A.6 of the ISA
> manual.
> Patch 11 adds some basic conformance tests to ensure the implemented
> mapping matches table A.6 with stronger SEQ_CST stores.
> 
> Conformance test cases notes
> --------
> The conformance tests in this patch are a good sanity check but do not
> guarantee exactly following Table A.6. It checks that the right
> instructions are emitted (ex. fence rw,r) but not the order of those
> instructions.
> 
> LLVM mapping notes
> --------
> LLVM emits corresponding fences for atomic_signal_fence instructions.
> This seems to be an oversight since AFAIK atomic_signal_fence acts as a
> compiler directive. GCC does not emit any fences for atomic_signal_fence
> instructions.
> 
> Future work
> --------
> There still remains some work to be done in this space after this
> patchset fixes the correctness of the GCC mappings. 
> * Look into explicitly handling subword loads/stores.
> * Look into using AMOSWAP.rl for store words/doubles.
> * L{b|h|w|d}.aq/rl & S{b|h|w|d}.aq/rl support once ratified.
> * zTSO mappings.
> 
> Prior Patchsets
> --------
> Patchset v1:
>   https://gcc.gnu.org/pipermail/gcc-patches/2022-April/592950.html 
> 
> Patchset v2:
>   https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615264.html 
> 
> Patchset v3:
>   https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615431.html 
> 
> Patchset v4:
>   https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615748.html 
> 
> Changelogs
> --------
> Changes for v2:
> * Use memmodel_base rather than a custom simplify_memmodel function
>   (Inspired by Christoph Muellner's patch 1/9)
> * Move instruction styling change from [v1 5/7] to [v2 3/8] to reduce
>   [v2 6/8]'s complexity
> * Eliminated %K flag for atomic store introduced in v1 in favor of
>   if/else
> * Rebase/test
> 
> Changes for v3:
> * Use a trailing fence for atomic stores to be compatible with table A.7
> * Emit an optimized fence r,rw following a SEQ_CST load
> * Consolidate tests in [PATCH v3 10/10]
> * Add tests for basic A.6 conformance
> 
> Changes for v4:
> * Update cover letter to cover more of the reasoning behind moving to a
>   compatibility mapping
> * Improve conformance testcases patch assertions and add new
>   compare-exchange testcases
> 
> Changes for v5:
> * Update cover letter to cover more context and reasoning behind moving
>   to a compatibility mapping
> * Rebase to include the subword-atomic cases introduced here:
>   https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616080.html
> * Add basic amo-add subword atomic testcases
> * Reformat changelogs
> * Fix misc. whitespace issues
> 
> [1] GCC port with mappings merged 06 Feb 2017
>   
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=09cae7507d9e88f2b05cf3a9404bf181e65ccbac
> 
> [2] A.6 mappings added to ISA manual 12 Dec 2017
> https://github.com/riscv/riscv-isa-manual/commit/9da1a115bcc4fe327f35acceb851d4850d12e9fa
>  
> 
> [3] Hans Boehm almost-all-store Microbenchmark:
> // Copyright 2023 Google LLC.
> // SPDX-License-Identifier: Apache-2.0
> 
> #include <atomic>
> #include <iostream>
> #include <time.h>
> 
> static constexpr int INNER_ITERS = 10'000'000;
> static constexpr int OUTER_ITERS = 20;
> static constexpr int N_TESTS = 4;
> 
> volatile int the_volatile(17);
> std::atomic<int> the_atomic(17);
> 
> void test1(int i) {
>   the_volatile = i;
> }
> 
> void test2(int i) {
>   the_atomic.store(i, std::memory_order_release);
> }
> 
> void test3(int i) {
>   atomic_thread_fence(std::memory_order_release);
>   the_atomic.store(i, std::memory_order_relaxed);
> }
> 
> void test4(int i) {
>   atomic_thread_fence(std::memory_order_release);
>   the_atomic.store(i, std::memory_order_relaxed);
>   atomic_thread_fence(std::memory_order_seq_cst);
> }
> 
> typedef void (*int_func)(int);
> 
> uint64_t getnanos() {
>   struct timespec result;
>   if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &result) != 0) {
>       std::cerr << "clock_gettime() failed\n";
>       exit(1);
>   }
>   return (uint64_t)result.tv_nsec + 1'000'000'000 * (uint64_t)result.tv_sec;
> }
> 
> int_func tests[N_TESTS] = { test1, test2, test3, test4 };
> const char *test_names[N_TESTS] =
>       { "control", "release store", "store with release fence", "store with 
> two fences" };
> uint64_t total_time[N_TESTS] = { 0 };
> 
> int main(int argc, char **argv) {
>   struct timespec res;
>   if (clock_getres(CLOCK_PROCESS_CPUTIME_ID, &res) != 0) {
>       std::cerr << "clock_getres() failed\n";
>       exit(1);
>   } else {
>       std::cout << "nsec resolution = " << res.tv_nsec << std::endl;
>   }
>   if (argc == 2 && argv[1][0] == 'r') {
>       // Run tests in reverse order.
>       for (int i = 0; i < N_TESTS / 2; ++i) {
>       std::swap(tests[i], tests[N_TESTS - 1 - i]);
>       std::swap(test_names[i], test_names[N_TESTS - 1 - i]);
>       }
>   }
>   for (int i = 0; i < OUTER_ITERS; ++i) {
>       // Alternate tests to minimize bias due to thermal throttling.
>       for (int j = 0; j < N_TESTS; ++j) {
>       uint64_t start_time = getnanos();
>       for (int k = 1; k <= INNER_ITERS; ++k) {
>       tests[j](k); // Provides memory accesses between tests.
>       }
>       // Ignore first iteration for all tests. The first iteration of the 
> first test is
>       // empirically slightly slower.
>       if (i != 0) {
>       total_time[j] += getnanos() - start_time;
>       }
>       if ((tests[j] == test1 ? the_volatile : the_atomic.load()) != 
> INNER_ITERS) {
>       std::cerr << "result check failed, test = " << j << ", " << 
> the_volatile << std::endl;
>       exit(1);
>       }
>       }
>   }
>   for (int i = 0; i < N_TESTS; ++i) {
>       double nsecs_per_iter = (double) total_time[i] / INNER_ITERS / 
> (OUTER_ITERS - 1);
>       std::cout << test_names[i] << " took " << nsecs_per_iter << " nseconds 
> per iteration\n";
>   }
>   exit(0);
> }
> 
> [4] Hans Boehm Raw Microbenchmark Results
> Intel(R) Core(TM) i7-8650U (sanity check only):
> 
> hboehm@hboehm-glaptop0:~/tests$ ./a.out
> nsec resolution = 1
> control took 1.34812 nseconds per iteration
> release store took 1.30038 nseconds per iteration
> store with release fence took 1.2933 nseconds per iteration
> store with two fences took 18.0474 nseconds per iteration
> 
> Cortex A53 (Raspberry pi)
> hboehm@rpi3-20210823:~/tests$ ./a.out
> nsec resolution = 1
> control took 7.15224 nseconds per iteration
> release store took 10.7282 nseconds per iteration
> store with release fence took 7.15221 nseconds per iteration
> store with two fences took 10.013 nseconds per iteration
> hboehm@rpi3-20210823:~/tests$ ./a.out -r
> nsec resolution = 1
> control took 7.15227 nseconds per iteration
> release store took 10.7282 nseconds per iteration
> store with release fence took 7.16133 nseconds per iteration
> store with two fences took 10.034 nseconds per iteration
> 
> Cortex A55 (Pixel 6 Pro)
> 
> raven:/data/tmp # taskset 0f ./release-timer
> nsec resolution = 1
> control took 2.77965 nseconds per iteration
> release store took 8.89654 nseconds per iteration
> store with release fence took 4.44787 nseconds per iteration
> store with two fences took 7.78331 nseconds per iteration
> raven:/data/tmp # taskset 0f ./release-timer -r                               
>                                          
> nsec resolution = 1
> control took 2.78024 nseconds per iteration
> release store took 8.89574 nseconds per iteration
> store with release fence took 4.44844 nseconds per iteration
> store with two fences took 7.78428 nseconds per iteration
> 
> Cortex A76 (Pixel 6 Pro)
> raven:/data/tmp # taskset 30 ./release-timer -r                               
>                                          
> nsec resolution = 1
> control took 1.77686 nseconds per iteration
> release store took 1.81081 nseconds per iteration
> store with release fence took 5.3301 nseconds per iteration
> store with two fences took 8.88346 nseconds per iteration
> raven:/data/tmp # taskset 30 ./release-timer                                  
>                                          
> nsec resolution = 1
> control took 1.78021 nseconds per iteration
> release store took 1.86095 nseconds per iteration
> store with release fence took 5.33088 nseconds per iteration
> store with two fences took 8.88462 nseconds per iteration
> 
> Cortex X1 (Pixel 6 Pro)
> raven:/data/tmp # taskset c0 ./release-timer                                  
>                                          
> nsec resolution = 1
> control took 2.14252 nseconds per iteration
> release store took 2.14258 nseconds per iteration
> store with release fence took 4.32982 nseconds per iteration
> store with two fences took 7.05234 nseconds per iteration
> raven:/data/tmp # taskset c0 ./release-timer -r                               
>                                          
> nsec resolution = 1
> control took 2.14254 nseconds per iteration
> release store took 2.14251 nseconds per iteration
> store with release fence took 4.3273 nseconds per iteration
> store with two fences took 7.05239 nseconds per iteration
> 
> [5] Hans Boehm almost-all-load Microbenchmark:
> // Copyright 2023 Google LLC.
> // SPDX-License-Identifier: Apache-2.0
> 
> #include <atomic>
> #include <iostream>
> #include <time.h>
> 
> static constexpr int INNER_ITERS = 10'000'000;
> static constexpr int OUTER_ITERS = 20;
> static constexpr int N_TESTS = 4;
> 
> volatile int the_volatile(17);
> std::atomic<int> the_atomic(17);
> 
> int test1() {
>   return the_volatile;
> }
> 
> int test2() {
>   return the_atomic.load(std::memory_order_acquire);
> }
> 
> int test3() {
>   int result = the_atomic.load(std::memory_order_relaxed);
>   atomic_thread_fence(std::memory_order_acquire);
>   return result;
> }
> 
> int test4() {
>   atomic_thread_fence(std::memory_order_seq_cst);
>   int result = the_atomic.load(std::memory_order_relaxed);
>   atomic_thread_fence(std::memory_order_acquire);
>   return result;
> }
> 
> typedef int (*int_func)();
> 
> uint64_t getnanos() {
>   struct timespec result;
>   if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &result) != 0) {
>       std::cerr << "clock_gettime() failed\n";
>       exit(1);
>   }
>   return (uint64_t)result.tv_nsec + 1'000'000'000 * (uint64_t)result.tv_sec;
> }
> 
> int_func tests[N_TESTS] = { test1, test2, test3, test4 };
> const char *test_names[N_TESTS] =
>       { "control", "acquire load", "load with acquire fence", "load with two 
> fences" };
> uint64_t total_time[N_TESTS] = { 0 };
> 
> uint sum, last_sum = 0;
> 
> int main(int argc, char **argv) {
>   struct timespec res;
>   if (clock_getres(CLOCK_PROCESS_CPUTIME_ID, &res) != 0) {
>       std::cerr << "clock_getres() failed\n";
>       exit(1);
>   } else {
>       std::cout << "nsec resolution = " << res.tv_nsec << std::endl;
>   }
>   if (argc == 2 && argv[1][0] == 'r') {
>       // Run tests in reverse order.
>       for (int i = 0; i < N_TESTS / 2; ++i) {
>       std::swap(tests[i], tests[N_TESTS - 1 - i]);
>       std::swap(test_names[i], test_names[N_TESTS - 1 - i]);
>       }
>   }
>   for (int i = 0; i < OUTER_ITERS; ++i) {
>       // Alternate tests to minimize bias due to thermal throttling.
>       for (int j = 0; j < N_TESTS; ++j) {
>       sum = 0;
>       uint64_t start_time = getnanos();
>       for (int k = 0; k < INNER_ITERS; ++k) {
>       sum += tests[j](); // Provides memory accesses between tests.
>       }
>       // Ignore first iteration for all tests. The first iteration of the 
> first test is
>       // empirically slightly slower.
>       if (i != 0) {
>       total_time[j] += getnanos() - start_time;
>       }
>       if (sum == 0 || last_sum != 0 && sum != last_sum) {
>       std::cerr << "result check failed";
>       exit(1);
>       }
>       last_sum = sum;
>       }
>   }
>   for (int i = 0; i < N_TESTS; ++i) {
>       double nsecs_per_iter = (double) total_time[i] / INNER_ITERS / 
> (OUTER_ITERS - 1);
>       std::cout << test_names[i] << " took " << nsecs_per_iter << " nseconds 
> per iteration\n";
>   }
>   exit(0);
> }
> 
> Patrick O'Neill (11):
>   RISC-V: Eliminate SYNC memory models
>   RISC-V: Enforce Libatomic LR/SC SEQ_CST
>   RISC-V: Enforce subword atomic LR/SC SEQ_CST
>   RISC-V: Enforce atomic compare_exchange SEQ_CST
>   RISC-V: Add AMO release bits
>   RISC-V: Strengthen atomic stores
>   RISC-V: Eliminate AMO op fences
>   RISC-V: Weaken LR/SC pairs
>   RISC-V: Weaken mem_thread_fence
>   RISC-V: Weaken atomic loads
>   RISC-V: Table A.6 conformance tests
> 
>  gcc/config/riscv/riscv-protos.h               |   3 +
>  gcc/config/riscv/riscv.cc                     |  66 ++++--
>  gcc/config/riscv/sync.md                      | 194 ++++++++++++------
>  .../riscv/amo-table-a-6-amo-add-1.c           |   8 +
>  .../riscv/amo-table-a-6-amo-add-2.c           |   8 +
>  .../riscv/amo-table-a-6-amo-add-3.c           |   8 +
>  .../riscv/amo-table-a-6-amo-add-4.c           |   8 +
>  .../riscv/amo-table-a-6-amo-add-5.c           |   8 +
>  .../riscv/amo-table-a-6-compare-exchange-1.c  |  10 +
>  .../riscv/amo-table-a-6-compare-exchange-2.c  |  10 +
>  .../riscv/amo-table-a-6-compare-exchange-3.c  |  10 +
>  .../riscv/amo-table-a-6-compare-exchange-4.c  |  10 +
>  .../riscv/amo-table-a-6-compare-exchange-5.c  |  10 +
>  .../riscv/amo-table-a-6-compare-exchange-6.c  |  11 +
>  .../riscv/amo-table-a-6-compare-exchange-7.c  |  10 +
>  .../gcc.target/riscv/amo-table-a-6-fence-1.c  |   8 +
>  .../gcc.target/riscv/amo-table-a-6-fence-2.c  |  10 +
>  .../gcc.target/riscv/amo-table-a-6-fence-3.c  |  10 +
>  .../gcc.target/riscv/amo-table-a-6-fence-4.c  |  10 +
>  .../gcc.target/riscv/amo-table-a-6-fence-5.c  |  10 +
>  .../gcc.target/riscv/amo-table-a-6-load-1.c   |   9 +
>  .../gcc.target/riscv/amo-table-a-6-load-2.c   |  11 +
>  .../gcc.target/riscv/amo-table-a-6-load-3.c   |  11 +
>  .../gcc.target/riscv/amo-table-a-6-store-1.c  |   9 +
>  .../gcc.target/riscv/amo-table-a-6-store-2.c  |  11 +
>  .../riscv/amo-table-a-6-store-compat-3.c      |  11 +
>  .../riscv/amo-table-a-6-subword-amo-add-1.c   |   9 +
>  .../riscv/amo-table-a-6-subword-amo-add-2.c   |   9 +
>  .../riscv/amo-table-a-6-subword-amo-add-3.c   |   9 +
>  .../riscv/amo-table-a-6-subword-amo-add-4.c   |   9 +
>  .../riscv/amo-table-a-6-subword-amo-add-5.c   |   9 +
>  gcc/testsuite/gcc.target/riscv/pr89835.c      |   9 +
>  libgcc/config/riscv/atomic.c                  |   4 +-
>  33 files changed, 467 insertions(+), 75 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-5.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-3.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-4.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-5.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-6.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-7.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-compat-3.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-3.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-4.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr89835.c

These changes address and fix all the issues I reported/I'm aware of,
thank you!

Tested-by: Andrea Parri <and...@rivosinc.com>

  Andrea

Reply via email to