Re: -fprofile-update=atomic vs. 32-bit architectures

Sebastian Huber Wed, 07 Dec 2022 01:55:31 -0800



On 04.11.22 09:27, Sebastian Huber wrote:

Hello,
even recent 32-bit architectures such as RISC-V do not support 64-bitatomic operations. Using -fprofile-update=atomic for the 32-bit RISC-VRV32GC ISA yields:
warning: target does not support atomic profile update, single mode isselected
For multi-threaded applications it is quite important to use atomiccounter increments to get valid coverage data. I think this fall back isnot really good. Maybe we should consider using this approach from JakubJelinek for 32-bit architectures lacking 64-bit atomic operations:
if (__atomic_add_fetch_4 ((unsigned int *) &val, 1, __ATOMIC_RELAXED)== 0) __atomic_fetch_add_4 (((unsigned int *) &val) + 1, 1,__ATOMIC_RELAXED);
https://patchwork.ozlabs.org/project/gcc/patch/19c4a81d-6ecd-8c6e-b641-e257c1959...@suse.cz/#1447334
Last year I added the TARGET_GCOV_TYPE_SIZE target hook to optionallyreduce the gcov type size to 32 bits. I am not really sure if this was agood idea. Longer running executables may observe counter overflowsleading to invalid coverage data. If someone wants atomic updates, thenthe updates should be atomic even if this means to use a libraryimplementation (libatomic).
What about the following approach if -fprofile-update=atomic is given:

1. Use 64-bit atomics if available.

2. Use
if (__atomic_add_fetch_4 ((unsigned int *) &val, 1, __ATOMIC_RELAXED)== 0) __atomic_fetch_add_4 (((unsigned int *) &val) + 1, 1,__ATOMIC_RELAXED);
if 32-bit atomics are available.

This approach works fine for the edge counters ingimple_gen_edge_profiler() because we don't have to read the countervalue. We just have to do an increment. In gimple_gen_time_profiler() wehave to do this:


/* Emit: counters[0] = ++__gcov_time_profiler_counter.  */

So here we have to do an atomic increment and fetch the value. Thisdoesn't work with the approach above. For example let thread A incrementthe lower part from 0xfffffffe to 0xffffffff, then let thread Bincrement the lower part from 0xffffffff to 0x0, then the higher partfrom 0x7 to 0x8, then let thread A read 0x8. Thread A would then get0x8_ffffffff instead of the correct 0x7_ffffffff.


3. Else use a library call (libatomic).


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/

Re: -fprofile-update=atomic vs. 32-bit architectures

Reply via email to