On 09/02/2018 07:57 AM, Cesar Philippidis wrote: > On 09/01/2018 12:04 PM, Tom de Vries wrote: >> On 08/31/2018 04:14 PM, Cesar Philippidis wrote: > >>> Is this patch OK for trunk? >>> >> >> Well, how did you test this ( >> https://gcc.gnu.org/contribute.html#patches : "Bootstrapping and >> testing. State the host and target combinations you used to do proper >> testing as described above, and the results of your testing.") ? > > I tested the standalone nvptx compiler. I'll retest with libgomp with > -misa=sm_35. Bootstrapping won't help much here, unfortunately. >>> +++ b/gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c >>> @@ -0,0 +1,24 @@ >>> +/* Test the nvptx atomic instructions for __atomic_fetch_OP for SM_35 >>> + targets. */ >>> + >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O2 -misa=sm_35" } */ >>> + >>> +int >>> +main() >>> +{ >>> + unsigned long long a = ~0; >>> + unsigned b = 0xa; >>> + >>> + __atomic_fetch_add (&a, b, 0); >>> + __atomic_fetch_and (&a, b, 0); >>> + __atomic_fetch_or (&a, b, 0); >>> + __atomic_fetch_xor (&a, b, 0); >>> + >>> + return a; >>> +} >>> + >>> +/* { dg-final { scan-assembler "atom.add.u64" } } */ >>> +/* { dg-final { scan-assembler "atom.b64.and" } } */ >>> +/* { dg-final { scan-assembler "atom.b64.or" } } */ >>> +/* { dg-final { scan-assembler "atom.b64.xor" } } */ >>> -- 2.17.1 >>> >> >> Hmm, the add.u64 vs b64.and looks odd (and the scan-assembler-not >> testcase does not use this difference, so that needs to be fixed, or for >> bonus points, changed into a scan-assembler testcase). >> >> The documentation uses "op.type", we should fix the compiler to emit >> that consistently. Separate patch that fixes that pre-approved. > > ACK. I think there are a lot of other cases like that in the BE. > >> This is ok (with, as I mentioned above, the SI part split off into a >> separate patch), on the condition that you test libgomp with >> -foffload=-misa=sm_35.
Adding -foffload=misa=sm_35 didn't work because the host gcc doesn't support the -misa flag. When I forced the nvptx BE to set TARGET_SM35 to always be true, I ran into problems with SM_30 code linking against SM_35 code. Therefore, I don't think this patch is ready for trunk yet. By the way, is -misa really necessary for atomic_fetch_<logic><mode>? Looking at the PTX documentation I see <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#changes-in-ptx-isa-version-3-1>: PTX ISA version 3.1 introduces the following new features: * Support for sm_35 target architecture. * Extends atomic and reduction instructions to perform 64-bit {and, or, xor} operations, and 64-bit integer {min, max} operations. Is there a table for which list which GPUs are compatible with which instructions? Thanks, Cesar