Hi!

Many thanks, Alexandra and Tobias for working this out together!


On 2020-06-23T06:50:26-0300, Alexandre Oliva <ol...@adacore.com> wrote:
> On Jun  9, 2020, Thomas Schwinge <tho...@codesourcery.com> wrote:
>
>> Previously, for '-foffload=nvptx-none -foffload=-fdump-rtl-mach
>> -save-temps -o ./nvptx-merged-loop.exe', GCC produced the expected
>> 'nvptx-merged-loop.o.307r.mach'.
>
> I believe the patch I've just installed fixes the UNRESOLVED results
> caused by not finding dump files.

Yes, confirmed, thanks!

A few small issues remain, for those I'll respond elsewhere in this
thread.


>> Consider 'libgomp.oacc-c-c++-common/pr85381-2.c':
>
>>     /* { dg-additional-options "-save-temps" } */
>
>>     /* { dg-final { scan-assembler-times "bar.sync" 2 } } */
>
>> This expects to scan the PTX offloading compilation assembler code (not
>> host code!), expecting that nvptx offloading code assembly is produced
>> after the host code, and thus overwrites the latter file.  (Yes, that's
>> certainly ugly/fragile...)
>
> I'm afraid this will need further adjusting in the testsuite, as we'll
> store the nvptx asm saved aux output in a separate file.
> scan-assembler-times will no longer work for this purpose, we'll need
> something that knows how to find the offloaded asm.

So, that's (now?) easy enough to repair.  I've pushed "[testsuite]
Replace fragile 'scan-assembler' with 'scan-offload-rtl' in
'libgomp.oacc-c-c++-common/pr85381*.c'", see attached.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 8a8efad09811b5c08cfd9d469c6f1b6ba0c848f1 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <tho...@codesourcery.com>
Date: Tue, 30 Jun 2020 05:24:17 +0200
Subject: [PATCH] [testsuite] Replace fragile 'scan-assembler' with
 'scan-offload-rtl' in 'libgomp.oacc-c-c++-common/pr85381*.c'

These test cases use directives similar to:

    /* { dg-additional-options "-save-temps" } */

    /* { dg-final { scan-assembler-times "bar.sync" 2 } } */

This expects to scan the PTX offloading compilation assembler code (not host
code!), expecting that nvptx offloading code assembly is produced after the
host code, and thus overwrites the latter file.  (Yes, that's certainly
ugly/fragile...)

..., and this broke with recent commit 1dedc12d186a110854537e1279b4e6c29f2df35a
"revamp dump and aux output names" plus fix-up commit commit
efc16503ca10bc0e934e0bace5777500e4dc757a "handle dumpbase in offloading, adjust
testsuite" (short summary: file names changed), so let's finally make that
robust.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Replace fragile
	'scan-assembler' with 'scan-offload-rtl'.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-5.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr85381.c: Likewise.
---
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c | 4 ++--
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-3.c | 4 ++--
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c | 4 ++--
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-5.c | 4 ++--
 libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381.c   | 4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
index 6570c64afff5..84b9c01443e5 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-2.c
@@ -1,6 +1,6 @@
-/* { dg-additional-options "-save-temps" } */
 /* { dg-do run { target openacc_nvidia_accel_selected } }
    { dg-skip-if "" { *-*-* } { "*" } { "-O2" } } */
+/* { dg-additional-options "-foffload=-fdump-rtl-mach" } */
 
 int
 main (void)
@@ -33,4 +33,4 @@ main (void)
 
    so the loop is not recognized as empty loop (which we detect by seeing if
    joining immediately follows forked).  */
-/* { dg-final { scan-assembler-times "bar.sync" 2 } } */
+/* { dg-final { scan-offload-rtl-dump-times "nvptx_barsync" 2 "mach" } } */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-3.c
index c5d1c5add68e..cddbf2719067 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-3.c
@@ -1,6 +1,6 @@
-/* { dg-additional-options "-save-temps -w" } */
 /* { dg-do run { target openacc_nvidia_accel_selected } }
    { dg-skip-if "" { *-*-* } { "*" } { "-O2" } } */
+/* { dg-additional-options "-foffload=-fdump-rtl-mach" } */
 
 int a;
 #pragma acc declare create(a)
@@ -32,4 +32,4 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-assembler-not "bar.sync" } } */
+/* { dg-final { scan-offload-rtl-dump-not "nvptx_barsync" "mach" } } */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c
index d955d79718df..e1679444172c 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-4.c
@@ -1,6 +1,6 @@
-/* { dg-additional-options "-save-temps -w" } */
 /* { dg-do run { target openacc_nvidia_accel_selected } }
    { dg-skip-if "" { *-*-* } { "*" } { "-O2" } } */
+/* { dg-additional-options "-foffload=-fdump-rtl-mach" } */
 
 #define n 1024
 
@@ -24,4 +24,4 @@ main (void)
 /* Atm, %ntid.y is broadcast from one loop to the next, so there are 2 bar.syncs
    for that (the other two are there for the same reason as in pr85381-2.c).
    Todo: Recompute %ntid.y instead of broadcasting it. */
-/* { dg-final { scan-assembler-times "bar.sync" 4 } } */
+/* { dg-final { scan-offload-rtl-dump-times "nvptx_barsync" 4 "mach" } } */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-5.c
index 61e7e48f0c93..26ca5093c47d 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-5.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381-5.c
@@ -1,6 +1,6 @@
-/* { dg-additional-options "-save-temps" } */
 /* { dg-do run { target openacc_nvidia_accel_selected } }
    { dg-skip-if "" { *-*-* } { "*" } { "-O2" } } */
+/* { dg-additional-options "-foffload=-fdump-rtl-mach" } */
 
 #define n 1024
 
@@ -21,4 +21,4 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-assembler-not "bar.sync" } } */
+/* { dg-final { scan-offload-rtl-dump-not "nvptx_barsync" "mach" } } */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381.c
index 2864dfcf3cb1..eda87743625b 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85381.c
@@ -1,6 +1,6 @@
-/* { dg-additional-options "-save-temps" } */
 /* { dg-do run { target openacc_nvidia_accel_selected } }
    { dg-skip-if "" { *-*-* } { "*" } { "-O2" } } */
+/* { dg-additional-options "-foffload=-fdump-rtl-mach" } */
 
 int
 main (void)
@@ -15,4 +15,4 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-assembler-not "bar.sync" } } */
+/* { dg-final { scan-offload-rtl-dump-not "nvptx_barsync" "mach" } } */
-- 
2.27.0

Reply via email to