Long standing issue but as top-level 'asm' statement were rare, it did not show 
up.
However, the fix for PR108969 in commit r14-321-g9a41d2cdbcd added code

+#elif defined(_GLIBCXX_SYMVER_GNU)
+  __extension__ __asm (".globl _ZSt21ios_base_library_initv");
q
libstdc++-v3/include/std/iostream. This was then duly written by the 
offloading-device
lto1 for digestion by the device-target assembler. While the llvm-mc linker 
user by
GCN did accept .globl, nvptx's ptxas did choke on it.

Additionally, as the assembly was already written for offloading, the output was
lost on the host when using LTO for not only for offload but for real (i.e. 
with -flto).

Has someone an idea how to check whether the offloading-code assembler does not
contain the _ZSt21ios_base_library_initv while the host-side (before or after 
LTO)
should contain it, but only with _GLIBCXX_SYMVER_GNU?
Otherwise, the testcase tests only and at least whether it breaks with nvptx
as ptxas does not like the symbol.

* * *

Tested (manually + running the OvO and sollve-testsuite) on x86-64-gnu-linux 
with nvptx
offloading and with "make check -k" on x86-64-gnu-linux, albeit without 
offloading configured.
The installed-build regtesting of "make check-target-libgomp" seems to be 
currently broken
as it does run all checking code (check_effective_target...) but does not seem 
to find
any actual testcase to be run, probably a side effect of the recent testsuite 
changes.

OK for mainline and GCC 13?

Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
LTO: Fix writing of toplevel asm with offloading [PR109816]

When offloading was enabled, top-level 'asm' were added to the offloading section,
confusing assemblers which did not support the syntax. Additionally, with offloading
and -flto, the top-level assembler code did not end up in the host files.

As r14-321-g9a41d2cdbcd added top-level 'asm' to some libstdc++ header files, the issue became
more apparent, causing fails with nvptx for C++ testcases.

	PR libstdc++/109816

gcc/ChangeLog:
	* lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by
	'!lto_stream_offload_p'.

libgomp/ChangeLog:

	* testsuite/libgomp.c++/target-map-class-1.C: New test.
	* testsuite/libgomp.c++/target-map-class-2.C: New test.

 gcc/lto-cgraph.cc                                  |  2 +-
 libgomp/testsuite/libgomp.c++/target-map-class-1.C | 98 ++++++++++++++++++++++
 libgomp/testsuite/libgomp.c++/target-map-class-2.C |  6 ++
 3 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/gcc/lto-cgraph.cc b/gcc/lto-cgraph.cc
index 805c785..aed5e9d 100644
--- a/gcc/lto-cgraph.cc
+++ b/gcc/lto-cgraph.cc
@@ -1020,7 +1020,7 @@ output_symtab (void)
      When doing WPA we must output every asm just once.  Since we do not partition asm
      nodes at all, output them to first output.  This is kind of hack, but should work
      well.  */
-  if (!asm_nodes_output)
+  if (!asm_nodes_output && !lto_stream_offload_p)
     {
       asm_nodes_output = true;
       lto_output_toplevel_asms ();
diff --git a/libgomp/testsuite/libgomp.c++/target-map-class-1.C b/libgomp/testsuite/libgomp.c++/target-map-class-1.C
new file mode 100644
index 0000000..ad4802d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c++/target-map-class-1.C
@@ -0,0 +1,98 @@
+/* PR middle-end/109816  */
+
+/* This variant: without -flto, see target-map-class-2.C for -flto. */
+
+/* iostream.h adds 'globl _ZSt21ios_base_library_initv' with _GLIBCXX_SYMVER_GNU,
+   but it shouldn't end up in the offload assembly but only in the host assembly. */
+
+/* Example based on sollve_vv's test_target_data_map_classes.cpp; however,
+   relevant is only the 'include' and not the actual executable code.  */
+
+#include <iostream>
+#include <omp.h>
+
+using namespace std;
+
+#define N 1000
+
+struct A
+{
+  int *h_array;
+  int size, sum;
+
+  A (int *array, const int s) : h_array(array), size(s), sum(0) { }
+  ~A() { h_array = NULL; }
+};
+
+void
+test_map_tofrom_class_heap ()
+{
+  int *array = new int[N];
+  A *obj = new A (array, N);
+
+  #pragma omp target map(from: array[:N]) map(tofrom: obj[:1])
+    {
+      int *tmp_h_array = obj->h_array;
+      obj->h_array = array;
+      int tmp = 0;
+      for (int i = 0; i < N; ++i)
+	{
+	  obj->h_array[i] = 4*i;
+	  tmp += 3;
+	}
+      obj->h_array = tmp_h_array;
+      obj->sum = tmp;
+    }
+
+  for (int i = 0; i < N; ++i)
+    if (obj->h_array[i] != 4*i)
+      __builtin_abort ();
+
+  if (3*N != obj->sum)
+    {
+      std::cout << "sum: " << obj->sum << std::endl;
+      __builtin_abort ();
+    }
+
+  delete obj;
+  delete[] array;
+}
+
+void
+test_map_tofrom_class_stack ()
+{
+  int array[N];
+  A obj(array, N);
+
+  #pragma omp target map(from: array[:N]) map(tofrom: obj)
+    {
+      int *tmp_h_array = obj.h_array;
+      obj.h_array = array;
+      int tmp = 0;
+      for (int i = 0; i < N; ++i)
+	{
+	  obj.h_array[i] = 7*i;
+	  tmp += 5;
+	}
+      obj.h_array = tmp_h_array;
+      obj.sum = tmp;
+    }
+
+  for (int i = 0; i < N; ++i)
+    if (obj.h_array[i] != 7*i)
+      __builtin_abort ();
+
+  if (5*N != obj.sum)
+    {
+      std::cout << "sum: " << obj.sum << std::endl;
+      __builtin_abort ();
+    }
+}
+
+int
+main()
+{
+  test_map_tofrom_class_heap();
+  test_map_tofrom_class_stack();
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c++/target-map-class-2.C b/libgomp/testsuite/libgomp.c++/target-map-class-2.C
new file mode 100644
index 0000000..1ef20f7
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c++/target-map-class-2.C
@@ -0,0 +1,6 @@
+/* { dg-additional-options "-flto" }  */
+/* PR middle-end/109816  */
+
+/* This variant: with -flto, see target-map-class-1.C for without -flto. */
+
+#include "target-map-class-1.C"

Reply via email to