[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-04-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

Thomas Schwinge  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |15.0
 Resolution|--- |FIXED

--- Comment #13 from Thomas Schwinge  ---
This code now works with (the upcoming) GCC 15, without requiring optimizations
enabled, or host-side '-fno-exceptions'.  Tested host, Nvidia GPU offloading,
AMD GPU offloading.

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-04-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

--- Comment #12 from GCC Commits  ---
The trunk branch has been updated by Thomas Schwinge :

https://gcc.gnu.org/g:fe283dba774be57b705a7a871b000d2894d2e553

commit r15-9470-gfe283dba774be57b705a7a871b000d2894d2e553
Author: Thomas Schwinge 
Date:   Fri Mar 28 09:20:49 2025 +0100

GCN, nvptx: Support '-mfake-exceptions', and use it for offloading
compilation [PR118794]

With '-mfake-exceptions' enabled, the user-visible behavior in presence of
exception handling constructs changes such that the compile-time
'sorry, unimplemented: exception handling not supported' is skipped, code
generation proceeds, and instead, exception handling constructs 'abort' at
run time.  (..., or don't, if they're in dead code.)

PR target/118794
gcc/
* config/gcn/gcn.opt (-mfake-exceptions): Support.
* config/nvptx/nvptx.opt (-mfake-exceptions): Likewise.
* config/gcn/gcn.md (define_expand "exception_receiver"): Use it.
* config/nvptx/nvptx.md (define_expand "exception_receiver"):
Likewise.
* config/gcn/mkoffload.cc (main): Set it.
* config/nvptx/mkoffload.cc (main): Likewise.
* config/nvptx/nvptx.cc (nvptx_assemble_integer)
: Special handling for
'SYMBOL_REF's.
* except.cc (expand_dw2_landing_pad_for_region): Don't generate
bogus code for (default)
'#define EH_RETURN_DATA_REGNO(N) INVALID_REGNUM'.
libgcc/
* config/gcn/unwind-gcn.c (_Unwind_Resume): New.
* config/nvptx/unwind-nvptx.c (_Unwind_Resume): Likewise.
gcc/testsuite/
* g++.target/gcn/exceptions-bad_cast-2.C: Set
'-mno-fake-exceptions'.
* g++.target/gcn/exceptions-pr118794-1.C: Likewise.
* g++.target/gcn/exceptions-throw-2.C: Likewise.
* g++.target/nvptx/exceptions-bad_cast-2.C: Likewise.
* g++.target/nvptx/exceptions-pr118794-1.C: Likewise.
* g++.target/nvptx/exceptions-throw-2.C: Likewise.
* g++.target/gcn/exceptions-bad_cast-2_-mfake-exceptions.C: New.
* g++.target/gcn/exceptions-pr118794-1_-mfake-exceptions.C:
Likewise.
* g++.target/gcn/exceptions-throw-2_-mfake-exceptions.C: Likewise.
* g++.target/nvptx/exceptions-bad_cast-2_-mfake-exceptions.C:
Likewise.
* g++.target/nvptx/exceptions-pr118794-1_-mfake-exceptions.C:
Likewise.
* g++.target/nvptx/exceptions-throw-2_-mfake-exceptions.C:
Likewise.
libgomp/
*
testsuite/libgomp.c++/target-exceptions-bad_cast-2-offload-sorry-GCN.C:
Set '-foffload-options=-mno-fake-exceptions'.
*
testsuite/libgomp.c++/target-exceptions-bad_cast-2-offload-sorry-nvptx.C:
Likewise.
*
testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-GCN.C:
Likewise.
*
testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-nvptx.C:
Likewise.
*
testsuite/libgomp.c++/target-exceptions-throw-2-offload-sorry-GCN.C:
Likewise.
*
testsuite/libgomp.c++/target-exceptions-throw-2-offload-sorry-nvptx.C:
Likewise.
*
testsuite/libgomp.oacc-c++/exceptions-bad_cast-2-offload-sorry-GCN.C:
Likewise.
*
testsuite/libgomp.oacc-c++/exceptions-bad_cast-2-offload-sorry-nvptx.C:
Likewise.
*
testsuite/libgomp.oacc-c++/exceptions-throw-2-offload-sorry-GCN.C:
Likewise.
*
testsuite/libgomp.oacc-c++/exceptions-throw-2-offload-sorry-nvptx.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-bad_cast-2.C: Adjust.
* testsuite/libgomp.c++/target-exceptions-pr118794-1.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-throw-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2.C: Likewise.
* testsuite/libgomp.c++/target-exceptions-throw-2-O0.C: New.

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-04-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

--- Comment #11 from GCC Commits  ---
The trunk branch has been updated by Thomas Schwinge :

https://gcc.gnu.org/g:aa3e72f943032e5f074b2bd2fd06d130dda8760b

commit r15-9463-gaa3e72f943032e5f074b2bd2fd06d130dda8760b
Author: Thomas Schwinge 
Date:   Thu Mar 27 23:06:37 2025 +0100

Add test cases for exception handling constructs in dead code for GCN,
nvptx target and OpenMP 'target' offloading [PR118794]

PR target/118794
gcc/testsuite/
* g++.target/gcn/exceptions-pr118794-1.C: New.
* g++.target/nvptx/exceptions-pr118794-1.C: Likewise.
libgomp/
* testsuite/libgomp.c++/target-exceptions-pr118794-1.C: New.
*
testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-GCN.C:
Likewise.
*
testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-nvptx.C:
Likewise.

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-03-27 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed||2025-03-27
   Keywords||openacc, openmp
 Ever confirmed|0   |1
 Target|nvptx   |GCN, nvptx
   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org
 CC||tschwinge at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED

--- Comment #10 from Thomas Schwinge  ---
Indeed for '-O0' compilation, you currently need host-side '-fno-exceptions' to
prevent GCN as well as nvptx offloading compilation to fail with 'sorry,
unimplemented: exception handling not supported', which I'm aware we should
resolve properly.

With optimizations enabled, or '-O0 -fno-exceptions', this code now works with
(the upcoming) GCC 15: compiles, executes successfully (tested Nvidia GPU),
produces output (that I've not verified in detail, but have 'diff'ed to host
output: matches, apart from a few numeric differences for E-14 etc. numbers;
probably rounding differences).

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-02-10 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

Richard Biener  changed:

   What|Removed |Added

 Target||nvptx

--- Comment #9 from Richard Biener  ---
Hmm, try -fno-exceptions?  It might be because of SJLJ EH - what target are you
offloading to?  I suppose nvptx?

Try declaring sqrt as noexcept (or nothrow, don't know C++ enough off-head).

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-02-07 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

--- Comment #8 from Benjamin Schulz  ---
with this here it is satisfied:

  // Normalize v
T norm=0;
   // T norm=fabs(gpu_dot_product_w(v,v));

T normc= sqrt(norm);

  //  const T normc=norm;
#pragma omp parallel for
for (size_t i = 0; i < pext0; ++i)
{
v(i,pstrv0)= v(i,pstrv0)/normc;
}

However, gpu_dot_product_w is called before asT
dot_pr=gpu_dot_product_w(u,v);

which does not throw any nonlocal gotos. it is defined like this: 

template 
inline T gpu_dot_product_w(  const datastruct& vec1, const datastruct
&vec2)
{
const size_t n=vec1.pextents[0];
const size_t strv1=vec1.pstrides[0];
const size_t strv2=vec2.pstrides[0];
T result=0;


#pragma omp parallel for reduction(+:result)
for (size_t i = 0; i < n; ++i)
{
result += vec1(i,strv1) * vec2(i,strv2);
}
return result;
}

and the operators are these:

#pragma omp begin declare target
template
inline T& datastruct::operator()(const size_t row, const size_t stride)
{

return pdata[row * stride];
}
#pragma omp end declare target


none of this has anything to do with gotos pstrides are pointers to non-stl
arrays


i do not know what is going on  here...

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-02-07 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

--- Comment #7 from Benjamin Schulz  ---
with this here it is satisfied:

  // Normalize v
T norm=0;
   // T norm=fabs(gpu_dot_product_w(v,v));

T normc= sqrt(norm);

  //  const T normc=norm;
#pragma omp parallel for
for (size_t i = 0; i < pext0; ++i)
{
v(i,pstrv0)= v(i,pstrv0)/normc;
}

However, gpu_dot_product_w is called before asT
dot_pr=gpu_dot_product_w(u,v);

which does not throw any nonlocal gotos. it is defined like this: 

template 
inline T gpu_dot_product_w(  const datastruct& vec1, const datastruct
&vec2)
{
const size_t n=vec1.pextents[0];
const size_t strv1=vec1.pstrides[0];
const size_t strv2=vec2.pstrides[0];
T result=0;


#pragma omp parallel for reduction(+:result)
for (size_t i = 0; i < n; ++i)
{
result += vec1(i,strv1) * vec2(i,strv2);
}
return result;
}

and the operators are these:

#pragma omp begin declare target
template
inline T& datastruct::operator()(const size_t row, const size_t stride)
{

return pdata[row * stride];
}
#pragma omp end declare target


none of this has anything to do with gotos pstrides are pointers to non-stl
arrays


i do not know what is going on  here...

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-02-07 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

--- Comment #6 from Benjamin Schulz  ---
Hi thanks for the fast reply. Unfortunately none of these works...

(yes, putting in the -fno-math-errno option also raises this error, even if i
put it into -offload...

even if i try -foffload= -fno-math-errno the assert also does not work. and the
builtin unreachable option does also not work.

One problem is that there is no unsigned double in c++...

Strangely, not even this here compiles:

T norm=fabs(gpu_dot_product_w(v,v));

T normc= sqrt(norm);

  //  const T normc=norm;
#pragma omp parallel for
for (size_t i = 0; i < pext0; ++i)
{
v(i,pstrv0)= v(i,pstrv0)/normc;
}

Is there another division by zero problem?

I do not really know what is going on here..

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-02-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

--- Comment #4 from Andrew Pinski  ---
Try -fno-math-errno ? Or add:
[[assert(norm>=0)]];

or:
if (norm>=0)
  __builtin_unreachable();

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-02-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

--- Comment #5 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #4)
> Try -fno-math-errno ? Or add:
> [[assert(norm>=0)]];
> 
> or:
> if (norm>=0)
>   __builtin_unreachable();

Sorry the if statement should have been:
if (norm<0)
  __builtin_unreachable();

But you should get the idea.

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-02-07 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

--- Comment #3 from Benjamin Schulz  ---
gcc --version
gcc (Gentoo 14.2.1_p20241221 p7) 14.2.1 20241221

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-02-07 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

--- Comment #2 from Benjamin Schulz  ---
Created attachment 60424
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60424&action=edit
cmakelists.txt

[Bug target/118794] The attached c++ openmp offload code fails, because the c sqrt function makes nonlocal gotos..

2025-02-07 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118794

--- Comment #1 from Benjamin Schulz  ---
Created attachment 60423
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60423&action=edit
main.cpp