Re: [PATCH v4] libgo: Don't use pt_regs member in mcontext_t
On Sat, Apr 2, 2022 at 1:21 AM Sören Tempel wrote: > > Thanks for committing a first fix! Unfortunately, your changes don't > work on ppc64le musl since you are now still using .regs on ppc64le the > include of asm/ptrace.h (as added in the v1 of my patch) is missing. > Hence, your patch fails to compile on ppc64le musl with the following > error message: > > go-signal.c:230:63: error: invalid use of undefined type 'struct > pt_regs' > 230 | ret.sigpc = > ((ucontext_t*)(context))->uc_mcontext.regs->nip; > > If you want to continue using .regs on ppc64le an include of > asm/ptrace.h is needed since both glibc and musl declare `struct > pt_regs` as an incomplete type (with glibc asm/ptrace.h is included > indirectly by other headers used by go-signal.c it seems). > > See https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587520.html > > Would be nice if this could be fixed :) Sorry, I guess I misread your patch. What is the right standalone code for the PPC64 musl case? Thanks. Ian
[Bug fortran/105138] [7,8,9,10,11,12,F95] Bogus error when function name does not shadow an intrinsic when RESULT clause is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105138 --- Comment #8 from kargl at gcc dot gnu.org --- This patch fixes the error. The comment in the patch explains it. diff --git a/gcc/fortran/intrinsic.cc b/gcc/fortran/intrinsic.cc index 52e5f4ed39e..ec833667dbe 100644 --- a/gcc/fortran/intrinsic.cc +++ b/gcc/fortran/intrinsic.cc @@ -1167,6 +1167,11 @@ gfc_is_intrinsic (gfc_symbol* sym, int subroutine_flag, locus loc) || sym->attr.if_source == IFSRC_IFBODY) return false; + /* If the function has a result-name and it's recursive, it cannot be + an intrinsic subprogram. */ + if (sym->result && sym->attr.recursive) +return false; + if (subroutine_flag) isym = gfc_find_subroutine (sym->name); else
gcc-11-20220402 is now available
Snapshot gcc-11-20220402 is now available on https://gcc.gnu.org/pub/gcc/snapshots/11-20220402/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 11 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-11 revision 5f587c81bc558942d2988f5e2965a72471f5c202 You'll find: gcc-11-20220402.tar.xz Complete GCC SHA256=2da5aa7ae438b27987a1ccaf8f779cb4a9c1f7f5ab7f91ae5336aa112b54a985 SHA1=5c8ee76fedec6ac8112d8a351721ae26045489cf Diffs from 11-20220326 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-11 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
[Bug libquadmath/105101] incorrect rounding for sqrtq
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101 Michael_S changed: What|Removed |Added CC||already5chosen at yahoo dot com --- Comment #4 from Michael_S --- If you want quick fix for immediate shipment then you can take that: #include #include __float128 quick_and_dirty_sqrtq(__float128 x) { if (isnanq(x)) return x; if (x==0) return x; if (x < 0) return nanq(""); if (isinfq(x)) return x; int xExp; x = frexpq(x, ); if (xExp & 1) x *= 2.0; // x in [0.5:2.0) __float128 r = (__float128)(1.0/sqrt((double)x)); // r=rsqrt(x) estimate (53 bits) r *= 1.5 - r*r*x*0.5; // NR iteration improves precision of r to 105.4 bit __float128 y = x*r; // y=sqrt(x) estimate (105.4 bits) // extended-precision NR iteration __float128 yH = (double)y; __float128 yL = y - yH; __float128 deltaX = x - yH*yH; deltaX -= yH*yL*2; deltaX -= yL*yL; y += deltaX*r*0.5; // improve precision of y to ~210.2 bits. Not enough for perfect rounding, but not too bad return ldexpq(y, xExp >> 1); } It is very slow, even slower than what you have now, which by itself is quite astonishingly slow. It is also not sufficiently precise for correct rounding in all cases. But, at least, the worst error is something like (0.5+2**-98) ULP, so you are unlikely to be ever caught by black box type of testing. It's biggest advantage is extreme portability. Should run on all platforms where double==IEEE binary64 and __float128 == IEEE binary128. May be, few days later I'll have better variant for "good" 64-bit platforms i.e. for those where we have __int128. It would be 15-25 times faster than the variant above and rounding would be mathematically correct rather than just "impossible to be caught" like above. But it would not run everywhere. Also, I want to give it away under MIT or BSD license, rather than under GPL.
[Bug fortran/105138] [7,8,9,10,11,12,F95] Bogus error when function name does not shadow an intrinsic when RESULT clause is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105138 --- Comment #7 from kargl at gcc dot gnu.org --- It seems the problem is that gfortran does not know that a function name is local to its own scoping unit when a result-name is used. First, if the function is contained in a module it seems to work correctly. Consider, module foo implicit none contains recursive function log_gamma(z) result(res) complex,intent(in) :: z complex :: res complex x if (real(z) == 0) then res = 42 else x = (0,1) res = log_gamma(x) end if end function log_gamma end module foo program bar use foo implicit none complex z z = log_gamma(cmplx(2.,3.)) print *, z end program bar % gfcx -Wall -o z a.f90 a.f90:4:3: 4 |recursive function log_gamma(z) result(res) | 1 Warning: 'log_gamma' declared at (1) may shadow the intrinsic of the same name. In order to call the intrinsic, explicit INTRINSIC declarations may be required. [-Wintrinsic-shadow] % ./z (42.000,0.) Now, consider recursive function log_gamma(z) result(res) complex,intent(in) :: z complex :: res complex x if (real(z) == 0) then res = 42 else x = (0,1) res = log_gamma(x) end if end function log_gamma program bar implicit none complex z z = log_gamma(cmplx(2.,3.)) print *, z end program bar % gfcx -Wall -o z a.f90 a.f90:1:0: 1 | recursive function log_gamma(z) result(res) | Warning: 'log_gamma' declared at (1) is also the name of an intrinsic. It can only be called via an explicit interface or if declared EXTERNAL. [-Wintrinsic-shadow] a.f90:9:22: 9 | res = log_gamma(x) | 1 Error: 'x' argument of 'log_gamma' intrinsic at (1) must be REAL a.f90:16:17: 16 |z = log_gamma(cmplx(2.,3.)) | 1 Error: 'x' argument of 'log_gamma' intrinsic at (1) must be REAL I believe the first error is wrong. The local function name should block the intrinsic name. The second error is correct, because the recursive function requires an explicit interface in program bar and gfortran is picking up the intrinsic function. If the function is modified to recursive function log_gamma(z) result(res) complex,intent(in) :: z complex :: res complex x if (real(z) == 0) then res = 42 else block complex, external :: log_gamma x = (0,1) res = log_gamma(x) end block end if end function log_gamma then the first error message does not occur. The block...end block should not be required. The second error message remains as it should. If the program is modified to program bar implicit none complex z complex, external :: log_gamma z = log_gamma(cmplx(2.,3.)) print *, z end program bar or program bar implicit none interface recursive function log_gamma(z) result(res) complex,intent(in) :: z complex :: res end function log_gamma end interface complex z z = log_gamma(cmplx(2.,3.)) print *, z end program bar it compiles and runs with the block...end block modified function For completeness, if log_gamma() is a contained routine within the program it compiles and runs. program bar implicit none complex z z = log_gamma(cmplx(2.,3.)) print *, z contains recursive function log_gamma(z) result(res) complex,intent(in) :: z complex :: res complex x if (real(z) == 0) then res = 42 else x = (0,1) res = log_gamma(x) end if end function log_gamma end program bar % gfcx -Wall -o z a.f90 a.f90:7:4: 7 | recursive function log_gamma(z) result(res) |1 Warning: 'log_gamma' declared at (1) may shadow the intrinsic of the same name. In order to call the intrinsic, explicit INTRINSIC declarations may be required. [-Wintrinsic-shadow] %./z (42.000,0.)
[Bug tree-optimization/105139] New: GCC produces vmovw instruction with an incorrect argument for -O3 -march=sapphirerapids
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105139 Bug ID: 105139 Summary: GCC produces vmovw instruction with an incorrect argument for -O3 -march=sapphirerapids Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vsevolod.livinskiy at gmail dot com Target Milestone: --- Link to the Compiler Explorer: https://godbolt.org/z/9GTPqWfn8 It looks like GCC produced vmovw instruction with an incorrect argument (https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html) Reproducer: extern long c[]; extern int d[]; long a; long e(long f) { return f < a ? f : a; } void g() { for (signed b = 0; b < 4028643; b++) d[b] = e((char)(~c[b])); } Error: >$ g++ -O3 -march=sapphirerapids -c func.cpp /tmp/ccB2zLYr.s: Assembler messages: /tmp/ccB2zLYr.s:92: Error: operand type mismatch for `vmovw' gcc version 12.0.1 20220401 (git://gcc.gnu.org/git/gcc.git:master 15d683d4f0b390b27c54a7c92c6e4f33195bdc93) P.S. I'm not sure if "tree-optimization" is the correct classification for this fault
[Bug fortran/105138] [7,8,9,10,11,12,F95] Bogus error when function name does not shadow an intrinsic when RESULT clause is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105138 --- Comment #6 from anlauf at gcc dot gnu.org --- Workaround: module m interface LOG_GAMMA module procedure LOG_GAMMA_ end interface LOG_GAMMA contains RECURSIVE FUNCTION LOG_GAMMA_(Z) RESULT(RES) COMPLEX,INTENT(IN) :: Z COMPLEX :: RES RES = LOG_GAMMA_(Z) END FUNCTION LOG_GAMMA_ end module m
[Bug fortran/105138] [7,8,9,10,11,12,F95] Bogus error when function name does not shadow an intrinsic when RESULT clause is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105138 --- Comment #5 from Vladimir Fuka --- In that case some compiler or linker magic happens after that, because the correct code is executed.
[Bug fortran/105138] [7,8,9,10,11,12,F95] Bogus error when function name does not shadow an intrinsic when RESULT clause is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105138 --- Comment #4 from kargl at gcc dot gnu.org --- (In reply to Vladimir Fuka from comment #2) > As mentioned, the correct function is called in case everything is REAL > Actually, the correct function isn't called. See the parse tree that I posted for log_gamma. For your floor example, the parse tree contains code: IF (> floor:z 0) ASSIGN floor:res __convert_i4_r4[[((__floor4_r4[[(((- floor:z 1.e0)) ((arg not-present)))]]))]] ELSE ASSIGN floor:res 0 ENDIF Notice, the __floor4_r4 is the intrinsic routine not the user-defined floor.
[Bug fortran/105138] [7,8,9,10,11,12,F95] Bogus error when function name does not shadow an intrinsic when RESULT clause is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105138 kargl at gcc dot gnu.org changed: What|Removed |Added Priority|P3 |P4 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW CC||kargl at gcc dot gnu.org Target Milestone|--- |13.0 Last reconfirmed||2022-04-02 --- Comment #3 from kargl at gcc dot gnu.org --- F2018, 15.6.2.2, page 319 If RESULT appears, the name of the function result of the function is result-name and all occurrences of the function name in execution-part statements in its scope refer to the function itself. If I change COMPLEX to REAL in the first example, I get % gfcx -c -fdump-parse-tree Namespace: A-H: (REAL 4) I-N: (INTEGER 4) O-Z: (REAL 4) procedure name = log_gamma symtree: 'log_gamma' || symbol: 'log_gamma' type spec : (REAL 4) attributes: (PROCEDURE INTRINSIC-PROC FUNCTION RECURSIVE) result: res Formal arglist: z symtree: 'res' || symbol: 'res' type spec : (REAL 4) attributes: (VARIABLE RESULT) symtree: 'z' || symbol: 'z' type spec : (REAL 4) attributes: (VARIABLE DUMMY(IN)) code: ASSIGN log_gamma:res __lgamma_4[[((log_gamma:z))]] The attributes for log_gamma includes INTRINSIC-PROC, which is clearly wrong.
Re: -stdlib=libc++?
Hi Iain, May I ask why we need to specify --with-gxx-libcxx-include-dir= at compile/configure time of GCC? While in clang equivalent, -stdlib= doesn't require so. thanks, Shivam On Sat, Apr 2, 2022 at 7:32 PM Shivam Gupta wrote: > Hi Iain, > > Thank you for the quick response and the effort to make that feature > available. > > When I reconfigured/build GCC > with --with-gxx-libcxx-include-dir=/usr/include/c++/v1/ , -stdlib= option > is now available to take libc++. > > thanks, > Shivam. > > On Sat, Apr 2, 2022 at 3:21 PM Iain Sandoe wrote: > >> Hi Shivam, >> >> > On 2 Apr 2022, at 06:57, Shivam Gupta wrote: >> > >> > I saw your last year's mail for the same topic on the GCC mailing list - >> https://gcc.gnu.org/pipermail/gcc/2020-March/000230.html. >> >> The patch was applied to GCC-11 (so is available one GCC-11 branch and >> will be on GCC-12 when that is released). >> > >> > I tried today but this option is still not available. >> >> The option has to be configured when the compiler is built, that also >> means that you have to install (and point the configure to) a suitable set >> of libc++ headers from the LLVM project (e.g. there is a set here: >> https://github.com/iains/llvm-project/tree/9.0.1-gcc-stdlib). >> >> Generally, GCC is very compatible with the libc++ headers (the changes I >> made on that branch were mostly to deal with being in std:: for >> GCC and std::experimental:: for LLVM-9). For LLVM libc++ earlier than 9 >> there is a missing symbol that GCC uses - but that can be worked around too. >> >> There have been some changes in more recent (in particular, LLVM-14/main) >> libc++ that should make it more compatible. >> >> Of course, you should pick a version of the libc++ headers than matches >> the version used on your system (9 was used for quite a long time, but >> recent xcode headers are newer). >> >> Given that this involves cross-project sources and choosing a suitable >> set, probably it is a job for the distributions (e.g. homebrew, macports >> etc) to arrange or, for self-built compilers, following in the general >> comments above. >> >> FWIW, I have used this to build quite a few OSS projects on a number of >> Darwin versions (hence the comment about GCC being very compatible with >> libc++). >> >> thanks, >> Iain. >> >>
[Bug fortran/105138] [7,8,9,10,11,12,F95] Bogus error when function name does not shadow an intrinsic when RESULT clause is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105138 --- Comment #2 from Vladimir Fuka --- As mentioned, the correct function is called in case everything is REAL a = floor(5.0) print *, a contains RECURSIVE FUNCTION FLOOR(Z) RESULT(RES) REAL,INTENT(IN) :: Z REAL :: RES if (z>0) then RES = FLOOR(Z - 1) else RES = 0 end if END FUNCTION FLOOR end > gfortran-12 shadow.f90 > ./a.out 0.
[Bug c++/104865] Wrong code for conditional expression on VAX or with -ffast-math
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104865 --- Comment #7 from Maciej W. Rozycki --- Well, it's not clear to me whether the reserved operand as defined by the VAX floating-point architecture ought be considered an sNaN given that there is no qNaN. Also a reserved operand causes a fault with any FP instruction, even data moves (though one can move a reserved operand bit pattern with an integer move of the right width, observing that there is a single register file for both integer and FP arithmetic, and that of course FP operations can be directly performed on memory as well). In any case there's probably more than one bug here.
[Bug fortran/105138] [7,8,9,10,11,12,F95] Bogus error when function name does not shadow an intrinsic when RESULT clause is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105138 --- Comment #1 from Vladimir Fuka --- For after naming LOG one gets RECURSIVE FUNCTION LOG(Z) RESULT(RES) COMPLEX,INTENT(IN) :: Z COMPLEX :: RES RES = LOG(Z); END FUNCTION LOG > gfortran-12 -c shadow.f90 /tmp/ccbpyhxl.s: Assembler messages: /tmp/ccbpyhxl.s:3: Error: junk at end of line, first unrecognized character is `(' /tmp/ccbpyhxl.s:4: Error: unrecognized symbol type "" /tmp/ccbpyhxl.s:4: Error: junk at end of line, first unrecognized character is `(' /tmp/ccbpyhxl.s:5: Error: invalid character '(' in mnemonic /tmp/ccbpyhxl.s:36: Error: expected comma after name `__' in .size directive
[Bug fortran/105138] New: [7,8,9,10,11,12,F95] Bogus error when function name does not shadow an intrinsic when RESULT clause is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105138 Bug ID: 105138 Summary: [7,8,9,10,11,12,F95] Bogus error when function name does not shadow an intrinsic when RESULT clause is used Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: vladimir.fuka at gmail dot com Target Milestone: --- Reported at Stackoverflow by Denis Cousineau https://stackoverflow.com/questions/71718480/gfortran-compiler-error-for-a-code-from-reputable-source/71718729?noredirect=1#comment126746262_71718729 RECURSIVE FUNCTION LOG_GAMMA(Z) RESULT(RES) COMPLEX,INTENT(IN) :: Z COMPLEX :: RES RES = LOG_GAMMA(Z); END FUNCTION LOG_GAMMA > gfortran-12 shadow.f90 shadow.f90:4:18: 4 | RES = LOG_GAMMA(Z); | 1 Error: ‘x’ argument of ‘log_gamma’ intrinsic at (1) must be REAL When the argument type agrees, the correct function is called.
[Bug c++/104865] Wrong code for conditional expression on VAX or with -ffast-math
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104865 --- Comment #6 from Jonathan Wakely --- OK, maybe I should not have used __builtin_nan in the test. The bug is in the rest of the code though, isn't it? Replace the __builtin_nan with a function returning the same sNaN, does the test still fail? (I can't check myself right now).
[Bug c++/104865] Wrong code for conditional expression on VAX or with -ffast-math
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104865 --- Comment #5 from Maciej W. Rozycki --- Wrong question then. Should `__builtin_nan' even compile on non-IEEE-754 FP targets that don't have a qNaN? And I'll reply to myself. According to our manual: "-- Built-in Function: double __builtin_nan (const char *str) This is an implementation of the ISO C99 function 'nan'." and then according to ISO C99: "The nan functions return a quiet NaN, if available, with content indicated through tagp. If the implementation does not support quiet NaNs, the functions return zero." so firstly __builtin_isnan(__builtin_nan("")) is supposed to return 0 with the VAX target (because obviously 0.0 is not a NaN), and secondly the compiled program is wrong as `_ZL3nan' is supposed to be set to all-zeros (which is the representation of 0.0 datum with the VAX floating-point format), and then `c1' and `c2' must likewise be both 0. Both asserts are supposed to fail with the VAX target then (and similarly PDP-11, which has a similar FP format).
[Bug c++/104865] Wrong code for conditional expression on VAX or with -ffast-math
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104865 --- Comment #4 from Jonathan Wakely --- They can still have NaNs.
Re: [PATCH] wwwdocs: fedora-devel-list archives changes
On Tue, 15 Mar 2022, Jonathan Wakely wrote: >> It appears redhat.com has lost Fedora mailing list archives, which are >> now at lists.fedoraproject.org using completely different tooling. >> >>Jakub, is there a better way than the patch below? > This looks right to me, I don't think there's a better way to link to > those archives. Thank you, Jonathan. I now pushed my patch. Gerald
Re: -stdlib=libc++?
Hi Iain, Thank you for the quick response and the effort to make that feature available. When I reconfigured/build GCC with --with-gxx-libcxx-include-dir=/usr/include/c++/v1/ , -stdlib= option is now available to take libc++. thanks, Shivam. On Sat, Apr 2, 2022 at 3:21 PM Iain Sandoe wrote: > Hi Shivam, > > > On 2 Apr 2022, at 06:57, Shivam Gupta wrote: > > > > I saw your last year's mail for the same topic on the GCC mailing list - > https://gcc.gnu.org/pipermail/gcc/2020-March/000230.html. > > The patch was applied to GCC-11 (so is available one GCC-11 branch and > will be on GCC-12 when that is released). > > > > I tried today but this option is still not available. > > The option has to be configured when the compiler is built, that also > means that you have to install (and point the configure to) a suitable set > of libc++ headers from the LLVM project (e.g. there is a set here: > https://github.com/iains/llvm-project/tree/9.0.1-gcc-stdlib). > > Generally, GCC is very compatible with the libc++ headers (the changes I > made on that branch were mostly to deal with being in std:: for > GCC and std::experimental:: for LLVM-9). For LLVM libc++ earlier than 9 > there is a missing symbol that GCC uses - but that can be worked around too. > > There have been some changes in more recent (in particular, LLVM-14/main) > libc++ that should make it more compatible. > > Of course, you should pick a version of the libc++ headers than matches > the version used on your system (9 was used for quite a long time, but > recent xcode headers are newer). > > Given that this involves cross-project sources and choosing a suitable > set, probably it is a job for the distributions (e.g. homebrew, macports > etc) to arrange or, for self-built compilers, following in the general > comments above. > > FWIW, I have used this to build quite a few OSS projects on a number of > Darwin versions (hence the comment about GCC being very compatible with > libc++). > > thanks, > Iain. > >
[Bug middle-end/105137] New: Missed optimization 64-bit adds and shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105137 Bug ID: 105137 Summary: Missed optimization 64-bit adds and shifts Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: andre.schackier at gmail dot com Target Milestone: --- Given the following source code [godbolt](https://godbolt.org/z/8KMMhefqY) #include typedef __int128_t int128_t; int64_t foo(int128_t a, int64_t b, int cond) { if (cond) { a += ((int128_t)b) << 64; } return a >> 64; } int64_t bar(int128_t a, int64_t b, int cond) { int64_t r = a >> 64; if (cond) { r += b; } return r; } Compiling with "-O3" we get: foo: mov rax, rsi mov rsi, rdi mov rdi, rax testecx, ecx je .L2 xor r8d, r8d add rsi, r8 adc rdi, rdx .L2: mov rax, rdi ret bar: add rdx, rsi mov rax, rsi testecx, ecx cmovne rax, rdx ret Although both functions do the same, gcc implements worse code for foo. Credits: This was entirely found by Trevor Spiteri reported at the llvm-project here: https://github.com/llvm/llvm-project/issues/54718
[Bug c++/104865] Wrong code for conditional expression on VAX or with -ffast-math
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104865 Maciej W. Rozycki changed: What|Removed |Added CC||macro at orcam dot me.uk --- Comment #3 from Maciej W. Rozycki --- Should `__builtin_nan' even compile on non-IEEE-754 FP targets?
[Bug middle-end/105136] New: [11/12] Missed optimization regression with 32-bit adds and shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105136 Bug ID: 105136 Summary: [11/12] Missed optimization regression with 32-bit adds and shifts Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: andre.schackier at gmail dot com Target Milestone: --- Given the following source code [godbolt](https://godbolt.org/z/daTxMYWKo) #include int32_t foo(int64_t a, int32_t b, int cond) { if (cond) { a += ((int64_t)b) << 32; } return a >> 32; } int32_t bar(int64_t a, int32_t b, int cond) { int32_t r = a >> 32; if (cond) { r += b; } return r; } and compiling with "-O3" we get the following assembly: foo: sal rsi, 32 mov rax, rdi add rax, rsi testedx, edx cmove rax, rdi shr rax, 32 ret bar: sar rdi, 32 add esi, edi testedx, edx mov eax, esi cmove eax, edi ret With gcc-10.3 we get for bar: bar: sar rdi, 32 testedx, edx lea eax, [rsi+rdi] cmove eax, edi ret Also note that neither versions recognize that foo does the same as bar. Credits: This was entirely found by Trevor Spiteri reported at the llvm-project here: https://github.com/llvm/llvm-project/issues/54718
[committed] wwwdocs: gcc-11: Switch from to using ids.
I - or rather the w3 validator :) - realized that the use of is deprecated, so use id attributes instead. Gerald --- htdocs/gcc-11/changes.html | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html index 8e6d4ec8..c9eb2295 100644 --- a/htdocs/gcc-11/changes.html +++ b/htdocs/gcc-11/changes.html @@ -1097,7 +1097,7 @@ is built as FAT libraries containing both 32 bit and 64 bit objects. -GCC 11.1 +GCC 11.1 This is the https://gcc.gnu.org/bugzilla/buglist.cgi?bug_status=RESOLVEDresolution=FIXEDtarget_milestone=11.0;>list of problem reports (PRs) from GCC's bug tracking system that are @@ -1106,7 +1106,7 @@ complete (that is, it is possible that some PRs that have been fixed are not listed here). -GCC 11.2 +GCC 11.2 This is the https://gcc.gnu.org/bugzilla/buglist.cgi?bug_status=RESOLVEDresolution=FIXEDtarget_milestone=11.2;>list of problem reports (PRs) from GCC's bug tracking system that are @@ -1116,7 +1116,7 @@ are not listed here). -GCC 11.3 +GCC 11.3 Target Specific Changes -- 2.35.1
Re: [PATCH 4/5] openmp: Use libgomp memory allocation functions with unified shared memory.
On 02/04/2022 13:04, Andrew Stubbs wrote: This additional patch adds transformation for omp_target_alloc. The OpenMP 5.0 document says that addresses allocated this way needs to work without is_device_ptr. The easiest way to make that work is to make them USM addresses. Actually, reading on, it says "Every device address allocated through OpenMP device memory routines is a valid host pointer", so USM is the correct answer. Andrew
Re: [PATCH 4/5] openmp: Use libgomp memory allocation functions with unified shared memory.
On 08/03/2022 11:30, Hafiz Abid Qadeer wrote: This patches changes calls to malloc/free/calloc/realloc and operator new to memory allocation functions in libgomp with allocator=ompx_unified_shared_mem_alloc. This additional patch adds transformation for omp_target_alloc. The OpenMP 5.0 document says that addresses allocated this way needs to work without is_device_ptr. The easiest way to make that work is to make them USM addresses. I will commit this to OG11 shortly. Andrewopenmp: Do USM transform for omp_target_alloc OpenMP 5.0 says that omp_target_alloc should return USM addresses. gcc/ChangeLog: * omp-low.c (usm_transform): Transform omp_target_alloc and omp_target_free. libgomp/ChangeLog: * testsuite/libgomp.c/usm-6.c: Add omp_target_alloc. gcc/testsuite/ChangeLog: * c-c++-common/gomp/usm-2.c: Add omp_target_alloc. * c-c++-common/gomp/usm-3.c: Add omp_target_alloc. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 4e8ab9e4ca0..9235eafd1d7 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -15880,7 +15880,8 @@ usm_transform (gimple_stmt_iterator *gsi_p, bool *, if ((strcmp (name, "malloc") == 0) || (fndecl_built_in_p (fndecl, BUILT_IN_NORMAL) && DECL_FUNCTION_CODE (fndecl) == BUILT_IN_MALLOC) -|| DECL_IS_REPLACEABLE_OPERATOR_NEW_P (fndecl)) +|| DECL_IS_REPLACEABLE_OPERATOR_NEW_P (fndecl) +|| strcmp (name, "omp_target_alloc") == 0) { tree omp_alloc_type = build_function_type_list (ptr_type_node, size_type_node, @@ -15952,7 +15953,8 @@ usm_transform (gimple_stmt_iterator *gsi_p, bool *, || (fndecl_built_in_p (fndecl, BUILT_IN_NORMAL) && DECL_FUNCTION_CODE (fndecl) == BUILT_IN_FREE) || (DECL_IS_OPERATOR_DELETE_P (fndecl) - && DECL_IS_REPLACEABLE_OPERATOR (fndecl))) + && DECL_IS_REPLACEABLE_OPERATOR (fndecl)) + || strcmp (name, "omp_target_free") == 0) { tree omp_free_type = build_function_type_list (void_type_node, ptr_type_node, diff --git a/gcc/testsuite/c-c++-common/gomp/usm-2.c b/gcc/testsuite/c-c++-common/gomp/usm-2.c index 64dbb6be131..8c20ef94e69 100644 --- a/gcc/testsuite/c-c++-common/gomp/usm-2.c +++ b/gcc/testsuite/c-c++-common/gomp/usm-2.c @@ -12,6 +12,8 @@ void *aligned_alloc (__SIZE_TYPE__, __SIZE_TYPE__); void *calloc(__SIZE_TYPE__, __SIZE_TYPE__); void *realloc(void *, __SIZE_TYPE__); void free (void *); +void *omp_target_alloc (__SIZE_TYPE__, int); +void omp_target_free (void *, int); #ifdef __cplusplus } @@ -24,16 +26,21 @@ foo () void *p2 = realloc(p1, 30); void *p3 = calloc(4, 15); void *p4 = aligned_alloc(16, 40); + void *p5 = omp_target_alloc(50, 1); free (p2); free (p3); free (p4); + omp_target_free (p5, 1); } /* { dg-final { scan-tree-dump-times "omp_alloc \\(20, 10\\)" 1 "usm_transform" } } */ /* { dg-final { scan-tree-dump-times "omp_realloc \\(.*, 30, 10, 10\\)" 1 "usm_transform" } } */ /* { dg-final { scan-tree-dump-times "omp_calloc \\(4, 15, 10\\)" 1 "usm_transform" } } */ /* { dg-final { scan-tree-dump-times "omp_aligned_alloc \\(16, 40, 10\\)" 1 "usm_transform" } } */ -/* { dg-final { scan-tree-dump-times "omp_free" 3 "usm_transform" } } */ +/* { dg-final { scan-tree-dump-times "omp_alloc \\(50, 10\\)" 1 "usm_transform" } } */ +/* { dg-final { scan-tree-dump-times "omp_free" 4 "usm_transform" } } */ /* { dg-final { scan-tree-dump-not " free" "usm_transform" } } */ /* { dg-final { scan-tree-dump-not " aligned_alloc" "usm_transform" } } */ /* { dg-final { scan-tree-dump-not " malloc" "usm_transform" } } */ +/* { dg-final { scan-tree-dump-not " omp_target_alloc" "usm_transform" } } */ +/* { dg-final { scan-tree-dump-not " omp_target_free" "usm_transform" } } */ diff --git a/gcc/testsuite/c-c++-common/gomp/usm-3.c b/gcc/testsuite/c-c++-common/gomp/usm-3.c index 934582ea5fd..2b0cbb45e27 100644 --- a/gcc/testsuite/c-c++-common/gomp/usm-3.c +++ b/gcc/testsuite/c-c++-common/gomp/usm-3.c @@ -10,6 +10,8 @@ void *aligned_alloc (__SIZE_TYPE__, __SIZE_TYPE__); void *calloc(__SIZE_TYPE__, __SIZE_TYPE__); void *realloc(void *, __SIZE_TYPE__); void free (void *); +void *omp_target_alloc (__SIZE_TYPE__, int); +void omp_target_free (void *, int); #ifdef __cplusplus } @@ -22,16 +24,21 @@ foo () void *p2 = realloc(p1, 30); void *p3 = calloc(4, 15); void *p4 = aligned_alloc(16, 40); + void *p5 = omp_target_alloc(50, 1); free (p2); free (p3); free (p4); + omp_target_free (p5, 1); } /* { dg-final { scan-tree-dump-times "omp_alloc \\(20, 10\\)" 1 "usm_transform" } } */ /* { dg-final { scan-tree-dump-times "omp_realloc \\(.*, 30, 10, 10\\)" 1 "usm_transform" } } */ /* { dg-final { scan-tree-dump-times
Re: [PATCH] mips: Fix an ICE caused by r12-7962
On Sat, Apr 02, 2022 at 06:53:55PM +0800, Xi Ruoyao wrote: > I made a mistake in r12-7962 and it causes an ICE running g++.dg-struct- > layout-1 tests. The fix and a reduced test are included in this patch. > Ok for trunk? > > > > DECL_SIZE(x) is NULL if x is a flexible array member, but I forgot to > check it in r12-7962. Then if we increase the size of a struct with > flexible array member (by using aligned attribute), the code will > dereference NULL trying to use the "size" of the flexible array member. > > gcc/ > > * config/mips/mips.cc (mips_function_arg): Check if DECL_SIZE is > NULL before dereferencing it. > > gcc/testsuite/ > > * gcc.target/mips/pr102024-4.c: New test. Ok, sorry for not catching that. All other targets guard such integer_zerop (DECL_SIZE (...)) uses with either DECL_SIZE (...) != NULL_TREE or DECL_BIT_FIELD, so this is the only such spot. > diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc > index a6dd1e9e7b6..079bb03968a 100644 > --- a/gcc/config/mips/mips.cc > +++ b/gcc/config/mips/mips.cc > @@ -6082,7 +6082,8 @@ mips_function_arg (cumulative_args_t cum_v, const > function_arg_info ) >an ABI change. */ > if (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field)) > continue; > - if (integer_zerop (DECL_SIZE (field))) > + if (DECL_SIZE (field) > + && integer_zerop (DECL_SIZE (field))) > { > zero_width_field_abi_change = true; > continue; > diff --git a/gcc/testsuite/gcc.target/mips/pr102024-4.c > b/gcc/testsuite/gcc.target/mips/pr102024-4.c > new file mode 100644 > index 000..2147cc769d0 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/mips/pr102024-4.c > @@ -0,0 +1,10 @@ > +// { dg-do compile } > +// { dg-options "-mabi=64 -mhard-float" } > + > +struct __attribute__((aligned(16))) test { > + int x[0]; > + double b; > + int f[]; > +}; > + > +void check(struct test) {} // { dg-message "the ABI for passing a value > containing zero-width fields before an adjacent 64-bit floating-point field > was changed in GCC 12.1" } > -- > 2.35.1 > Jakub
[PATCH] mips: Fix an ICE caused by r12-7962
I made a mistake in r12-7962 and it causes an ICE running g++.dg-struct- layout-1 tests. The fix and a reduced test are included in this patch. Ok for trunk? DECL_SIZE(x) is NULL if x is a flexible array member, but I forgot to check it in r12-7962. Then if we increase the size of a struct with flexible array member (by using aligned attribute), the code will dereference NULL trying to use the "size" of the flexible array member. gcc/ * config/mips/mips.cc (mips_function_arg): Check if DECL_SIZE is NULL before dereferencing it. gcc/testsuite/ * gcc.target/mips/pr102024-4.c: New test. --- gcc/config/mips/mips.cc| 3 ++- gcc/testsuite/gcc.target/mips/pr102024-4.c | 10 ++ 2 files changed, 12 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/mips/pr102024-4.c diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc index a6dd1e9e7b6..079bb03968a 100644 --- a/gcc/config/mips/mips.cc +++ b/gcc/config/mips/mips.cc @@ -6082,7 +6082,8 @@ mips_function_arg (cumulative_args_t cum_v, const function_arg_info ) an ABI change. */ if (DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field)) continue; - if (integer_zerop (DECL_SIZE (field))) + if (DECL_SIZE (field) + && integer_zerop (DECL_SIZE (field))) { zero_width_field_abi_change = true; continue; diff --git a/gcc/testsuite/gcc.target/mips/pr102024-4.c b/gcc/testsuite/gcc.target/mips/pr102024-4.c new file mode 100644 index 000..2147cc769d0 --- /dev/null +++ b/gcc/testsuite/gcc.target/mips/pr102024-4.c @@ -0,0 +1,10 @@ +// { dg-do compile } +// { dg-options "-mabi=64 -mhard-float" } + +struct __attribute__((aligned(16))) test { + int x[0]; + double b; + int f[]; +}; + +void check(struct test) {} // { dg-message "the ABI for passing a value containing zero-width fields before an adjacent 64-bit floating-point field was changed in GCC 12.1" } -- 2.35.1
[Bug libstdc++/105128] source_location compile error for latest clang 15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105128 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #5 from Jakub Jelinek --- Fixed.
[Bug libstdc++/105128] source_location compile error for latest clang 15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105128 --- Comment #4 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:2a82301d409d3aa0e0b3b884e4c6daeaa0486d6b commit r12-7968-g2a82301d409d3aa0e0b3b884e4c6daeaa0486d6b Author: Jakub Jelinek Date: Sat Apr 2 12:49:38 2022 +0200 libstdc++: Tweak source_location for clang trunk [PR105128] Apparently clang trunk implemented __builtin_source_location(), but the using __builtin_ret_type = decltype(__builtin_source_location()); which has been added for it isn't enough, they also need the std::source_location::__impl class to be defined (but incomplete seems to be good enough) before the builtin is used. The following has been tested on godbolt with clang trunk (old version fails with error: 'std::source_location::__impl' was not found; it must be defined before '__builtin_source_location' is called and some follow-up errors), getting back to just void * instead of __builtin_ret_type and commenting out using doesn't work either and just struct __impl; before using __builtin_ret_type doesn't work too. 2022-04-02 Jakub Jelinek PR libstdc++/105128 * include/std/source_location (std::source_location::__impl): Move definition before using __builtin_ret_type.
Re: [PATCH] libstdc++: Tweak source_location for clang trunk [PR105128]
On Sat, 2 Apr 2022, 10:32 Jakub Jelinek via Libstdc++, < libstd...@gcc.gnu.org> wrote: > Hi! > > Apparently clang trunk implemented __builtin_source_location(), but the > using __builtin_ret_type = decltype(__builtin_source_location()); > which has been added for it isn't enough, they also need the > std::source_location::__impl class to be defined (but incomplete seems > to be good enough) before the builtin is used. > > The following has been tested on godbolt with clang trunk (old version > fails with > error: 'std::source_location::__impl' was not found; it must be defined > before '__builtin_source_location' is called > and some follow-up errors), getting back to just void * instead of > __builtin_ret_type and commenting out using doesn't work either and > just struct __impl; before using __builtin_ret_type doesn't work too. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > OK, thanks. > 2022-04-02 Jakub Jelinek > > PR libstdc++/105128 > * include/std/source_location (std::source_location::__impl): Move > definition before using __builtin_ret_type. > > --- libstdc++-v3/include/std/source_location2022-02-25 > 10:46:53.275178858 +0100 > +++ libstdc++-v3/include/std/source_location2022-04-01 > 19:36:02.056236397 +0200 > @@ -43,6 +43,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >{ >private: > using uint_least32_t = __UINT_LEAST32_TYPE__; > +struct __impl > +{ > + const char* _M_file_name; > + const char* _M_function_name; > + unsigned _M_line; > + unsigned _M_column; > +}; > using __builtin_ret_type = decltype(__builtin_source_location()); > >public: > @@ -76,14 +83,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > { return _M_impl ? _M_impl->_M_function_name : ""; } > >private: > -struct __impl > -{ > - const char* _M_file_name; > - const char* _M_function_name; > - unsigned _M_line; > - unsigned _M_column; > -}; > - > const __impl* _M_impl = nullptr; >}; > > > Jakub > >
Re: -stdlib=libc++?
Hi Shivam, > On 2 Apr 2022, at 06:57, Shivam Gupta wrote: > > I saw your last year's mail for the same topic on the GCC mailing list > -https://gcc.gnu.org/pipermail/gcc/2020-March/000230.html. The patch was applied to GCC-11 (so is available one GCC-11 branch and will be on GCC-12 when that is released). > > I tried today but this option is still not available. The option has to be configured when the compiler is built, that also means that you have to install (and point the configure to) a suitable set of libc++ headers from the LLVM project (e.g. there is a set here: https://github.com/iains/llvm-project/tree/9.0.1-gcc-stdlib). Generally, GCC is very compatible with the libc++ headers (the changes I made on that branch were mostly to deal with being in std:: for GCC and std::experimental:: for LLVM-9). For LLVM libc++ earlier than 9 there is a missing symbol that GCC uses - but that can be worked around too. There have been some changes in more recent (in particular, LLVM-14/main) libc++ that should make it more compatible. Of course, you should pick a version of the libc++ headers than matches the version used on your system (9 was used for quite a long time, but recent xcode headers are newer). Given that this involves cross-project sources and choosing a suitable set, probably it is a job for the distributions (e.g. homebrew, macports etc) to arrange or, for self-built compilers, following in the general comments above. FWIW, I have used this to build quite a few OSS projects on a number of Darwin versions (hence the comment about GCC being very compatible with libc++). thanks, Iain.
[PATCH] libstdc++: Tweak source_location for clang trunk [PR105128]
Hi! Apparently clang trunk implemented __builtin_source_location(), but the using __builtin_ret_type = decltype(__builtin_source_location()); which has been added for it isn't enough, they also need the std::source_location::__impl class to be defined (but incomplete seems to be good enough) before the builtin is used. The following has been tested on godbolt with clang trunk (old version fails with error: 'std::source_location::__impl' was not found; it must be defined before '__builtin_source_location' is called and some follow-up errors), getting back to just void * instead of __builtin_ret_type and commenting out using doesn't work either and just struct __impl; before using __builtin_ret_type doesn't work too. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2022-04-02 Jakub Jelinek PR libstdc++/105128 * include/std/source_location (std::source_location::__impl): Move definition before using __builtin_ret_type. --- libstdc++-v3/include/std/source_location2022-02-25 10:46:53.275178858 +0100 +++ libstdc++-v3/include/std/source_location2022-04-01 19:36:02.056236397 +0200 @@ -43,6 +43,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { private: using uint_least32_t = __UINT_LEAST32_TYPE__; +struct __impl +{ + const char* _M_file_name; + const char* _M_function_name; + unsigned _M_line; + unsigned _M_column; +}; using __builtin_ret_type = decltype(__builtin_source_location()); public: @@ -76,14 +83,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { return _M_impl ? _M_impl->_M_function_name : ""; } private: -struct __impl -{ - const char* _M_file_name; - const char* _M_function_name; - unsigned _M_line; - unsigned _M_column; -}; - const __impl* _M_impl = nullptr; }; Jakub
[PATCH] i386: Fix up ix86_expand_vector_init_general [PR105123]
Hi! The following testcase is miscompiled on ia32. The problem is that at -O0 we end up with: vector(4) short unsigned int _1; short unsigned int u.0_3; ... _1 = {u.0_3, u.0_3, u.0_3, u.0_3}; statement (dead) which is wrongly expanded. elt is (subreg:HI (reg:SI 83 [ u.0_3 ]) 0), tmp_mode SImode, so after convert_mode we start with word (reg:SI 83 [ u.0_3 ]). The intent is to manually broadcast that value to 2 SImode parts, but because we pass word as target to expand_simple_binop, it will overwrite (reg:SI 83 [ u.0_3 ]) and we end up with 0: 10: {r83:SI=r83:SI<<0x10;clobber flags:CC;} 11: {r83:SI=r83:SI|r83:SI;clobber flags:CC;} 12: {r83:SI=r83:SI<<0x10;clobber flags:CC;} 13: {r83:SI=r83:SI|r83:SI;clobber flags:CC;} 14: clobber r110:V4HI 15: r110:V4HI#0=r83:SI 16: r110:V4HI#4=r83:SI as the two ors do nothing and two shifts each by 16 left shift it all away. The following patch fixes that by using NULL_RTX target, so we expand it as 10: {r110:SI=r83:SI<<0x10;clobber flags:CC;} 11: {r111:SI=r110:SI|r83:SI;clobber flags:CC;} 12: {r112:SI=r83:SI<<0x10;clobber flags:CC;} 13: {r113:SI=r112:SI|r83:SI;clobber flags:CC;} 14: clobber r114:V4HI 15: r114:V4HI#0=r111:SI 16: r114:V4HI#4=r113:SI instead. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Another possibility would be to pass NULL_RTX only when word == elt and word otherwise, where word would necessarily be a pseudo from the first shift after passing NULL_RTX there once or pass NULL_RTX for the shift and word for ior. 2022-04-02 Jakub Jelinek PR target/105123 * config/i386/i386-expand.cc (ix86_expand_vector_init_general): Avoid using word as target for expand_simple_binop when doing ASHIFT and IOR. * gcc.target/i386/pr105123.c: New test. --- gcc/config/i386/i386-expand.cc.jj 2022-03-19 13:52:53.0 +0100 +++ gcc/config/i386/i386-expand.cc 2022-04-01 16:51:27.253154191 +0200 @@ -15830,9 +15830,9 @@ quarter: else { word = expand_simple_binop (tmp_mode, ASHIFT, word, shift, - word, 1, OPTAB_LIB_WIDEN); + NULL_RTX, 1, OPTAB_LIB_WIDEN); word = expand_simple_binop (tmp_mode, IOR, word, elt, - word, 1, OPTAB_LIB_WIDEN); + NULL_RTX, 1, OPTAB_LIB_WIDEN); } } --- gcc/testsuite/gcc.target/i386/pr105123.c.jj 2022-04-01 16:56:44.549625810 +0200 +++ gcc/testsuite/gcc.target/i386/pr105123.c2022-04-01 16:56:33.569782511 +0200 @@ -0,0 +1,22 @@ +/* PR target/105123 */ +/* { dg-do run { target sse2_runtime } } */ +/* { dg-options "-msse2" } */ +/* { dg-additional-options "-mtune=i686" { target ia32 } } */ + +typedef unsigned short __attribute__((__vector_size__ (4 * sizeof (unsigned short V; + +V +foo (unsigned short u, V v) +{ + return __builtin_shuffle (u * v, v); +} + +int +main () +{ + V x = foo (1, (V) { 0, 1, 2, 3 }); + for (unsigned i = 0; i < 4; i++) +if (x[i] != i) + __builtin_abort (); + return 0; +} Jakub
[Bug middle-end/105135] New: [11/12 Regression] Optimization regression for handrolled branchless assignment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105135 Bug ID: 105135 Summary: [11/12 Regression] Optimization regression for handrolled branchless assignment Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: andre.schackier at gmail dot com Target Milestone: --- Given the following source code [godbolt](https://godbolt.org/z/rrP3bqGW7): ```cpp char to_lower_1(const char c) { return c + ((c >= 'A' && c <= 'Z') * 32); } char to_lower_2(const char c) { return c + (((c >= 'A') & (c <= 'Z')) * 32); } char to_lower_3(const char c) { if (c >= 'A' && c <= 'Z') { return c + 32; } return c; } ``` compiling with `-O3` produces the following assembly ```asm to_lower_1(char): lea eax, [rdi-65] cmp al, 25 setbe al sal eax, 5 add eax, edi ret to_lower_2(char): lea eax, [rdi-65] cmp al, 25 setbe al sal eax, 5 add eax, edi ret to_lower_3(char): lea edx, [rdi-65] lea eax, [rdi+32] cmp dl, 26 cmovnb eax, edi ret ``` Note that gcc-10.3 did produce the same assembly for all 3 functions while gcc-11 and trunk do not.
Re: [PATCH v4] libgo: Don't use pt_regs member in mcontext_t
Hi Ian, Thanks for committing a first fix! Unfortunately, your changes don't work on ppc64le musl since you are now still using .regs on ppc64le the include of asm/ptrace.h (as added in the v1 of my patch) is missing. Hence, your patch fails to compile on ppc64le musl with the following error message: go-signal.c:230:63: error: invalid use of undefined type 'struct pt_regs' 230 | ret.sigpc = ((ucontext_t*)(context))->uc_mcontext.regs->nip; If you want to continue using .regs on ppc64le an include of asm/ptrace.h is needed since both glibc and musl declare `struct pt_regs` as an incomplete type (with glibc asm/ptrace.h is included indirectly by other headers used by go-signal.c it seems). See https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587520.html Would be nice if this could be fixed :) Sincerely, Sören Ian Lance Taylor wrote: > On Thu, Mar 31, 2022 at 9:41 AM Sören Tempel wrote: > > > > Ping. > > > > Would be nice to get this integrated since this one of the changes needed to > > make gccgo work with musl libc. Let me know if the patch needs to be revised > > further. > > I went with a simpler solution, more verbose but easier to read. Now > committed to mainline. Please let me know if you have any problems > with this. Thanks. > > Ian > fad0ecb68c08512ac24852b6d5264cdb9809dc6d > diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE > index afaccb0e9e6..f93eaf48e28 100644 > --- a/gcc/go/gofrontend/MERGE > +++ b/gcc/go/gofrontend/MERGE > @@ -1,4 +1,4 @@ > -7f33baa09a8172bb2c5f1ca0435d9efe3e194c9b > +45108f37070afb696b069768700e39a269f1fecb > > The first line of this file holds the git revision number of the last > merge done from the gofrontend repository. > diff --git a/libgo/runtime/go-signal.c b/libgo/runtime/go-signal.c > index 0cb90304730..9c919e1568a 100644 > --- a/libgo/runtime/go-signal.c > +++ b/libgo/runtime/go-signal.c > @@ -231,7 +231,14 @@ getSiginfo(siginfo_t *info, void *context > __attribute__((unused))) > #elif defined(__alpha__) && defined(__linux__) > ret.sigpc = ((ucontext_t*)(context))->uc_mcontext.sc_pc; > #elif defined(__PPC__) && defined(__linux__) > + // For some reason different libc implementations use > + // different names. > +#if defined(__PPC64__) || defined(__GLIBC__) > ret.sigpc = ((ucontext_t*)(context))->uc_mcontext.regs->nip; > +#else > + // Assumed to be ppc32 musl. > + ret.sigpc = ((ucontext_t*)(context))->uc_mcontext.gregs[32]; > +#endif > #elif defined(__PPC__) && defined(_AIX) > ret.sigpc = ((ucontext_t*)(context))->uc_mcontext.jmp_context.iar; > #elif defined(__aarch64__) && defined(__linux__) > @@ -347,6 +354,7 @@ dumpregs(siginfo_t *info __attribute__((unused)), void > *context __attribute__((u > mcontext_t *m = &((ucontext_t*)(context))->uc_mcontext; > int i; > > +#if defined(__PPC64__) || defined(__GLIBC__) > for (i = 0; i < 32; i++) > runtime_printf("r%d %X\n", i, m->regs->gpr[i]); > runtime_printf("pc %X\n", m->regs->nip); > @@ -355,6 +363,16 @@ dumpregs(siginfo_t *info __attribute__((unused)), void > *context __attribute__((u > runtime_printf("lr %X\n", m->regs->link); > runtime_printf("ctr %X\n", m->regs->ctr); > runtime_printf("xer %X\n", m->regs->xer); > +#else > + for (i = 0; i < 32; i++) > + runtime_printf("r%d %X\n", i, m->gregs[i]); > + runtime_printf("pc %X\n", m->gregs[32]); > + runtime_printf("msr %X\n", m->gregs[33]); > + runtime_printf("cr %X\n", m->gregs[38]); > + runtime_printf("lr %X\n", m->gregs[36]); > + runtime_printf("ctr %X\n", m->gregs[35]); > + runtime_printf("xer %X\n", m->gregs[37]); > +#endif > } > #elif defined(__PPC__) && defined(_AIX) > {