date:20210606

[Bug fortran/100855] pow run time gfortran vs ifort

2021-06-06 Thread nadavhalahmi560 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100855

--- Comment #10 from Nadav Halahmi  ---
(In reply to Dominique d'Humieres from comment #9)
> I don't know if the test is coming from a real world problem. The modified
> test
> 
> program power
> implicit none
> 
> real :: sum, sum1, n, q
> integer :: i, j
> integer :: limit
> real :: start, finish
> 
> sum = 0d0
> sum1 = 0d0
> limit = 1
> n = 2.0
> q = 0.5
> call CPU_TIME(start)
> do i=1, limit
> n = n*q
> sum1 = sum1 + (i ** (0.05 + n))
> end do
> do i=1, limit
> sum = sum + (i ** 0.05)
> end do
> sum = sum1 + (limit-1)*sum
> call CPU_TIME(finish)
> print *, sum, n, sum1
> print '("Time = ",f6.3," seconds.")',finish-start
> end program power
> 
> yields
> 
>150945680.   0.   15095.7852
> Time =  0.000 seconds.

What did you try to show here?

[Bug c++/100929] New: gcc fails to optimize less to min for SIMD code

2021-06-06 Thread denis.yaroshevskij at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929

Bug ID: 100929
   Summary: gcc fails to optimize less to min for SIMD code
   Product: gcc
   Version: og10 (devel/omp/gcc-10)
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: denis.yaroshevskij at gmail dot com
  Target Milestone: ---

Stand alone float - x86 example:
https://godbolt.org/z/vr3cjvY5G

Using a library x86 float, int, aarch64: https://godbolt.org/z/zPP48vzrq

less + blend or greater + blend should become min/max.

[Bug c/100920] bogus warnings with -Wscalar-storage-order

2021-06-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100920

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Eric Botcazou :

https://gcc.gnu.org/g:a589877a0036fc2f66b7a957859940c53efdc7c9

commit r12-1242-ga589877a0036fc2f66b7a957859940c53efdc7c9
Author: Eric Botcazou 
Date:   Sun Jun 6 11:37:45 2021 +0200

Fix thinko in new warning on type punning for storage order purposes

In C, unlike in Ada, the storage order of arrays is that of their component
type, so you need to look at it when deciding to warn.  And the PR
complains
about a bogus warning on the assignment of a pointer returned by alloca or
malloc, so this also fixes that.

gcc/c
PR c/100920
* c-decl.c (finish_struct): Fix thinko in previous change.
* c-typeck.c (convert_for_assignment): Do not warn on pointer
assignment and initialization for storage order purposes if the
RHS is a call to a DECL_IS_MALLOC function.
gcc/testsuite/
* gcc.dg/sso-14.c: New test.

[Bug c/100920] bogus warnings with -Wscalar-storage-order

2021-06-06 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100920

Eric Botcazou  changed:

   What|Removed |Added

   Target Milestone|--- |12.0
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Eric Botcazou  ---
Thanks for reporting the problem.

[Bug libfortran/98301] random_init() is broken

2021-06-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98301

--- Comment #13 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Andre Vehreschild
:

https://gcc.gnu.org/g:002745ca3668fc5e87c22acc81caaeaaadf9c47a

commit r11-8515-g002745ca3668fc5e87c22acc81caaeaaadf9c47a
Author: Andre Vehreschild 
Date:   Sun Jun 6 12:06:31 2021 +0200

PR fortran/98301 - random_init() is broken

Correct implementation of random_init() when -fcoarray=lib is given.
Backport from mainline.

2021-06-06  Andre Vehreschild  
Steve Kargl  

gcc/fortran/ChangeLog:

PR fortran/98301
* trans-decl.c (gfc_build_builtin_function_decls): Move decl.
* trans-intrinsic.c (conv_intrinsic_random_init): Use bool for
lib-call of caf_random_init instead of logical (4-byte).
* trans.h: Add tree var for random_init.

libgfortran/ChangeLog:

PR fortran/98301
* caf/libcaf.h (_gfortran_caf_random_init): New function.
* caf/single.c (_gfortran_caf_random_init): New function.
* gfortran.map: Added fndecl.
* intrinsics/random_init.f90: Implement random_init.

[Bug target/100930] New: PPC: Missing builtins for P9 vextsb2w, vextsb2w, vextsb2d, vextsh2d, vextsw2d

2021-06-06 Thread jens.seifert at de dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100930

Bug ID: 100930
   Summary: PPC: Missing builtins for P9 vextsb2w, vextsb2w,
vextsb2d, vextsh2d, vextsw2d
   Product: gcc
   Version: 8.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jens.seifert at de dot ibm.com
  Target Milestone: ---

Using the same names like xlC appreciated:
vec_extsbd, vec_extsbw, vec_extshd, vec_extshw, vec_extswd

[Bug rtl-optimization/40772] generating rendundant moves from second byte of 32b/64b register

2021-06-06 Thread roger at nextmovesoftware dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40772

Roger Sayle  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 CC||roger at nextmovesoftware dot 
com
   Target Milestone|--- |7.0
 Status|UNCONFIRMED |RESOLVED

--- Comment #5 from Roger Sayle  ---
This issue has been fixed since gcc 7; the compiler now stores the high-byte
register ah/bh/dh etc directly to memory.  The original tst2b.c testcase when
compiled with -O3 -march=k8 -fno-tree-vectorize looks like:
test:
.LFB0:
.cfi_startproc
leal1(%rdi), %edx
movl%edi, %eax
movb%ah, data(%rip)
addl$15, %eax
movb%dh, data+1(%rip)
leal2(%rdi), %edx
movb%ah, data+15(%rip)
movb%dh, data+2(%rip)
leal3(%rdi), %edx
movb%dh, data+3(%rip)
leal4(%rdi), %edx
movb%dh, data+4(%rip)
leal5(%rdi), %edx
movb%dh, data+5(%rip)
leal6(%rdi), %edx
movb%dh, data+6(%rip)
leal7(%rdi), %edx
movb%dh, data+7(%rip)
leal8(%rdi), %edx
movb%dh, data+8(%rip)
leal9(%rdi), %edx
movb%dh, data+9(%rip)
leal10(%rdi), %edx
movb%dh, data+10(%rip)
leal11(%rdi), %edx
movb%dh, data+11(%rip)
leal12(%rdi), %edx
movb%dh, data+12(%rip)
leal13(%rdi), %edx
movb%dh, data+13(%rip)
leal14(%rdi), %edx
movb%dh, data+14(%rip)
ret

[Bug bootstrap/29482] libcpp/configure - no usable dependency style found

2021-06-06 Thread nicolas at debian dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29482

Nicolas Boulenguez  changed:

   What|Removed |Added

 CC||nicolas at debian dot org

--- Comment #9 from Nicolas Boulenguez  ---
Hello.

I had the failure with GCC-10.2.1, only when running `autoreconf -f -i .
fixincludes gcc subdirs...` before `./configure`.

For each subdir in turn, autoreconf checks if the subdirectory uses libtool or
automake.  If so, it installs depcomp in . (../ from the subdir), else removes
./depcomp (breaking the build of other subdirectories).

Changing the order of autoreconf arguments so that the last one depends on
automake fixed the problem for me.

I am not sure if this is a bug, or where to report it, but documenting the
work-around here may be useful to other GCC users.

[Bug fortran/100907] Bind(c): failure handling wide character

2021-06-06 Thread dominiq at lps dot ens.fr via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100907

--- Comment #5 from Dominique d'Humieres  ---
> It seems that Mac OS doesn't have the full set of C11 standard headers... :-(

Shouldn't the C11 standard headers be provide by GCC12?

Nevertheless the test compiles with the new version of the new C companion.
The same is true for 100910 and 100914.

[Bug driver/69471] "-march=native" unintentionally breaks further -march/-mtune flags

2021-06-06 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69471

H.J. Lu  changed:

   What|Removed |Added

   Target Milestone|9.5 |9.3
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #20 from H.J. Lu  ---
Fixed in GCC 9.3 and above.  GCC 8 branch is closed.

[Bug target/100931] New: [x86-64] Failure to optimize 2 32-bit stores converted to a 64-bit store into using movabs instead of loading from a constant

2021-06-06 Thread gabravier at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100931

Bug ID: 100931
   Summary: [x86-64] Failure to optimize 2 32-bit stores converted
to a 64-bit store into using movabs instead of loading
from a constant
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gabravier at gmail dot com
  Target Milestone: ---

void g(int *p) {
  *p = 2;
  p[1] = 2;
}

void h(long long *p)
{
*p = 0x20002;
}

g compiles to this on GCC on plenty of architectures:

g(int*):
  mov rax, QWORD PTR .LC0[rip]
  mov QWORD PTR [rdi], rax
  ret

.LC0:
  .long 2
  .long 2

h is equivalent to g (non-withstanding aliasing) and instead compiles to this:

h(long long*):
  movabs rax, 8589934594
  mov QWORD PTR [rdi], rax
  ret

g has been compiled differently from h since GCC 10. 

I'm somewhat doubtful about filing this bug actually, I personally think that h
will be faster and that g is simply a regression from GCC 9, but I can't really
be sure there isn't some architecture-specific reasoning to use a separate
constant, especially since this transformation seems to only occur on specific
architectures (generic, core2, nehalem, westmere, sandybridge, ivybridge,
haswell, broadwell, znver1, znver2 and znver3)

[Bug c++/100929] gcc fails to optimize less to min for SIMD code

2021-06-06 Thread glisse at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929

--- Comment #1 from Marc Glisse  ---
Please attach your testcases to the bug report. godbolt links are nice
complements, but not considered sufficient here.

We don't lower the comparison or the blend in GIMPLE (yet). I think Hongtao Liu
is doing blends right now. I don't know if there would be issues for
comparisons (with -ftrapping-math for instance?).

If you write (x

[Bug target/100929] gcc fails to optimize less to min for SIMD code

2021-06-06 Thread glisse at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929

Marc Glisse  changed:

   What|Removed |Added

Version|og10 (devel/omp/gcc-10) |11.1.0
   Keywords||missed-optimization
  Component|c++ |target
   Severity|normal  |enhancement
 Target||x86_64-*-*

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-06-06 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593

H.J. Lu  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #12 from H.J. Lu  ---
We should handle it in the whole Linux software stack:

https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/8

not just in compiler.

[Bug rtl-optimization/95405] Unnecessary stores with std::optional

2021-06-06 Thread gabravier at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95405

Gabriel Ravier  changed:

   What|Removed |Added

 CC||gabravier at gmail dot com

--- Comment #2 from Gabriel Ravier  ---
Welp, I've tried to convert this to a simplified form, but I can't seem to get
the same output regardless of how close I get in terms of GIMPLE output.

With this code:

struct opbeb {};

union opbs {
opbeb empty_byte;
long value;
};

struct opb {
opbs payload;
bool engaged;
};

struct op : public opb {
};

struct ob {
op payload;
};

struct o {
ob base;
};

o foo();

long bar()
{
struct o r = foo();
if (__builtin_expect_with_probability((*(const ob *)&r).payload.engaged
!= 0, 1, .66))
return (long &)*(long *)&r;
else
return 0;
}

I get this final GIMPLE (i.e. -fdump-tree-optimized):

;; Function bar (_Z3barv, funcdef_no=9255, decl_uid=109154, cgraph_uid=6606,
symbol_order=6814)

Removing basic block 5
long int bar ()
{
  struct o r;
  bool _1;
  long int _4;
  long int _7;

   [local count: 1073741824]:
  r = foo ();
  _1 = MEM[(const struct ob *)&r].payload.D.109140.engaged;
  if (_1 != 0)
goto ; [66.00%]
  else
goto ; [34.00%]

   [local count: 708669601]:
  _7 = MEM[(long int &)&r];

   [local count: 1073741824]:
  # _4 = PHI <_7(3), 0(2)>
  r ={v} {CLOBBER};
  return _4;

}

Which seems to be almost exactly identical to the one I get from the real
std::optional:

;; Function bar (_Z3barv, funcdef_no=6084, decl_uid=49565, cgraph_uid=5869,
symbol_order=5916)

Removing basic block 5
long int bar ()
{
  struct optional r;
  long int _1;
  bool _4;
  long int _5;

   [local count: 1073741824]:
  r = foo ();
  _4 = MEM[(const struct _Optional_base *)&r]._M_payload.D.50442._M_engaged;
  if (_4 != 0)
goto ; [66.00%]
  else
goto ; [34.00%]

   [local count: 708669601]:
  _5 = MEM[(long int &)&r];

   [local count: 1073741824]:
  # _1 = PHI <_5(3), 0(2)>
  r ={v} {CLOBBER};
  return _1;

}

Literally the only differences I can see is that variables are declared in a
different order, and that some variable names are different.

Yet the assembly output for my version optimizes the store to memory away just
fine, and the std::optional output still fails to optimize the store to memory.

Is the (very minor) difference here this significant or is there something I
can't see in the outputted GIMPLE that results in the differences ? I tried to
delve into the RTL, though I failed to really understand what was going on
(though I could see significant differences between what I wrote and the
original example there).
I've also checked the assembly, and as far as I can see, there is no functional
difference between what I wrote and the original one, LLVM even produces the
exact same assembly for both.

I've also tried to rule out the difference in variable declaration placement
and  naming by rewriting what I wrote into GIMPLE and modifying it to
correspond to the original example as well as possible, with this being my best
effort:

long int __GIMPLE (ssa,guessed_local(1073741824))
bar ()
{
  struct o r;
  long int _1;
  bool _4;
  long int _7;

  __BB(2,guessed_local(1073741824)):
  r = foo ();
  _4 = __MEM  ((const struct ob *)&r).payload.base.engaged;
  if (_4 != _Literal (bool) 0)
goto __BB3(guessed(88583700));
  else
goto __BB4(guessed(45634028));

  __BB(3,guessed_local(708669601)):
  _7 = __MEM  (&r);
  goto __BB4(precise(134217728));

  __BB(4,guessed_local(1073741824)):
  _1 = __PHI (__BB3: _7, __BB2: 0l);
  r = _Literal (struct o) {};
  return _1;

}

But it still gets optimized well, as expected, unlike the original, which is
rather mind boggling to me, unless there really is a bunch of GIMPLE
information that isn't part of the outputted form.

PS: LLVM optimizes the original example and what I wrote perfectly fine to the
same assembly code.

[Bug fortran/100907] Bind(c): failure handling wide character

2021-06-06 Thread jrfsousa at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100907

--- Comment #6 from José Rui Faustino de Sousa  ---
> Shouldn't the C11 standard headers be provide by GCC12?
> 

AFAIK gcc uses the system's libc. In Linux the default will be GNU libc "glibc"
in Mas OS the default libc will be BSD libc which is missing some of the
headers... Or so it says in GNU portability library "gnulib" documentation...

Thank you very much.

Best regards,
José Rui

[Bug target/100909] [12 Regression] powerpc64le: Regression causing unexpected error with IBM long double

2021-06-06 Thread marxin at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100909

Martin Liška  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #2 from Martin Liška  ---
Mine, I've got a patch for it.

[Bug other/100932] New: autoconf error: possibly undefined macro: GCC_AC_ENABLE_DECIMAL_FLOAT

2021-06-06 Thread nicolas at debian dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100932

Bug ID: 100932
   Summary: autoconf error: possibly undefined macro:
GCC_AC_ENABLE_DECIMAL_FLOAT
   Product: gcc
   Version: 11.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nicolas at debian dot org
  Target Milestone: ---

Hello.
When I attempt to autoreconf(2.69) the gcc/ subdirectory of 10.2.1 or 11.1.0, I
get:
  configure.ac:886: error: possibly undefined macro:
GCC_AC_ENABLE_DECIMAL_FLOAT
  configure.ac:1499: error: possibly undefined macro:
GCC_AC_FUNC_MMAP_BLACKLIST
There is a slight possibility that the error is caused by local patches (Debian
experimental), but this trivial change fixes the issue:

 --- a/src/gcc/configure.ac 
+++ b/src/gcc/configure.ac  
@@ -25,6 +25,7 @@

 AC_INIT
 AC_CONFIG_SRCDIR(tree.c)
+AC_CONFIG_MACRO_DIRS(../config)
 AC_CONFIG_HEADER(auto-host.h:config.in)

 gcc_version=`cat $srcdir/BASE-VER`

The documentation seems to recommend AC_CONFIG_MACRO_DIRS anyway.

[Bug c++/67829] Bogus "ambiguous template instantiation" error with partial specializations involving a template template parameter

2021-06-06 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67829

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
 CC||ppalka at gcc dot gnu.org

[Bug other/100933] New: install cannot stat include-fixed/limits.h

2021-06-06 Thread nicolas at debian dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100933

Bug ID: 100933
   Summary: install cannot stat include-fixed/limits.h
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nicolas at debian dot org
  Target Milestone: ---

Hello.

I have been bitten by the exact bug described at:
https://gcc.gnu.org/legacy-ml/gcc/2013-04/msg00171.html

The work-around described there worked for me : run 'make && make install'
directly instead of via wrappers (dh_auto_build and dh_auto_install) that parse
'make -n' before normal operation.

The issue seems difficult, but please at least provide a hint in the error
message at install time.  Without this post, I would probably  never have found
a work-around.

Thanks.

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-06-06 Thread i at maskray dot me via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593

--- Comment #13 from Fangrui Song  ---
(In reply to H.J. Lu from comment #12)
> We should handle it in the whole Linux software stack:
> 
> https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/8
> 
> not just in compiler.

It is great that you have the desire to fix these fundamental issues :)

I think a GNU_PROPERTY marker is over-engineering. See
https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/8 for details. Many things
(including this and PR98112) can be changed today. When
-fno-direct-access-external-data/-fno-direct-access-external-function as
-fno-pic default becomes prevailing, make ld warning by default for
R_*_COPY/canonical PLT entries. After a while (say one or two years), let glibc
ld.so warn for R_*_COPY/canonical PLT entries.

[Bug c/100902] pointer attachment issues

2021-06-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100902

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:7fa4db39b6bcd207bd2b52023ff6b155bd15

commit r12-1246-g7fa4db39b6bcd207bd2b52023ff6b155bd15
Author: Jakub Jelinek 
Date:   Sun Jun 6 19:37:06 2021 +0200

openmp: Call c_omp_adjust_map_clauses even for combined target [PR100902]

When looking at in_reduction support for target, I've noticed that
c_omp_adjust_map_clauses is not called for the combined target case.

The following patch fixes it.

Unfortunately, there are other issues.

One is (also mentioned in the PR) that currently the pointer attachment
stuff seems to be clause ordering dependent (the standard says that clause
ordering on the same construct does not matter), the baz and qux cases
in the PR are rejected while when swapped it is accepted.
Note, the order of clauses in GCC really is treated as insignificant
initially and only later on the compiler can adjust the ordering (e.g. when
we sort map clauses based on what they refer to etc.) and in particular,
clauses from parsing is reverse of the order in user code, while
c_omp_split_clauses performed for combined/composite constructs typically
reverses that ordering, i.e. makes it follow the user code ordering.

And another one is I'm slightly afraid c_omp_adjust_map_clauses might
misbehave in templates, though haven't tried to verify it with testcases.
When processing_template_decl, the non-dependent clauses will be handled
usually the same as when not in a template, but dependent clauses aren't
processed or only limited processing is done there, and rest is deferred
till later.  From quick skimming of c_omp_adjust_map_clauses, it seems
it might not be very happy about non-processed map clauses that might
still have the TREE_LIST representation of array sections, or might
not have finalized decls or base decls etc.
So, for this I wonder if cp_parser_omp_target (and other cp/parser.c
callers of c_omp_adjust_map_clauses) shouldn't call it only
if (!processing_template_decl) - perhaps you could add
cp_omp_adjust_map_clauses wrapper that would be
if (!processing_template_decl)
  c_omp_adjust_map_clauses (...);
- and call c_omp_adjust_map_clauses from within pt.c after the clauses
are tsubsted and finish_omp_clauses is called again.

2021-06-06  Jakub Jelinek  

PR c/100902
* c-parser.c (c_parser_omp_target): Call c_omp_adjust_map_clauses
even when target is combined with other constructs.

* parser.c (cp_parser_omp_target): Call c_omp_adjust_map_clauses
even when target is combined with other constructs.

* c-c++-common/gomp/pr100902-1.c: New test.

[Bug rtl-optimization/95405] Unnecessary stores with std::optional

2021-06-06 Thread glisse at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95405

--- Comment #3 from Marc Glisse  ---
For a self-contained version, see below. Notice how the extra constructor in
_Optional_payload_base changes the generated code, or storing directly a
_Optional_payload_base instead of _Optional_payload in optional

struct _Optional_payload_base {
  long _M_value;
  bool _M_engaged = false;
  _Optional_payload_base() = default;
  ~_Optional_payload_base() = default;
  _Optional_payload_base(const _Optional_payload_base&) = default;
  _Optional_payload_base(_Optional_payload_base&&) = default;

  _Optional_payload_base(double,float);
};

struct _Optional_payload : _Optional_payload_base { };

struct optional
{
  _Optional_payload _M_payload;
};

optional foo();
long bar()
{
  auto r = foo();
  if (r._M_payload._M_engaged)
return r._M_payload._M_value;
  else
return 0L;
}

[Bug rtl-optimization/95405] Unnecessary stores with std::optional

2021-06-06 Thread gabravier at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95405

--- Comment #4 from Gabriel Ravier  ---
Ah, I see. Didn't think there was a constructor involved and/or that GIMPLE
would keep it implicit like this...

[Bug rtl-optimization/95405] Unnecessary stores with std::optional

2021-06-06 Thread glisse at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95405

--- Comment #5 from Marc Glisse  ---
GIMPLE doesn't know about calling conventions, that's something that only
"appears" during expansion to RTL.
Still, I don't claim to understand what is going on here.

[Bug target/100929] gcc fails to optimize less to min for SIMD code

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929

--- Comment #2 from Andrew Pinski  ---
Original x86_64 testcase:

#include 

__m256 if_else(__m256 x, __m256 y) {
  __m256 mask = _mm256_cmp_ps(y, x, _CMP_LT_OQ);
  return _mm256_blendv_ps(x, y, mask);
}

__m256 min(__m256 x, __m256 y) {
  return _mm256_min_ps(x, y);
}

 CUT -
Note the other testcase is using eve which I have no idea what it is coming
from.

[Bug target/100931] [x86-64] Failure to optimize 2 32-bit stores converted to a 64-bit store into using movabs instead of loading from a constant

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100931

--- Comment #1 from Andrew Pinski  ---
SLP is happening.
This is just a cost model issue as -mtune=intel works.

[Bug bootstrap/100932] autoconf error: possibly undefined macro: GCC_AC_ENABLE_DECIMAL_FLOAT

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100932

Andrew Pinski  changed:

   What|Removed |Added

  Component|other   |bootstrap
   Keywords||build

--- Comment #1 from Andrew Pinski  ---
I suspect most people already and normally do:
autoconf -I../config

[Bug bootstrap/100933] install cannot stat include-fixed/limits.h

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100933

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX
  Component|other   |bootstrap

--- Comment #1 from Andrew Pinski  ---
I doubt we are going to fix "make -n" as it is just a debugging tool of
makefiles rather than actually something which should be used.

[Bug tree-optimization/100934] New: wrong code at -O3 on x86_64-linux-gnu

2021-06-06 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100934

Bug ID: 100934
   Summary: wrong code at -O3 on x86_64-linux-gnu
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhendong.su at inf dot ethz.ch
  Target Milestone: ---

It seems to affect all versions since GCC 8.4 (but not GCC 8.3). 

[583] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap
--prefix=/local/suz-local/software/local/gcc-trunk --enable-languages=c,c++
--disable-werror --enable-multilib --with-system-zlib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 12.0.0 20210606 (experimental) [master revision
28c62475050:a6bc26893ec:a589877a0036fc2f66b7a957859940c53efdc7c9] (GCC) 
[584] % 
[584] % gcctk -O2 small.c; ./a.out
[585] % 
[585] % gcctk -O3 small.c
[586] % ./a.out
Segmentation fault
[587] % 
[587] % cat small.c
int a, b, c, d, e;
int main() {
  int f = 0, g = 0;
  for (; f < 2; f++) {
int h, i;
for (h = 0; h < 2; h++) {
  b = e = g ? a % g : 0;
  c = d;
  for (i = 0; i < 1; i++)
g = 0;
  for (; g < 2; g++)
;
}
  }
  return 0;
}

[Bug tree-optimization/100923] wrong code at -O2 and above on x86_64-linux-gnu

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100923

Andrew Pinski  changed:

   What|Removed |Added

Summary|wrong code at -Os and above |wrong code at -O2 and above
   |on x86_64-linux-gnu |on x86_64-linux-gnu
 Ever confirmed|0   |1
   Last reconfirmed||2021-06-06
 Status|UNCONFIRMED |NEW
   Keywords||alias, wrong-code

--- Comment #2 from Andrew Pinski  ---
- working (-O2 -fno-strict-aliasing)
+ not working (-O2 -fstrict-aliasing)

-  l.1_3 = l;
-  e.2_5 = e;
-  f.3_6 = f;
-  *e.2_5 = f.3_6;
-  _7 = *l.1_3;
-  if (_7 != 0)

+  l.1_4 = l;
+  _5 = *l.1_4;
+  e.2_6 = e;
+  f.3_7 = f;
+  *e.2_6 = f.3_7;
+  if (_5 != 0)

So we swapped around the store to *e and the load from *l.

[Bug tree-optimization/100923] [9/10/11/12 Regression] wrong code at -O2 and above on x86_64-linux-gnu

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100923

Andrew Pinski  changed:

   What|Removed |Added

Summary|wrong code at -O2 and above |[9/10/11/12 Regression]
   |on x86_64-linux-gnu |wrong code at -O2 and above
   ||on x86_64-linux-gnu
   Target Milestone|--- |9.5

[Bug target/100930] PPC: Missing builtins for P9 vextsb2w, vextsb2w, vextsb2d, vextsh2d, vextsw2d

2021-06-06 Thread wschmidt at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100930

Bill Schmidt  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-06-06
 Ever confirmed|0   |1
 CC||bergner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org

--- Comment #1 from Bill Schmidt  ---
Hi Jens,

The old xlC names are nonstandard.  The agreed-upon names between GCC and
OpenXL are vec_signexti, vec_signextll, and vec_signextq, having result type
vector signed int, vector signed long long, and vector signed __int128,
respectively.  vec_signextq is available only for P10.

Unfortunately these aren't yet implemented (their absence was discovered not
too long ago), so we still have work to do here. :(

Confirmed.

[Bug tree-optimization/100923] [9/10/11/12 Regression] wrong code at -O2 and above on x86_64-linux-gnu

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100923

--- Comment #3 from Andrew Pinski  ---
So FRE thinks:
  *e.2_6 = f.3_7;

Does not modify:

  _9 = *l.1_4;

[Bug tree-optimization/100934] [9/10/11/12 Regression] wrong code at -O3 during unrolling

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100934

Andrew Pinski  changed:

   What|Removed |Added

 Target||x86_64-linux-gnu
 Status|UNCONFIRMED |NEW
Summary|wrong code at -O3 on|[9/10/11/12 Regression]
   |x86_64-linux-gnu|wrong code at -O3 during
   ||unrolling
   Keywords||wrong-code
   Last reconfirmed||2021-06-06
  Known to fail||12.0
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
The complete unroller is adding a __builtin_unreachable and then that becomes
the only thing.

[Bug d/100935] New: d: T.alignof ignores explicit align(N) type alignment

2021-06-06 Thread ibuclaw at gdcproject dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100935

Bug ID: 100935
   Summary: d: T.alignof ignores explicit align(N) type alignment
   Product: gcc
   Version: 9.4.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: d
  Assignee: ibuclaw at gdcproject dot org
  Reporter: ibuclaw at gdcproject dot org
  Target Milestone: ---

T.alignof currently always returns the natural alignment of a type:

align(8) struct Aligned { int a; }
static assert(Aligned.alignof == 8); // fails, 4
align(1) struct Packed { int a; }
static assert(Packed.alignof == 1);  // fails, 4

[Bug target/100936] New: %p and %P modifiers should not emit segment overrides

2021-06-06 Thread ubizjak at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100936

Bug ID: 100936
   Summary: %p and %P modifiers should not emit segment overrides
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ubizjak at gmail dot com
  Target Milestone: ---

Following testcase:

--cut here--
__seg_gs int var = 123;

static int
*foo (void)
{
  int *addr;

  asm ("lea %p1, %0" : "=r"(addr) : "m"(var));

  return addr;
}

static int
bar (int *addr)
{
  int val;

  asm ("mov %%gs:%1, %0" : "=r"(val) : "m"(*addr));

  return val;
}

int
baz (void)
{
  int *addr = foo();
  int val = bar (addr);

  return val;
}
--cut here--

emits assembly warning when compiled on x86 target:

gcc -O2 -c lea.c
lea.c: Assembler messages:
lea.c:8: Warning: segment override on `lea' is ineffectual

$ objdump -d lea.o

lea.o: file format elf64-x86-64


Disassembly of section .text:

 :
   0:   65 48 8d 04 25 00 00lea%gs:0x0,%rax
   7:   00 00 
   9:   65 8b 00mov%gs:(%rax),%eax
   c:   c3  retq   

The problem is with %p operand modifier, which should emit raw symbol name:

   P -- if PIC, print an @PLT suffix.  For -fno-plt, load function
address from GOT.
   p -- print raw symbol name.

but it also emits its segment override. As shown in the above example, it is
not possible to use LEA to load its address into a register.

Similar problem is with %P modifier, trying to CALL or JMP to overriden
symbol,e.g:

call %gs:zzz
jmp %gs:zzz

call.s:1: Warning: skipping prefixes on `call'
call.s:2: Warning: skipping prefixes on `jmp'

[Bug target/100936] %p and %P modifiers should not emit segment overrides

2021-06-06 Thread ubizjak at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100936

--- Comment #1 from Uroš Bizjak  ---
Proposed patch:

--cut here--
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 04649b42122..0773a4a9ba8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13531,7 +13531,7 @@ ix86_print_operand_punct_valid_p (unsigned char code)

 static void
 ix86_print_operand_address_as (FILE *file, rtx addr,
-  addr_space_t as, bool no_rip)
+  addr_space_t as, bool raw)
 {
   struct ix86_address parts;
   rtx base, index, disp;
@@ -13570,7 +13570,7 @@ ix86_print_operand_address_as (FILE *file, rtx addr,
   else
 gcc_assert (ADDR_SPACE_GENERIC_P (parts.seg));

-  if (!ADDR_SPACE_GENERIC_P (as))
+  if (!ADDR_SPACE_GENERIC_P (as) && !raw)
 {
   if (ASSEMBLER_DIALECT == ASM_ATT)
putc ('%', file);
@@ -13589,7 +13589,7 @@ ix86_print_operand_address_as (FILE *file, rtx addr,
 }

   /* Use one byte shorter RIP relative addressing for 64bit mode.  */
-  if (TARGET_64BIT && !base && !index && !no_rip)
+  if (TARGET_64BIT && !base && !index && !raw)
 {
   rtx symbol = disp;

--cut here--

[Bug target/100929] gcc fails to optimize less to min for SIMD code

2021-06-06 Thread denis.yaroshevskij at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929

--- Comment #3 from Denis Yaroshevskiy  ---
> Please attach your testcases to the bug report.

Is what @Andrew Pinski copied enough? I can attach the same code as file.

> I don't know if there would be issues for comparisons (with -ftrapping-math 
> for instance?).

-ftrapping-math causes clang to stop doing this optimisation.

I can see that clang does it, so I assume `nans` are OK without this flag. For
ints this is for sure OK.

> Note the other testcase is using eve which I have no idea what it is coming 
> from.

Using eve just was much easier then writing this with intrinsics:

The point was:

vpcmpgtdymm2, ymm0, ymm1
vpblendvb   ymm0, ymm0, ymm1, ymm2

should become

vpminsd ymm0, ymm1, ymm0

And on arm:

cmgtv2.4s, v0.4s, v1.4s
bit v0.16b, v1.16b, v2.16b

should become
   sminv0.4s, v1.4s, v0.4s

And
fcmgt   v2.4s, v0.4s, v1.4s
bit v0.16b, v1.16b, v2.16b

should become
   fminv0.4s, v1.4s, v0.4s


I don't really know how it is done in `gcc` - but all these examples look like
the same issue. If it is very helpful to write all of them as intrinsics, I
can.

[Bug driver/100937] New: configure: Add --enable-default-semantic-interposition

2021-06-06 Thread i at maskray dot me via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937

Bug ID: 100937
   Summary: configure: Add --enable-default-semantic-interposition
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: i at maskray dot me
  Target Milestone: ---

Add a configure option --enable-default-semantic-interposition to customize
-f(no-)semantic-interposition default.

The suppression of interprocedural optimizations and inlining for such
default visibility non-vague-linkage function definitions is the biggest
difference between -fPIE/-fPIC.

Distributions may want to enable default -fno-semantic-interposition to
reclaim the lost performance from -fPIC (e.g. CPython is said to be 27% faster;
Clang is 3% faster).

[Bug driver/100937] configure: Add --enable-default-semantic-interposition

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |WONTFIX
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
NO. This is wrong for many reasons. First it makes portability a pain.

[Bug driver/100937] configure: Add --enable-default-semantic-interposition

2021-06-06 Thread i at maskray dot me via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937

Fangrui Song  changed:

   What|Removed |Added

 Resolution|WONTFIX |---
 Status|RESOLVED|UNCONFIRMED

--- Comment #2 from Fangrui Song  ---
How is it a portability problem?

clang -fpic has always been allowing interprocedural optimizations for
non-vague-linkage function definitions. FreeBSD uses clang and software works
with no problem.




For a vague-linkage function definition, a call site in the same
translation unit may inline the callee. Whether
-fno-semantic-interposition is enabled/disabled has no effect.

For a non-vague-linkage function definition, by default
(-fsemantic-interposition) the -fpic mode does not allow a call site
in the same translation unit to inline the callee or perform other
interprocedural optimizations.
-fno-semantic-interposition re-enables interprocedural optimizations.

If a caller inlines a callee, using LD_PRELOAD to interpose the callee
will not affect the caller. But many other LD_PRELOAD usage still
work.
We consider the small LD_PRELOAD limitation a good trade off for the speedup.

[Bug driver/100937] configure: Add --enable-default-semantic-interposition

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |WONTFIX
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Andrew Pinski  ---
>clang -fpic has always been allowing interprocedural optimizations for 
>non-vague-linkage function definitions. FreeBSD uses clang and software works 
>with no problem.

That does not mean clang is correct here.
clang breaks ELF assumptions and that is all I am going to say.  If you want to
break ELF fine, FreeBSD can break those.  But there is still a portability
issue between distros using different options like this.

[Bug driver/100937] configure: Add --enable-default-semantic-interposition

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937

--- Comment #4 from Andrew Pinski  ---
Also your patch did not change the documentation of the option.
Plus the documentation is clear that changing the default is most likely not
wanted at all:
https://gcc.gnu.org/onlinedocs/gcc-11.1.0/gcc/Optimize-Options.html#index-fsemantic-interposition

[Bug libstdc++/100475] semiregular-box's constructor uses wrong list-initialization

2021-06-06 Thread hewillk at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100475

--- Comment #7 from 康桓瑋  ---
(In reply to CVS Commits from comment #6)
> The master branch has been updated by Patrick Palka :
> 
> https://gcc.gnu.org/g:fe993b469c528230d9a01e1ae2208610f960dd9f
> 
> commit r12-856-gfe993b469c528230d9a01e1ae2208610f960dd9f
> Author: Patrick Palka 
> Date:   Tue May 18 00:28:44 2021 -0400
> 
> libstdc++: Fix up semiregular-box partial specialization [PR100475]
> 
> This makes the in-place constructor of our partial specialization of
> __box for already-semiregular types perform
> direct-non-list-initialization
> (in accordance with the specification of the primary template), and
> additionally makes the member function data() use std::__addressof.
> 
> libstdc++-v3/ChangeLog:
> 
> PR libstdc++/100475
> * include/std/ranges (__box::__box): Use non-list-initialization
> in member initializer list of in-place constructor of the
> partial specialization for semiregular types.
> (__box::operator->): Use std::__addressof.
> * testsuite/std/ranges/adaptors/detail/semiregular_box.cc
> (test02): New test.
> * testsuite/std/ranges/single_view.cc (test04): New test.


I think that even list-initialization with a single parameter should be changed
to direct-non-list-initialization to avoid bugs in some uncommon situations.



#include 

struct S {
  S() = default;
  S(std::initializer_list) = delete;
  S(const S&) {}
};
S obj;

auto l = std::initializer_list{{}, {}};
auto x = std::views::single(obj);
auto y = std::views::single(std::move(l));

https://godbolt.org/z/7nePj6Y57

[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers

2021-06-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735

--- Comment #17 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:16465ceb06cc1f65cfca3c0eb2c1ee27ab03bdfd

commit r12-1252-g16465ceb06cc1f65cfca3c0eb2c1ee27ab03bdfd
Author: liuhongt 
Date:   Tue Jun 1 09:00:57 2021 +0800

CALL_INSN may not be a real function call.

Use "used" flag for CALL_INSN to indicate it's a fake call. If it's a
fake call, it won't have its own function stack.

gcc/ChangeLog

PR target/82735
* df-scan.c (df_get_call_refs): When call_insn is a fake call,
it won't use stack pointer reg.
* final.c (leaf_function_p): When call_insn is a fake call, it
won't affect caller as a leaf function.
* reg-stack.c (callee_clobbers_any_stack_reg): New.
(subst_stack_regs): When call_insn doesn't clobber any stack
reg, don't clear the arguments.
* rtl.c (shallow_copy_rtx): Don't clear flag used when orig is
a insn.
* shrink-wrap.c (requires_stack_frame_p): No need for stack
frame for a fake call.
* rtl.h (FAKE_CALL_P): New macro.

[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers

2021-06-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735

--- Comment #18 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:9a90b311f22956addaf4f5f9bdb3592afd45083f

commit r12-1253-g9a90b311f22956addaf4f5f9bdb3592afd45083f
Author: liuhongt 
Date:   Tue Jun 1 09:09:44 2021 +0800

Fix _mm256_zeroupper by representing the instructions as call_insns in
which the call has a special vzeroupper ABI.

When __builtin_ia32_vzeroupper is called explicitly, the corresponding
vzeroupper pattern does not carry any CLOBBERS or SETs before LRA,
which leads to incorrect optimization in pass_reload. In order to
solve this problem, this patch refine instructions as call_insns in
which the call has a special vzeroupper ABI.

gcc/ChangeLog:

PR target/82735
* config/i386/i386-expand.c (ix86_expand_builtin): Remove
assignment of cfun->machine->has_explicit_vzeroupper.
* config/i386/i386-features.c
(ix86_add_reg_usage_to_vzerouppers): Delete.
(ix86_add_reg_usage_to_vzeroupper): Ditto.
(rest_of_handle_insert_vzeroupper): Remove
ix86_add_reg_usage_to_vzerouppers, add df_analyze at the end
of the function.
(gate): Remove cfun->machine->has_explicit_vzeroupper.
* config/i386/i386-protos.h (ix86_expand_avx_vzeroupper):
Declared.
* config/i386/i386.c (ix86_insn_callee_abi): New function.
(ix86_initialize_callee_abi): Ditto.
(ix86_expand_avx_vzeroupper): Ditto.
(ix86_hard_regno_call_part_clobbered): Adjust for vzeroupper
ABI.
(TARGET_INSN_CALLEE_ABI): Define as ix86_insn_callee_abi.
(ix86_emit_mode_set): Call ix86_expand_avx_vzeroupper
directly.
* config/i386/i386.h (struct GTY(()) machine_function): Delete
has_explicit_vzeroupper.
* config/i386/i386.md (enum unspec): New member
UNSPEC_CALLEE_ABI.
(ABI_DEFAULT,ABI_VZEROUPPER,ABI_UNKNOWN): New
define_constants for insn callee abi index.
* config/i386/predicates.md (vzeroupper_pattern): Adjust.
* config/i386/sse.md (UNSPECV_VZEROUPPER): Deleted.
(avx_vzeroupper): Call ix86_expand_avx_vzeroupper.
(*avx_vzeroupper): Rename to ..
(avx_vzeroupper_callee_abi): .. this, and adjust pattern as
call_insn which has a special vzeroupper ABI.
(*avx_vzeroupper_1): Deleted.

gcc/testsuite/ChangeLog:

PR target/82735
* gcc.target/i386/pr82735-1.c: New test.
* gcc.target/i386/pr82735-2.c: New test.
* gcc.target/i386/pr82735-3.c: New test.
* gcc.target/i386/pr82735-4.c: New test.
* gcc.target/i386/pr82735-5.c: New test.

[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers

2021-06-06 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735

--- Comment #19 from Hongtao.liu  ---
Fixed in GCC12.

[Bug target/69199] Incorrect prototypes for AVX512 unaligned load/store builtin functions

2021-06-06 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69199

--- Comment #2 from Hongtao.liu  ---
I can confirm it has already been fixed by r7-104

[Bug gcov-profile/100938] New: [GCOV] Coverage changes when a statement is divided in multiple lines

2021-06-06 Thread njuwy at smail dot nju.edu.cn via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100938

Bug ID: 100938
   Summary: [GCOV] Coverage changes when a statement is divided in
multiple lines
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: njuwy at smail dot nju.edu.cn
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/10.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure -enable-checking=release -enable-languages=c,c++
-disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.2.0 (GCC) 

$ cat test.c
int fn5(int x){
return -x;
}
int fn2(int x,int y){
return x+3-y;
}
int fn6(int x,int y){
return x+y;
}
int fn7(int x){
return 2*x;
}

int main()
{
int t1,t2,t3,t4,t5=1;
int b=1,y;
 y = fn5(b && fn2(t1=t2,fn6(fn7(t3) < t4,t5)));
 y = fn5(b && fn2(t1=t2,
fn6(fn7(t3) < t4,t5)));
}

$ gcc -O0 --coverage test.c;./a.out;gcov test;cat test.c.gcov
File 'test.c'
Lines executed:100.00% of 14
Creating 'test.c.gcov'

-:0:Source:test.c
-:0:Graph:test.gcno
-:0:Data:test.gcda
-:0:Runs:1
2:1:int fn5(int x){
2:2:return -x;
-:3:}
2:4:int fn2(int x,int y){
2:5:return x+3-y;
-:6:}
2:7:int fn6(int x,int y){
2:8:return x+y;
-:9:}
2:   10:int fn7(int x){
2:   11:return 2*x;
-:   12:}
-:   13:
1:   14:int main()
-:   15:{
1:   16:int t1,t2,t3,t4,t5=1;
1:   17:int b=1,y;
   1*:   18: y = fn5(b && fn2(t1=t2,fn6(fn7(t3) < t4,t5)));
   2*:   19: y = fn5(b && fn2(t1=t2,
1:   20:fn6(fn7(t3) < t4,t5)));
-:   21:}

Line 18 and 19 should be executed the same number of times

[Bug target/100931] [x86-64] Failure to optimize 2 32-bit stores converted to a 64-bit store into using movabs instead of loading from a constant

2021-06-06 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100931

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #2 from Hongtao.liu  ---
What's the option do you use?

with -O2 -march=x86-64, gcc generate same asm for g and h

https://godbolt.org/z/Wx5eG39aG

[Bug target/100885] [12 Regression] ICE: in extract_constrain_insn, at recog.c:2671: insn does not satisfy its constraints: {sse4_1_zero_extendv8qiv8hi2}

2021-06-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100885

--- Comment #6 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:be5efe9c12cb852c788f74f8555e6ab8d755479b

commit r12-1254-gbe5efe9c12cb852c788f74f8555e6ab8d755479b
Author: liuhongt 
Date:   Thu Jun 3 16:38:32 2021 +0800

Fix ICE of insn does not satisfy its constraints.

evex encoding vpmovzxbx needs both AVX512BW and AVX512VL which means
constraint "Yw" should be used instead of constraint "v".

gcc/ChangeLog:

PR target/100885
* config/i386/sse.md (*sse4_1_zero_extendv8qiv8hi2_3): Refine
constraints.
(v4siv4di2): Delete constraints for define_expand.

gcc/testsuite/ChangeLog:

PR target/100885
* g++.target/i386/pr100885.C: New test.

[Bug c/100939] New: Missing warning with misplaced attribute declaration in struct, enum, or union definition

2021-06-06 Thread johnfbennett at protonmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100939

Bug ID: 100939
   Summary: Missing warning with misplaced attribute declaration
in struct, enum, or union definition
   Product: gcc
   Version: 11.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: johnfbennett at protonmail dot com
  Target Milestone: ---

$ cat misplacedattribute.c
struct samplestruct {
int member1;
int member2;
};

int main(void) {
struct __attribute__((__unused__)) samplestruct samplestruct;

return 0;
}
$ gcc -Wall misplacedattribute.c
misplacedattribute.c: In function ‘main’:
misplacedattribute.c:7:50: warning: unused variable ‘samplestruct’
[-Wunused-variable]
7 |  struct __attribute__((__unused__)) samplestruct samplestruct;
  |  ^~~~
$

[Bug target/100885] [12 Regression] ICE: in extract_constrain_insn, at recog.c:2671: insn does not satisfy its constraints: {sse4_1_zero_extendv8qiv8hi2}

2021-06-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100885

--- Comment #7 from CVS Commits  ---
The releases/gcc-11 branch has been updated by hongtao Liu
:

https://gcc.gnu.org/g:c064e787b10069e3de56bd3d0d1a34a1a09086ea

commit r11-8517-gc064e787b10069e3de56bd3d0d1a34a1a09086ea
Author: liuhongt 
Date:   Thu Jun 3 16:38:32 2021 +0800

Fix ICE of insn does not satisfy its constraints.

evex encoding vpmovzxbx needs both AVX512BW and AVX512VL which means
constraint "Yw" should be used instead of constraint "v".

gcc/ChangeLog:

PR target/100885
* config/i386/sse.md (*sse4_1_zero_extendv8qiv8hi2_3): Refine
constraints.
(v4siv4di2): Delete constraints for define_expand.

gcc/testsuite/ChangeLog:

PR target/100885
* g++.target/i386/pr100885.C: New test.

[Bug libstdc++/100940] New: views::take and views::drop should not define _S_has_simple_extra_args

2021-06-06 Thread hewillk at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100940

Bug ID: 100940
   Summary: views::take and views::drop should not define
_S_has_simple_extra_args
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

For view::take and views::drop, we need to perfectly forward its incoming arg
in some uncommon situations:

#include 

struct Five {
  operator int() && { return 5; }
} five;

extern int x[10];
auto r = x | std::views::take(five);

https://godbolt.org/z/MEsssWGEh

[Bug target/100885] [12 Regression] ICE: in extract_constrain_insn, at recog.c:2671: insn does not satisfy its constraints: {sse4_1_zero_extendv8qiv8hi2}

2021-06-06 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100885

--- Comment #8 from Hongtao.liu  ---
Fixed in trunk.

[Bug rtl-optimization/88770] Redundant load opt. or CSE pessimizes code

2021-06-06 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88770

--- Comment #3 from Hongtao.liu  ---
Shouldn't pass_store_merging be better place to handle such optimization?
currently store-merging only merges .a and .b, fails to merge .c and .d

202t.store-merging

void caller ()
{
  struct guu D.4030;
  struct guu D.4029;

   [local count: 1073741824]:
  MEM  [(int *)&D.4029] = 21474836483;
  D.4029.c = 7.0e+0;
  D.4029.d = 9;
  test (D.4029);
  MEM  [(int *)&D.4030] = 21474836483;
  D.4030.c = 7.0e+0;
  D.4030.d = 9;
  test (D.4030);
  D.4029 ={v} {CLOBBER};
  D.4030 ={v} {CLOBBER};
  return;

[Bug c++/54835] [C++11][DR 1518] Explicit default constructors not respected during copy-list-initialization

2021-06-06 Thread rs2740 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54835

TC  changed:

   What|Removed |Added

 CC||rs2740 at gmail dot com

--- Comment #21 from TC  ---
(In reply to David Friberg from comment #19)
> 
> P0398R0 [1] describes the final resolution to CWG 1518, after which the
> following example is arguably well-formed:
> 

It's not. Explicitness of a constructor is not considered when forming implicit
conversion sequences from a braced-init-list, and therefore the assignment is
ambiguous because {} can convert to either S or tag_t, even though the latter
is ill-formed if actually used.

[Bug target/100941] New: wrong code with __builtin_shufflevector() with -mavx512f

2021-06-06 Thread zsojka at seznam dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100941

Bug ID: 100941
   Summary: wrong code with __builtin_shufflevector() with
-mavx512f
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 50957
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50957&action=edit
reduced testcase

Output:
$ x86_64-pc-linux-gnu-gcc testcase.c -Wno-psabi
$ ./a.out
$ x86_64-pc-linux-gnu-gcc testcase.c -mavx512f
$ ./a.out 
Aborted

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-1254-20210607112745-gbe5efe9c12c-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r12-1254-20210607112745-gbe5efe9c12c-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.0 20210607 (experimental) (GCC)

[Bug target/100929] gcc fails to optimize less to min for SIMD code

2021-06-06 Thread glisse at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929

--- Comment #4 from Marc Glisse  ---
(In reply to Denis Yaroshevskiy from comment #3)
> Is what @Andrew Pinski copied enough?

I think so (it is missing the command line), although one example with an
integer type could also help in case floats turn out to have a different issue.

> -ftrapping-math causes clang to stop doing this optimisation.

Note that -ftrapping-math is on by default with gcc (PR 54192), but
-fno-trapping-math wouldn't solve your problem, we are missing other things.

[Bug target/69199] Incorrect prototypes for AVX512 unaligned load/store builtin functions

2021-06-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69199

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |7.0
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Andrew Pinski  ---
Fixed so closing.

61 matches

Mail list logo