[Bug ipa/103171] [12 Regression] ICE Segmentation fault since r12-2523-g13586172d0b70c9d

2021-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103171

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #1 from Jan Hubicka  ---
this looks like variable is optimized out because ipa-cp removes reference to
it and later it is used in code.  Perhaps another instance of missed transform
form ipa-cp?

[Bug tree-optimization/103223] [12 regression] Access attribute dropped when ipa-sra is applied

2021-11-21 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103223

--- Comment #11 from hubicka at kam dot mff.cuni.cz ---
> Xeon(R) Platinum 8358 (IceLake) (64C 128T 512G):
> BenchMarks  Copies  RunTime1RunTime2Rate1   Rate2   
> Compare
> 548.exchange2_r 128 479 913 700 367 -47.57%
> 
> Xeon(R) Gold 6252 (CascadeLake) (48C 96T 192G)
> BenchMarks  Copies  RunTime1RunTime2Rate1   Rate2   
> Compare
> 548.exchange2_r 96  643 1240391 203 -48.08%

I filled in PR103227 to track this problem.  There seems to be two
issues visible on exchange2.  First is that ipa-sra changes order of
functions which in which inliner visits them and this makes difference
in inlining decisions. Second is that ipa-sra makes some constant
propagation info to be lost.  With Martin we look into this.

[Bug fortran/102043] Wrong array types used for negative stride accesses

2021-11-21 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102043

--- Comment #24 from rguenther at suse dot de  ---
On Fri, 19 Nov 2021, mikael at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102043
> 
> --- Comment #23 from Mikael Morin  ---
> (In reply to Richard Biener from comment #21)
> > (In reply to Bernhard Reutner-Fischer from comment #17)
> > > Do we want to address arrays always at position 0 (maybe to help graphite 
> > > ?)
> > 
> > Helping graphite (and other loop optimizers) would be to not lower
> > multi-dimensional accesses to a single dimension (I think that's what
> > Sandras patches try to do). 
> 
> Or maybe graphite can be taught to handle flattened array access?
> 
> Anyway, does the middle-end support out-of-order array access?
> Namely for an array arr(4, 5, 6), arr(:, 1, :) is an array of size (4, 6).
> Does the middle-end type system support this?

Support as in not miscompile - yes.  Dependence analysis is only
helped if the shapes of the accesses to the same storage are
compatible - a two dimensional and a three dimensional access are not.

> In any case, it’s not for gcc 12.

Agreed.

> > The lower bound doesn't really matter here and
> > is well-handled by all code.
> 
> Well, unless the lower bound is negative. ;-)

Even a negative lower bound is OK.  Problematic is only if you
provide an index that accesses the array below the (negative)
lower bound ;)  But note!  In GCC  a[i] with a having a lower
bound of say -100 will access storage at  + (i - -100),
the lower bound is just to make the effective index >= 0 again,
so it can't be used (on its own) to solve the present issue
as far as I understand since you _do_ need to access the
storage at an effective index < 0.

[Bug tree-optimization/103088] [12 regression] 500.perlbench from spec 2017 fails since r12-4698

2021-11-21 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103088

--- Comment #22 from rguenther at suse dot de  ---
On Fri, 19 Nov 2021, aldyh at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103088
> 
> --- Comment #19 from Aldy Hernandez  ---
> 
> The problem is this construct in Perl_do_ncmp:
> 
>   if (lnv < rnv)
> return -1;
>   if (lnv > rnv)
> return 1;
>   if (lnv == rnv)
> return 0;
>   return 2;
> 
> These are all doubles.  The code is depending on a pair of values that are
> neither <, >, nor ==, being a NAN.

It would be nice from a QOI point of view (aka DWIM) that even with
-ffast-math we'd recognize this as a form of FP classification since
"obviously" the intent was to handle the NaN special case.  Not that
I see any reasonable way to represent the NaN case in -ffast-math IL ...

> I think we can keep this PR closed.  Don't use -ffast-math unless followed by
> -fno-unsafe-math-optimizations.

Agreed.

[Bug tree-optimization/103223] [12 regression] Access attribute dropped when ipa-sra is applied

2021-11-21 Thread admin at levyhsu dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103223

Levy  changed:

   What|Removed |Added

 CC||admin at levyhsu dot com

--- Comment #10 from Levy  ---
Hi Jan

Just want provide a status report that this commit
ecdf414bd89e6ba251f6b3f494407139b4dbae0e seems to caused about 50% regression
when running multi-copy 548.exchange2_r with march_native_ofast_lto on
spec2017:

Xeon(R) Platinum 8358 (IceLake) (64C 128T 512G):
BenchMarks  Copies  RunTime1RunTime2Rate1   Rate2   Compare
548.exchange2_r 128 479 913 700 367 -47.57%

Xeon(R) Gold 6252 (CascadeLake) (48C 96T 192G)
BenchMarks  Copies  RunTime1RunTime2Rate1   Rate2   Compare
548.exchange2_r 96  643 1240391 203 -48.08%

Best
Levy

[Bug tree-optimization/79534] [9/10/11/12 Regression] tree-ifcombine aarch64 performance regression with trunk@245151

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79534

--- Comment #19 from Andrew Pinski  ---
I am even seeing almost 10% performance drop from GCC 7 to the trunk even. Even
at -O2.  This is on an OcteonTX2:
-O2:
7.4.0:
2150.408184
trunk:
2032.971945

-O2  -fno-tree-vectorize:
trunk:
2028.648257

-O3:
7.4.0:
2207.640167
trunk:
2049.228275

[Bug tree-optimization/101474] SRA sometimes produces worse code with inline functions (seen with -fipa-icf sometimes)

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101474

--- Comment #4 from Andrew Pinski  ---
Note ICF behavior and inlining is what PR 96252 is about.

[Bug ipa/96252] [10/11/12 Regression] mis-optimization where identical functions have very different codegen since gcc 10

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96252

Andrew Pinski  changed:

   What|Removed |Added

 CC||vegard.nossum at oracle dot com

--- Comment #8 from Andrew Pinski  ---
*** Bug 101474 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/101474] SRA sometimes produces worse code with inline functions (seen with -fipa-icf sometimes)

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101474

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from Andrew Pinski  ---
This is a dup of bug 96252.

*** This bug has been marked as a duplicate of bug 96252 ***

[Bug tree-optimization/92342] [10/11/12 Regression] a small missed transformation into x?b:0

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92342

Andrew Pinski  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2021-Novembe
   ||r/585112.html
   Keywords||patch

--- Comment #25 from Andrew Pinski  ---
Submitted the sequence here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585111.html
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585112.html

[Bug tree-optimization/92342] [10/11/12 Regression] a small missed transformation into x?b:0

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92342

--- Comment #24 from Andrew Pinski  ---
(In reply to Segher Boessenkool from comment #4)
> Btw, try
> 
> int h(int a, int b, int c, int d)
> {
>   return (c & -(a==b)) | (d & -(a!=b));
> }
> 
> to see we have some way to go here.

I filed that as PR 103354 which I will handle after I submit the patches for
this one.

[Bug target/103275] [11/12 Regression] don't generate kmov with IE model relocations

2021-11-21 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103275

--- Comment #17 from Hongtao.liu  ---
Fixed in GCC12 and GCC11.

[Bug target/103275] [11/12 Regression] don't generate kmov with IE model relocations

2021-11-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103275

--- Comment #16 from CVS Commits  ---
The releases/gcc-11 branch has been updated by hongtao Liu
:

https://gcc.gnu.org/g:eb8ff3cbc09e029ca0cbd0d8b09bcaba162ab95a

commit r11-9257-geb8ff3cbc09e029ca0cbd0d8b09bcaba162ab95a
Author: liuhongt 
Date:   Wed Nov 17 15:48:37 2021 +0800

Don't allow mask/sse/mmx mov in TLS code sequences.

As change in assembler, refer to [1], this patch disallow mask/sse/mmx
mov in TLS code sequences which require integer MOV instructions.

[1]
https://sourceware.org/git/?p=binutils-gdb.git;a=patch;h=d7e3e627027fcf37d63e284144fe27ff4eba36b5

gcc/ChangeLog:

PR target/103275
* config/i386/constraints.md (Bk): New
define_memory_constraint.
* config/i386/i386-protos.h (ix86_gpr_tls_address_pattern_p):
Declare.
* config/i386/i386.c (ix86_gpr_tls_address_pattern_p): New
function.
* config/i386/i386.md (*movsi_internal): Don't allow
mask/sse/mmx move in TLS code sequences.
(*movdi_internal): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr103275.c: New test.

[Bug target/103275] [11/12 Regression] don't generate kmov with IE model relocations

2021-11-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103275

--- Comment #15 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:b5844cb0bc8c7d9be2ff1ecded249cad82b9b71c

commit r12-5445-gb5844cb0bc8c7d9be2ff1ecded249cad82b9b71c
Author: liuhongt 
Date:   Wed Nov 17 15:48:37 2021 +0800

Don't allow mask/sse/mmx mov in TLS code sequences.

As change in assembler, refer to [1], this patch disallow mask/sse/mmx
mov in TLS code sequences which require integer MOV instructions.

[1]
https://sourceware.org/git/?p=binutils-gdb.git;a=patch;h=d7e3e627027fcf37d63e284144fe27ff4eba36b5

gcc/ChangeLog:

PR target/103275
* config/i386/constraints.md (Bk): New
define_memory_constraint.
* config/i386/i386-protos.h (ix86_gpr_tls_address_pattern_p):
Declare.
* config/i386/i386.c (ix86_gpr_tls_address_pattern_p): New
function.
* config/i386/i386.md (*movsi_internal): Don't allow
mask/sse/mmx move in TLS code sequences.
(*movdi_internal): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr103275.c: New test.

[Bug tree-optimization/103354] missed optimization with & and | and compares

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103354

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2021-11-22
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
After my patch for PR 92342, we get:


  _4 = a_9(D) == b_10(D) ? c_11(D) : 0;
  _8 = a_9(D) != b_10(D) ? d_12(D) : 0;
  _13 = _4 | _8;

A few patterns are needed here really:

One pattern simple is:
(simplify
 (bit_ior:c
  (cond @0 @1 integer_zero_p)
  (cond @0 integer_zero_p @2))
 (cond @0 @1 @2))

But the next one is harder and requires a double for loop. I will handle it
soon.

[Bug tree-optimization/103354] New: missed optimization with & and | and compares

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103354

Bug ID: 103354
   Summary: missed optimization with & and | and compares
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
int h(int a, int b, int c, int d)
{
  return (c & -(a==b)) | (d & -(a!=b));
}

 CUT ---
We currently don't optimize the above down to just:
int h(int a, int b, int c, int d)
{
  return (a==b) ? c : d;
}

This is from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92342#c4

[Bug target/103353] New: Indefinite recursion when compiling -mmma requiring testcase w/ -maltivec

2021-11-21 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103353

Bug ID: 103353
   Summary: Indefinite recursion when compiling -mmma requiring
testcase w/ -maltivec
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: error-recovery, ice-on-invalid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---
Target: powerpc-*-linux-gnu

g++-12.0.0-alpha2024 snapshot (g:3057f1ab737582a9fb37a3fb967ed8bf3659f2f4)
ICEs because of stack exhaustion when compiling
gcc/testsuite/gcc.target/powerpc/pr101849.c w/ -maltivec:

% powerpc-e300c3-linux-gnu-gcc-12.0.0 -maltivec -c
gcc/testsuite/gcc.target/powerpc/pr101849.c
gcc/testsuite/gcc.target/powerpc/pr101849.c: In function 'foo':
gcc/testsuite/gcc.target/powerpc/pr101849.c:11:12: error: '__builtin_vsx_lxvp'
requires the '-mmma' option
   11 |   dst[0] = __builtin_vsx_lxvp (0, (__vector_pair *)(void *)x);
  |^~
powerpc-e300c3-linux-gnu-gcc-12.0.0: internal compiler error: Segmentation
fault signal terminated program cc1

(gdb) where 10
#0  0x0093fc34 in ggc_internal_alloc (size=size@entry=24,
f=f@entry=0x0, s=s@entry=0, n=n@entry=1)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/ggc-page.c:1278
#1  0x00a8b94b in ggc_alloc ()
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/ggc.h:178
#2  start_sequence ()
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/emit-rtl.c:5476
#3  0x00ac850b in emit_move_multi_word (mode=E_OOmode, x=, y=0x775fe978)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/expr.c:3889
#4  0x00ac3d3f in emit_move_insn (x=x@entry=0x76511060,
y=y@entry=0x775fe978)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/expr.c:4128
#5  0x00a9ce04 in copy_to_reg (x=x@entry=0x775fe978)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/explow.c:625
#6  0x00a88908 in operand_subword_force (op=op@entry=0x775fe978,
offset=offset@entry=..., mode=mode@entry=E_OOmode)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/emit-rtl.c:1802
#7  0x00ac861e in emit_move_multi_word (mode=E_OOmode, x=, y=0x775fe978)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/expr.c:3918
#8  0x00ac3d3f in emit_move_insn (x=x@entry=0x76511018,
y=y@entry=0x775fe978)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/expr.c:4128
#9  0x00a9ce04 in copy_to_reg (x=x@entry=0x775fe978)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/explow.c:625
(More stack frames follow...)

(gdb) where -10
#882893 0x009d29e5 in cgraph_node::expand (this=0x777c8440)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/context.h:48
#882894 cgraph_node::expand (this=0x777c8440)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cgraphunit.c:1781
#882895 0x009d3df6 in output_in_order ()
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cgraphunit.c:2135
#882896 symbol_table::compile (this=0x777c)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cgraphunit.c:2353
#882897 symbol_table::compile (this=0x777c)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cgraphunit.c:2267
#882898 0x009d6d07 in symbol_table::finalize_compilation_unit
(this=0x777c)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cgraphunit.c:2537
#882899 0x00e8b4f7 in compile_file ()
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/toplev.c:479
#882900 0x0080d376 in do_compile (no_backend=false)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/toplev.c:2156
#882901 toplev::main (this=this@entry=0x7fffd9f6, argc=,
argc@entry=12, argv=,
argv@entry=0x7fffdb48)
at
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/toplev.c:2308
#882902 

[Bug tree-optimization/92342] [10/11/12 Regression] a small missed transformation into x?b:0

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92342

--- Comment #23 from Andrew Pinski  ---
Created attachment 51846
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51846=edit
Patch which I am testing

Note it is really two patches, one which fixes up the multiply case and then
one which does the same for &- too.

[Bug tree-optimization/92342] [10/11/12 Regression] a small missed transformation into x?b:0

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92342

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Status|NEW |ASSIGNED
  Component|rtl-optimization|tree-optimization

--- Comment #22 from Andrew Pinski  ---
Mine, we should do this transformation on the gimple level really. Let me do
that.

[Bug target/97984] [10/11 Regression] Worse code for -O3 than -O2 on aarch64 vector multiply-add

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97984

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||needs-bisection
  Known to work||12.0
Summary|[10/11/12 Regression] Worse |[10/11 Regression] Worse
   |code for -O3 than -O2 on|code for -O3 than -O2 on
   |aarch64 vector multiply-add |aarch64 vector multiply-add

--- Comment #3 from Andrew Pinski  ---
The cost model on the trunk seems to have been fixed.

[Bug rtl-optimization/103350] wrong code with -Os -fno-tree-ter on aarch64-unknown-linux-gnu

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103350

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-11-21
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug tree-optimization/103351] [12 Regression] '-fcompare-debug' failure (length) at -O2

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103351

--- Comment #2 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #1)
> Confirmed, the first (major) difference shows up in 199t.cddce3.

So maybe introduced by r12-5301.

[Bug tree-optimization/103351] [12 Regression] '-fcompare-debug' failure (length) at -O2

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103351

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-11-21
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Target Milestone|--- |12.0

--- Comment #1 from Andrew Pinski  ---
Confirmed, the first (major) difference shows up in 199t.cddce3.



   if (&__str._M_local_buf == _13)
 goto ; [30.00%]
   else
-goto ; [70.00%]
+goto ; [70.00%]

[local count: 286689070]:
   _1 = &_10->D.4203.name;
   if (_1 != __trans_tmp_1_14(D))
-goto ; [90.00%]
+goto ; [90.00%]
   else
-goto ; [10.00%]
-
-   [local count: 28668907]:
-  goto ; [100.00%]
+goto ; [10.00%]

-   [local count: 258020162]:
+   [local count: 258020162]:
   Trans_NS___cxx11_basic_string::size (_1);
   pretmp_16 = __str._M_dataplus._M_p;
   goto ; [100.00%]

-   [local count: 668941172]:
+   [local count: 668941172]:
   if (__trans_tmp_2_18(D) != 0)
-goto ; [50.00%]
+goto ; [50.00%]
   else
 goto ; [50.00%]

-   [local count: 334470586]:
-  # DEBUG this => &__str
-  # DEBUG __p => &__str._M_local_buf
-  # DEBUG INLINE_ENTRY _M_data
+   [local count: 334470586]:
   __str._M_dataplus._M_p = &__str._M_local_buf;
-  goto ; [100.00%]
+
+   [local count: 334470586]:

[Bug middle-end/85619] Inconsistent descriptions for new warning options in GCC 8.1.0

2021-11-21 Thread julien at trigofacile dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85619

--- Comment #4 from Julien ÉLIE  ---
Following up on that issue.  I've just checked against GCC 11 documentation.
The two points are still open.
Thanks beforehand.

[Bug middle-end/82798] Inconsistent descriptions for warning options in documentation

2021-11-21 Thread julien at trigofacile dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82798

--- Comment #5 from Julien ÉLIE  ---
Following up on that bug report, I've checked GCC 11 documentation.
Only 3/, part of 5/ (-Wmultistatement-macros) and 6/ have been fixed.
Points 1/, 2/, 4/, part of 5/ (-Wreorder) and 7/ remain.
Thanks beforehand.

[Bug bootstrap/50064] Failure in stage 2 (lazy binding) on openbsd

2021-11-21 Thread julien at trigofacile dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50064

Julien ÉLIE  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #1 from Julien ÉLIE  ---
Probably too old to be worth investigating.
As the author of the bug, I suggest to close it.

[Bug c/83011] -Wformat-truncation=2 difficult to avoid for non-constant bounds

2021-11-21 Thread julien at trigofacile dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83011

--- Comment #8 from Julien ÉLIE  ---
"I do agree that the warning in this case is too difficult to deal with and
should be adjusted so I'm going to confirm this report on that basis."
FWIW, the same warning is still present in GCC 10.2.1.

[Bug c/83011] -Wformat-truncation=2 difficult to avoid for non-constant bounds

2021-11-21 Thread julien at trigofacile dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83011

--- Comment #7 from Julien ÉLIE  ---
*** Bug 103352 has been marked as a duplicate of this bug. ***

[Bug c/103352] Wrong computation of -Wformat-truncation=2

2021-11-21 Thread julien at trigofacile dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103352

Julien ÉLIE  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Julien ÉLIE  ---
Duplicate, sorry.

*** This bug has been marked as a duplicate of bug 83011 ***

[Bug c/103352] New: Wrong computation of -Wformat-truncation=2

2021-11-21 Thread julien at trigofacile dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103352

Bug ID: 103352
   Summary: Wrong computation of -Wformat-truncation=2
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: julien at trigofacile dot com
  Target Milestone: ---

When building
https://raw.githubusercontent.com/InterNetNews/inn/main/lib/timer.c I have a
warning with -Wformat-truncation=2

len = 52 * timer_count + 27 + (prefix == NULL ? 0 : strlen(prefix)) + 1;
buf = xmalloc(len);
if (prefix == NULL)
rc = 0;
else
rc = snprintf(buf, len, "%s ", prefix);

timer.c: In function ‘TMRsummary’:
timer.c:395:36: warning: ‘ ’ directive output may be truncated writing 1 byte
into a region of size between 0 and 1 [-Wformat-truncation=]
  395 | rc = snprintf(buf, len, "%s ", prefix);
  |^
timer.c:395:14: note: ‘snprintf’ output 2 or more bytes (assuming 3) into a
destination of size 1
  395 | rc = snprintf(buf, len, "%s ", prefix);
  |  ^


It seems that "len" is considered to be of length 1.  I bet it comes from the
"+ 1" at the end of the declaration of "len".
If I move the "+ 27" at the end of the declaration of "len", the warning
disappears.

Isn't there something to improve in the way the computation is done for that
warning?

Thanks beforehand.

[Bug debug/103351] New: [12 Regression] '-fcompare-debug' failure (length) at -O2

2021-11-21 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103351

Bug ID: 103351
   Summary: [12 Regression] '-fcompare-debug' failure (length) at
-O2
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
CC: aoliva at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu

Created attachment 51845
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51845=edit
auto-reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-g++ -O2 -fcompare-debug testcase.C -save-temps
x86_64-pc-linux-gnu-g++: error: testcase.C: '-fcompare-debug' failure (length)

$ x86_64-pc-linux-gnu-g++ -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest/bin/x86_64-pc-linux-gnu-g++
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-5437-20211121132234-g09a4ffb72aa-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r12-5437-20211121132234-g09a4ffb72aa-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.0 20211121 (experimental) (GCC)

[Bug debug/103315] Gfortran DW_AT_Rank expression not emitting correct rank value.

2021-11-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103315

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:da17c304e22ba256eba0b03710aa329115163b08

commit r12-5442-gda17c304e22ba256eba0b03710aa329115163b08
Author: Jakub Jelinek 
Date:   Sun Nov 21 21:08:04 2021 +0100

fortran, debug: Fix up DW_AT_rank [PR103315]

For DW_AT_rank we were emitting
.uleb128 0x4# DW_AT_rank
.byte   0x97# DW_OP_push_object_address
.byte   0x23# DW_OP_plus_uconst
.uleb128 0x1c
.byte   0x6 # DW_OP_deref
on 64-bit and
.uleb128 0x4# DW_AT_rank
.byte   0x97# DW_OP_push_object_address
.byte   0x23# DW_OP_plus_uconst
.uleb128 0x10
.byte   0x6 # DW_OP_deref
on 32-bit.  I think this is wrong, as dtype.rank field in the descriptor
has unsigned char type, not pointer type nor pointer sized integral.
E.g. if we have a
REAL :: a(..)
dummy argument, which is passed as a reference to the function descriptor,
we want to evaluate a->dtype.rank.  The above DWARF expressions perform
*(uintptr_t *)(a + 0x1c)
and
*(uintptr_t *)(a + 0x10)
respectively.  The following patch changes those to:
.uleb128 0x5# DW_AT_rank
.byte   0x97# DW_OP_push_object_address
.byte   0x23# DW_OP_plus_uconst
.uleb128 0x1c
.byte   0x94# DW_OP_deref_size
.byte   0x1
and
.uleb128 0x5# DW_AT_rank
.byte   0x97# DW_OP_push_object_address
.byte   0x23# DW_OP_plus_uconst
.uleb128 0x10
.byte   0x94# DW_OP_deref_size
.byte   0x1
which perform
*(unsigned char *)(a + 0x1c)
and
*(unsigned char *)(a + 0x10)
respectively.

2021-11-21  Jakub Jelinek  

PR debug/103315
* trans-types.c (gfc_get_array_descr_info): Use DW_OP_deref_size 1
instead of DW_OP_deref for DW_AT_rank.

[Bug c++/101180] [12 Regression] Rejected code since r12-299-ga0fdff3cf33f7284

2021-11-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101180

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:364539710f828851b9fac51c39033cd09aa620de

commit r12-5441-g364539710f828851b9fac51c39033cd09aa620de
Author: Jakub Jelinek 
Date:   Sun Nov 21 21:06:23 2021 +0100

i386: Fix up handling of target attribute [PR101180]

As shown in the testcase below, if a function has multiple target
attributes
(rather than a single one with one or more arguments) or if a function
gets one target attribute on one declaration and another one on another
declaration, on x86 their effect is not combined into
DECL_FUNCTION_SPECIFIC_TARGET, but instead only the last processed target
attribute wins.  aarch64 handles this right, the following patch follows
what it does, i.e. only start with target_option_default_node if
DECL_FUNCTION_SPECIFIC_TARGET is previously NULL (i.e. the first target
attribute being processed on a function) and otherwise start from the
previous DECL_FUNCTION_SPECIFIC_TARGET.

2021-11-21  Jakub Jelinek  

PR c++/101180
* config/i386/i386-options.c (ix86_valid_target_attribute_p): If
fndecl already has DECL_FUNCTION_SPECIFIC_TARGET, use that as base
instead of target_option_default_node.

* gcc.target/i386/pr101180.c: New test.

[Bug c++/103349] ICE in potential_constant_expression_1, at cp/constexpr.c:9104 (sorry, unimplemented: unexpected AST of kind omp_masked)

2021-11-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103349

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
 CC||jakub at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
   Last reconfirmed||2021-11-21

[Bug rtl-optimization/103350] New: wrong code with -Os -fno-tree-ter on aarch64-unknown-linux-gnu

2021-11-21 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103350

Bug ID: 103350
   Summary: wrong code with -Os -fno-tree-ter on
aarch64-unknown-linux-gnu
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: aarch64-unknown-linux-gnu

Created attachment 51844
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51844=edit
reduced testcase

Output:
$ aarch64-unknown-linux-gnu-gcc -Os -fno-tree-ter testcase.c -static
$ qemu-aarch64 -- ./a.out 
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted

$ aarch64-unknown-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-aarch64/bin/aarch64-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-5436-20211121114008-gdc915b361bb-checking-yes-rtl-df-extra-aarch64/bin/../libexec/gcc/aarch64-unknown-linux-gnu/12.0.0/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl
--with-sysroot=/usr/aarch64-unknown-linux-gnu --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=aarch64-unknown-linux-gnu
--with-ld=/usr/bin/aarch64-unknown-linux-gnu-ld
--with-as=/usr/bin/aarch64-unknown-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r12-5436-20211121114008-gdc915b361bb-checking-yes-rtl-df-extra-aarch64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.0 20211121 (experimental) (GCC)

[Bug tree-optimization/103088] [12 regression] 500.perlbench from spec 2017 fails since r12-4698

2021-11-21 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103088

--- Comment #21 from Aldy Hernandez  ---
One last comment.

A smaller hammer than -fno-unsafe-math-optimizations may be
-fno-finite-math-only which allows for the problematic NAN behavior in
Perl_do_ncmp.  Allowing for the inlining, but not munging the comparisons.  For
that matter, the test passes for ref with it.

Though someone more knowledgeable with perl+math should opine here.  I don't
know if NAN / INF are the only issues in this test wrt -ffast-math.

[Bug fortran/99061] [10/11/12 Regression] ICE in gfc_conv_intrinsic_atan2d, at fortran/trans-intrinsic.c:4728

2021-11-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99061

--- Comment #7 from CVS Commits  ---
The master branch has been updated by Harald Anlauf :

https://gcc.gnu.org/g:8fef6f720a5a0a056abfa986ba870bb406ab4716

commit r12-5440-g8fef6f720a5a0a056abfa986ba870bb406ab4716
Author: Harald Anlauf 
Date:   Sun Nov 21 19:29:27 2021 +0100

Fortran: fix lookup for gfortran builtin math intrinsics used by DEC
extensions

gcc/fortran/ChangeLog:

PR fortran/99061
* trans-intrinsic.c (gfc_lookup_intrinsic): Helper function for
looking up gfortran builtin intrinsics.
(gfc_conv_intrinsic_atrigd): Use it.
(gfc_conv_intrinsic_cotan): Likewise.
(gfc_conv_intrinsic_cotand): Likewise.
(gfc_conv_intrinsic_atan2d): Likewise.

gcc/testsuite/ChangeLog:

PR fortran/99061
* gfortran.dg/dec_math_5.f90: New test.

Co-authored-by: Steven G. Kargl 

[Bug fortran/47720] problems with makefile dependency generation using -M

2021-11-21 Thread aldot at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47720

Bernhard Reutner-Fischer  changed:

   What|Removed |Added

 CC||aldot at gcc dot gnu.org

--- Comment #9 from Bernhard Reutner-Fischer  ---
To avoid the duplicate module names, we would have to remember which modules we
emitted already (or maybe deps_add_ not when opening and reading the module but
way later).

But, as said, is this really a problem?

As to the fact that -M... requires cpp, we could enable cpp when seeing -M,
yes.
I'm not sure about the implications this has on the source we're reading
though. Maybe none.

[Bug c++/103349] New: ICE in potential_constant_expression_1, at cp/constexpr.c:9104 (sorry, unimplemented: unexpected AST of kind omp_masked)

2021-11-21 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103349

Bug ID: 103349
   Summary: ICE in potential_constant_expression_1, at
cp/constexpr.c:9104 (sorry, unimplemented: unexpected
AST of kind omp_masked)
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code, openmp
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

g++-12.0.0-alpha2024 snapshot (g:3057f1ab737582a9fb37a3fb967ed8bf3659f2f4)
ICEs when compiling the following testcase, reduced from
test/OpenMP/masked_codegen.cpp from the clang 13.0.0 test suite, w/ -fopenmp:

int c;

void
lambda_masked (int a, int b)
{
  [=] ()
  {
#pragma omp masked
c = a + b;
  };
}

% g++-12.0.0 -fopenmp -c zvbd9hfy.cpp
zvbd9hfy.cpp: In lambda function:
zvbd9hfy.cpp:10:3: sorry, unimplemented: unexpected AST of kind omp_masked
   10 |   };
  |   ^
zvbd9hfy.cpp:10:3: internal compiler error: in potential_constant_expression_1,
at cp/constexpr.c:9104
0x9776f3 potential_constant_expression_1
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/constexpr.c:9104
0x978f51 potential_constant_expression_1
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/constexpr.c:8445
0x97a16f potential_constant_expression_1(tree_node*, bool, bool, bool, int)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/constexpr.c:9126
0x97a16f potential_constant_expression(tree_node*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/constexpr.c:9135
0x9e7d4b finish_function(bool)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/decl.c:17740
0xa24a2a finish_lambda_function(tree_node*)
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/lambda.c:1559
0xabb92e cp_parser_lambda_body
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:11668
0xabb92e cp_parser_lambda_expression
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:10987
0xabc4d3 cp_parser_primary_expression
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:5674
0xabf18e cp_parser_postfix_expression
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:7580
0xaa696a cp_parser_binary_expression
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:9910
0xaa751a cp_parser_assignment_expression
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:10214
0xaa91c9 cp_parser_expression
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:10384
0xaac9d8 cp_parser_expression_statement
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:12578
0xab9b3a cp_parser_statement
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:12374
0xababed cp_parser_statement_seq_opt
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:12726
0xabacc8 cp_parser_compound_statement
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:12675
0xadb733 cp_parser_function_body
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:24899
0xadb733 cp_parser_ctor_initializer_opt_and_function_body
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:24950
0xadc700 cp_parser_function_definition_after_declarator
   
/var/tmp/portage/sys-devel/gcc-12.0.0_alpha2024/work/gcc-12-2024/gcc/cp/parser.c:31080

[Bug ipa/97783] [9/10/11/12 Regression] Optimizer assumes global static variable cannot be updated by external function, even though function is passed address of local functions

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97783

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|INVALID |MOVED
   See Also||https://sourceware.org/bugz
   ||illa/show_bug.cgi?id=28611

[Bug c++/77513] -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null

2021-11-21 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513

Jonathan Wakely  changed:

   What|Removed |Added

   Last reconfirmed|2021-08-27 00:00:00 |2021-11-21

--- Comment #13 from Jonathan Wakely  ---
An intereting case from PR 103347 where the pedwarn about the NSDMI is
suppressed because GCC thinks the initializer is in a system header:

#include 
struct test {
void *x = NULL; //invalid in C++03 mode
};
int main() {}

This should be rejected with -pedantic-errors, but g++ is silent unless you
also add -Wsystem-headers.

[Bug c++/103347] Non-static data member initialization is erroneously allowed in C++03 mode

2021-11-21 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103347

--- Comment #4 from Jonathan Wakely  ---
(In reply to Andrew Pinski from comment #2)
> That is GCC does warning about the following case:
> struct test {
> int x = 0;
> };
> int main() {}

And you get an error with -pedantic-errors

[Bug ipa/97783] [9/10/11/12 Regression] Optimizer assumes global static variable cannot be updated by external function, even though function is passed address of local functions

2021-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97783

Jan Hubicka  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Jan Hubicka  ---
The testcase is invalid since there is leaf attribute on function glob but it
is not leaf since it calls back and modifies static variable.

I filled in https://sourceware.org/bugzilla/show_bug.cgi?id=28611

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2021-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 93358, which changed state.

Bug 93358 Summary: [10/11/12 Regression] 447.dealII regresses by 15% after 
r10-6025-gf5b25e15165adde1356af42e9066ab75c5b37a19
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93358

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug lto/93358] [10/11/12 Regression] 447.dealII regresses by 15% after r10-6025-gf5b25e15165adde1356af42e9066ab75c5b37a19

2021-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93358

Jan Hubicka  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jan Hubicka  ---
According to the plot linked the regression was fixed in July 2020

[Bug ipa/103227] [12 Regression] 58% exchange2 regression with -Ofast -march=native on zen3 since r12-5223-gecdf414bd89e6ba251f6b3f494407139b4dbae0e

2021-11-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227

--- Comment #10 from CVS Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:0f5afb626381d19bfced30bc19cf3b03867fa6f5

commit r12-5439-g0f5afb626381d19bfced30bc19cf3b03867fa6f5
Author: Jan Hubicka 
Date:   Sun Nov 21 16:15:41 2021 +0100

Improve base tracking in ipa-modref

on exchange2 benchamrk we miss some useful propagation because modref gives
up very early on analyzing accesses through pointers.  For example in
int test (int *a)
{
  int i;
  for (i=0; a[i];i++);
  return i+a[i];
}

We are not able to determine that a[i] accesses are relative to a.
This is because get_access requires the SSA name that is in MEM_REF to be
PARM_DECL while on other places we use ipa-prop helper to work out the
proper
base pointers.

This patch commonizes the code in get_access and parm_map_for_arg so both
use the check properly and extends it to also figure out that newly
allocated
memory is not a side effect to caller.

gcc/ChangeLog:

2021-11-21  Jan Hubicka  

PR ipa/103227
* ipa-modref.c (parm_map_for_arg): Rename to ...
(parm_map_for_ptr): .. this one; handle static chain and calls to
malloc functions.
(modref_access_analysis::get_access): Use parm_map_for_ptr.
(modref_access_analysis::process_fnspec): Update.
(modref_access_analysis::analyze_load): Update.
(modref_access_analysis::analyze_store): Update.

gcc/testsuite/ChangeLog:

2021-11-21  Jan Hubicka  

PR ipa/103227
* gcc.dg/tree-ssa/modref-15.c: New test.

[Bug testsuite/103264] [12 regression] gcc.dg/tree-prof/merge_block.c fails after r12-5236

2021-11-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103264

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:c8260767aa3b41017b075d8fde3a4065fa637db7

commit r12-5438-gc8260767aa3b41017b075d8fde3a4065fa637db7
Author: Jan Hubicka 
Date:   Sun Nov 21 16:13:40 2021 +0100

Fix failure merge_block.c testcase

gcc/testsuite/ChangeLog:

2021-11-21  Jan Hubicka  

PR ipa/103264
* gcc.dg/tree-prof/merge_block.c: Add -fno-ipa-modref

[Bug c/103348] Bad code generated for fabs(long double) under aarch64

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103348

--- Comment #4 from Andrew Pinski  ---
(In reply to jacob navia from comment #3)

> As per standard c99 fabs is a generic function.

NO, you need to include tgmath.h for generic math functions.

[Bug c/103348] Bad code generated for fabs(long double) under aarch64

2021-11-21 Thread jacob at jacob dot remcomp.fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103348

--- Comment #3 from jacob navia  ---
1) The complete program is as follows:
#include 

int main(void)
{
long double ld = -2.3L;
ld = fabs(ld);
}

Compiler flags
gcc -S -c -std=c99 tafbs1.c

As per standard c99 fabs is a generic function.
After your remark, gcc generates correct code when I explicitly include


This is surprising



> Le 21 nov. 2021 à 14:25, pinskia at gcc dot gnu.org 
>  a écrit :
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103348
> 
> Andrew Pinski  changed:
> 
>   What|Removed |Added
> 
> Status|UNCONFIRMED |WAITING
> Ever confirmed|0   |1
>   Last reconfirmed||2021-11-21
> 
> --- Comment #1 from Andrew Pinski  ---
> Full testcase?
> Is this c or c++?
> Also if this is c, you should be using fabsl for long double.
> 
> -- 
> You are receiving this mail because:
> You reported the bug.

[Bug c++/77513] -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513

Andrew Pinski  changed:

   What|Removed |Added

 CC||fchelnokov at gmail dot com

--- Comment #12 from Andrew Pinski  ---
*** Bug 103347 has been marked as a duplicate of this bug. ***

[Bug c++/103347] Non-static data member initialization is erroneously allowed in C++03 mode

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103347

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Andrew Pinski  ---
Dup of bug 77513

*** This bug has been marked as a duplicate of bug 77513 ***

[Bug c++/103347] Non-static data member initialization is erroneously allowed in C++03 mode

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103347

--- Comment #2 from Andrew Pinski  ---
That is GCC does warning about the following case:
struct test {
int x = 0;
};
int main() {}

[Bug c++/103347] Non-static data member initialization is erroneously allowed in C++03 mode

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103347

--- Comment #1 from Andrew Pinski  ---
This is macro tracking which is getting in the way of emitting the warning as
NULL is from a system header, there is a dup of this bug already.

[Bug tree-optimization/99520] Failure to detect bswap pattern

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520

--- Comment #8 from Andrew Pinski  ---
(In reply to Richard Biener from comment #7)
> I think there was a recent duplicate.

Yes I think PR 98953.

[Bug c/103348] Bad code generated for fabs(long double) under aarch64

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103348

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Andrew Pinski  ---
#include 


long double 
f(long double ld) 
{
ld = fabsl(ld);
return ld;
};

Works correctly.

[Bug middle-end/31531] A microoptimization of isnegative of signed integer

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31531

Andrew Pinski  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2021-Novembe
   ||r/585088.html
   Keywords||patch

--- Comment #18 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585088.html

[Bug middle-end/103344] mulshift does not work when divisor is larger than 100 on 32 bits target.

2021-11-21 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103344

Roger Sayle  changed:

   What|Removed |Added

 CC||roger at nextmovesoftware dot 
com

--- Comment #6 from Roger Sayle  ---
mulshift's decision is based on the divisor and target CPU's timings (-mtune).
For example, -m32 uses mulshift for the udivmod divisor 1008, but not for 1000
(i.e. it's far more subtle than "doesn't work when divisor is larger than
100").

Unfortunately the (pruned-search) algorithm that GCC's expand uses makes it
impossible to print out the costs of different alternatives, instead all that
we know is that it was unable to find any instruction sequence that it
considered cheaper than calling __udivmoddi4.  (and it may have considered
thousands of potential sequences, see PR87256).

[Bug c/103348] Bad code generated for fabs(long double) under aarch64

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103348

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1
   Last reconfirmed||2021-11-21

--- Comment #1 from Andrew Pinski  ---
Full testcase?
Is this c or c++?
Also if this is c, you should be using fabsl for long double.

[Bug c/103348] New: Bad code generated for fabs(long double) under aarch64

2021-11-21 Thread jacob at jacob dot remcomp.fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103348

Bug ID: 103348
   Summary: Bad code generated for fabs(long double) under aarch64
   Product: gcc
   Version: 6.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jacob at jacob dot remcomp.fr
  Target Milestone: ---

Machine aarch64 linux
when compiling:
long double ld;

ld = fabs(ld);

The compiler generates:
ldr q0, [x29, 16] 
bl  __trunctfdf2
fabsd0, d0
bl  __extenddftf2
str q0, [x29, 16] 
You truncate de 128 bit long double to 64, then extend it to 128 bits! You
loose ALL PRECISION!

Incredible

[Bug tree-optimization/103345] missed optimization: add/xor individual bytes to form a word

2021-11-21 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103345

Roger Sayle  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
  Component|other   |tree-optimization
 Ever confirmed|0   |1
 CC||roger at nextmovesoftware dot 
com
   Last reconfirmed||2021-11-21

--- Comment #1 from Roger Sayle  ---
The ior form is percieved in tree-ssa's bswap pass implemented in
gimple-ssa-store-merging.c.
32 bit load in target endianness found at: _16 = MEM  [(const
uint8_t *)ptr_15(D)];
32-bit nop implementations found: 1
My guess it that it should be trivial to handle the PLUS_EXPR and BIT_XOR_EXPR
tree codes at the same time as BIT_IOR_EXPR.

[Bug testsuite/103264] [12 regression] gcc.dg/tree-prof/merge_block.c fails after r12-5236

2021-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103264

--- Comment #3 from Jan Hubicka  ---
Aha, I see it.  The difference with IPA modref is that we now eliminate the
loop before profile instrumentation.  This leads to difference because the
function t() is never executed and thus we use guessed profile on it.

Now without modref we are not able to optimize out the loop and we actualy
profile it and then scale down the profile to zero.  This makes the cunrolli
updating bug to disappear and also shows weird behaviour of profile scaling -
when we get count to 0 we should maintain local profile.

So to get back to the original behaviour we want to add -fno-ipa-modref but
there are at least two other issues - the cunrolli misupading profile (which if
I recall may be partly intentional) and profile scalling killing local profile
when dropping count to 0.

[Bug c++/103347] New: Non-static data member initialization is erroneously allowed in C++03 mode

2021-11-21 Thread fchelnokov at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103347

Bug ID: 103347
   Summary: Non-static data member initialization is erroneously
allowed in C++03 mode
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fchelnokov at gmail dot com
  Target Milestone: ---

This code is invalid in C++03 mode:
```
#include 
struct test {
void *x = NULL; //invalid in C++03 mode
};
int main() {}
```
It is rejected by Clang, not not by GCC. Demo:
https://gcc.godbolt.org/z/113EWvzzx

Related discussion: https://stackoverflow.com/q/49618716/7325599

[Bug testsuite/103264] [12 regression] gcc.dg/tree-prof/merge_block.c fails after r12-5236

2021-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103264

--- Comment #2 from Jan Hubicka  ---
OK, the test wants to verify that main() became single basic block that has
correct frequency.  This is still true.  What we get wrong is that the function
t() gets mismatches:
int t ()
{
  int i;
  int _7;
  int _10;
  int _13;

;;   basic block 2, loop depth 0, count 312727306 (estimated locally, globally
0), probably never executed
;;prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED)
;;pred:   ENTRY [always]  count:312727306 (estimated locally, globally
0) (FALLTHRU,EXECUTABLE)
  _7 = a[0];
  if (_7 != 0)
goto ; [5.50%]
  else
goto ; [94.50%]
;;succ:   6 [5.5% (guessed)]  count:1722 (estimated locally,
globally 0) (TRUE_VALUE,EXECUTABLE)
;;3 [94.5% (guessed)]  count:295527304 (estimated locally,
globally 0) (FALSE_VALUE,EXECUTABLE)

;;   basic block 3, loop depth 0, count 299827304 (estimated locally, globally
0), probably never executed
;;   Invalid sum of incoming counts 295527304 (estimated locally, globally 0),
should be 299827304 (estimated locally, globally 0)
;;prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED)
;;pred:   2 [94.5% (guessed)]  count:295527304 (estimated locally,
globally 0) (FALSE_VALUE,EXECUTABLE)
  _10 = a[1];
  if (_10 != 0)
goto ; [5.50%]
  else
goto ; [94.50%]
;;succ:   6 [5.5% (guessed)]  count:16490502 (estimated locally,
globally 0) (TRUE_VALUE,EXECUTABLE)
;;4 [94.5% (guessed)]  count:283336802 (estimated locally,
globally 0) (FALSE_VALUE,EXECUTABLE)

;;   basic block 4, loop depth 0, count 287459432 (estimated locally, globally
0), probably never executed
;;   Invalid sum of incoming counts 283336802 (estimated locally, globally 0),
should be 287459432 (estimated locally, globally 0)
;;prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)
;;pred:   3 [94.5% (guessed)]  count:283336802 (estimated locally,
globally 0) (FALSE_VALUE,EXECUTABLE)
  _13 = a[2];
  if (_13 != 0)
goto ; [5.50%]
  else
goto ; [94.50%]
;;succ:   6 [5.5% (guessed)]  count:15810269 (estimated locally,
globally 0) (TRUE_VALUE,EXECUTABLE)
;;5 [94.5% (guessed)]  count:271649163 (estimated locally,
globally 0) (FALSE_VALUE,EXECUTABLE)

;;   basic block 5, loop depth 0, count 271649163 (estimated locally, globally
0), probably never executed
;;prev block 4, next block 6, flags: (NEW)
;;pred:   4 [94.5% (guessed)]  count:271649163 (estimated locally,
globally 0) (FALSE_VALUE,EXECUTABLE)
;;succ:   6 [always]  count:271649163 (estimated locally, globally 0)
(FALLTHRU)

;;   basic block 6, loop depth 0, count 312727306 (estimated locally, globally
0), probably never executed
;;   Invalid sum of incoming counts 321149936 (estimated locally, globally 0),
should be 312727306 (estimated locally, globally 0)
;;prev block 5, next block 1, flags: (NEW, REACHABLE, VISITED)
;;pred:   3 [5.5% (guessed)]  count:16490502 (estimated locally,
globally 0) (TRUE_VALUE,EXECUTABLE)
;;5 [always]  count:271649163 (estimated locally, globally 0)
(FALLTHRU)
;;4 [5.5% (guessed)]  count:15810269 (estimated locally,
globally 0) (TRUE_VALUE,EXECUTABLE)
;;2 [5.5% (guessed)]  count:1722 (estimated locally,
globally 0) (TRUE_VALUE,EXECUTABLE)
  # i_5 = PHI <1(3), 3(5), 2(4), 0(2)>
  return i_5;
;;succ:   EXIT [always]  count:312727306 (estimated locally, globally
0)

}

which appears in cunrolli. I don't think this is related to the modref change.

Martin, can you please bisect it?

[Bug middle-end/103344] mulshift does not work when divisor is larger than 100 on 32 bits target.

2021-11-21 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103344

--- Comment #5 from cqwrteur  ---
(In reply to Andrew Pinski from comment #4)
> Again you misunderstood. There is a cost model for doing the multiple shift
> from the division and if says the function call will be faster, it uses
> that. There is also a trade off with respect to code size too. This is one
> area where gcc could be improved in printing out what choices it made and
> why but that is a different issue.
> Again how do you know if the function will be slower than the multiple and
> shift here?

dragonbox!

[Bug tree-optimization/102720] [12 regression] gcc.dg/tree-ssa/ldist-strlen-1.c and ldist-strlen-2.c fail after r12-4324

2021-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102720

Jan Hubicka  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Jan Hubicka  ---
Fixed.

[Bug c++/103346] New: ICE on template specialization via alias template with a non-type parameter pack, and fold expression on lambda calls

2021-11-21 Thread ap.vanzanten at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103346

Bug ID: 103346
   Summary: ICE on template specialization via alias template with
a non-type parameter pack, and fold expression on
lambda calls
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ap.vanzanten at gmail dot com
  Target Milestone: ---

Command: g++ -std=c++20 

Code:


struct Item {};

template 
struct Sequence;

template 
using ItemSequence = Sequence;

template 
constexpr auto f() {
  constexpr auto l = [](Item item) { return item; };
  return ItemSequence{};
}



Output:


: In substitution of 'template using ItemSequence =
Sequence [with Item ...items = {f::l((const
Item)items)...}]':
:12:34:   required from here
:7:37: internal compiler error: Segmentation fault
7 | using ItemSequence = Sequence;
  | ^
0x1786229 internal_error(char const*, ...)
???:0
0xfa6740 strip_array_types(tree_node*)
???:0
0x82f438 cp_type_quals(tree_node const*)
???:0
0x7e8a1b tsubst_pack_expansion(tree_node*, tree_node*, int, tree_node*)
???:0
0x7ef14a tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
???:0
0x7ef72c tsubst_argument_pack(tree_node*, tree_node*, int, tree_node*)
???:0
0x7ef12c tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
???:0
0x7dd7cc tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0x7dd8a7 tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0x7f40b2 instantiate_template(tree_node*, tree_node*, int)
???:0
0x7dde4d tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0x7e6702 lookup_template_class(tree_node*, tree_node*, tree_node*, tree_node*,
int, int)
???:0
0x80a52d finish_template_type(tree_node*, tree_node*, int)
???:0
0x7c47db c_parse_file()
???:0
0x896762 c_common_parse_file()
???:0
Please submit a full bug report,
...



Note the issue is not present when moving the lambda 'l' out of the function
'f' (into global scope), or when removing the 'ItemSequence' alias, and simply
using 'Sequence' directly.

See also https://godbolt.org/z/Yqnnod4W6 for a live example.

[Bug target/102117] s390: Inefficient code for 64x64=128 signed multiply for <= z13

2021-11-21 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102117

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:dc915b361bbc99da83fc53db7f7e0e28d0ce12c8

commit r12-5436-gdc915b361bbc99da83fc53db7f7e0e28d0ce12c8
Author: Roger Sayle 
Date:   Sun Nov 21 11:40:08 2021 +

Tweak tree-ssa-math-opts.c to solve PR target/102117.

This patch resolves PR target/102117 on s390.  The problem is that
some of the functionality of GCC's RTL expanders is no longer triggered
following the transition to tree SSA form.  On s390, unsigned widening
multiplications are converted into WIDEN_MULT_EXPR (aka w* in tree dumps),
but signed widening multiplies are left in their original form, which
alas doesn't benefit from the clever logic in expand_widening_mult.

The fix is to teach convert_mult_to_widen, that RTL expansion can
synthesize a signed widening multiplication if the target provides
a suitable umul_widen_optab.

On s390-linux-gnu with -O2 -m64, the code in the bugzilla PR currently
generates:

imul128:
stmg%r12,%r13,96(%r15)
srag%r0,%r4,63
srag%r1,%r3,63
lgr %r13,%r3
mlgr%r12,%r4
msgr%r1,%r4
msgr%r0,%r3
lgr %r4,%r12
agr %r1,%r0
lgr %r5,%r13
agr %r4,%r1
stmg%r4,%r5,0(%r2)
lmg %r12,%r13,96(%r15)
br  %r14

but with this patch should now generate the more efficient:

imul128:
lgr %r1,%r3
mlgr%r0,%r4
srag%r5,%r3,63
ngr %r5,%r4
srag%r4,%r4,63
sgr %r0,%r5
ngr %r4,%r3
sgr %r0,%r4
stmg%r0,%r1,0(%r2)
br  %r14

2021-11-21  Roger Sayle  
Robin Dapp  

gcc/ChangeLog
PR target/102117
* tree-ssa-math-opts.c (convert_mult_to_widen): Recognize
signed WIDEN_MULT_EXPR if the target supports umul_widen_optab.

gcc/testsuite/ChangeLog
PR target/102117
* gcc.target/s390/mul-wide.c: New test case.
* gcc.target/s390/umul-wide.c: New test case.

[Bug other/103345] New: missed optimization: add/xor individual bytes to form a word

2021-11-21 Thread gcc at rjk dot terraraq.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103345

Bug ID: 103345
   Summary: missed optimization: add/xor individual bytes to form
a word
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at rjk dot terraraq.uk
  Target Milestone: ---

All code generated with godbolt's idea of 'trunk'. See
https://godbolt.org/z/Wcj61PKKG

Source:

#include 

uint32_t load_le_32_or(const uint8_t *ptr)
{
  return ((uint32_t)ptr[0]) | ((uint32_t)ptr[1] << 8) | ((uint32_t)ptr[2] <<
16) | ((uint32_t)ptr[3] << 24);
}

uint32_t load_le_32_add(const uint8_t *ptr)
{
  return ((uint32_t)ptr[0]) + ((uint32_t)ptr[1] << 8) + ((uint32_t)ptr[2] <<
16) + ((uint32_t)ptr[3] << 24);
}


uint32_t load_le_32_xor(const uint8_t *ptr)
{
  return ((uint32_t)ptr[0]) ^ ((uint32_t)ptr[1] << 8) ^ ((uint32_t)ptr[2] <<
16) ^ ((uint32_t)ptr[3] << 24);
}

The ^ version is admittedly a bit of an odd choice but the + version is a
reasonably natural way to write the code.


Code on gcc -O2:

load_le_32_or:
mov eax, DWORD PTR [rdi]
ret
load_le_32_add:
movzx   eax, BYTE PTR [rdi+1]
movzx   edx, BYTE PTR [rdi+2]
sal eax, 8
sal edx, 16
add eax, edx
movzx   edx, BYTE PTR [rdi]
add eax, edx
movzx   edx, BYTE PTR [rdi+3]
sal edx, 24
add eax, edx
ret
load_le_32_xor:
movzx   eax, BYTE PTR [rdi+1]
movzx   edx, BYTE PTR [rdi+2]
sal eax, 8
sal edx, 16
xor eax, edx
movzx   edx, BYTE PTR [rdi]
xor eax, edx
movzx   edx, BYTE PTR [rdi+3]
sal edx, 24
xor eax, edx
ret


Code on clang -O2:

load_le_32_or:  # @load_le_32_or
mov eax, dword ptr [rdi]
ret
load_le_32_add: # @load_le_32_add
mov eax, dword ptr [rdi]
ret
load_le_32_xor: # @load_le_32_xor
mov eax, dword ptr [rdi]
ret

[Bug middle-end/103344] mulshift does not work when divisor is larger than 100 on 32 bits target.

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103344

--- Comment #4 from Andrew Pinski  ---
Again you misunderstood. There is a cost model for doing the multiple shift
from the division and if says the function call will be faster, it uses that.
There is also a trade off with respect to code size too. This is one area where
gcc could be improved in printing out what choices it made and why but that is
a different issue.
Again how do you know if the function will be slower than the multiple and
shift here?

[Bug middle-end/103344] mulshift does not work when divisor is larger than 100 on 32 bits target.

2021-11-21 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103344

--- Comment #3 from cqwrteur  ---
(In reply to Andrew Pinski from comment #2)
> Sounds like a cost model issue. Are you sure it is faster?

division is very very slow. And you use software emulation which is even
slower.

[Bug tree-optimization/103344] mulshift does not work when divisor is larger than 100 on 32 bits target.

2021-11-21 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103344

--- Comment #2 from Andrew Pinski  ---
Sounds like a cost model issue. Are you sure it is faster?

[Bug tree-optimization/103344] mulshift does not work when divisor is larger than 100 on 32 bits target.

2021-11-21 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103344

--- Comment #1 from cqwrteur  ---
https://godbolt.org/z/v57vxqq7E

while 64bits target works

[Bug tree-optimization/103344] New: mulshift does not work when divisor is larger than 100 on 32 bits target.

2021-11-21 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103344

Bug ID: 103344
   Summary: mulshift does not work when divisor is larger than 100
on 32 bits target.
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: unlvsur at live dot com
  Target Milestone: ---

https://godbolt.org/z/x3zx738he