[Bug target/101142] [11/12 regression] Regression due to supporting bitwise operators on AVX512 masks.

2021-06-20 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101142

--- Comment #4 from Hongtao.liu  ---
(In reply to Andrew Pinski from comment #3)
> The exact command line to hit this issue is:
>  -O3 -march=skylake-avx512

Yes, thanks for the clarification.

g++ byteswap.cpp test.cpp -march=skylake-avx512 -O3 -DDTYPE32

[Bug target/100866] PPC: Inefficient code for vec_revb(vector unsigned short) < P9

2021-06-20 Thread jens.seifert at de dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866

--- Comment #9 from Jens Seifert  ---
I know that if I would use vec_perm builtin as an end user, that you then need
to fulfill to the LE specification, but you can always optimize the code as you
like as long as it creates correct results afterwards.

load constant
xxlnor constant

can always be transformed to 

load inverse constant.

[Bug target/101142] [11/12 regression] Regression due to supporting bitwise operators on AVX512 masks.

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101142

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.2
   Keywords||missed-optimization
   Last reconfirmed||2021-06-21
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #3 from Andrew Pinski  ---
The exact command line to hit this issue is:
 -O3 -march=skylake-avx512

[Bug debug/101141] Fedora glibc debuginfo .dwz contains a partial unit with needed debuginfo but which is not imported

2021-06-20 Thread roc at ocallahan dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101141

--- Comment #3 from roc at ocallahan dot org  ---
Filed https://sourceware.org/bugzilla/show_bug.cgi?id=28000

[Bug target/101142] [11/12 regression] Regression due to supporting bitwise operators on AVX512 masks.

2021-06-20 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101142

--- Comment #2 from Hongtao.liu  ---
I'm working on a patch which disparages slightly the mask register alternative
for bitwise operations(using "?k" in alternatives). It can prevent mask bitwise
instruction generation when the input is not allocate as mask registers.

Also when allocano cost of GENERAL_REGS is same as MASK_REGS, allocate
MASK_REGS first since it has already been disparaged. This is for testcase like
below where the input is allocated as mask registers, then mask bitwise
instructions should be used here.

#include
volatile __mmask8 foo;
void
foo_orb (__m512i a, __m512i b, __m512i c, __m512i d)
{
  __mmask8 m1 = _mm512_cmp_epi64_mask (a, b, 2);
  __mmask8 m2 = _mm512_cmp_epi64_mask (c, d, 4);
  foo = m1 | m2;
}

vpcmpq  $2, %zmm1, %zmm0, %k0
vpcmpq  $4, %zmm3, %zmm2, %k1
korb%k1, %k0, %k2
kmovb   %k2, foo(%rip)
ret
foo:

[Bug debug/101141] Fedora glibc debuginfo .dwz contains a partial unit with needed debuginfo but which is not imported

2021-06-20 Thread roc at ocallahan dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101141

roc at ocallahan dot org  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|WAITING |RESOLVED

--- Comment #2 from roc at ocallahan dot org  ---
Good point. It looks correct in the generated .so so it's probably a DWZ bug.
I'll repost it there. Sorry for the noise.

[Bug target/101142] [11/12 regression] Regression due to supporting bitwise operators on AVX512 masks.

2021-06-20 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101142

--- Comment #1 from Hongtao.liu  ---
Created attachment 51041
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51041=edit
byteswap.cpp

[Bug target/101142] New: [11/12 regression] Regression due to supporting bitwise operators on AVX512 masks.

2021-06-20 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101142

Bug ID: 101142
   Summary: [11/12 regression] Regression due to  supporting
bitwise operators on AVX512 masks.
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: crazylht at gmail dot com
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-*-* i?86-*-*

Created attachment 51040
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51040=edit
test.cpp

options to reproduce regression: g++ byteswap.cpp test.cpp -march=native -O3
-DDTYPE32

The regression is due to r11-2796-g388cb292a94f98a276548cd6ce01285cf36d17df
which supports bitwise operator for avx512 masks. In byteswap.cpp there're a
lot of bitwise operations, the register pressure is very high, so LRA allocate
some bitwise operations to mask registers to avoid spills, the problem is mask
bitwise instructions has 1/4 throught of those gpr versions which causes the
regression.

[Bug debug/101141] Fedora glibc debuginfo .dwz contains a partial unit with needed debuginfo but which is not imported

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101141

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-06-21
 Status|UNCONFIRMED |WAITING

--- Comment #1 from Andrew Pinski  ---
This sounds like either a binutils issue or a https://sourceware.org/dwz/ issue
or a Fedora issue.
Can you rebuild glibc in Fedora and look at the resulting glibc.so file before
the dwz and debug info gets stripped?

[Bug debug/101141] New: Fedora glibc debuginfo .dwz contains a partial unit with needed debuginfo but which is not imported

2021-06-20 Thread roc at ocallahan dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101141

Bug ID: 101141
   Summary: Fedora glibc debuginfo .dwz contains a partial unit
with needed debuginfo but which is not imported
   Product: gcc
   Version: 11.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: roc at ocallahan dot org
  Target Milestone: ---

The Fedora 34 package glibc-debuginfo-2.33-16.fc34.x86_64 package has glibc
symbols in
/usr/lib/debug/.build-id/08/1490fc18239fa63189a53e526a68ee5d19c571.debug whose
.gnu_debugaltlink is /usr/lib/debug/usr/.dwz/glibc-2.33-16.fc34.x86_64.

/usr/lib/debug/.build-id/08/1490fc18239fa63189a53e526a68ee5d19c571.debug has a
compilation unit that imports the partial unit at 0x1f63c from the .dwz file:

UNIT:
< 0><0x000c GOFF=0x000795fa>  DW_TAG_compile_unit
DW_AT_producer 
<.debug_str(sup)+0x0001200e>
DW_AT_language  DW_LANG_C11
DW_AT_name  global-locale.c
DW_AT_comp_dir 
/usr/src/debug/glibc-2.33-16.fc34.x86_64/locale
DW_AT_stmt_list
<.debug_line+0xce8a>
< 1><0x001e GOFF=0x0007960c>DW_TAG_imported_unit
  DW_AT_import   
<.debug_info(sup)+0x0001f63c>

That partial unit contains a variable declaration whose DW_AT_specification
(0x1f0d2) lives in another partial compilation unit:

UNIT:
< 0><0x000c GOFF=0x0001f63c>  DW_TAG_partial_unit
DW_AT_stmt_list
<.debug_line+0x>
...
< 1><0x0038 GOFF=0x0001f668>DW_TAG_variable
  DW_AT_specification
<.debug_info+0x0001f0d2>
  DW_AT_decl_file 0x013b
/usr/src/debug/glibc-2.33-16.fc34.x86_64/locale/global-locale.c
  DW_AT_decl_line 0x0040
  DW_AT_decl_column   0x0013
  DW_AT_location  len 0x000a:
0e9b: DW_OP_const8u 0 DW_OP_form_tls_address

UNIT:
< 0><0x000c GOFF=0x0001f0cd>  DW_TAG_partial_unit
DW_AT_stmt_list
<.debug_line+0x>
< 1><0x0011 GOFF=0x0001f0d2>DW_TAG_variable
  DW_AT_name 
__libc_tsd_LOCALE
  DW_AT_decl_file 0x0089
/usr/src/debug/glibc-2.33-16.fc34.x86_64/locale/localeinfo.h
  DW_AT_decl_line 0x00e1
  DW_AT_decl_column   0x0001
  DW_AT_type 
<.debug_info+0x00035fb3>
  DW_AT_external  yes
  DW_AT_declaration   yes

However, the partial unit at 0x1f0cd is not imported anywhere in
/usr/lib/debug/.build-id/08/1490fc18239fa63189a53e526a68ee5d19c571.debug as far
as I can tell. I think the compilation unit at global-locale.c, at least,
should be importing it.

[Bug target/100866] PPC: Inefficient code for vec_revb(vector unsigned short) < P9

2021-06-20 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866

--- Comment #8 from luoxhu at gcc dot gnu.org ---
(In reply to Jens Seifert from comment #7)
> Regarding vec_revb for vector unsigned int. I agree that
> revb:
> .LFB0:
> .cfi_startproc
> vspltish %v1,8
> vspltisw %v0,-16
> vrlh %v2,%v2,%v1
> vrlw %v2,%v2,%v0
> blr
> 
> works. But in this case, I would prefer the vperm approach assuming that the
> loaded constant for the permute vector can be re-used multiple times.
> But please get rid of the xxlnor 32,32,32. That does not make sense after
> loading a constant. Change the constant that need to be loaded.

xxlnor is LE specific requirement(not existed if build with -mbig), we need to
turn the index {0,1,2,3} to {31, 30,29,28} for vperm usage, it is required
otherwise produces incorrect result:

 6|0x1630 <+16>:lvx v0,0,r9
 7+>   0x1634 <+20>:xxlnor  vs32,vs32,vs32
 8|0x1638 <+24>:vperm   v2,v2,v2,v0
 9|0x163c <+28>:blr

(gdb)
0x1634 in revb ()
2: /x $vs34.uint128 = 0x42345678323456782234567812345678
5: /x $vs32.uint128 = 0xc0d0e0f08090a0b0405060700010203
(gdb) si
0x1638 in revb ()
2: /x $vs34.uint128 = 0x42345678323456782234567812345678
5: /x $vs32.uint128 = 0xf3f2f1f0f7f6f5f4fbfaf9f8fffefdfc
(gdb) si
0x163c in revb ()
2: /x $vs34.uint128 = 0x78563442785634327856342278563412
5: /x $vs32.uint128 = 0xf3f2f1f0f7f6f5f4fbfaf9f8fffefdfc



Quoted from the ISA:

vperm VRT,VRA,VRB,VRC

vsrc.qword[0] ← VSR[VRA+32]
vsrc.qword[1] ← VSR[VRB+32]
do i = 0 to 15
index ← VSR[VRC+32].byte[i].bit[3:7]
VSR[VRT+32].byte[i] ← src.byte[index]
end

Let the source vector be the concatenation of the
contents of VSR[VRA+32] followed by the contents of
VSR[VRB+32].
For each integer value i from 0 to 15, do the following.
Let index be the value specified by bits 3:7 of byte
element i of VSR[VRC+32].
The contents of byte element index of src are
placed into byte element i of VSR[VRT+32].

RISC-V: Parsing custom extension that is version 0

2021-06-20 Thread Robert Balas via Gcc-bugs
When giving gcc a -march string with a custom extension of
version 0 (for example pulpv0) then gcc will think assign in the
default version of 2p0.

In gcc/common/config/riscv/riscv-common.c the function
riscv_subset_list::parsing_subset_version falls back to the
default version (2p0) when parsing if the major and minor version
are both zero (which is the case for the string "pulpv0"). This
means both "pulpv0" and "pulpv2" will get assigned the version
2p0. Looks wrong to me.

Robert


[Bug fortran/100971] ICE: Bad IO basetype (7)

2021-06-20 Thread jvdelisle2 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100971

Jerry DeLisle  changed:

   What|Removed |Added

 CC||jvdelisle2 at gmail dot com

--- Comment #1 from Jerry DeLisle  ---
I can confirm this bug.

[Bug rtl-optimization/46235] inefficient bittest code generation

2021-06-20 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46235

Roger Sayle  changed:

   What|Removed |Added

 CC||roger at nextmovesoftware dot 
com
 Resolution|--- |FIXED
   Target Milestone|--- |12.0
 Status|UNCONFIRMED |RESOLVED

--- Comment #7 from Roger Sayle  ---
Fixed on mainline.

[Bug libstdc++/101136] msdosdjgpp toolchain cannot find std::wstring_view

2021-06-20 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101136

--- Comment #2 from Jonathan Wakely  ---
This is because the _GLIBCXX_USE_WCHAR_T macro is not defined, because
 etc are not complete on the target, so we don't have e.g. wcslen and
other wchar_t functions.

However, the wchar_t type is always defined for C++ and so the template can be
used with it.

So not a bug, but the expected behaviour.

See mails from me last year about enabling more of the library without needing
, which would fix this.

[Bug c++/101140] New: [modules] no matching function for call to ‘operator new(sizetype, void*)’

2021-06-20 Thread ensadc at mailnesia dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101140

Bug ID: 101140
   Summary: [modules] no matching function for call to ‘operator
new(sizetype, void*)’
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ensadc at mailnesia dot com
  Target Milestone: ---

$ cat new.hpp
extern "C++" {
void* operator new(__SIZE_TYPE__, void* p);
}

$ cat foo.cpp
module;
#include "new.hpp"
export module foo;

export template
T* construct_at(T* p) {
return ::new((void*)p) T();
}

$ cat bar.cpp
export module bar;

import foo;

void f(int* p) {
construct_at(p);
}

$ g++ -std=c++20 -fmodules-ts -c foo.cpp
$ g++ -std=c++20 -fmodules-ts -c bar.cpp
In module foo, imported at bar.cpp:3:
foo.cpp: In instantiation of ‘T* construct_at@foo(T*) [with T = int]’:
bar.cpp:6:17:   required from here
foo.cpp:7:12: error: no matching function for call to ‘operator new(sizetype,
void*)’
7 | return ::new((void*)p) T();
  |^~~
: note: candidate: ‘void* operator new(long unsigned int)’
: note:   candidate expects 1 argument, 2 provided
: note: candidate: ‘void* operator new(long unsigned int,
std::align_val_t)’
: note:   no known conversion for argument 2 from ‘void*’ to
‘std::align_val_t’
bar.cpp:1:8: warning: not writing module ‘bar’ due to errors
1 | export module bar;
  |^~



The error disappears if "new.hpp" is included or imported in `bar.cpp`.

I originally encountered this problem when using `std::construct_at` (defined
in``) in a module.

[Bug target/101132] [11/12 regression] [MIPS/MSA] internal compiler error: in do_store_flag, at expr.c:12541

2021-06-20 Thread xry111 at mengyan1223 dot wang via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101132

--- Comment #4 from Xi Ruoyao  ---
(In reply to Xi Ruoyao from comment #3)
> Another testcase (produced by cvise from mesa-21.1.3):

Flag: -O3 -mmsa -fno-trapping-math

[Bug target/101132] [11/12 regression] [MIPS/MSA] internal compiler error: in do_store_flag, at expr.c:12541

2021-06-20 Thread xry111 at mengyan1223 dot wang via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101132

--- Comment #3 from Xi Ruoyao  ---
Another testcase (produced by cvise from mesa-21.1.3):

unsigned float3_to_rgb9e5_gc_0;
util_format_r9g9b9e5_float_pack_rgba_float_dst_row_bc_0;
util_format_r9g9b9e5_float_pack_rgba_float_dst_row() {
  unsigned x;
  char *dst = util_format_r9g9b9e5_float_pack_rgba_float_dst_row;
  for (; x; x += 1) {
int __trans_tmp_1, maxrgb_0;
struct {
  unsigned u
} f, max;
if (f.u > 80)
  __trans_tmp_1 = 0;
else
  __trans_tmp_1 = max.u;
maxrgb_0 = __trans_tmp_1 > float3_to_rgb9e5_gc_0
   ?: util_format_r9g9b9e5_float_pack_rgba_float_dst_row_bc_0;
*dst = maxrgb_0;
dst += 4;
  }
}

[Bug target/101132] [11/12 regression] [MIPS/MSA] internal compiler error: in do_store_flag, at expr.c:12541

2021-06-20 Thread xry111 at mengyan1223 dot wang via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101132

Xi Ruoyao  changed:

   What|Removed |Added

  Component|middle-end  |target

--- Comment #2 from Xi Ruoyao  ---
This seems mips specific.

[Bug tree-optimization/66787] gcc fails tail call elimination

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66787

Andrew Pinski  changed:

   What|Removed |Added

  Known to fail||8.3.0
  Component|c++ |tree-optimization
   Keywords||missed-optimization
  Known to work||12.0

--- Comment #3 from Andrew Pinski  ---
Looks like it is fixed on the trunk:

t9.cc.044t.tailr1:Eliminated tail recursion in bb 4 : _14 = Array::computeSubSize (this_10(D), _5);
t9.cc.044t.tailr1:Eliminated tail recursion in bb 7 : BlendingTable::create
(this_16(D), _11, 255, 255);
t9.cc.044t.tailr1:Eliminated tail recursion in bb 5 : BlendingTable::create
(this_16(D), dst_15, _10, 255);
t9.cc.044t.tailr1:Eliminated tail recursion in bb 3 : BlendingTable::create
(this_16(D), dst_15, src_13, _9);
t9.cc.044t.tailr1:Eliminated tail recursion in bb 8 : BlendingTable::print
(this_14(D), _10, 255, 255);
t9.cc.044t.tailr1:Eliminated tail recursion in bb 6 : BlendingTable::print
(this_14(D), dst_17, _9, 255);
t9.cc.044t.tailr1:Eliminated tail recursion in bb 4 : BlendingTable::print
(this_14(D), dst_17, src_16, _8);

It was not working in GCC 8.0 though:
t9.cc.043t.tailr1:Eliminated tail recursion in bb 4 : _15 = Array::computeSubSize (this_12, _5);
t9.cc.043t.tailr1:Eliminated tail recursion in bb 10 : BlendingTable::create
(this_16, _11, 255, 255);
t9.cc.043t.tailr1:Eliminated tail recursion in bb 8 : BlendingTable::create
(this_16, dst_15, _10, 255);
t9.cc.043t.tailr1:Eliminated tail recursion in bb 6 : BlendingTable::create
(this_16, dst_15, src_13, _9);

[Bug middle-end/62062] Missed optimization: write ptr reloaded in each loop iteration

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62062

Andrew Pinski  changed:

   What|Removed |Added

  Component|inline-asm  |middle-end

--- Comment #4 from Andrew Pinski  ---
for write_run_char_ptr_ref we get:

   [local count: 955630225]:
  # n_18 = PHI 
  _2 = *p_10(D);
  _3 = _2 + 1;
  *p_10(D) = _3;
  *_2 = _13;
  n_8 = n_18 + -1;
  if (n_8 != -1(OVF))
goto ; [89.00%]
  else
goto ; [11.00%]

So I think this is still correct based on aliasing rules ...
That is the what char* points to can be writing to a reference/pointer type of
(char*&).

Now we could version this loop for aliasing but I don't know how much it would
benifit in general.

[Bug middle-end/61621] Normal enum switch slower than test case.

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61621

Andrew Pinski  changed:

   What|Removed |Added

  Component|c++ |middle-end
   Keywords||missed-optimization

--- Comment #3 from Andrew Pinski  ---
Note I think this might have been improved already.

Also I think there is an issue here which is not mentioned, in the case of
"instructions", there might be most used ones which can be pulled out of the
case table.
Also in the below test function could even be using a constant table and a load
from that based on the instructions [i] (and then added to value).

[Bug tree-optimization/94956] Unable to remove impossible ffs() test for zero

2021-06-20 Thread steinar+gcc at gunderson dot no via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94956

--- Comment #7 from Steinar H. Gunderson  ---
To wrap this up, confirming that GCC 11 does well on my benchmark:

BM_Chain2054529 iterations  18781 ns/iter   GCC 10, asm bsfq
BM_Chain2044584 iterations  22509 ns/iter   GCC 10, ffsll()
BM_Chain2049753 iterations  20216 ns/iter   GCC 11, asm bsfq
BM_Chain2053346 iterations  18816 ns/iter   GCC 11, ffsll()
BM_Chain2064926 iterations  15747 ns/iter   Clang 12, asm bsfq
BM_Chain2071208 iterations  14374 ns/iter   Clang 12, ffsll()

So basically for 11+, the ffsll() statement does better than the bsfq
statement, whereas it used to do markedly worse.

Clang does even better, but I can live with that. :-)

[Bug tree-optimization/101139] Unable to remove double byteswap in fast path

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101139

--- Comment #2 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #1)
>   if (b_13 < h.0_15)
> goto ; [51.12%]
>   else
> goto ; [48.88%]
> 
>[local count: 548896825]:
>   _16 = (short unsigned int) f$ab_14;
>   _17 = (int) _16;
>   _18 = __builtin_bswap16 (_17);
>   goto ; [100.00%]
> 
>[local count: 524845000]:
>   k_22 = i ();
>   c.1_23 = (short unsigned int) k_22;
>   _24 = (int) c.1_23;
>   _25 = __builtin_bswap16 (_24);
> 
>[local count: 1073741824]:
>   # prephitmp_32 = PHI <_18(4), _25(5)>
> 
> Basically the same issue as PR 13563.

Once that issue is fixed we should get:
  if (b_13 < h.0_15)
goto ; [51.12%]
  else
goto ; [48.88%]

   [local count: 548896825]:
  _16 = (short unsigned int) f$ab_14;
  goto ; [100.00%]

   [local count: 524845000]:
  k_22 = i ();
  c.1_23 = (short unsigned int) k_22;

   [local count: 1073741824]:
  # c.1_23 = PHI <_16(4), c.1_23(5)>

  _24 = (int) c.1_23;
  prephitmp_32 = __builtin_bswap16 (_32);
  _2 = (int) prephitmp_32;
  _3 = __builtin_bswap16 (_2);
  _4 = (int) _3;

Which then should just optimize later on to:  if (b_13 < h.0_15)
goto ; [51.12%]
  else
goto ; [48.88%]

   [local count: 548896825]:
  _16 = (short unsigned int) f$ab_14;
  goto ; [100.00%]

   [local count: 524845000]:
  k_22 = i ();
  c.1_23 = (short unsigned int) k_22;

   [local count: 1073741824]:
  # c.1_23 = PHI <_16(4), c.1_23(5)>

  _24 = (int) c.1_23;
  _4 = _24;
(If I did this conversion right)

[Bug tree-optimization/101139] Unable to remove double byteswap in fast path

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101139

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-06-20
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Depends on||13563
 Status|UNCONFIRMED |ASSIGNED
   Keywords||missed-optimization
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
  if (b_13 < h.0_15)
goto ; [51.12%]
  else
goto ; [48.88%]

   [local count: 548896825]:
  _16 = (short unsigned int) f$ab_14;
  _17 = (int) _16;
  _18 = __builtin_bswap16 (_17);
  goto ; [100.00%]

   [local count: 524845000]:
  k_22 = i ();
  c.1_23 = (short unsigned int) k_22;
  _24 = (int) c.1_23;
  _25 = __builtin_bswap16 (_24);

   [local count: 1073741824]:
  # prephitmp_32 = PHI <_18(4), _25(5)>

Basically the same issue as PR 13563.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=13563
[Bug 13563] if-conversion not agressive enough

[Bug c++/61245] #pragma GCC ivdep is ignored with call inside the test of a for loop

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61245

--- Comment #5 from Andrew Pinski  ---
The loop is still vectorized though. As it looks like it was versioned.

t6.cc: In function ‘void doT(SoA&) [with int N = 3]’:
t6.cc:34:17: warning: ignoring loop annotation
   34 |   for (auto i=0U; i

[Bug tree-optimization/101139] New: Unable to remove double byteswap in fast path

2021-06-20 Thread steinar+gcc at gunderson dot no via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101139

Bug ID: 101139
   Summary: Unable to remove double byteswap in fast path
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: steinar+gcc at gunderson dot no
  Target Milestone: ---

The following code is reduced from a real interpreter:

extern void (*a[])();
int d, e, h, l;
typedef struct {
  char ab;
} f;
f g;
short i();
short m68ki_read_imm_16() {
  short j, k;
  int b = d;
  f f = g;
  if (b < h)
return __builtin_bswap16(()[0]);
  k = i();
  short c = k;
  j = __builtin_bswap16(c);
  return j;
}
int b() {
  short m;
  do {
m = m68ki_read_imm_16();
short c = m;
l = __builtin_bswap16(c);
a[l]();
  } while (e);
  return e;
}

Compiling with arm-linux-gnueabihf-gcc-10 -O2 yields this interesting sequence
in the function:

b   .L11
.L15:
ldrbr3, [r5, #8]@ zero_extendqisi2
rev16   r3, r3
uxthr3, r3
.L10:
rev16   r3, r3
uxthr3, r3

The original code intention was to have a reusable function that returned in
big-endian, but that a specific use of it would be able to ignore endianness
into a table lookup, removing the double-swap entirely. GCC can normally do
that, but it seems that the branch in m68ki_read_imm_16() somehow gets in the
way. Just to be clear, I expect zero rev16 instructions altogether in b() when
m68ki_read_imm_16() is inlined.

The problem is not ARM-specific; x86 shows a similar problematic sequence:

leaqa(%rip), %rbx
jmp .L11
.p2align 4,,10
.p2align 3
.L15:
movsbw  g(%rip), %ax
rolw$8, %ax
.L10:
rolw$8, %ax
movzwl  %ax, %edx

Also verified with

gcc version 12.0.0 20210527 (experimental) [master revision
262e75d22c3:7bb6b9b2f47:9d3a953ec4d2695e9a6bfa5f22655e2aea47a973] (Debian
20210527-1)

[Bug c++/61245] #pragma GCC ivdep is ignored with call inside the test of a for loop

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61245

Andrew Pinski  changed:

   What|Removed |Added

Summary|ICE at in expand_ANNOTATE,  |#pragma GCC ivdep is
   |at internal-fn.c:127 called |ignored with call inside
   |from cfgexpand.c|the test of a for loop

--- Comment #4 from Andrew Pinski  ---
The ICE to warning was fixed with r5-4959.

[Bug c++/61245] ICE at in expand_ANNOTATE, at internal-fn.c:127 called from cfgexpand.c

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61245

Andrew Pinski  changed:

   What|Removed |Added

   Keywords|ice-on-valid-code   |missed-optimization

--- Comment #3 from Andrew Pinski  ---
Do we get a warning now instead of an internal compiler error:
t6.cc:34:17: warning: ignoring loop annotation
   34 |   for (auto i=0U; i

[Bug target/59555] bogus error: template with C linkage with preprocessed c++ file

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59555

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
  Component|c++ |target
 Status|UNCONFIRMED |RESOLVED
   Target Milestone|--- |9.0

--- Comment #1 from Andrew Pinski  ---
Fixed for GCC 9 by r9-1648 which changed NO_IMPLICIT_EXTERN_C to
SYSTEM_IMPLICIT_EXTERN_C and made it less fragile.

[Bug target/56066] g++ generates strong symbols conflicting with C99 extern inline code on Windows

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56066

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|REOPENED|RESOLVED
  Component|c++ |target

--- Comment #6 from Andrew Pinski  ---
I don't think this is a bug.  Rather the problem is you are linking two
different linkages and expecting it to work. in the C++ case, it is vague
linkage while in C, there is extern linkage still.
the correct thing is to use gnu_inline so it is the linkage you expect in both
langauges.

[Bug c++/43064] improve location and text of diagnostics in constructor initializer lists

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43064

Andrew Pinski  changed:

   What|Removed |Added

 CC||bero at arklinux dot org

--- Comment #13 from Andrew Pinski  ---
*** Bug 43933 has been marked as a duplicate of this bug. ***

[Bug c++/43933] Suboptimal error message when supplying a bad default value in initialization

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43933

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #4 from Andrew Pinski  ---
In the end this is a dup of bug 43064 which is fixed in GCC 9 as I shown.

*** This bug has been marked as a duplicate of bug 43064 ***

[Bug c++/43933] Suboptimal error message when supplying a bad default value in initialization

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43933

Andrew Pinski  changed:

   What|Removed |Added

  Known to fail||8.3.1

--- Comment #3 from Andrew Pinski  ---
Looks like it has been fixed on the trunk:
t7.cc: In constructor ‘A::A()’:
t7.cc:13:10: error: call of overloaded ‘QString(int)’ is ambiguous
   13 | A::A() : a(0), b(0) { }
  |  ^~~~
t7.cc:4:3: note: candidate: ‘QString::QString(char)’
4 |   QString(char);
  |   ^~~
t7.cc:3:3: note: candidate: ‘QString::QString(const QString&)’
3 |   QString(const QString&);
  |   ^~~
t7.cc:2:3: note: candidate: ‘QString::QString(const char*)’
2 |   QString(const char*);
  |   ^~~
t7.cc:13:16: error: call of overloaded ‘QString(int)’ is ambiguous
   13 | A::A() : a(0), b(0) { }
  |^~~~
t7.cc:4:3: note: candidate: ‘QString::QString(char)’
4 |   QString(char);
  |   ^~~
t7.cc:3:3: note: candidate: ‘QString::QString(const QString&)’
3 |   QString(const QString&);
  |   ^~~
t7.cc:2:3: note: candidate: ‘QString::QString(const char*)’
2 |   QString(const char*);
  |   ^~~


It was broken in GCC 8.3 though:
t7.cc: In constructor ‘A::A()’:
t7.cc:13:19: error: call of overloaded ‘QString(int)’ is ambiguous
 A::A() : a(0), b(0) { }
   ^

[Bug c++/43881] warning attached to a function is emitted even though the function is not being called

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43881

--- Comment #6 from Andrew Pinski  ---
The best way to do this is to use asm instead:
extern "C" int close(int);
extern __typeof__ (close) close __attribute__ ((__warning__ ("The symbol close
refers to the system function. Use safe_close instead.")));
extern __typeof__ (close) safe_close asm("close");

[Bug c++/100134] [modules] ICE when using a vector in a module

2021-06-20 Thread ensadc at mailnesia dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100134

--- Comment #1 from ensadc at mailnesia dot com ---
Reduced:


vector

namespace std {
template  struct __replace_first_arg;
template  class _Template, typename _Up,
  typename _Tp, typename... _Types>
struct __replace_first_arg<_Template<_Tp, _Types...>, _Up> {
  using type = _Template<_Up, _Types...>;
};
template  struct allocator {
  typedef _Tp value_type;
  friend constexpr bool operator==(const allocator &,
   const allocator &) noexcept {
return true;
  }
};
} // namespace std
namespace __gnu_cxx {
template 
struct __alloc_traits {
  template  struct rebind {
typedef std::__replace_first_arg<_Alloc, _Tp> other;
  };
};
} // namespace __gnu_cxx
namespace std {
template > class vector {
  typedef
  typename __gnu_cxx::__alloc_traits<_Alloc>::template rebind<_Tp>::other
  _Tp_alloc_type;
};
} // namespace std

foo.cpp

export module foo;
import ;
export struct Foo {
  std::vector v;
};

build commands:

g++ -I . -x c++-system-header -fmodules-ts -std=c++20 vector
g++ -I . -std=c++20 -fmodules-ts foo.cpp

[Bug c++/43149] Partial optimization

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43149

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #9 from Andrew Pinski  ---
https://gcc.gnu.org/onlinedocs/gcc/Function-Specific-Option-Pragmas.html

Works across headers as there is nothing special about headers really.

[Bug c++/67252] Demangler fails on template conversion operator

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67252

Andrew Pinski  changed:

   What|Removed |Added

 Depends on||41233

--- Comment #2 from Andrew Pinski  ---
Might be a dup of bug 41233.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41233
[Bug 41233] Templated conversion operator produces symbol name that won't
demangle

[Bug c++/101138] Ambiguous code (with operator==) compiled without error

2021-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101138

--- Comment #1 from Andrew Pinski  ---
ICC also accepts it but Microsoft rejects it.
But ICC rejects the following which GCC still accepts:
//#include 
//using namespace std;
#define printf __builtin_printf

//template
struct D {
template bool operator==(Y a) const { 
//cout << "f" < 
bool operator==(T a, D b) { 
printf("f2\n");
return false;
}
struct ok;

int main()
{
D a, b;
if (a == b)
return 0;
return 1;
}

[Bug c++/101138] New: Ambiguous code (with operator==) compiled without error

2021-06-20 Thread hiraditya at msn dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101138

Bug ID: 101138
   Summary: Ambiguous code (with operator==) compiled without
error
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hiraditya at msn dot com
  Target Milestone: ---

$ cat test.cpp

#include 
using namespace std;

template
struct D {
template bool operator==(Y a) const { 
cout << "f" < 
bool operator==(T a, D b) { 
cout << "fD" < a, b;
if (a == b)
return 0;
return 1;
}

gcc compiles this code fine, bug clang errors out.

https://godbolt.org/z/c13EExxeY