[Bug tree-optimization/93078] New: Missing fma and round functions auto-vectorization with x86-64 (sse2)

2019-12-26 Thread diegoandres91b at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93078

Bug ID: 93078
   Summary: Missing fma and round functions auto-vectorization
with x86-64 (sse2)
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: diegoandres91b at hotmail dot com
  Target Milestone: ---

The next code (with -Ofast):

#include 

using namespace std;

float a[4], b[4], c[4];

void vec_fma() {
for (int i = 0; i < 4; ++i) c[i] = fma(a[i], b[i], c[i]);
}

void vec_round() {
for (int i = 0; i < 4; ++i) c[i] = round(a[i]);
}

void vec_floor() {
for (int i = 0; i < 4; ++i) c[i] = floor(a[i]);
}

void vec_ceil() {
for (int i = 0; i < 4; ++i) c[i] = ceil(a[i]);
}

void vec_trunc() {
for (int i = 0; i < 4; ++i) c[i] = trunc(a[i]);
}

void vec_rint() {
for (int i = 0; i < 4; ++i) c[i] = rint(a[i]);
}

void vec_nearbyint() {
for (int i = 0; i < 4; ++i) c[i] = nearbyint(a[i]);
}

Compiles without auto-vectorization:

vec_fma():
sub rsp, 8
movss   xmm2, DWORD PTR c[rip]
movss   xmm1, DWORD PTR b[rip]
movss   xmm0, DWORD PTR a[rip]
callfmaf
movss   xmm2, DWORD PTR c[rip+4]
movss   xmm1, DWORD PTR b[rip+4]
movss   DWORD PTR c[rip], xmm0
movss   xmm0, DWORD PTR a[rip+4]
callfmaf
movss   xmm2, DWORD PTR c[rip+8]
movss   xmm1, DWORD PTR b[rip+8]
movss   DWORD PTR c[rip+4], xmm0
movss   xmm0, DWORD PTR a[rip+8]
callfmaf
movss   xmm2, DWORD PTR c[rip+12]
movss   xmm1, DWORD PTR b[rip+12]
movss   DWORD PTR c[rip+8], xmm0
movss   xmm0, DWORD PTR a[rip+12]
callfmaf
movss   DWORD PTR c[rip+12], xmm0
add rsp, 8
ret
vec_round():
movss   xmm3, DWORD PTR a[rip]
movss   xmm0, DWORD PTR .LC1[rip]
movss   xmm2, DWORD PTR .LC0[rip]
movaps  xmm4, xmm0
movaps  xmm1, xmm3
andps   xmm1, xmm0
comiss  xmm2, xmm1
jbe .L5
addss   xmm1, DWORD PTR .LC2[rip]
andnps  xmm4, xmm3
movaps  xmm3, xmm4
cvttss2si   eax, xmm1
pxorxmm1, xmm1
cvtsi2ssxmm1, eax
orpsxmm3, xmm1
.L5:
movss   DWORD PTR c[rip], xmm3
movss   xmm3, DWORD PTR a[rip+4]
movaps  xmm4, xmm0
movaps  xmm1, xmm3
andps   xmm1, xmm0
comiss  xmm2, xmm1
jbe .L6
addss   xmm1, DWORD PTR .LC2[rip]
andnps  xmm4, xmm3
movaps  xmm3, xmm4
cvttss2si   eax, xmm1
pxorxmm1, xmm1
cvtsi2ssxmm1, eax
orpsxmm3, xmm1
.L6:
movss   DWORD PTR c[rip+4], xmm3
movss   xmm3, DWORD PTR a[rip+8]
movaps  xmm4, xmm0
movaps  xmm1, xmm3
andps   xmm1, xmm0
comiss  xmm2, xmm1
jbe .L7
addss   xmm1, DWORD PTR .LC2[rip]
andnps  xmm4, xmm3
movaps  xmm3, xmm4
cvttss2si   eax, xmm1
pxorxmm1, xmm1
cvtsi2ssxmm1, eax
orpsxmm3, xmm1
.L7:
movss   DWORD PTR c[rip+8], xmm3
movss   xmm3, DWORD PTR a[rip+12]
movaps  xmm1, xmm3
andps   xmm1, xmm0
comiss  xmm2, xmm1
jbe .L8
addss   xmm1, DWORD PTR .LC2[rip]
andnps  xmm0, xmm3
cvttss2si   eax, xmm1
pxorxmm1, xmm1
cvtsi2ssxmm1, eax
movaps  xmm3, xmm1
orpsxmm3, xmm0
.L8:
movss   DWORD PTR c[rip+12], xmm3
ret

...

vec_nearbyint():
sub rsp, 8
movss   xmm0, DWORD PTR a[rip]
callnearbyintf
movss   DWORD PTR c[rip], xmm0
movss   xmm0, DWORD PTR a[rip+4]
callnearbyintf
movss   DWORD PTR c[rip+4], xmm0
movss   xmm0, DWORD PTR a[rip+8]
callnearbyintf
movss   DWORD PTR c[rip+8], xmm0
movss   xmm0, DWORD PTR a[rip+12]
callnearbyintf
movss   DWORD PTR c[rip+12], xmm0
add rsp, 8
ret

In comparison, the icc compiler also fails to auto-vectorize fma in sse2 mode
(without vfmadd132ps native instruction of fma), but it does have vectorized
versions of rounding functions (in sse2 mode, withtout roundps native
instruction of sse4.1):

vec_round():
push  rsi
movupsxmm0, XMMWORD PTR a[rip]
call  QWORD PTR [__svml_roundf4@GOTPCREL+rip]
movupsXMMWORD PTR c[rip], xmm0
pop   rcx
ret

...

vec_nearbyint():
push  rsi
movupsxmm0, XMMWORD PTR a[rip]
call  QWORD PTR [__svml_nearbyintf4@GOTPCREL+rip]
movupsXMMWORD PTR c[rip], xmm0
pop   rcx
ret

Compiler Explorer Code: 

[Bug c++/93077] New: internal compiler error: in hash_operand during GIMPLE pass: fre

2019-12-26 Thread raj.khem at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93077

Bug ID: 93077
   Summary: internal compiler error: in hash_operand during GIMPLE
pass: fre
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: raj.khem at gmail dot com
  Target Milestone: ---

Created attachment 47552
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47552=edit
testcase

attached test case crashes gcc10 ( works ok with gcc9 )

arm-yoe-linux-musleabi-g++  -march=armv7ve -mthumb -mfpu=neon -mfloat-abi=hard
test.cpp -c -O1


during GIMPLE pass: fre
test.cpp: In function 'void __tcf_0(void*)':
test.cpp:50411:5: internal compiler error: in hash_operand, at
fold-const.c:3768
50411 | };
  | ^
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

[Bug middle-end/93076] New: internal compiler error: Segmentation fault during GIMPLE pass: cddce

2019-12-26 Thread raj.khem at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93076

Bug ID: 93076
   Summary: internal compiler error: Segmentation fault during
GIMPLE pass: cddce
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: raj.khem at gmail dot com
  Target Milestone: ---

Created attachment 47551
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47551=edit
testcase

attached test fails with gcc10, works ok with gcc9 it fails on (arm, mips, x86)
so not necessarily arm specific

arm-yoe-linux-musleabi-g++ -march=armv7ve -mthumb -mfpu=neon -mfloat-abi=hard
-c test.cpp -O1



test.cpp:49: warning: "__cpp_constexpr" redefined
   49 | #define __cpp_constexpr 200704L
  |
: note: this is the location of the previous definition
test.cpp:10279: warning: "NULL" redefined
10279 | #define NULL 0L
  |
test.cpp:4697: note: this is the location of the previous definition
 4697 | #define NULL __null
  |
during GIMPLE pass: cddce
test.cpp: In function
'testing::internal::ParamGenerator
spvtools::utils::{anonymous}::gtest_F32ToF16HexFloatFP32To16Tests_EvalGenerator_()':
test.cpp:66196:1: internal compiler error: Segmentation fault
66196 | }
  | ^
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

[Bug c++/93075] New: Incorrect line number in DW_MACRO_start_file entry of file included via "-include" option

2019-12-26 Thread tatyana at synopsys dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93075

Bug ID: 93075
   Summary: Incorrect line number in DW_MACRO_start_file entry of
file included via "-include" option
   Product: gcc
   Version: 8.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tatyana at synopsys dot com
  Target Milestone: ---

DW_MACRO_start_file entry, which relates to a file included via "-include"
option, contains a wrong line number operand in .debug_macro section.

Also, the source file line numbers are not incremented (as they should since a
line "#include ..." is added in the beginning).

GCC 8.3.1

[Bug c++/92438] [8/9 Regression] Function declaration parsed incorrectly with `-std=c++1z`

2019-12-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92438

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[8/9/10 Regression] |[8/9 Regression] Function
   |Function declaration parsed |declaration parsed
   |incorrectly with|incorrectly with
   |`-std=c++1z`|`-std=c++1z`

--- Comment #6 from Jakub Jelinek  ---
Fixed on the trunk so far.

[Bug c++/92438] [8/9/10 Regression] Function declaration parsed incorrectly with `-std=c++1z`

2019-12-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92438

--- Comment #5 from Jakub Jelinek  ---
Author: jakub
Date: Thu Dec 26 10:16:01 2019
New Revision: 279736

URL: https://gcc.gnu.org/viewcvs?rev=279736=gcc=rev
Log:
PR c++/92438
* parser.c (cp_parser_constructor_declarator_p): If open paren
is followed by RID_ATTRIBUTE, skip over the attribute tokens and
try to parse type specifier.

* g++.dg/ext/attrib61.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/ext/attrib61.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/parser.c
trunk/gcc/testsuite/ChangeLog

[Bug bootstrap/93074] [10 regression] build FAIL with --enable-offload-targets=nvptx-none

2019-12-26 Thread dimhen at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93074

--- Comment #2 from Dmitry G. Dyachenko  ---
(In reply to Andrew Pinski from comment #1)
> According to
> https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__DEVICE.html
> 
> cuDeviceGetName exists.
> Maybe F31 has an older version of Cuda installed.

I have no CUDA installed.
Can I check smth else?

r279710 FAIL
r279709 PASS