[Bug target/95783] New: Inefficient use of the stack when a function takes the address of its argument

2020-06-19 Thread josephcsible at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95783

Bug ID: 95783
   Summary: Inefficient use of the stack when a function takes the
address of its argument
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: josephcsible at gmail dot com
  Target Milestone: ---
Target: x86_64-linux-gnu

Consider this C code:

void g(long *);
long f(long x) {
g(&x);
return x;
}

At either "-O3" or "-Os", it results in this assembly:

f:
subq$24, %rsp
movq%rdi, 8(%rsp)
leaq8(%rsp), %rdi
callg
movq8(%rsp), %rax
addq$24, %rsp
ret

There are two problems with this: it's unnecessarily complicated with extra
instructions, and it wastes 16 bytes of stack space. I'd rather see this
assembly instead:

f:
pushq   %rdi
movq%rsp, %rdi
callg
popq%rax
ret

https://godbolt.org/z/PuNB6Y

[Bug preprocessor/95782] New: [ppc64le] ICE in _cpp_pop_context

2020-06-19 Thread e...@coeus-group.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95782

Bug ID: 95782
   Summary: [ppc64le] ICE in _cpp_pop_context
   Product: gcc
   Version: 10.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: preprocessor
  Assignee: unassigned at gcc dot gnu.org
  Reporter: e...@coeus-group.com
  Target Milestone: ---

I'm running into an ICE on ppc64le:

  internal compiler error: in _cpp_pop_context, at libcpp/macro.c:2644

Here is a reproducer:

  #define a
  #define b(d) d
  #if defined(a)  
  b(vector double)
  #endif

Just running `gcc -E test.c` when targeting ppc64le triggers the issue.  It
happens with at least GCC 9 and 10.

[Bug testsuite/95110] new test case in r11-345 error: gcc.dg/tree-ssa/pr94969.c: dump file does not exist

2020-06-19 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95110

--- Comment #4 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Bin Cheng :

https://gcc.gnu.org/g:63c00a0c6543ce6d55e0ebc97ddbc1d36fb1289c

commit r10-8332-g63c00a0c6543ce6d55e0ebc97ddbc1d36fb1289c
Author: Bin Cheng 
Date:   Sat Jun 20 14:24:31 2020 +0800

Add missing unit dependence vector in data dependence analysis

Current data dependence analysis misses unit distant vector if DRs in
DDR have the same invariant access functions.  This adds the vector as
the constant access function case.

Also fix typo in testcase.

Backport from master.

2020-06-20  Bin Cheng  

gcc/
PR tree-optimization/94969
* tree-data-ref.c (constant_access_functions): Rename to...
(invariant_access_functions): ...this.  Add parameter.  Check for
invariant access function, rather than constant.
(build_classic_dist_vector): Call above function.
* tree-loop-distribution.c (pg_add_dependence_edges): Add comment.

gcc/testsuite/
PR tree-optimization/94969
* gcc.dg/tree-ssa/pr94969.c: New test.

2020-06-20  Jakub Jelinek  

gcc/testsuite/
PR tree-optimization/95110
* gcc.dg/tree-ssa/pr94969.c: Swap scan-tree-dump-not arguments.

[Bug tree-optimization/94969] [8/10 Regression] Invalid loop distribution since r8-2390-gdfbddbeb1ca912c9

2020-06-19 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94969

--- Comment #17 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Bin Cheng :

https://gcc.gnu.org/g:63c00a0c6543ce6d55e0ebc97ddbc1d36fb1289c

commit r10-8332-g63c00a0c6543ce6d55e0ebc97ddbc1d36fb1289c
Author: Bin Cheng 
Date:   Sat Jun 20 14:24:31 2020 +0800

Add missing unit dependence vector in data dependence analysis

Current data dependence analysis misses unit distant vector if DRs in
DDR have the same invariant access functions.  This adds the vector as
the constant access function case.

Also fix typo in testcase.

Backport from master.

2020-06-20  Bin Cheng  

gcc/
PR tree-optimization/94969
* tree-data-ref.c (constant_access_functions): Rename to...
(invariant_access_functions): ...this.  Add parameter.  Check for
invariant access function, rather than constant.
(build_classic_dist_vector): Call above function.
* tree-loop-distribution.c (pg_add_dependence_edges): Add comment.

gcc/testsuite/
PR tree-optimization/94969
* gcc.dg/tree-ssa/pr94969.c: New test.

2020-06-20  Jakub Jelinek  

gcc/testsuite/
PR tree-optimization/95110
* gcc.dg/tree-ssa/pr94969.c: Swap scan-tree-dump-not arguments.

[Bug other/95781] New: Missing dead code elimination when a recursive function is inlined.

2020-06-19 Thread yyc1992 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95781

Bug ID: 95781
   Summary: Missing dead code elimination when a recursive
function is inlined.
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yyc1992 at gmail dot com
  Target Milestone: ---

Code,

```
static int 2(int *p, int k)
{
int res = 0;
if (k > 0)
res += 2(p, k - 1);
return *p + res;
}

int g2(int *p)
{
return 2(p, 3);
}
```

Compiling with -O3 the code produced for `g2` is

```
g2:
movl(%rdi), %eax
sall$2, %eax
ret
```

i.e. `*p * 4` that doesn't need to call `2`. However, the code for `2`
is still generated even though it is never used.

It seems that this only happens when the recursive function is sufficiently
complex. Replacing `*p` with a constant or making the `k > 0` branch returning
directly produces code that does not have `2` in it. Seems that there's
some smart late optimization pass that doesn't have a global DCE pass
afterwards?

Looks similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80680 but I'm not
sure if they have the same root cause.

[Bug other/95780] New: target_clones treats internal visibility different from static functions

2020-06-19 Thread yyc1992 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95780

Bug ID: 95780
   Summary: target_clones treats internal visibility different
from static functions
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yyc1992 at gmail dot com
  Target Milestone: ---

Again using the code in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95778. If
the static function `f2` is changed to `visibility("internal")`, i.e.

```

__attribute__((visibility("internal"),noinline,target_clones("default,avx2")))
int f2(int *p)
{
asm volatile ("" :: "r"(p) : "memory");
return *p;
}

__attribute__((noinline,target_clones("default,avx2"))) int g2(int *p)
{
return f2(p);
}
```

the call to `f2` will then use the PLT again. Without `target_clone` the two
has similar effects and both produce a direct call.

[Bug other/95779] New: Unnecessary dispatch function for static target_clones function.

2020-06-19 Thread yyc1992 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95779

Bug ID: 95779
   Summary: Unnecessary dispatch function for static target_clones
function.
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yyc1992 at gmail dot com
  Target Milestone: ---

Using the code in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95778 the full
assembly generated (the version with both noinline) is (disabled unwind info),

```
.file   "b.c"
.text
.p2align 4
.type   f2.default.1, @function
f2.default.1:
movl(%rdi), %eax
ret
.size   f2.default.1, .-f2.default.1
.p2align 4
.type   g2.default.1, @function
g2.default.1:
jmp f2.default.1
.size   g2.default.1, .-g2.default.1
.p2align 4
.type   f2.avx2.0, @function
f2.avx2.0:
movl(%rdi), %eax
ret
.size   f2.avx2.0, .-f2.avx2.0
.p2align 4
.type   g2.avx2.0, @function
g2.avx2.0:
jmp f2.avx2.0
.size   g2.avx2.0, .-g2.avx2.0
.section.text.g2.resolver,"axG",@progbits,g2.resolver,comdat
.p2align 4
.weak   g2.resolver
.type   g2.resolver, @function
g2.resolver:
subq$8, %rsp
call__cpu_indicator_init@PLT
movq__cpu_model@GOTPCREL(%rip), %rax
leaqg2.avx2.0(%rip), %rdx
testb   $4, 13(%rax)
leaqg2.default.1(%rip), %rax
cmovne  %rdx, %rax
addq$8, %rsp
ret
.size   g2.resolver, .-g2.resolver
.globl  g2
.type   g2, @gnu_indirect_function
.setg2,g2.resolver
.text
.p2align 4
.type   f2.resolver, @function
f2.resolver:
subq$8, %rsp
call__cpu_indicator_init@PLT
movq__cpu_model@GOTPCREL(%rip), %rax
leaqf2.avx2.0(%rip), %rdx
testb   $4, 13(%rax)
leaqf2.default.1(%rip), %rax
cmovne  %rdx, %rax
addq$8, %rsp
ret
.size   f2.resolver, .-f2.resolver
.ident  "GCC: (GNU) 10.1.0"
.section.note.GNU-stack,"",@progbits
```

AFAICT the `f2.resolver` is never used anywhere and can be omitted (all caller
of `f2` are statically dispatched).

[Bug other/95778] New: target_clones indirection eliminates requires noinline

2020-06-19 Thread yyc1992 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95778

Bug ID: 95778
   Summary: target_clones indirection eliminates requires noinline
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yyc1992 at gmail dot com
  Target Milestone: ---

Compiling

```
static __attribute__((noinline,target_clones("default,avx2"))) int f2(int *p)
{
asm volatile ("" :: "r"(p) : "memory");
return *p;
}

__attribute__((target_clones("default,avx2"))) int g2(int *p)
{
return f2(p);
}
```

with `-fPIC -O3` generates


```
g2.avx2.0:
jmp f2.avx2.0
```

However, if any of the two `noinline` is removed, the generated code becomes,

```
g2.avx2.0:
jmp f2@PLT
```

which cannot get eliminated later
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95776

I think this should be possible to do and should be possible without LTO (hence
a slightly different bug than
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95776 even though if that one is
fixed turning on LTO can particially fix this).

Also, in this case, the `f2` should be inlinable to `g2`. However, no
combination of `inline`, `always_inline`, `flatten` I've tested can do that,
even though when both functions are marked with `noinline` gcc clearly knows
which function is calling what so it should have no problem inlining.

[Bug c/95777] New: Allow specifying more than one target options at the same time in target and target_clones attribute

2020-06-19 Thread yyc1992 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95777

Bug ID: 95777
   Summary: Allow specifying more than one target options at the
same time in target and target_clones attribute
   Product: gcc
   Version: 10.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yyc1992 at gmail dot com
  Target Milestone: ---

Currently it seems that (document and own tests) only a single option is
allowed for each version of the function using `target` and `target_clones`.
This can be a problem for options that are not strict subset of each other
(e.g. the AVX512 ones IIUC). Of course specifying `cpu=haswell` and
`cpu=skylake` for the same target doesn't make much sense so some checking
should be in place but I believe allowing multiple directly testable features
to be specified at the same time should be allowed.

A related issue is that while one can indeed do some of these by specifying a
`arch=`. However, even if the runtime CPU supports all the features it'll
still not get selected if the name doesn't exactly match (tested with
`arch=haswell` on my kabelake laptop). If a fallback could be implemented to
make this work that will be also good enough for me at least...

[Bug lto/95776] New: Reduce indirection with target_clones at link time (with LTO)

2020-06-19 Thread yyc1992 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95776

Bug ID: 95776
   Summary: Reduce indirection with target_clones at link time
(with LTO)
   Product: gcc
   Version: 10.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yyc1992 at gmail dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Currently, if a function is not not visible outside the final library (static,
or internal or hidden visibility), the call of the plt will be replaced with
the call to the function directly.

With target_clones, this is also possible within the same compilation unit for
static functions as callees. The caller that has the same cloning attribute
will simply call the cloned function without indirection.

However, this stops working when the two are combined. Even with the maximum
options and attribute to help it (hidden visibility, same compilation unit,
-Wl,-Bsymbolic, LTO) the call to the cloned function from a caller with
matching cloning attribute still go through the PLT.

Test code

```
__attribute__((noinline,visibility("hidden"))) int f1(int *p)
{
asm volatile ("" :: "r"(p) : "memory");
return *p;
}

__attribute__((noinline,visibility("hidden"),target_clones("default,avx2")))
int f2(int *p)
{
asm volatile ("" :: "r"(p) : "memory");
return *p;
}

__attribute__((noinline)) int g1(int *p)
{
return f1(p);
}

__attribute__((noinline,target_clones("default,avx2"))) int g2(int *p)
{
return f2(p);
}
```

Compiled with `-fPIC -flto -O3 -Wl,-Bsymbolic -shared`. The `f1` call calls
`f1` directly whereas the two cloned `f2` calls both call `f2@plt`.

The same also applies to inlining, target_clones kills inlining even with lto
on.

I assume this happens because this can only be done at link time which either
didn't get passed enough info to determine this or simply didn't get
implemented? I assume this should be possible since it can be done within a
single compilation unit.

[Bug target/95774] __builtin_cpu_is can't detect cooperlake

2020-06-19 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95774

H.J. Lu  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com
   Last reconfirmed||2020-06-20
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from H.J. Lu  ---
Created attachment 48759
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48759&action=edit
A patch

This patch depends on

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546522.html

[Bug target/95775] New: Command line argument for target_clones?

2020-06-19 Thread yyc1992 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95775

Bug ID: 95775
   Summary: Command line argument for target_clones?
   Product: gcc
   Version: 10.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yyc1992 at gmail dot com
  Target Milestone: ---

Would it make sense to add a command line argument that is roughly equivalent
to to adding `target_clones` to all functions?

In terms of usefulness, I believe it will be a very cheap way for many
libraries to turn on the support with minimal code change. It certainly won't
be as optimized as best possible but neither is target_clones attribute itself
compared to hand wrote different implementations using compiler
intrinsics/assembly...

In terms of implementation, I believe most of the issues I've hit when adding
such attribute to functions has been fixed so I have little issue using it now.
It'll also be a new feature so it shouldn't really break any existing code.

And for further improvement, the compiler should have fair knowledge of what
instruction can be/has been used and can omit some of the cloning in order to
reduce code size. I don't think this needs to be included in the first version
though...

And IIUC this is something that icc does automatically? (If that can serve as a
argument for this feature...)

[Bug target/95237] LOCAL_DECL_ALIGNMENT shrinks alignment, FAIL gcc.target/i386/pr69454-2.c

2020-06-19 Thread skpgkp2 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95237

--- Comment #18 from Sunil Pandey  ---
Another test, trigger with -Os option.

$ cat foo.i
int a;
long long b() {}
int c() {
  if (b())
a = 1;
}


$gcc -m32  -mpreferred-stack-boundary=2 -Os   -c  foo.i
during GIMPLE pass: adjust_alignment
foo.i: In function ??c??:
foo.i:3:5: internal compiler error: in execute, at adjust-alignment.c:74
3 | int c() {
  | ^
0x2091411 execute
   
/local/skpandey/gccwork/pr95237/gitlab/gcc.orig/gcc/adjust-alignment.c:74
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug c++/95772] warning desired when default operator= cannot be constructued

2020-06-19 Thread marcpawl at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95772

--- Comment #2 from Marc Pawlowsky  ---
I understand that it is deleted, but if somebody says it should be defaulted
when it is defaulted that is most likely an error, and it would be nice if a
warning were generated.

[Bug middle-end/95673] missing -Wstring-compare for an impossible strncmp test

2020-06-19 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95673

Martin Sebor  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |msebor at gcc dot 
gnu.org

[Bug middle-end/95673] missing -Wstring-compare for an impossible strncmp test

2020-06-19 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95673

Martin Sebor  changed:

   What|Removed |Added

   Keywords||diagnostic
  Component|regression  |middle-end
 Resolution|INVALID |---
 Ever confirmed|0   |1
   Last reconfirmed||2020-06-19
Summary|Inconsistent optimization   |missing -Wstring-compare
   |behavior when there is a|for an impossible strncmp
   |buffer overflow |test
 Status|RESOLVED|NEW

--- Comment #4 from Martin Sebor  ---
Let me repurpose this bug to track the missing warning then.

[Bug libstdc++/95765] std::vector should be built without warnings with -Wconversion and/or -Wsystem-headers

2020-06-19 Thread redboltz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95765

--- Comment #4 from Takatoshi Kondo  ---
Thank you for fixing the warnings.

> Users should not be routinely using -Wsystem-headers to find problems with 
> their own code (that defeats the entire purpose of suppressing warnings in 
> system headers).
>
> I've fixed some warnings in libstdc++ code, but the real problem here is Bug 
> 43167 (and Bug 43167 comment 17 has almost exactly the same example).

I agree. If warnings relate to templates would be reported without
`-Wsystem-headers`, it is ideal.

[Bug libstdc++/90436] Redundant size checking in vector

2020-06-19 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90436

--- Comment #2 from Marc Glisse  ---
(writing down some notes)

Calling

  size_type
  _M_check_len_one(const char* __s) const
  {
if (max_size() - size() < 1)
  __throw_length_error(__N(__s));

const size_type __len = size() + (std::max)(size(), (size_t)1);
return (__len > max_size()) ? max_size() : __len;
  }

instead of _M_check_len reduces the running time of this micro-benchmark

#include 
int main(){
  volatile int a=0;
  for(int i=0;i<100;++i){
std::vector v;
for(int j=0;j<1000;++j){
  v.push_back(j);
}
a=v[a];
  }
}

from .88s to .66s at -O3. Two key elements (the perf gain only comes if we do
both) are removing the overflow check, and having the comparison between size
and max_size optimized to be done on byte length (not divided by the element
size).

I think the overflow check could be removed from the normal _M_check_len: we
have already checked that max_size() - size() >= __n so size() + __n cannot
overflow, and size() must be smaller than max_size(), which should be at most
SIZE_MAX/2, at least if ptrdiff_t and size_t have the same size, so size() +
size() cannot overflow either.
I should check if the compiler could help more. It is supposed to know how to
optimize .ADD_OVERFLOW based on the range of the operands.

I suspect that a single_use restriction explains why max_size() == size()
compares values without division while max_size() - size() < __n (for __n = 1)
doesn't.

[Bug target/95774] New: __builtin_cpu_is can't detect cooperlake

2020-06-19 Thread craig.topper at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95774

Bug ID: 95774
   Summary: __builtin_cpu_is can't detect cooperlake
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: craig.topper at gmail dot com
  Target Milestone: ---

Cooperlake appears to be defined the enum in libgcc for __builtin_cpu_is, but
there is no code to use that enum value when identifying the cpu in libgcc.

[Bug c++/95773] New: [[nodiscard]] attribute is ignored for calls to overridden functions

2020-06-19 Thread vladimir.krivopalov at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95773

Bug ID: 95773
   Summary: [[nodiscard]] attribute is ignored for calls to
overridden functions
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vladimir.krivopalov at gmail dot com
  Target Milestone: ---

The following code compiles fine with GCC 10.1.0 but fails with Clang 10.0.0 :


struct Base {
[[nodiscard]] virtual int f() = 0;
};

struct Derived : Base {
[[nodiscard]] int f() override {
return 1135;
}
};

int main()
{
Derived d;
Base& b = d;
b.f();
return 0;
}


# g++ -Wall -Werror -Wextra -std=c++17 nodiscard.cpp -o nodiscard
# clang++ -Wall -Werror -Wextra -std=c++17 nodiscard.cpp -o nodiscard
nodiscard.cpp:15:5: error: ignoring return value of function declared with
'nodiscard' attribute [-Werror,-Wunused-result]
b.f();
^~~
1 error generated.

[Bug c++/95772] warning desired when default operator= cannot be constructued

2020-06-19 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95772

--- Comment #1 from Jonathan Wakely  ---
(In reply to Marc Pawlowsky from comment #0)
> I expected a diagnostic saying that operator= cannot be defaulted which is
> seen if the ASSIGN code is enabled.  The code compiles cleanly.

As expected. The C++ standard is clear about what this code means. The
explicitly defaulted assignment operator is defined as deleted, because it
would be ill-formed.

Defaulting it doesn't mean it will be provided by the compiler.


> When I wrote this bug in a large code base  with -03 and -flto the code will
> fail with illegal instructions and other memory corruption errors.  No
> problem without optimization.

What bug? The type is not assignable, how can that fail at runtime?

[Bug c++/95772] New: warning desired when default operator= cannot be constructued

2020-06-19 Thread marcpawl at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95772

Bug ID: 95772
   Summary: warning desired when default operator= cannot be
constructued
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: marcpawl at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/eXckPe

#include 

class Const {
 public:
  int const i_;

  Const(int i) : i_(i) {}

  Const& operator=(Const& rhs) = default;
};

int main(int argc, char**) {
  Const c{argc};
  static_assert(!std::is_assignable::value,
"should not be able to assign i_");
#ifdef ASSIGN
  Const d{argc + 1};
  d = c;
#endif
  return c.i_;
}

I expected a diagnostic saying that operator= cannot be defaulted which is seen
if the ASSIGN code is enabled.  The code compiles cleanly.

When I wrote this bug in a large code base  with -03 and -flto the code will
fail with illegal instructions and other memory corruption errors.  No problem
without optimization.

[Bug c++/95768] -march=sandybridge -O2 -Wall crashes as 'during GIMPLE pass: uninit ... Segmentation fault'

2020-06-19 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95768

H.J. Lu  changed:

   What|Removed |Added

   Target Milestone|--- |11.0
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||msebor at gcc dot gnu.org
   Last reconfirmed||2020-06-19

--- Comment #2 from H.J. Lu  ---
It is caused by r11-959

[Bug libstdc++/87614] User related warnings are hidden in system headers

2020-06-19 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87614

Jonathan Wakely  changed:

   What|Removed |Added

 CC||redboltz at gmail dot com

--- Comment #4 from Jonathan Wakely  ---
*** Bug 95765 has been marked as a duplicate of this bug. ***

[Bug libstdc++/95765] std::vector should be built without warnings with -Wconversion and/or -Wsystem-headers

2020-06-19 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95765

--- Comment #3 from Jonathan Wakely  ---
Actually Bug 87614 is the source of that other example, and a more suitable one
to mark this as a duplicate of.

*** This bug has been marked as a duplicate of bug 87614 ***

[Bug c++/43167] Warnings should not be disabled when instantiating templates defined in system headers

2020-06-19 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43167

Jonathan Wakely  changed:

   What|Removed |Added

 CC||redboltz at gmail dot com

--- Comment #21 from Jonathan Wakely  ---
*** Bug 95765 has been marked as a duplicate of this bug. ***

[Bug libstdc++/95765] std::vector should be built without warnings with -Wconversion and/or -Wsystem-headers

2020-06-19 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95765

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:5b6215083bd6a3e10dd142e1c5d4fab011d6f074

commit r11-1562-g5b6215083bd6a3e10dd142e1c5d4fab011d6f074
Author: Jonathan Wakely 
Date:   Fri Jun 19 18:15:15 2020 +0100

libstdc++: Fix some -Wsystem-headers warnings (PR 95765)

PR libstdc++/95765
* include/bits/stl_algobase.h (__size_to_integer(float))
(__size_to_integer(double), __size_to_integer(long double))
(__size_to_integer(__float128)): Cast return type explicitly.
* include/bits/stl_uninitialized.h
(__uninitialized_default_1):
Remove unused typedef.

[Bug libstdc++/95765] std::vector should be built without warnings with -Wconversion and/or -Wsystem-headers

2020-06-19 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95765

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Jonathan Wakely  ---
Users should not be routinely using -Wsystem-headers to find problems with
their own code (that defeats the entire purpose of suppressing warnings in
system headers).

I've fixed some warnings in libstdc++ code, but the real problem here is Bug
43167 (and Bug 43167 comment 17 has almost exactly the same example).

*** This bug has been marked as a duplicate of bug 43167 ***

[Bug regression/95673] Inconsistent optimization behavior when there is a buffer overflow

2020-06-19 Thread dn2sp-dev at yahoo dot fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95673

--- Comment #3 from dn2sp-dev at yahoo dot fr ---
(In reply to Martin Sebor from comment #2)
> When the result of strncmp is only used to test for equality to zero that it
> determines must evaluate to either true or false GCC 10 issues the
> -Wstring-compare warning and folds those comparisons to the respective
> constants (see the adjusted test case below).
> 
> But GCC doesn't issue the warning when the result is also used for other
> things like in the test case.  I'm thinking it probably should warn
> regardless.

Thank you a lot for this clear explanation.
Issuing -Wstring-compare even when the result is used could help because a
useless (always false) string comparison is most likely the result of faulty
code.

[Bug tree-optimization/95770] [11 Regression] ice during GIMPLE pass: slp

2020-06-19 Thread dcb314 at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95770

--- Comment #2 from David Binderman  ---
Probably a duplicate, but this C++ code also causes a crash:

float *a;
void b() {
  float c, d;
  a[0] = a[1] = 0.5f * (c - 2 + d);
  a[2] = a[3] = 0.5f * (c + 2 + d);
}

$ /home/dcb/gcc/results/bin/gcc -c -w -O3 -march=native bug623.cc
during GIMPLE pass: slp
bug623.cc: In function ‘void b()’:
bug623.cc:2:6: internal compiler error: Segmentation fault
2 | void b() {
  |  ^
0x1193667 crash_signal
../../trunk.git/gcc/toplev.c:328
0x1471370 vect_schedule_slp_instance
../../trunk.git/gcc/tree-vect-slp.c:4220
0x14712c7 vect_schedule_slp_instance
../../trunk.git/gcc/tree-vect-slp.c:4182
0x14712c7 vect_schedule_slp_instance
../../trunk.git/gcc/tree-vect-slp.c:4182

Native is AMD FX(tm)-8350 Eight-Core Processor.

[Bug target/95737] PPC: Unnecessary extsw after negative less than

2020-06-19 Thread jens.seifert at de dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

Jens Seifert  changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|DUPLICATE   |---

--- Comment #3 from Jens Seifert  ---
This is different as the extsw also happens if the result gets used e.g.
followed by a andc, which is my case. I obviously oversimplified the sample. It
has nothing to do with function result and ABI requirements. gcc assume that
the result of -(a < b) implemented by subfc, subfe is signed 32-bit. But the
result is already 64-bit.

unsigned long long branchlesconditional(unsigned long long a, unsigned long
long b, unsigned long long c)
{
   unsigned long long mask = -(a < b);
   return c &~ mask;
}

results in

_Z20branchlesconditionalyyy:
.LFB1:
.cfi_startproc
subfc 4,4,3
subfe 3,3,3
not 3,3
extsw 3,3
and 3,3,5
blr

expected
subfc
subfe
andc

[Bug target/95237] LOCAL_DECL_ALIGNMENT shrinks alignment, FAIL gcc.target/i386/pr69454-2.c

2020-06-19 Thread skpgkp2 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95237

--- Comment #17 from Sunil Pandey  ---
$ cat foo.c
long long c(long long x) {}
int a() { long long b = c(b); }

$ gcc -m32 -mpreferred-stack-boundary=2 -c foo.c
during GIMPLE pass: adjust_alignment
foo.c: In function ??a??:
foo.c:2:5: internal compiler error: in execute, at adjust-alignment.c:74
2 | int a() { long long b = c(b); }
  | ^
0x79d34f execute
   
/local/skpandey/gccwork/pr95237/gitlab/gcc.orig/gcc/adjust-alignment.c:74
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug tree-optimization/95771] New: Failure to optimize popcount idiom when argument is unsigned char

2020-06-19 Thread gabravier at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95771

Bug ID: 95771
   Summary: Failure to optimize popcount idiom when argument is
unsigned char
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gabravier at gmail dot com
  Target Milestone: ---

int f(unsigned char x)
{
int i = 0;
while (x)
{
x &= x - 1;
++i;
}
return i;
}

This can be optimized to __builtin_popcount(x). LLVM does this transformation,
but GCC does not.

PS : GCC does this optimization if x is int and a few other types. I've also
seen that GCC does not do this optimization for __int128 (which it could do
with adding a popcount of the low and high parts of x).

[Bug tree-optimization/95761] [11 regression] ICE during GIMPLE pass: slp verify_ssa failed

2020-06-19 Thread dimhen at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95761

--- Comment #5 from Dmitry G. Dyachenko  ---
r11-1553 PASS original testcase for me.
And FAIL similar test with the same stack.

$ cat x_2.i
typedef int a[10];
typedef struct {
  a b;
  a c;
  a d;
} e;
e j;
void k() {
  int *h = j.c, *f = j.d, *g = j.b;
  int i;
  for (i = 0; i < 10; i++)
h[i] = f[0] + g[0];
  {
h = j.d;
for (i = 0; i < 10; i++)
  h[i] = f[1] - g[0];
h = 0;
h[0] = 0;
  }
  k();
}

gcc -O3 -fpreprocessed -c x_2.i 
x_2.i: In function 'k':
x_2.i:8:6: error: definition in block 2 follows the use
8 | void k() {
  |  ^
for SSA_NAME: vect_cst__14 in statement:
vect__31.16_9 = vect_cst__14 - vect__127.14_155;
during GIMPLE pass: slp
x_2.i:8:6: internal compiler error: verify_ssa failed
0x12407cd verify_ssa(bool, bool)
/home/dimhen/src/gcc_current/gcc/tree-ssa.c:1208
0xf385b5 execute_function_todo
/home/dimhen/src/gcc_current/gcc/passes.c:1992
0xf392ec do_per_function
/home/dimhen/src/gcc_current/gcc/passes.c:1640
0xf392ec execute_todo
/home/dimhen/src/gcc_current/gcc/passes.c:2039
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.


test from PR95770 FAIL too for me

[Bug middle-end/95755] GCC 10.1.0 reports bogus sizes in -Werror=format-truncation= error

2020-06-19 Thread jonathan.leffler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95755

--- Comment #2 from Jonathan Leffler  ---
(In reply to Martin Sebor from comment #1)
> The sizes used to trigger the warning are based on what it can determine
> from the representation of source code it sees.  ...

Thank you for the confirmation, explanation and workarounds.

For the time being, I'm changing the format string to the equivalent of
"%.128s/name/%.160s/abc1234", and including this bug's URL in the comments.

The original source file is about 3200 lines long and directly includes 31
headers (and doesn't directly include either of those used in the repro).  As
you must have guessed, the structure and function names have been anonymized. 
The function the code comes from is about 150 lines long.  Most of the 'real'
structures are bigger than in the reproduction.  There was a lot of reduction
work.

For the actual code, because the macros defining the array sizes are simple
integer values (#define ABC 128), I can use a series of macros to embed the
sizes into the format string; it isn't too awful.

So, I have an adequate workaround.

But ultimately, I think the false positive should be fixed if at all possible.

Thank you all for your efforts.

[Bug target/65010] ppc backend generates unnecessary signed extension

2020-06-19 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65010

Bill Schmidt  changed:

   What|Removed |Added

 CC||jens.seifert at de dot ibm.com

--- Comment #10 from Bill Schmidt  ---
*** Bug 95737 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/95770] [11 Regression] ice during GIMPLE pass: slp

2020-06-19 Thread dcb314 at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95770

--- Comment #1 from David Binderman  ---
Reduced C code is:

typedef struct {
  float a;
  float b
} c;
c d, f, g;
float e;
h() {
  g.a = d.a * f.a - f.b;
  g.b = d.a * f.b + e;
}

[Bug target/95737] PPC: Unnecessary extsw after negative less than

2020-06-19 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

Bill Schmidt  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Bill Schmidt  ---
If you can show this is different from 65010 (not a return value issue), please
reopen.

*** This bug has been marked as a duplicate of bug 65010 ***

[Bug target/95737] PPC: Unnecessary extsw after negative less than

2020-06-19 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737

--- Comment #1 from Bill Schmidt  ---
Please test this out of context of a return statement.  The problem with
unnecessary extends of return values is widely known and not specific to this
particular case.

[Bug tree-optimization/95770] [11 Regression] ice during GIMPLE pass: slp

2020-06-19 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95770

Andrew Pinski  changed:

   What|Removed |Added

Version|unknown |11.0
  Component|c   |tree-optimization
   Target Milestone|--- |11.0
   Keywords||ice-on-valid-code
Summary|ice during GIMPLE pass: slp |[11 Regression] ice during
   ||GIMPLE pass: slp

[Bug c/95770] New: ice during GIMPLE pass: slp

2020-06-19 Thread dcb314 at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95770

Bug ID: 95770
   Summary: ice during GIMPLE pass: slp
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com
  Target Milestone: ---

Created attachment 48758
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48758&action=edit
C source code

Sometime from 20200618 to 20200619, the attached C code fails
to compile with compiler flag -O3 on x86_64.

/home/dcb/gcc/results.20200618/bin/gcc
/home/dcb/gcc/results.20200619/bin/gcc
../src/filter_tools.c: In function ‘fftx.constprop’:
../src/filter_tools.c:77:13: error: definition in block 11 follows the use
for SSA_NAME: vect_cst__45 in statement:
vect__88.4452_47 = vect__95.4450_43 + vect_cst__45;
during GIMPLE pass: slp
../src/filter_tools.c:77:13: internal compiler error: verify_ssa failed
0x11624ab verify_ssa(bool, bool)
../../trunk.git/gcc/tree-ssa.c:1208
0xded0d5 execute_function_todo
../../trunk.git/gcc/passes.c:1992
0xdee331 do_per_function
../../trunk.git/gcc/passes.c:1640
0xdee331 execute_todo
../../trunk.git/gcc/passes.c:2039

I'll have a go at reducing it.

[Bug c++/95726] ICE with aarch64 __Float32x4_t as template argument

2020-06-19 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95726

--- Comment #7 from Jason Merrill  ---
(In reply to Jakub Jelinek from comment #5)
> Dunno, perhaps for backporting it could be done in template_args_equal
> instead?

For backporting we could treat them as different only if
comparing_specializations is set.

[Bug tree-optimization/95769] Constant expression in inline function not optimized

2020-06-19 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95769

--- Comment #4 from Jakub Jelinek  ---
And, in the compiler the C++ constant evaluation is something done only in the
C++ FE, while you are looking for IPA constant propagation and based on that
performing the C++ FE constant evaluation because that function had constexpr.

[Bug tree-optimization/95769] Constant expression in inline function not optimized

2020-06-19 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95769

--- Comment #3 from Jakub Jelinek  ---
(In reply to John Simon from comment #2)
> (In reply to Jakub Jelinek from comment #1)
> > If you want to ensure a function is evaluated at compile time, it needs to
> > be either C++20 consteval, or you need to evaluate it in constant expression
> > context, e.g. constexpr auto x = expensive_function(1);
> 
> in this case, doing that will make the `other_function` consteval too, so I
> can't call it with non-constant values.

So you can't make it consteval then, but you can still assign it into const
variable (or even constinit in C++20 to abort compilation if not evaluated into
constant).  Without that, the constexpr means nothing, the compiler might or
might not evaluate it at compile time.

[Bug tree-optimization/94880] Failure to recognize andn pattern

2020-06-19 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94880

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:e0bfe016712ace877dd5b057bc1eb06e3c307623

commit r11-1558-ge0bfe016712ace877dd5b057bc1eb06e3c307623
Author: Przemyslaw Wirkus 
Date:   Fri Jun 19 16:48:55 2020 +0100

Fix PR94880: Failure to recognize andn pattern

Pattern "(x | y) - y" can be optimized to simple "(x & ~y)" andn
pattern.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/94880
* match.pd (A | B) - B -> (A & ~B): New simplification.

gcc/testsuite/ChangeLog:

PR tree-optimization/94880
* gcc.dg/tree-ssa/pr94880.c: New Test.

[Bug tree-optimization/95769] Constant expression in inline function not optimized

2020-06-19 Thread gcc at mailinator dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95769

--- Comment #2 from John Simon  ---
(In reply to Jakub Jelinek from comment #1)
> If you want to ensure a function is evaluated at compile time, it needs to
> be either C++20 consteval, or you need to evaluate it in constant expression
> context, e.g. constexpr auto x = expensive_function(1);

in this case, doing that will make the `other_function` consteval too, so I
can't call it with non-constant values.

[Bug middle-end/95755] GCC 10.1.0 reports bogus sizes in -Werror=format-truncation= error

2020-06-19 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95755

Martin Sebor  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2020-06-19
 Blocks||88781
  Component|c   |middle-end
 Status|UNCONFIRMED |NEW

--- Comment #1 from Martin Sebor  ---
The sizes used to trigger the warning are based on what it can determine from
the representation of source code it sees.  In this case it's the IL below
(seen in the output of -fdump-tree-strlen):

mz_format (struct mz_cb * rcb)
{
  struct mz_unity ncred;
  struct mz_unity * ocred;
  char[513] * _1;

   [local count: 1073741824]:
  ocred_4 = rcb_3(D)->mz_creds;
  ncred = *ocred_4;
  _1 = &rcb_3(D)->mz_url;
  snprintf (_1, 513, "%s/name/%s/abc1234", &MEM  [(void *)&ncred +
8B], &MEM  [(void *)&ncred + 288B]);
  ncred ={v} {CLOBBER};
  return;

}

>From the argument to the first %s directive (&MEM  [(void *)&ncred +
8B]), rather than using its type which is unfortunately unreliable in other
contexts, the warning conservatively uses the size of the whole object, which
it calculates to be 4552 bytes.  It uses the same conservative heuristic as
optimizations use here, which could be argued is both good and bad.  It's good
in that it reflects what would happen if only the last ncred array member was a
nul-terminated string (GCC assumes that strlen results are bounded by the size
of the complete object the argument points to).  The words "may be truncated"
in the warning are used specifically to reflect that truncation is possible but
not inevitable.  It's bad because it triggers what for correct code seems like
false positives.

In this case, to avoid triggering, the warning could consider the MEM_REF type
instead (which is char[129]).  Unfortunately, because the type is seen as
unreliable and relying on it has led to bugs in the past, there's a lot of
sensitivity to using it for anything.  Alternatively, the warning could use the
offset to find the member and the size of the member (like -Wrestrict already
does).  That in this case is rcred->variant.variant01.am_field0, the first
member of the union, not the second member, az_field11, referenced in the
source.  Because they happen to have the same size the strategy would work here
but not in other cases when the first member were bigger.  Another alternative
(for unions) is to find the smallest member at the offset.  That would lead to
false negatives if the member actually referenced in the source were the bigger
one.  Yet another option is to give up if the member can't be unambiguously
identified.  That would lead to even more false negatives (basically, any
representation involving a MEM_REF would have to be excluded from the
analysis).  Yet another possibility that we have been discussing recently is
capturing the member referenced in the source earlier on, before it has been
lost to MEM_REF, and using it.  That's probably the only viable alternative but
it requires a change to the internal representation GCC uses, so it's not going
to be a simple solution.  I realize as a user you probably don't care about any
of this.  I summarize it mainly for other GCC developers.

Until we have decided on and implemented a strategy for this there are a few
ways to avoid the warning:

1) use precision to limit the amount of output of each %s directive
2) assert before the call that the lengths of the strings passed to the %s
directives aren't too long (e.g., if (strlen
(cred->variant.variant02.az_field11) > 32) __builtin_unreachable ();)
3) test the result of the call and handle the truncation somehow (e.g.,
similarly to the precondition above except with an abort or trap replacing
__builtin_unreachable if the snprintf call isn't followed by another statement)


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88781
[Bug 88781] [meta-bug] bogus/missing -Wstringop-truncation warnings

[Bug tree-optimization/95769] Constant expression in inline function not optimized

2020-06-19 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95769

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
If you want to ensure a function is evaluated at compile time, it needs to be
either C++20 consteval, or you need to evaluate it in constant expression
context, e.g. constexpr auto x = expensive_function(1);

[Bug libstdc++/35968] nth_element fails to meet its complexity requirements

2020-06-19 Thread sjhowe at dial dot pipex.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35968

--- Comment #14 from Stephen Howe  ---
(In reply to Anders Kaseorg from comment #13)
> (In reply to Patrick J. LoPresti from comment #12)
> > I am familiar with the usual algorithmic complexity definitions.
> > 
> > So, just to be clear... Your assertion is that the C++ standards committee
> > adopted a specification that rules out a deterministic implementation?
> 
> I should have been clearer: I’m saying it rules out quickselect with a
> deterministic pivot selection rule that doesn’t inspect Θ(n) elements. 
> Quickselect with randomized Θ(1) pivot selection would satisfy the
> specification, as would quickselect with deterministic Θ(n) pivot selection
> by median-of-medians or similar, but not quickselect with deterministic Θ(1)
> pivot selection by taking the first element or similar.

I am the original bug reporter.

The C++ standard is not good enough for this algorithm.
In David Musser's original paper
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.5196&rep=rep1&type=pdf
he describes Introsort, and in the ISO C++ 2011 standard, the complexity
requirements tightened for std::sort() from O(n * log n) on average, to O(n *
log n) in the worse case. Introsort guarantees that. It uses quicksort unless
it detects that pivot selection is going bad for portions of the array, it
which case it uses heapsort for which has no known worse case. So guaranteed
worse case performance.

Now David Musser also wrote about intraselect which analogously uses
quickselect but his paper did not say which algorithm should be swapped to if
quickselect went bad. The median-of-medians (in at least groups of 5) is such
an algorithm with O(n) performance. Heapselect is not O(n).
So the C++ standard could use Intraselect with guaranteed O(n) performance, not
just on average. The big O complexity could be tightened up in the same way
that std::sort() was tightened up in ISO C++ 2011.

> Quickselect with randomized Θ(1) pivot selection
No. It can be beaten. Even randomized Θ(1) pivot selection is not good enough.
Antiqsort by Doug McIlroy can beat it. See
https://www.cs.dartmouth.edu/~doug/aqsort.c

Cheers

Stephen Howe

[Bug tree-optimization/95769] New: Constant expression in inline function not optimized

2020-06-19 Thread gcc at mailinator dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95769

Bug ID: 95769
   Summary: Constant expression in inline function not optimized
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at mailinator dot com
  Target Milestone: ---

Godbolt link: https://godbolt.org/z/Kimna8

Code:

```
int constexpr expensive_function(int x){
int result{};
while(x!=1){
x=x%2!=0 ? x*3+1 : x/2;
result++;
}
return result;
}

int constexpr other_function(int a, int b){
return a*expensive_function(b);
}

int f_(int x){
return other_function(x, 1000);
}

int g(int x){
return x*expensive_function(1000);
}
```

The function `g` is optimized so `expensive_function(1000)` is evaluated at
compile time. The function `f` isn't.

[Bug c++/95768] -march=sandybridge -O2 -Wall crashes as 'during GIMPLE pass: uninit ... Segmentation fault'

2020-06-19 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95768

--- Comment #1 from Sergei Trofimovich  ---
On today's gcc master configured as:

$ ~/dev/git/gcc-native-quick/gcc/xg++
-B/home/slyfox/dev/git/gcc-native-quick/gcc/ -v # -march=sandybridge -O2 -Wall
-c bug.cc -o bug.o -
Reading specs from /home/slyfox/dev/git/gcc-native-quick/gcc/specs
COLLECT_GCC=/home/slyfox/dev/git/gcc-native-quick/gcc/xg++
COLLECT_LTO_WRAPPER=/home/slyfox/dev/git/gcc-native-quick/gcc/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--enable-languages=c,c++ --disable-bootstrap --with-multilib-list=m64
--prefix=/home/slyfox/dev/git/gcc-native-quick/../gcc-native-quick-installed
--disable-nls --without-isl --disable-libsanitizer --disable-libvtv
--disable-libgomp --disable-libstdcxx-pch --disable-libunwind-exceptions
CFLAGS='-O1 ' CXXFLAGS='-O1 ' --with-sysroot=/usr/x86_64-HEAD-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.0.0 20200619 (experimental) (GCC)

crash is reproducible:

$ ~/dev/git/gcc-native-quick/gcc/xg++
-B/home/slyfox/dev/git/gcc-native-quick/gcc/ -march=sandybridge -O2 -Wall -c
bug.cc -o bug.o
bug.cc: In constructor 'p::p(a::c)':
bug.cc:32:26: warning: '*.p::alloc' is used uninitialized
[-Wuninitialized]
   32 | header = (n *)malloc(alloc);
  |  ^
'
during GIMPLE pass: uninit
In function 'void s()':
Segmentation fault
   37 | void s() { p(a::q); }
  |  ^
0x7f7b2395dc3f ???
   
/usr/src/debug/sys-libs/glibc-2.31-r5/glibc-2.31/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0x7f7b23948d49 __libc_start_main
../csu/libc-start.c:308
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

[Bug c++/95768] New: -march=sandybridge -O2 -Wall crashes as 'during GIMPLE pass: uninit ... Segmentation fault'

2020-06-19 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95768

Bug ID: 95768
   Summary: -march=sandybridge -O2 -Wall crashes as 'during GIMPLE
pass: uninit ... Segmentation fault'
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slyfox at inbox dot ru
  Target Milestone: ---

SIGSEGV initially observed on qtcore-5.14.2 package.

Here is the minimal(ish) reproducer:

// $ cat bug.cc
extern "C" void *malloc(unsigned long);
class a {
public:
  enum c { Array };
};
class d {
public:
  static int e(int);
};
class f {
public:
  int g;
  void operator=(int) { d::e(g); }
};
template < typename, int, int > using h = f;
template < int i, int j > using k = h< int, i, j >;
template < int i, int j > using l = h< int, i, j >;
class m {
public:
  k< 0, 1 > is_object;
  k< 1, 1 > length;
};
class n {
public:
  m *o() { return (m *)this; }
};
class p {
public:
  enum {} alloc;
  n *header;
  p(a::c) {
header = (n *)malloc(alloc);
m b = *header->o();
b.length = 0;
  }
};
void detach2() { p(a::Array); }

LANG=C /usr/bin/x86_64-pc-linux-gnu-g++ -march=sandybridge -O2 -Wall -c bug.cc
-o bug.o
bug.cc: In constructor 'p::p(a::c)':
bug.cc:32:26: warning: '*.p::alloc' is used uninitialized
[-Wuninitialized]
   32 | header = (n *)malloc(alloc);
  |  ^
'
during GIMPLE pass: uninit
In function 'void detach2()':
Segmentation fault
   37 | void detach2() { p(a::Array); }
  |  ^~~
0xa9c91f crash_signal
../../gcc-11.0.0_pre/gcc/toplev.c:328
0x7f2dc77d2c3f ???
   
/usr/src/debug/sys-libs/glibc-2.31-r5/glibc-2.31/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0xce7494 location_wrapper_p(tree_node const*)
../../gcc-11.0.0_pre/gcc/tree.h:3999
0xce7494 tree_strip_any_location_wrapper(tree_node*)
../../gcc-11.0.0_pre/gcc/tree.h:4011
0xce7494 integer_onep(tree_node const*)
../../gcc-11.0.0_pre/gcc/tree.c:2573
0x4e0ee3 dump_expr
../../gcc-11.0.0_pre/gcc/cp/error.c:2386
0x4e3640 expr_to_string(tree_node*)
../../gcc-11.0.0_pre/gcc/cp/error.c:3109
0x4e3cfc cp_printer
../../gcc-11.0.0_pre/gcc/cp/error.c:4264
0x13e0646 pp_format(pretty_printer*, text_info*)
../../gcc-11.0.0_pre/gcc/pretty-print.c:1475
0x13d48e2 diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*)
../../gcc-11.0.0_pre/gcc/diagnostic.c:1159
0x13d683a diagnostic_impl
../../gcc-11.0.0_pre/gcc/diagnostic.c:1309
0x13d683a warning_at(unsigned int, int, char const*, ...)
../../gcc-11.0.0_pre/gcc/diagnostic.c:1446
0xc5e7ed maybe_warn_operand
../../gcc-11.0.0_pre/gcc/tree-ssa-uninit.c:418
0xc619e9 warn_uninitialized_vars
../../gcc-11.0.0_pre/gcc/tree-ssa-uninit.c:640
0xc66016 execute
../../gcc-11.0.0_pre/gcc/tree-ssa-uninit.c:2936
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug c++/95763] Feature request: compiler warning if line width exceeds N symbols

2020-06-19 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95763

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
The fortran FE also has diagnostics about source line lengths.
The question is e.g. if comments should be an exception or not.

[Bug c++/95767] No warning when branching on two identical conditions

2020-06-19 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95767

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #1 from Marek Polacek  ---
-Wduplicated-cond warns about similar things, but not in this case, because it
thinks that the condition has side-effects.

[Bug c++/95763] Feature request: compiler warning if line width exceeds N symbols

2020-06-19 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95763

--- Comment #3 from joseph at codesourcery dot com  ---
FWIW, the Ada front end has some style checking support (I once broke 
bootstrap by applying spelling corrections there, where fixing the 
spelling in a comment made a line too long).  It wouldn't be ridiculous to 
have such checks available for C and C++, so far as formatting checks work 
reasonably in the presence of macros.

[Bug target/95748] Long long function parameter should be aligned to 32 bit on x86.

2020-06-19 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95748

--- Comment #4 from joseph at codesourcery dot com  ---
Note that __alignof__ is preferred alignment, whereas C11 _Alignof (which 
only applies to types, not declarations) is the alignment required in all 
contexts (so they differ for long long on x86).

[Bug target/95018] [10/11 Regression] Excessive unrolling for Fortran library array handling

2020-06-19 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95018

--- Comment #39 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Jiu Fu Guo
:

https://gcc.gnu.org/g:60bd3f20baebeeddd60f8a2b85927e7da7c6016e

commit r10-8327-g60bd3f20baebeeddd60f8a2b85927e7da7c6016e
Author: guojiufu 
Date:   Thu May 28 13:42:23 2020 +0800

Introduce flag_cunroll_grow_size for cunroll and avoid enable it at -O2

Currently GIMPLE complete unroller(cunroll) is checking
flag_unroll_loops and flag_peel_loops to see if allow size growth.
Beside affects curnoll, flag_unroll_loops also controls RTL unroler.
To have more freedom to control cunroll and RTL unroller, this patch
introduces flag_cunroll_grow_size.  With this patch, we can control
cunroll and RTL unroller indepently. And enable flag_cunroll_grow_size
only if -funroll-loops or -fpeel-loops or -O3 is specified explicitly.

gcc/ChangeLog
2020-06-19  Jiufu Guo  

PR target/95018
* common.opt (flag_cunroll_grow_size): New flag.
* toplev.c (process_options): Set flag_cunroll_grow_size.
* tree-ssa-loop-ivcanon.c (pass_complete_unroll::execute):
Use flag_cunroll_grow_size.
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Override flag_cunroll_grow_size.

[Bug c++/95767] New: No warning when branching on two identical conditions

2020-06-19 Thread mrbart at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95767

Bug ID: 95767
   Summary: No warning when branching on two identical conditions
   Product: gcc
   Version: 9.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mrbart at gmx dot com
  Target Milestone: ---

Created attachment 48757
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48757&action=edit
Minimal-working example

#include 
#include 

int main() {
std::string input;
std::cin >> input;
std::string output;

if  (input == "option1") output = "option1 chosen";
else if (input == "option2") output = "option2 chosen";
else if (input == "option2") output = "option3 chosen";

return 0;
}


Shouldn't the compiler issue a warning that we check twice for the same
condition? Copy-and-past quickly can lead to a mistake here.

Compiled with:
g++ -Wall -Wextra Test.cpp

Tested on:
gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008
Ubuntu Linux 19.10 64 bit
Intel® Core™ i5-2500 CPU

[Bug c++/95763] Feature request: compiler warning if line width exceeds N symbols

2020-06-19 Thread hyena at hyena dot net.ee
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95763

--- Comment #2 from Erich Erstu  ---
(In reply to Jonathan Wakely from comment #1)
> This seems like a better fit for something like clang-tidy than being
> hardcoded into the compiler.

I can see the reasoning there, but different source/header files may have
different maximum line width limitations. For example, we may have a project
where most files are limited to 80 symbols per line but a subset of header
files is limited to 240 symbols per line (as an exception). So, this could not
be handled as a project-wide restriction but rather case-by-case. For example,
those header files may contain large multidimensional nested arrays/tables
(basically data) which requires an exception to be made.

For the above reasons, it would make sense to keep the line width limit as part
of the code file that is supposed to respect that limit. Clang-tidy would have
to get these limits from an external configuration file which bloats the
project. Since GCC already implements special comments such as "// fall
through" for switch statements, I could see this implemented similarly, as a
comment in the beginning of the code file.

[Bug target/95748] Long long function parameter should be aligned to 32 bit on x86.

2020-06-19 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95748

--- Comment #3 from H.J. Lu  ---
[hjl@gnu-cfl-2 tmp]$ cat x.c 
typedef __UINTPTR_TYPE__ uintptr_t;

__attribute__ ((noclone, noinline))
void
check (uintptr_t address, uintptr_t align)
{
  if (address & (align - 1))
__builtin_abort();
}

__attribute__ ((noclone, noinline))
void
foo(uintptr_t x, long long p)
{
  uintptr_t align = __alignof__(p);
  uintptr_t address = (uintptr_t) &p;
  check (address, align);
}

__attribute__ ((noclone, noinline))
int
bar(void)
{
  foo (4,5);
  return 0;
}

int *ptr;

int
main()
{
  int x = 1;
  ptr = &x;
  return bar();
}
[hjl@gnu-cfl-2 tmp]$ gcc -m32 x.c -mpreferred-stack-boundary=2 -O2 
[hjl@gnu-cfl-2 tmp]$ ./a.out 
Aborted (core dumped)
[hjl@gnu-cfl-2 tmp]$

[Bug target/95766] New: Failure to directly use vpbroadcastd for _mm_set1_epi32 when passing unsigned short

2020-06-19 Thread gabravier at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95766

Bug ID: 95766
   Summary: Failure to directly use vpbroadcastd for
_mm_set1_epi32 when passing unsigned short
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gabravier at gmail dot com
  Target Milestone: ---

__m128i f(unsigned short a)
{
return _mm_set1_epi32(a);
}

With -O3 -mavx512cd -mavx512vl, LLVM outputs this :

f(unsigned short):
  vpbroadcastd xmm0, edi
  ret

GCC outputs this :

f(unsigned short):
  kmovw k0, edi
  vpbroadcastmw2d xmm0, k0
  ret

[Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)

2020-06-19 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #10 from Uroš Bizjak  ---
Created attachment 48756
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48756&action=edit
Proposed patch

Patch in testing, survives GOMP testcases.

On a related note, the patch uses TARGET_USE_XCHG_FOR_ATOMIC_STORE, which
should probably be renamed to something more appropriate.

[Bug target/95764] Failure to optimize usage of _mm512_set1_epi32 to a single instruction

2020-06-19 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95764

H.J. Lu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com,
   ||hjl.tools at gmail dot com

--- Comment #2 from H.J. Lu  ---
There are quite a few memory broadcast bug.  This may be a dup.  PR 87767?

[Bug target/95764] Failure to optimize usage of _mm512_set1_epi32 to a single instruction

2020-06-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95764

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-06-19
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
In isolation I'd say the LLVM version is better.

[Bug rtl-optimization/95493] [10 Regression] test for vector members apparently reordered with assignment to vector members since r10-7523-gb90061c6ec090c6b

2020-06-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95493

--- Comment #11 from Richard Biener  ---
(In reply to Matthias Kretz (Vir) from comment #10)
> (In reply to Richard Biener from comment #7)
> > Fixed on trunk sofar.
> 
> Is there anything I can help to get this backported to 10? I applied your
> patch on my GCC 10 checkout since you committed it to master and have not
> had any issues.

It will certainly make 10.2 but I was afraid of fallout (which eventually
happened, see PR95690), so now waiting some more for the fallout from the
fallout fix ;)

But I hope to get to a round of backporting next week.

[Bug target/95748] Long long function parameter should be aligned to 32 bit on x86.

2020-06-19 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95748

H.J. Lu  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Ever confirmed|0   |1
   Last reconfirmed||2020-06-19
 Resolution|INVALID |---

--- Comment #2 from H.J. Lu  ---
long long in i386 psABI is 4 byte aligned.

[Bug libstdc++/95765] New: std::vector should be built without warnings with -Wconversion and/or -Wsystem-headers

2020-06-19 Thread redboltz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95765

Bug ID: 95765
   Summary: std::vector should be built without warnings with
-Wconversion and/or -Wsystem-headers
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redboltz at gmail dot com
  Target Milestone: ---

This is similar issue to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50871 but
different one.

Tested on x86-64 g++ 10.1

The following code converts from std::uint32_t to std::uint16_t. It is
dangerous. Even if compile with -Wconversion option, no warnings are reported
because conversion code is in the standard library.

#include 
#include 

struct info {
explicit info(std::uint16_t v): v { v } {}
std::uint16_t v;
};

int main() {
std::uint32_t a = 0x1;
std::vector i;

// convert from std::uint32_t to std::uint16_t internally
// Warning is reported only if -Wsystem-headers
i.emplace_back(a); 
}

Compile result: https://godbolt.org/z/_LJPAT


In order to report warning, -Wsystem-headers option is required.

Compile result: https://godbolt.org/z/jmkc5W

/opt/compiler-explorer/gcc-10.1.0/include/c++/10.1.0/ext/new_allocator.h:150:4:
warning: conversion from 'unsigned int' to 'uint16_t' {aka 'short unsigned
int'} may change value [-Wconversion]

  150 |  { ::new((void *)__p) _Up(std::forward<_Args>(__args)...); }

  |^

is the expected warning.

However, many other warnings are reported.
It is difficult to find the expected warning.


By the way, user can use the following workaround to report the expected
warning without -Wsystem-headers.

struct info {
// Possible workaround without -Wconversion
// Move conversion point to user code
template 
explicit info(T v): v { v } {}
std::uint16_t v;
};

Compile result: https://godbolt.org/z/f4YKgh

But I think that the std::vector implementation should be compiled without
warnings on -Wconversion and -Wsystem-headers.

I'm not sure the policy which warning checking should be satisfied the standard
library.

[Bug c++/95763] Feature request: compiler warning if line width exceeds N symbols

2020-06-19 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95763

--- Comment #1 from Jonathan Wakely  ---
This seems like a better fit for something like clang-tidy than being hardcoded
into the compiler.

[Bug rtl-optimization/95493] [10 Regression] test for vector members apparently reordered with assignment to vector members since r10-7523-gb90061c6ec090c6b

2020-06-19 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95493

--- Comment #10 from Matthias Kretz (Vir)  ---
(In reply to Richard Biener from comment #7)
> Fixed on trunk sofar.

Is there anything I can help to get this backported to 10? I applied your patch
on my GCC 10 checkout since you committed it to master and have not had any
issues.

[Bug tree-optimization/95761] [11 regression] ICE during GIMPLE pass: slp verify_ssa failed

2020-06-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95761

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Richard Biener  ---
Fixed.

[Bug tree-optimization/95761] [11 regression] ICE during GIMPLE pass: slp verify_ssa failed

2020-06-19 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95761

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:f8f5715606a4a455327874847ccc91f4617bb4de

commit r11-1553-gf8f5715606a4a455327874847ccc91f4617bb4de
Author: Richard Biener 
Date:   Fri Jun 19 10:03:46 2020 +0200

tree-optimization/95761 - fix vector insertion place compute

I missed that indeed SLP permutation code generation can end up
refering to a non-last vectorized stmt in the last SLP_TREE_VEC_STMTS
element as optimization.  So walk them all.

2020-06-19  Richard Biener  

PR tree-optimization/95761
* tree-vect-slp.c (vect_schedule_slp_instance): Walk all
vectorized stmts for finding the last one.

* gcc.dg/torture/pr95761.c: New testcase.

[Bug target/95762] Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2

2020-06-19 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762

--- Comment #5 from Jakub Jelinek  ---
I'd say anything that depends on optabs if possible.

[Bug target/95764] New: Failure to optimize usage of _mm512_set1_epi32 to a single instruction

2020-06-19 Thread gabravier at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95764

Bug ID: 95764
   Summary: Failure to optimize usage of _mm512_set1_epi32 to a
single instruction
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gabravier at gmail dot com
  Target Milestone: ---

__m512i f(__m512i a)
{
return (_mm512_set1_epi32(0x7FFF) & a);
}

With -O3 -mavx512f, LLVM outputs this :

.LCPI0_0:
  .quad 9223372034707292159
f(long long __vector(8)):
  vpandq zmm0, zmm0, qword ptr [rip + .LCPI0_0]{1to8}
  ret

GCC outputs this :

f(long long __vector(8)):
  mov eax, 2147483647
  vpbroadcastd zmm1, eax
  vpandq zmm0, zmm0, zmm1
  ret

I'm not completely sure the LLVM version is better, but I'd rather file a bug
report (and be able to file one back to LLVM if I learn that GCC's code is
better) than just do nothing.

[Bug target/95762] Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2

2020-06-19 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762

--- Comment #4 from rguenther at suse dot de  ---
On Fri, 19 Jun 2020, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762
> 
> --- Comment #3 from Jakub Jelinek  ---
> And I should note, because of offloading, it would be better to do that kind 
> of
> folding only after_inlining.

Hmm, does that also hold for all the vector permute/ctor "optimizations"
forwprop does?

[Bug target/95762] Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2

2020-06-19 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762

--- Comment #3 from Jakub Jelinek  ---
And I should note, because of offloading, it would be better to do that kind of
folding only after_inlining.

[Bug target/95762] Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2

2020-06-19 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762

--- Comment #2 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #1)
> We're currently representing this as a .VEC_CONVERT IFN lowered at veclower
> time to
> 
>   _4 = [vec_unpack_lo_expr] a_1(D);
>   _5 = [vec_unpack_hi_expr] a_1(D);
>   _2 = {_4, _5};
> 
> rather than using a NOP_EXPR as would be possible now.  I suppose we should
> remove .VEC_CONVERT again for vector integer conversions and directly
> use NOP_EXPRs plus make sure to lower those when not supported.  Not
> sure if __builtin_convertvector also supports integer<->float conversions.

__builtin_convertvector does support integer<->float conversions too.
I'd say we should just fold .VEC_CONVERT to something more appropriate if the
conditions are right (e.g. if an optab says it is possible to do it in a
different way that will also survive veclower) and otherwise keep it as is.

[Bug target/95740] Failure to avoid using the stack when interpreting a float as an integer when it is modified afterwards

2020-06-19 Thread crazylht at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95740

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #2 from Hongtao.liu  ---
Increase constraints preference and reduce sse->integer move cost can't help
it.


modified   gcc/config/i386/i386.md  
@@ -2294,9 +2294,9 @@   

 (define_insn "*movsi_internal" 
   [(set (match_operand:SI 0 "nonimmediate_operand" 
-"=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*rm,*k")  
+"=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,r ,m,?*v,*k,*k ,*rm,*k")   
 (match_operand:SI 1 "general_operand"  
-"g ,re,C ,*y,m  ,*y,*y,r  ,C ,*v,m ,*v,*v,r  ,*r,*km,*k ,CBC"))]   
+"g ,re,C ,*y,m  ,*y,*y,r  ,C ,*v,m ,v,*v,r  ,*r,*km,*k ,CBC"))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))"  
 {  
   switch (get_attr_type (insn))
modified   gcc/config/i386/x86-tune-costs.h 
@@ -1624,7 +1624,7 @@ struct processor_costs skylake_cost = {   
in 32,64,128,256 and 512-bit */ 
   {8, 8, 8, 12, 24},   /* cost of storing SSE registers
in 32,64,128,256 and 512-bit */ 
-  6, 6,/* SSE->integer and
integer->SSE moves */   
+  2, 2,/* SSE->integer and
integer->SSE moves */  

--

It seems to me for insn inserted reloaded before

18: r89:SI=r87:SF#0

(insn 18 16 6 2 (set (reg:SI 89) 
 (subreg:SI (reg:SF 87) 0))

LRA prefer to put reg:SI 89 into memory since it would be used later.

[Bug c++/95763] New: Feature request: compiler warning if line width exceeds N symbols

2020-06-19 Thread hyena at hyena dot net.ee
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95763

Bug ID: 95763
   Summary: Feature request: compiler warning if line width
exceeds N symbols
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Keywords: diagnostic, easyhack
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hyena at hyena dot net.ee
  Target Milestone: ---

Actually, this is a feature request, not a bug report.

It would be very useful if I could specify in a source file the maximum number
of symbols per line that the source file is allowed to contain. If some line
exceeds that limit, then the compiler should print the respective warning
mentioning the line number where the limit is exceeded.

[Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)

2020-06-19 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #9 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #8)

> The culprit is the %esp here, that adds the 0x67 prefix to the insn and
> will only work if %rsp is below 4GB.

Ah, indeed... I was in a bit of hurry and didn't notice.

[Bug target/95762] Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2

2020-06-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2020-06-19
 Status|UNCONFIRMED |NEW
 CC||jakub at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
We're currently representing this as a .VEC_CONVERT IFN lowered at veclower
time to

  _4 = [vec_unpack_lo_expr] a_1(D);
  _5 = [vec_unpack_hi_expr] a_1(D);
  _2 = {_4, _5};

rather than using a NOP_EXPR as would be possible now.  I suppose we should
remove .VEC_CONVERT again for vector integer conversions and directly
use NOP_EXPRs plus make sure to lower those when not supported.  Not
sure if __builtin_convertvector also supports integer<->float conversions.

[Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)

2020-06-19 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #8 from Jakub Jelinek  ---
(In reply to Uroš Bizjak from comment #7)
> Actually, x86_64 (at least my Fedora 32) does not like operations on stack:
> 
> Starting program: /sdd/uros/git/gcc/gcc/testsuite/gcc.dg/atomic/a.out 
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x0040110a in main ()
> (gdb) disass
> Dump of assembler code for function main:
>0x00401106 <+0>: push   %rbp
>0x00401107 <+1>: mov%rsp,%rbp
> => 0x0040110a <+4>: lock orq $0x0,(%esp)

The culprit is the %esp here, that adds the 0x67 prefix to the insn and
will only work if %rsp is below 4GB.

>0x0040 <+11>:mov$0x0,%eax
>0x00401116 <+16>:pop%rbp
>0x00401117 <+17>:retq   
> End of assembler dump.
> 
> I didn't investigate further, but 32bit executable works OK.

[Bug target/95762] New: Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2

2020-06-19 Thread gabravier at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762

Bug ID: 95762
   Summary: Failure to optimize __builtin_convertvector from
vector of 16 chars to vector of 16 shorts in a single
instruction on AVX2
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gabravier at gmail dot com
  Target Milestone: ---

typedef int8_t v16i8 __attribute__((vector_size(16)));
typedef int16_t v16i16 __attribute__((vector_size(32)));

auto f(v16i8 a)
{
return __builtin_convertvector(a, v16i16);
}

With -O3 -mavx2, LLVM outputs this :

f(signed char __vector(16)):
  vpmovsxbw ymm0, xmm0
  ret

GCC outputs this :

f(signed char __vector(16)):
  vpmovsxbw xmm1, xmm0
  vpsrldq xmm0, xmm0, 8
  vpmovsxbw xmm0, xmm0
  vinserti128 ymm0, ymm1, xmm0, 0x1
  ret

[Bug target/95750] [x86] Use dummy atomic insn instead of mfence in __atomic_thread_fence(seq_cst)

2020-06-19 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95750

--- Comment #7 from Uroš Bizjak  ---
Actually, x86_64 (at least my Fedora 32) does not like operations on stack:

Starting program: /sdd/uros/git/gcc/gcc/testsuite/gcc.dg/atomic/a.out 

Program received signal SIGSEGV, Segmentation fault.
0x0040110a in main ()
(gdb) disass
Dump of assembler code for function main:
   0x00401106 <+0>: push   %rbp
   0x00401107 <+1>: mov%rsp,%rbp
=> 0x0040110a <+4>: lock orq $0x0,(%esp)
   0x0040 <+11>:mov$0x0,%eax
   0x00401116 <+16>:pop%rbp
   0x00401117 <+17>:retq   
End of assembler dump.

I didn't investigate further, but 32bit executable works OK.

[Bug c++/95759] Sized deallocation function can not be matched

2020-06-19 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95759

Jonathan Wakely  changed:

   What|Removed |Added

  Component|libstdc++   |c++

--- Comment #2 from Jonathan Wakely  ---
If you test with a class type that has a non-trivial destructor you'll see the
sized deallocation function being called.

[Bug libstdc++/95759] Sized deallocation function can not be matched

2020-06-19 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95759

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Jonathan Wakely  ---
(In reply to hujp from comment #0)
> I guess if the sized deallocation funtion may not replaceable, or there is a
> mis-matched bug?

No, this is behaving as expected (and is not a libstdc++ bug in any case).

See http://wg21.link/cwg1788

[Bug tree-optimization/95745] [11 regression] O3-pr85794.c fails since r11-1445-g502d63b6d6141597bb18fd23c87736a1b384cf8f

2020-06-19 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95745

--- Comment #6 from Christophe Lyon  ---
(In reply to Martin Liška from comment #4)
> Ok, can I test it with a x86_64-linux-gnu cross compiler?

Yes, that's what I am using.

Target: arm-none-linux-gnueabi
Configured with: /configure --target=arm-none-linux-gnueabi
--prefix=/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools
--with-sysroot=/aci-gcc-fsf/builds/gcc-fsf-gccsrc/sysroot-arm-none-linux-gnueabi
--disable-nls --disable-libgomp --disable-libmudflap --disable-libcilkrts
--enable-checking --enable-languages=c,c++,fortran --with-float=soft
--enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a9


> Can you please provide exact command line for some of the problematic
> test-cases?

/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/gcc/xgcc
-B/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/gcc/
/gcc/testsuite/gcc.dg/vect/O3-pr85794.c -fno-diagnostics-show-caret
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never
-fdiagnostics-urls=never -mfloat-abi=softfp -ffast-math -ftree-vectorize
-fno-tree-loop-distribute-patterns -fno-vect-cost-model -fno-common -O2
-fdump-tree-vect-details -O3 -fno-ipa-cp-clone -S -o O3-pr85794.s
during RTL pass: expand
/gcc/testsuite/gcc.dg/vect/O3-pr85794.c: In function 'foo':
/gcc/testsuite/gcc.dg/vect/O3-pr85794.c:7:1: internal compiler error: in
do_store_flag, at expr.c:12247

[Bug tree-optimization/94757] GCC does not optimise unsigned multiplication known not to overflow

2020-06-19 Thread qianjh at cn dot fujitsu.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94757

Qian Jianhua  changed:

   What|Removed |Added

 CC||qianjh at cn dot fujitsu.com

--- Comment #2 from Qian Jianhua  ---
>GCC knows that the multiplication cannot overflow, because replacing the 
>returned expression with __builtin_mul_overflow_p(x, 3, x) is makes it 
>optimise 
>to returning constant 0.
__builtin_mul_overflow_p function only check the value range. So it could be
optimised by VRP.

The process like this
   x range [0, UINT_MAX/3]
   x*3   range [0, UINT_MAX]
   x*3/3 range [0, UINT_MAX/3], but not calculate the result(=x).

I tested some other cases. 
It seems that the optimiser in gcc only check the value range of the
expression, but not track/calculate the result of the expression, if the
variable is not constant int the expression.


So i think such optimisation of expression is not supported in gcc now.

In clang, the expression "(x * 3) / 3" is optimised to "x".

[Bug tree-optimization/95745] [11 regression] O3-pr85794.c fails since r11-1445-g502d63b6d6141597bb18fd23c87736a1b384cf8f

2020-06-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95745

--- Comment #5 from Martin Liška  ---
All right, I see something very similar for s390x cross compiler:

./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/ext/vector27.C
-march=z13 -c
during RTL pass: expand
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/ext/vector27.C: In function
‘void f(veci*, veci*, int)’:
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/ext/vector27.C:8:12: internal
compiler error: in do_store_flag, at expr.c:12247
8 |   *a = !*a || *b < ++c;
  |^~~
0x6ae823 do_store_flag
/home/marxin/Programming/gcc/gcc/expr.c:12247
0xc298bf expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/home/marxin/Programming/gcc/gcc/expr.c:9608
0xb11de3 expand_gimple_stmt_1
/home/marxin/Programming/gcc/gcc/cfgexpand.c:3786
0xb11de3 expand_gimple_stmt
/home/marxin/Programming/gcc/gcc/cfgexpand.c:3847
0xb1727a expand_gimple_basic_block
/home/marxin/Programming/gcc/gcc/cfgexpand.c:5888
0xb18d26 execute
/home/marxin/Programming/gcc/gcc/cfgexpand.c:6572
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug tree-optimization/95761] [11 regression] ICE during GIMPLE pass: slp verify_ssa failed

2020-06-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95761

Martin Liška  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org

--- Comment #2 from Martin Liška  ---
For the record: started with r11-1501-gda2b7c7f0a136b4d.

[Bug tree-optimization/95761] [11 regression] ICE during GIMPLE pass: slp verify_ssa failed

2020-06-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95761

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Last reconfirmed||2020-06-19
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Target Milestone|--- |11.0

--- Comment #1 from Richard Biener  ---
Mine.

[Bug tree-optimization/95745] [11 regression] O3-pr85794.c fails since r11-1445-g502d63b6d6141597bb18fd23c87736a1b384cf8f

2020-06-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95745

Martin Liška  changed:

   What|Removed |Added

 Status|WAITING |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org

--- Comment #4 from Martin Liška  ---
Ok, can I test it with a x86_64-linux-gnu cross compiler?
Can you please provide exact command line for some of the problematic
test-cases?

[Bug target/95753] [11 Regression] ICE when building the Linux Kernel for ARM64 (internal compiler error: ‘global_options’ are modified in local context)

2020-06-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95753

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 CC||marxin at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2020-06-19
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org

--- Comment #1 from Martin Liška  ---
I'll take a look.

[Bug middle-end/95757] [11 regression] missing warning in gcc.dg/Wstringop-overflow-25.c since r11-1517

2020-06-19 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95757

Christophe Lyon  changed:

   What|Removed |Added

 Target|powerpc64*-linux-gnu|powerpc64*-linux-gnu arm
 CC||clyon at gcc dot gnu.org

--- Comment #1 from Christophe Lyon  ---
I see the same thing on some arm targets:
arm-none-linux-gnueabihf --with-cpu=cortex-a5
arm-none-eabi -mcpu=cortex-m[034]

but for instance arm-none-linux-gnueabihf --with-cpu=cortex-a9 works.

[Bug tree-optimization/95745] [11 regression] O3-pr85794.c fails since r11-1445-g502d63b6d6141597bb18fd23c87736a1b384cf8f

2020-06-19 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95745

--- Comment #3 from Christophe Lyon  ---
I still see it with r11-1521-gaae80e833d2826fc0afe7ff1704d2ab0f4607c5a

[Bug tree-optimization/95761] New: [11 regression] ICE during GIMPLE pass: slp verify_ssa failed

2020-06-19 Thread dimhen at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95761

Bug ID: 95761
   Summary: [11 regression] ICE during GIMPLE pass: slp verify_ssa
failed
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dimhen at gmail dot com
  Target Milestone: ---

r11-1451 PASS
r11-1512 FAIL

`gcc -O2' PASS

$ gcc -O3 -c x_1.i
x_1.i: In function 'k':
x_1.i:10:6: error: definition in block 3 follows the use
   10 | void k() {
  |  ^
for SSA_NAME: vect__2.17_23 in statement:
vect__124.21_25 = vect_cst__58 + vect__2.17_23;
during GIMPLE pass: slp
x_1.i:10:6: internal compiler error: verify_ssa failed
0x123ff3d verify_ssa(bool, bool)
/home/dimhen/src/gcc_current/gcc/tree-ssa.c:1208
0xf37d25 execute_function_todo
/home/dimhen/src/gcc_current/gcc/passes.c:1992
0xf38a5c do_per_function
/home/dimhen/src/gcc_current/gcc/passes.c:1640
0xf38a5c execute_todo
/home/dimhen/src/gcc_current/gcc/passes.c:2039
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

$ cat x_1.i
typedef int a[10];
typedef struct {
  a b;
  a c;
  a d;
  a e;
} f;
f g;
int *j;
void k() {
  for (;;) {
a l;
j[0] = g.b[0];
int *h = g.d;
int i = 0;
for (; i < 10; i++)
  h[i] = l[0] - g.e[0];
h = g.e;
i = 0;
for (; i < 10; i++)
  h[i] = l[1] + g.e[i];
  }
}

Sorry for hyper-reduction

[Bug middle-end/95757] [11 regression] missing warning in gcc.dg/Wstringop-overflow-25.c since r11-1517

2020-06-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95757

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |11.0

[Bug rtl-optimization/95756] Failure to optimize memory operations with _Complex

2020-06-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95756

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Keywords||missed-optimization
 Ever confirmed|0   |1
   Last reconfirmed||2020-06-19
 Target||x86_64-*-* i?86-*-*
  Component|target  |rtl-optimization

--- Comment #1 from Richard Biener  ---
RTL expansion issue, we're ending up with

(insn 5 2 6 2 (set (reg:SF 82 [  ])
(mem/u/c:SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S4 A32]))
"t.c":3:12 -1
 (expr_list:REG_EQUAL (const_double:SF 0.0 [0x0.0p+0])
(nil)))
(insn 6 5 10 2 (set (reg:SF 83 [ +4 ])
(mem/u/c:SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S4 A32]))
"t.c":3:12 -1
 (expr_list:REG_EQUAL (const_double:SF 0.0 [0x0.0p+0])
(nil)))
(insn 10 6 11 2 (set (reg:SF 84)
(reg:SF 82 [  ])) "t.c":4:1 -1
 (nil))
(insn 11 10 12 2 (set (reg:SF 85)
(reg:SF 83 [ +4 ])) "t.c":4:1 -1
 (nil))
(insn 12 11 13 2 (set (mem/c:SF (plus:DI (reg/f:DI 77 virtual-stack-vars)
(const_int -8 [0xfff8])) [0  S4 A32])
(reg:SF 84)) "t.c":4:1 -1
 (nil))
(insn 13 12 14 2 (set (mem/c:SF (plus:DI (reg/f:DI 77 virtual-stack-vars)
(const_int -4 [0xfffc])) [0  S4 A32])
(reg:SF 85)) "t.c":4:1 -1
 (nil))
(insn 14 13 15 2 (set (reg:DI 20 xmm0)
(mem/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
(const_int -8 [0xfff8])) [0  S8 A32])) "t.c":4:1 -1
 (nil))
(insn 15 14 0 2 (use (reg:DI 20 xmm0)) "t.c":4:1 -1
 (nil))

[Bug target/95753] [11 Regression] ICE when building the Linux Kernel for ARM64 (internal compiler error: ‘global_options’ are modified in local context)

2020-06-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95753

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |11.0
Summary|ICE when building the Linux |[11 Regression] ICE when
   |Kernel for ARM64 (internal  |building the Linux Kernel
   |compiler error: |for ARM64 (internal
   |‘global_options’ are|compiler error:
   |modified in local context)  |‘global_options’ are
   ||modified in local context)

[Bug target/95748] Long long function parameter should be aligned to 32 bit on x86.

2020-06-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95748

Richard Biener  changed:

   What|Removed |Added

 Target||i?86-*-*
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID
   Keywords||ABI

--- Comment #1 from Richard Biener  ---
x86 aligns 'long long' (the type) to 8 bytes so your test is a bit flawed
because 8 byte alignment is of course also 4 byte alignment.

[Bug tree-optimization/95745] [11 regression] O3-pr85794.c fails since r11-1445-g502d63b6d6141597bb18fd23c87736a1b384cf8f

2020-06-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95745

Richard Biener  changed:

   What|Removed |Added

Version|unknown |11.0
   Target Milestone|--- |11.0

  1   2   >