[Bug target/86819] New: Set min_divisions_for_recip_mul to 2

2018-08-01 Thread glisse at gcc dot gnu.org
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- Target: x86_64-*-* On a modern x86_64, multiplications are super fast and divisions are much slower, so as soon as we have 2

[Bug target/86763] [8/9 Regression] Wrong code comparing member of copy of a 237 byte object with nontrivial default constructor on x86-64 arch

2018-07-31 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86763 Marc Glisse changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug target/57112] -march=x86-64 not documented

2018-07-31 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57112 Marc Glisse changed: What|Removed |Added Status|NEW |RESOLVED Known to work|

[Bug tree-optimization/86732] Potential nullptr dereference does not propagate knowledge about the pointer

2018-07-30 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86732 --- Comment #4 from Marc Glisse --- (In reply to Richard Biener from comment #3) > note how it doesn't eliminate the actual load which probably causes the > odd code-generation. The code says: /* We want the NULL pointer dereference to

[Bug tree-optimization/86732] Potential nullptr dereference does not propagate knowledge about the pointer

2018-07-30 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86732 --- Comment #2 from Marc Glisse --- While I would also like to see this optimized better, ISTR that this was done on purpose, you may want to look at the old discussions. Some languages may have things set up to catch null dereferences, but that

[Bug c/86729] address of vector element requested

2018-07-30 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86729 Marc Glisse changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/86722] New: ifcvt produces x&0 that is never cleaned up

2018-07-29 Thread glisse at gcc dot gnu.org
rmal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- Target: x86_64-*-* (could be rtl-optimization or target) void f(double*d,double*e){ for(;d

[Bug tree-optimization/86710] 3 missing logarithm optimizations

2018-07-28 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86710 --- Comment #1 from Marc Glisse --- This kind of transformation needs to be protected by some unsafe math flag, and by a single_use (aka :s) check on the logs. No :c in the output. The third transformation has nothing to do with logs, you are

[Bug tree-optimization/86701] Optimize strlen called on std::string c_str()

2018-07-27 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86701 --- Comment #1 from Marc Glisse --- Aren't you allowed to have null characters in the middle of a std::string?

[Bug tree-optimization/86628] Missed simplification of division

2018-07-23 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86628 --- Comment #5 from Marc Glisse --- (In reply to Richard Biener from comment #4) > Yeah, generally we can't associate because (x*y)*z may not overflow because > x == 0 but x*(y*z) may because y*z overflows. We can do it - in the wrapping case

[Bug tree-optimization/86628] Missed simplification of division

2018-07-22 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86628 --- Comment #3 from Marc Glisse --- We already simplify some simple cases like x*t/t -> x in match.pd. Larger cases are for a pass like reassoc. In this particular case, we could also imagine somehow noticing that (x*y)*z is better reassociated

[Bug ipa/86590] Codegen is poor when passing std::string by value with _GLIBCXX_EXTERN_TEMPLATE undefined

2018-07-20 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86590 --- Comment #5 from Marc Glisse --- -finline-limit=80 or higher (or more precisely --param max-inline-insns-auto=40) lets it optimize.

[Bug tree-optimization/86573] Failure to optimise passing simple values to inlined function

2018-07-19 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86573 --- Comment #7 from Marc Glisse --- The real difference in -std=c++17 is _GLIBCXX_EXTERN_TEMPLATE. With -std=c++14, we have many extern templates which the compiler almost never inlines. This leaves existing inline functions small enough to be

[Bug c++/86573] Failure to optimise passing simple values to inlined function

2018-07-18 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86573 --- Comment #2 from Marc Glisse --- When passing by copy, gcc seems to manage with default flags, but your -std=c++2a -fno-exceptions hinder it somehow.

[Bug c++/86573] Failure to optimise passing simple values to inlined function

2018-07-18 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86573 --- Comment #1 from Marc Glisse --- Try renaming 'main' to any other name and gcc does optimize...

[Bug tree-optimization/86557] missed vectorization with std::vector compared to icc 18

2018-07-18 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86557 Marc Glisse changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug middle-end/86471] GCC/libstdc++ outputs inferior code for std::fill and std::fill_n vs std::memset on c-style arrays

2018-07-13 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86471 --- Comment #9 from Marc Glisse --- (In reply to Andrew Pinski from comment #7) > This is incorrect for floating point types Because of negative 0 I assume. > And it introduces an extra check at runtime if value is not known to compile >

[Bug middle-end/86471] GCC/libstdc++ outputs inferior code for std::fill and std::fill_n vs std::memset on c-style arrays

2018-07-10 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86471 --- Comment #4 from Marc Glisse --- There have been questions before about enabling (parts of) ldist at -O2. (In reply to Matt Bentley from comment #3) > I thought I should note that there is also a missing optimization > opportunity in the

[Bug c++/86477] failure binding reference to vector element

2018-07-10 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86477 --- Comment #2 from Marc Glisse --- We don't have attribute ext_vector_type (we have vector_size). Gcc warns about it. We don't allow constructing a vector from a scalar (broadcasting). What Andrew says. If I fix everything, binding a reference

[Bug c/86420] [9 regression] nextafter(0x1p-1022,0) is constant folded

2018-07-06 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86420 --- Comment #1 from Marc Glisse --- (In reply to nsz from comment #0) > gcc has no flag to say 'floating-point exceptions matter' (like > -frounding-math for non-default rounding mode) There is -ftrapping-math (on by default), although its

[Bug c++/86347] Incorrect call order of allocation function in new expression

2018-06-28 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86347 Marc Glisse changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill

[Bug tree-optimization/86259] [8/9 Regression] min(4, strlen(s)) optimized to strlen(s) with -flto

2018-06-24 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86259 --- Comment #15 from Marc Glisse --- (In reply to Martin Sebor from comment #14) > > You say that > > > > struct { int a; int b; } s, s2; > > memcpy (, , sizeof (s)); > > > > is invalid, aka not copying the whole structure since you pass in

[Bug middle-end/86284] Insert trap instruction in place of missing return statement on dodgy code

2018-06-23 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86284 --- Comment #1 from Marc Glisse --- -fsanitize=return ?

[Bug tree-optimization/86270] Simple loop needs an extra register and an extra instruction

2018-06-21 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86270 --- Comment #2 from Marc Glisse --- :-( So many transforms seem to have this kind of drawback... We could always add a pair of single_use checks, but we are going to miss some optimizations if we do that. Maybe it is slightly relevant that one

[Bug c++/86173] Default construction of a union (in std::optional)

2018-06-20 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86173 --- Comment #4 from Marc Glisse --- Recent related commits: r261758 r261735 (they don't fix the issue).

[Bug c++/86187] Subscript operator applied to an temporary array results in an lvalue

2018-06-16 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86187 Marc Glisse changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug c++/85867] Subscript operator applied to an temporary array results in an lvalue

2018-06-16 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85867 Marc Glisse changed: What|Removed |Added CC||zhonghao at pku dot org.cn --- Comment #3

[Bug c++/86173] Default construction of a union (in std::optional)

2018-06-16 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86173 --- Comment #1 from Marc Glisse --- Note that constructing optional from std::nullopt does avoid the memset.

[Bug libstdc++/80335] perf of copying std::optional

2018-06-16 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80335 Marc Glisse changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug c++/86173] New: Default construction of a union (in std::optional)

2018-06-16 Thread glisse at gcc dot gnu.org
Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- Default construction of std::optional always starts with a memset of the whole optional to 0, while it doesn't with clang using

[Bug middle-end/86122] [8/9 Regression] ICE in useless_type_conversion_p, at gimple-expr.c:87

2018-06-14 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86122 --- Comment #5 from Marc Glisse --- (In reply to Jakub Jelinek from comment #2) > if we want unsigned_type_for to support complex integer types or not. I think we do (seems super easy). Testing utype can't hurt indeed. (In reply to Jakub

[Bug c/86093] New: volatile ignored on pointer in C

2018-06-08 Thread glisse at gcc dot gnu.org
Component: c Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- extern char*volatile i; int f(){return i-i;} gets simplified in C as 1 load and return 0. In C++ or if i has a non-pointer type (say int), we do have 2 loads

[Bug c/86092] global constant pointer optimization

2018-06-08 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86092 --- Comment #4 from Marc Glisse --- (In reply to Srinivas Achary from comment #2) > Is there any possibility to make this code work, Remove the 'const', or add 'volatile'. > without changing the variable attribute. -O0 > GCC-4 has no issue

[Bug tree-optimization/86024] Missed memcpy loop distribution with elementwise copy

2018-06-08 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86024 --- Comment #2 from Marc Glisse --- (In reply to Richard Biener from comment #1) > Or we may want to un-"SRA" such patterns, generating aggregate copies. I notice that store-merging does not merge these stores, I didn't check why. SLP can do it

[Bug tree-optimization/86062] Missed redundancy elimination with struct and array

2018-06-06 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86062 --- Comment #4 from Marc Glisse --- Thanks!

[Bug tree-optimization/86062] New: Missed redundancy elimination with struct and array

2018-06-05 Thread glisse at gcc dot gnu.org
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- #include struct I { double i,s; I(double d):i(d),s(d){} }; typedef std::array P; typedef std::array AP; static AP c

[Bug tree-optimization/86050] Inline break tail-call optimization

2018-06-04 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86050 Marc Glisse changed: What|Removed |Added Keywords||missed-optimization

[Bug tree-optimization/86024] New: Missed memcpy loop distribution with elementwise copy

2018-06-01 Thread glisse at gcc dot gnu.org
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- typedef struct A { int a, b; } A; void*f(A*restrict p){ A*q=__builtin_malloc(1024*sizeof(A)); for(int i=0;i<1

[Bug libstdc++/86023] New: Fake triviality test for internal purposes

2018-06-01 Thread glisse at gcc dot gnu.org
Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- While we cannot make std::pair or std::tuple trivial for now for ABI reasons, it should still be safe to use memcpy-type

[Bug libstdc++/86013] std::vector::shrink_to_fit() could sometimes use realloc()

2018-06-01 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86013 --- Comment #5 from Marc Glisse --- (In reply to Jan Kratochvil from comment #0) > Maybe it could even always call realloc() for size reduction of any type of > objects and just assert the returned pointer did not change. I can't find anywhere

[Bug middle-end/85992] Invalid optimization with atanf

2018-05-30 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85992 --- Comment #3 from Marc Glisse --- (In reply to Matt Peddie from comment #2) > Is there a way to disable this behavior? -fno-builtin (or a more specific -fno-builtin-atanf) tells gcc to handle atanf as a regular function call, not as a

[Bug c/85974] Failure to optimize difference of two pointers into a compile time constant

2018-05-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85974 --- Comment #1 from Marc Glisse --- In match.pd (simplify - (pointer_diff (convert?@2 @0) (convert?@3 ADDR_EXPR@1)) + (pointer_diff (convert?@2 @0) (convert1?@3 ADDR_EXPR@1)) (that is, we can have only one cast, not just 0 or 2) and

[Bug tree-optimization/85929] _GLIBCXX_ASSERTIONS, subscript type mismatch, and std::vector bounds check elimination

2018-05-28 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85929 --- Comment #4 from Marc Glisse --- (In reply to Richard Biener from comment #2) > So somehow we need to enhance the code in VRP that registers additional > asserts to also handle symbolic ranges and thus register not only > i_4 < count_8 but

[Bug middle-end/85929] _GLIBCXX_ASSERTIONS, subscript type mismatch, and std::vector bounds check elimination

2018-05-25 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85929 --- Comment #1 from Marc Glisse --- With size_type = unsigned long, the bounds check turns out to be exactly the same test as the loop exit check, and FRE3 gets rid of it. With size_type = unsigned int, it is harder. We have roughly long int

[Bug c/85850] [9.0 Regression] gcc 9.0 doesn't compile with Xcode 9.3.1

2018-05-20 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85850 --- Comment #1 from Marc Glisse --- In libcpp/system.h, is included too late, after messing with macros, it should move earlier with the other includes. We could probably also avoid #defining true/false in C++ (just a warning).

[Bug c++/82899] *this in constructors could not alias with reference input parameters of the same type

2018-05-18 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 Marc Glisse changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug c++/82899] *this in constructors could not alias with reference input parameters of the same type

2018-05-18 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 --- Comment #24 from Marc Glisse --- Author: glisse Date: Fri May 18 22:21:20 2018 New Revision: 260383 URL: https://gcc.gnu.org/viewcvs?rev=260383=gcc=rev Log: Aliasing 'this' in a C++ constructor 2018-05-18 Marc Glisse

[Bug c++/85827] false positive for -Wunused-but-set-variable because of constexpr-if

2018-05-18 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85827 --- Comment #2 from Marc Glisse --- I think that's going to be hard. The same issue always existed with macros. The whole point of "if constexpr" is not to look at the other branches, as they may not even compile. Sure, some minimal "safe"

[Bug tree-optimization/63185] Improve DSE with branches

2018-05-18 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63185 --- Comment #12 from Marc Glisse --- (In reply to Richard Biener from comment #11) > I guess you meant (notice the bogus memset size above): True. And while it shouldn't make a difference in checking if the stores to c are dead, it could (but

[Bug tree-optimization/85822] [8/9 Regression] Maybe wrong code in VRP since r249150

2018-05-17 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85822 --- Comment #1 from Marc Glisse --- _2 = x.0_1 & -281474976710656; if (_2 == -281474976710656) goto ; [20.24%] [...] x.0_11 = ASSERT_EXPR ; x.0_12 = ASSERT_EXPR

[Bug tree-optimization/63185] Improve DSE with branches

2018-05-17 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63185 --- Comment #7 from Marc Glisse --- This PR is messy. To sum up, comment #0 was recently fixed, comment #5 is not (not noticing that the writes in the loop are dead), and comment #6 asks for increasing the alignment of VLAs the same way we

[Bug rtl-optimization/85811] Invalid optimization with fmax, fabs and nan

2018-05-16 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85811 --- Comment #6 from Marc Glisse --- What does tree_expr_nonnegative_p call non-negative? A natural definition would exclude NaN, but for REAL_CST we just return ! REAL_VALUE_NEGATIVE.

[Bug rtl-optimization/85811] Invalid optimization with fmax, fabs and nan

2018-05-16 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85811 --- Comment #1 from Marc Glisse --- tree_binary_nonnegative_warnv_p for RDIV_EXPR does RECURSE (op0) && RECURSE (op1), but that doesn't work so well when the denominator can be 0. I guess it is still ok when finite-math-only (or no-nans and

[Bug target/85791] multiply overflow (128 bit)

2018-05-15 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85791 --- Comment #4 from Marc Glisse --- (In reply to Ruslan Nikolaev from comment #0) > 2. unsigned long long func(unsigned long long a, unsigned long long b) > { > __uint128_t c = (__uint128_t) a * b; > if (c > (unsigned long long)

[Bug c++/82899] *this in constructors could not alias with reference input parameters of the same type

2018-05-14 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 --- Comment #22 from Marc Glisse --- (In reply to rguent...@suse.de from comment #21) > Note that in the strict C semantic thing __restrict on > this isn't valid as the following is valid C++: > > Foo() __restrict > { > Foo *x = this; >

[Bug tree-optimization/85758] questionable bitwise folding (missing single use check?)

2018-05-12 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85758 --- Comment #1 from Marc Glisse --- Direct translation would be (from clang): andl%ecx, %edx addl%edx, %edi xorl%ecx, %edx addl%edx, %esi With -mbmi, I get andn%ecx, %edx, %eax

[Bug c++/82899] *this in constructors could not alias with reference input parameters of the same type

2018-05-12 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 --- Comment #20 from Marc Glisse --- Created attachment 44122 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44122=edit untested middle-end patch This works on the testcase, I need to add a comment and run it through the testsuite.

[Bug c++/82899] *this in constructors could not alias with reference input parameters of the same type

2018-05-12 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 --- Comment #19 from Marc Glisse --- (In reply to rguent...@suse.de from comment #18) > I suppose this changes debug information? Yes. Probably not so bad, but indeed better if we can avoid it. > I think adjusting the only user in

[Bug tree-optimization/85757] tree optimizers fail to fully clean up fixed-size memcpy

2018-05-12 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85757 Marc Glisse changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug c++/82899] *this in constructors could not alias with reference input parameters of the same type

2018-05-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 Marc Glisse changed: What|Removed |Added Attachment #44112|0 |1 is obsolete|

[Bug c++/82899] *this in constructors could not alias with reference input parameters of the same type

2018-05-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 --- Comment #16 from Marc Glisse --- (patch should use 'fn && DECL_CONSTRUCTOR_P (fn)' since fn can be NULL) As I was half expecting, messing with the types that directly doesn't work. It means 'this' has type T*restrict, and if I try for

[Bug c++/85746] Premature evaluation of __builtin_constant_p?

2018-05-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85746 --- Comment #3 from Marc Glisse --- (In reply to Jakub Jelinek from comment #2) > For different versions there is the > http://gcc.gnu.org/ml/gcc-patches/2018-03/msg00355.html > patch. Time to ping that one? ;-) (I don't have a particular

[Bug c++/85747] suboptimal code without constexpr

2018-05-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85747 --- Comment #5 from Marc Glisse --- (In reply to Antony Polukhin from comment #4) > Does providing some kind of -Oon-the-fly switch solves the issue with JIT > compile times while still allows more optimizations for the traditional non > JIT

[Bug c++/85747] suboptimal code without constexpr

2018-05-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85747 --- Comment #2 from Marc Glisse --- (In reply to Antony Polukhin from comment #0) > Could the compiler detect that `a[7]` holds values known at compile time and > force the constexpr on `sort(a + 0, a + 7);`? There has to be a limit. If I write

[Bug tree-optimization/80617] [missed optimization] Storing constant in two possibly-aliased locations

2018-05-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80617 --- Comment #12 from Marc Glisse --- (In reply to Richard Biener from comment #11) > Dup of PR23094 (and fixed). Richard, comment #9 shows that the original testcase is only half-fixed (though the other half seems hard to fix). Does this mean

[Bug c++/85746] New: Premature evaluation of __builtin_constant_p?

2018-05-11 Thread glisse at gcc dot gnu.org
++ Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- int f(int a,int b){ int c = __builtin_constant_p(a < b); return c; } In C or C++98, __builtin_constant_p is passed to the middle-end for further optimization. In C+

[Bug c++/82899] *this in constructors could not alias with reference input parameters of the same type

2018-05-10 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 --- Comment #15 from Marc Glisse --- Created attachment 44112 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44112=edit Untested patch Something like this, but I haven't tested, and other calls to build_this_parm need auditing to check if

[Bug tree-optimization/80617] [missed optimization] Storing constant in two possibly-aliased locations

2018-05-10 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80617 --- Comment #9 from Marc Glisse --- The testcases from comment #6 and comment #7 are now (gcc-8) properly optimized. The original has lost one of the 2 calls to free, one remains: __old_val_4 = MEM[(void * &)a_2(D)]; MEM[(void * &)a_2(D)] =

[Bug c++/82899] *this in constructors could not alias with reference input parameters of the same type

2018-05-10 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 --- Comment #14 from Marc Glisse --- (In reply to Marc Glisse from comment #13) > I have no idea what was changed in gcc-8 that > helped the original testcase, (optimization happens in FRE1) It could be an optimization that says that either the

[Bug c++/82899] *this in constructors could not alias with reference input parameters of the same type

2018-05-10 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 --- Comment #13 from Marc Glisse --- Explicitly marking the constructor with __restrict lets it optimize also the testcase in comment #12. I have no idea what was changed in gcc-8 that helped the original testcase, but it wasn't equivalent to

[Bug c++/82899] *this in constructors could not alias with reference input parameters of the same type

2018-05-10 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899 --- Comment #10 from Marc Glisse --- This seems fixed in 8.1 (at least we don't generate the extra mov anymore), can you check?

[Bug target/85730] complex code for modifying lowest byte in a 4-byte vector

2018-05-10 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85730 Marc Glisse changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug middle-end/85720] bad codegen for looped assignment of primitives at -O2

2018-05-09 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85720 --- Comment #3 from Marc Glisse --- (In reply to Mathias Stearn from comment #2) > Hmm. Taking the example from the -ftree-loop-distribute-patterns > documentation, it still seems to generate poor code, this time at both -O2 > and -O3:

[Bug libstdc++/85672] [9 Regression] error: redefinition of 'constexpr long double std::abs(long double)'

2018-05-08 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85672 --- Comment #11 from Marc Glisse --- (In reply to Jonathan Wakely from comment #8) > My autotools-fu is too weak to come up with anything better but I'd be very > happy if you can suggest something cleaner. For the general case, the autoconf

[Bug c++/85680] Missed optimization for value-init of variable-sized allocation

2018-05-08 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85680 --- Comment #4 from Marc Glisse --- All memset come from ldist, so already quite late in the pipeline. Maybe clang/intel, who avoid a comparison between new and the first memset, generate memset directly from the front-end? (clang generates the

[Bug c++/85680] Missed optimization for value-init of variable-sized allocation

2018-05-07 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85680 --- Comment #1 from Marc Glisse --- Quite impressive how we do the test in multiple ways, which are not quite equivalent because of the wrapping semantics of unsigned. Maybe if we asserted that the argument of operator new must be less than the

[Bug libstdc++/85672] [9 Regression] error: redefinition of 'constexpr long double std::abs(long double)'

2018-05-07 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85672 --- Comment #7 from Marc Glisse --- (In reply to Jonathan Wakely from comment #5) > > - -Wsystem-headers -Wundef will warn > > That's the status quo. It would take a ton of effort to avoid -Wundef > warnings in libstdc++ and that's not

[Bug libstdc++/85672] [9 Regression] error: redefinition of 'constexpr long double std::abs(long double)'

2018-05-07 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85672 --- Comment #4 from Marc Glisse --- (In reply to Jonathan Wakely from comment #3) > Yes it woud have been broken by r259813 and this should fix it: I don't think that's sufficient: - the same code is present in several files - -Wsystem-headers

[Bug libstdc++/85672] error: redefinition of 'constexpr long double std::abs(long double)'

2018-05-06 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85672 --- Comment #1 from Marc Glisse --- I think there is an inconsistency where we #define _GLIBCXX_USE_FLOAT128 0 (can you check your c++config.h?) to say that it shouldn't be supported, but then test with #ifdef and not #if.

[Bug tree-optimization/85143] Loop limit prevents (auto)vectorization

2018-05-01 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85143 --- Comment #7 from Marc Glisse --- Note that the patch still doesn't handle _1 = n_15(D) <= i_46; _2 = i_46 > 1336; _3 = _1 | _2; because of the mix between strict and large inequalities. (if I write int m = 1337; and replace i < 1337

[Bug tree-optimization/85143] Loop limit prevents (auto)vectorization

2018-05-01 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85143 --- Comment #6 from Marc Glisse --- Author: glisse Date: Tue May 1 21:41:05 2018 New Revision: 259812 URL: https://gcc.gnu.org/viewcvs?rev=259812=gcc=rev Log: Generalize a

[Bug c++/81420] When a reference is bound to a member in the base of a temporary, lifetime of the temporary is not extended

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81420 --- Comment #3 from Marc Glisse --- Created attachment 44050 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44050=edit untested hackish patch This seems to help a bit, but it doesn't feel like the right approach.

[Bug c++/81420] When a reference is bound to a member in the base of a temporary, lifetime of the temporary is not extended

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81420 Marc Glisse changed: What|Removed |Added Last reconfirmed|2018-01-08 00:00:00 |2018-5-1 --- Comment #2 from Marc Glisse

[Bug target/85582] [9 Regression] wrong code at -O1 and above on x86_64-linux-gnu in 32-bit mode

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85582 Marc Glisse changed: What|Removed |Added Target||x86-*-* Status|UNCONFIRMED

[Bug tree-optimization/84362] [7/8/9 Regression] Auto-vectorization regression when accessing member variable through getter/accessor

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84362 Marc Glisse changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 78151, which changed state. Bug 78151 Summary: Fail to vectorize *min_element https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78151 What|Removed |Added

[Bug tree-optimization/78151] Fail to vectorize *min_element

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78151 Marc Glisse changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug libstdc++/85466] Performance is slow when doing 'branchless' conditional style math operations

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85466 --- Comment #21 from Marc Glisse --- (In reply to Daniel Elliott from comment #20) > still clang is 1.64x faster. had a look at the assembly. My limited > understanding makes me think that the ucomiss is not fully vectorized and > the clang one

[Bug libstdc++/85466] Performance is slow when doing 'branchless' conditional style math operations

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85466 --- Comment #19 from Marc Glisse --- For the "ifno" case, llvm turns (item>.5f)?1.:0. into (cheating on the syntax, we can't do bit_and on float in C) ((item>.5f)?mask:0.) & 1. where mask is all one bits, and this uses the SSE comparison

[Bug libstdc++/85466] Performance is slow when doing 'branchless' conditional style math operations

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85466 --- Comment #18 from Marc Glisse --- For the "if" case, llvm turns: if (myVector[n] > 0.5){ result[n] = 0.8f; } else { result[n] = 0.1f; } into const float tab[2] = { .8f, .1f }; result[n] = tab[item > .5f];

[Bug libstdc++/85466] Performance is slow when doing 'branchless' conditional style math operations

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85466 --- Comment #12 from Marc Glisse --- Constant folding for nextafter seems like a useful thing to add, whatever we say about the rest of the testcase.

[Bug c++/85466] Performance is slow when doing 'branchless' conditional style math operations

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85466 --- Comment #1 from Marc Glisse --- Please always include your code in the bug report (this external website doesn't even seem to have a "download the code" option).

[Bug tree-optimization/85459] [8 Regression] Larger code generated from GMP template meta-programming

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85459 Marc Glisse changed: What|Removed |Added Attachment #43982|0 |1 is obsolete|

[Bug tree-optimization/85459] [8 Regression] Larger code generated from GMP template meta-programming

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85459 --- Comment #2 from Marc Glisse --- (In reply to Jakub Jelinek from comment #1) > I think this is a result of many changes. > E.g. r249885 bumps .s size from 3709 to 4599 bytes, r254724 from 4599 to > 5768, r255510 from 5772 to 7713. You are

[Bug tree-optimization/85459] New: [8 Regression] Larger code generated from GMP template meta-programming

Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- Created attachment 43982 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43982=edit preprocessed testc

[Bug c++/63579] New attribute for empty member optimization

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63579 --- Comment #4 from Marc Glisse --- The following was adopted for C++20 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0840r2.html ABI description (not merged yet) https://github.com/itanium-cxx-abi/cxx-abi/pull/50

[Bug target/85236] missing _mm256_atan2_ps

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85236 --- Comment #5 from Marc Glisse --- (In reply to bking from comment #4) > I understand that is a part of SVML, but doesn't that mean using the Intel > Compiler? Which means not using GCC. Is there not a plan to add it? Or is > that the intent

[Bug libstdc++/83860] [6/7/8 Regression] valarray replacement type breaks with auto and more than one operation

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83860 --- Comment #6 from Marc Glisse --- GMP's expression templates, which are based on libstdc++ valarray, have the same issue. I tried using values in GMP ( https://gmplib.org/list-archives/gmp-bugs/2014-January/003319.html ). I never committed it

[Bug c++/85236] missing _mm256_atan2_ps

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85236 --- Comment #2 from Marc Glisse --- This is part of SVML, not a basic intrinsic.

[Bug tree-optimization/85162] Vector extensions generating incorrect assembly

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85162 --- Comment #1 from Marc Glisse --- If you believe this is incorrect, you should be able to extend the testcase with an assert somewhere showing that the result is wrong. For vectors, as documented, comparisons return a vector of 0 (false) and

<    2   3   4   5   6   7   8   9   10   11   >