[Bug c++/99429] ICE for bool return from <=>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99429 Mike Sharov changed: What|Removed |Added Attachment #50315|0 |1 is obsolete|| --- Comment #1 from Mike Sharov --- Created attachment 50316 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50316=edit incorrect duration
[Bug c++/99429] New: ICE for bool return from <=>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99429 Bug ID: 99429 Summary: ICE for bool return from <=> Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net Target Milestone: --- Created attachment 50315 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50315=edit std::strong_ordering and incorrect duration code When erronously declaring <=> to return bool, g++ crashes: > g++ -c -std=c++20 chrono.cc chrono.cc: In instantiation of ‘class duration<1>’: chrono.cc:44:42: required from here chrono.cc:38:20: internal compiler error: Segmentation fault 38 | constexpr bool operator<=> (const duration& d) const = default; |^~~~
[Bug c/98404] New: Compiler emits unexpected function call that may cause security problems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98404 Bug ID: 98404 Summary: Compiler emits unexpected function call that may cause security problems Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net Target Milestone: --- int rotate_argv (const char** argv, int first, int mid, int end) { const char** p = argv+first; int n1 = mid-first; int n2 = end-mid; int nm = n1+n2-1; for (int j = 0; j < n2; ++j) { const char* v = p[nm]; for (int i = 0; i < nm; ++i) p[nm-i] = p[nm-i-1]; p[0] = v; } return n1; } This bit of code unexpectedly emits a call to memmove to replace the inner copy loop. Such behavior is highly inappropriate, breaking the "what-you-see-is-what-you-get" spirit of C. Sure, the loop is equivalent to a memmove call, but if I wanted to call memmove, I would have called memmove. Doing it behind my back brings in code paths that may cause problems impossible to understand by looking at the code. Worse yet, the compiler only does this in the optimized build (-Os, -O2, and -O3, but not -O1 or -O0), making debugging of the resulting problem a beat-your-head-on-the-desk frustrating exercise. The bug in my code was causing memory corruption in argv to happen in that inner loop, but looking at the code above will not reveal the problem, no matter how much you scream at the debugger. The bug was in my memmove implementation returning the wrong value, which the compiler then helpfully reloaded into p. Naturally, it's a good thing that I fixed the bug; having never used the return value of memmove myself I doubt I would have discovered it anytime soon. But this illustrates how a malicious exploit could be introduced into that loop without anybody being able to figure it out. Let's remember that we still have that LD_PRELOAD abomination. On a more mundane note, replacing the loop with memmove causes the compiled code to grow from 107 bytes to 166. This is using the -Os, switch, of course. I have complained many times about how gcc doesn't care about size optimization and doesn't inline stuff because it can't understand that inserting a function call into code that currently has none has great costs of register saving and all that. I have by now resigned to having to #define inline inline __attribute__((always_inline)) everywhere, but will you perhaps someday reconsider your position that size optimization does not matter? If 55% code bloat in this example doesn't convince you, what will? Finally, calling memmove will make the code slower, not faster, due to its much higher startup overhead that is justifiable for copying large blocks, but not for copying one or two elements, which is what the code above is made for. The conceit of the compiler, in thinking it knows better, thus results in worse outcome all around; in size, speed, and security.
[Bug tree-optimization/93896] New: Store merging uses SSE only for trivial types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93896 Bug ID: 93896 Summary: Store merging uses SSE only for trivial types Product: gcc Version: 9.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net Target Milestone: --- struct M { constexpr M() :p{},sz{},cz{}{} public: char* p; unsigned sz; unsigned cap; }; struct A { M a,b,c; A(); }; A::A() :a{},b{},c{}{} gcc 9.2.1 with -march=native -Os on Haswell generates: _ZN1AC2Ev: movq$0, (%rdi) movq$0, 8(%rdi) movq$0, 16(%rdi) movq$0, 24(%rdi) movq$0, 32(%rdi) movq$0, 40(%rdi) ret Store merging is obviously working here, but does not use SSE movups. If the constructor is removed or defaulted the output is: _ZN1AC2Ev: vpxor %xmm0, %xmm0, %xmm0 vmovups %xmm0, (%rdi) vmovups %xmm0, 16(%rdi) vmovups %xmm0, 32(%rdi) ret Whether the type is trivial should not matter by the time store merging occurs, but for some reason it does.
[Bug rtl-optimization/91482] New: __builtin_assume_aligned should not break write combining
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91482 Bug ID: 91482 Summary: __builtin_assume_aligned should not break write combining Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net Target Milestone: --- void write64 (void* p) { unsigned* p1 = (unsigned*) __builtin_assume_aligned (p, 8); *p1++ = 0; unsigned* p2 = (unsigned*) __builtin_assume_aligned (p1, 4); *p2++ = 1; } When the two stores are written without __builtin_assume_aligned, they are coalesced into a single movq store. The code above, however, results in two movl stores, even though the new information provided by __builtin_assume_aligned does not prevent combination.
[Bug c++/85875] New: -Weffc++ can't understand auto return values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85875 Bug ID: 85875 Summary: -Weffc++ can't understand auto return values Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net Target Milestone: --- struct C { struct const_iterator { auto& operator++() { return *this; } }; }; Compiling with -Weffc++ gives warning: t.cc:3:24: warning: prefix ‘auto& C::const_iterator::operator++()’ should return ‘C::const_iterator&’ [-Weffc++] even though auto& evaluates to C::const_iterator&
[Bug tree-optimization/85697] At -Os nontrivial ctor does not use SSE to zero
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85697 --- Comment #2 from Mike Sharov --- I previously filed bug #49127 about the non-SSE version of the same xor/mov optimization. Perhaps both could be addressed in the same manner with a more general capability of zeroing with a register when doing so is shorter.
[Bug c++/85858] -Weffc++ should not require copy ctor for const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85858 --- Comment #12 from Mike Sharov --- (In reply to Jonathan Wakely from comment #10) > It's simply not how C++ works. Quite right. I already agreed with you here; we are arguing about whether it /should/ work this way :) > An object's lifetime is distinct from it's constness, and a pointer-to-const > doesn't imply anything about whether the pointed-to object is immutable. Exactly! I can restate my gripe in these terms: C++ has no way of explicitly marking the owner of the object or its lifetime. When f() creates object const A a and passes it as const A* to g(), both f() and g() see the same const A object, but f() is the owner and should be allowed to delete it, while g() has only been granted read-only access and should not. If delete required a non-const pointer, then f() would either keep a non-const pointer to indicate that it owns a, or have to explicitly const_cast it to delete. > You seem to be saying that a pointer-to-const implies > an immortal object that will never be destroyed. Not at all. Object lifetime is a separate subject, but const correctness should help enforce it by restricting who gets to set it. Ideally, the object will have exactly one owner (insert rant on the evils of shared_ptr), and that owner will determine the lifetime of the object. If const prevented delete, the compiler could help you catch violations of the one-owner rule that may compromise defined object lifetime and cause undefined behavior in functions that hold pointers to that object. A function can only assume that the pointer it was given remains valid if the object lifetime is explicitly known, and there is no explicit C++ way of making it known. We can only define the lifetime in documentation. For example: > Why should that be true for pointers to the heap > but not pointers to the stack? Because the stack frees all owned objects when the scope is exited and the heap does not. The stack will call destructors to cleanup the objects, the heap will not. Consequently the stack can be said to be the owner of local objects, but the heap owns nothing because it destroys nothing.
[Bug c++/85858] -Weffc++ should not require copy ctor for const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85858 --- Comment #8 from Mike Sharov --- (In reply to Jonathan Wakely from comment #7) > Your mental model of C++ is simply not how the language works. My mental model here is actually of const correctness, not C++ specifically. When I pass around a const object I expect it to stay unmodified. Consider a function that takes a const T* argument. The signature suggests that the passed-in object will only be read and will not be modified. If that function deletes the pointer, "bad things" will likely happen. Suddenly the object contains garbage and you can't figure out how it happened. All called functions took it as const, so they shouldn't have messed with it. You assume memory corruption and onto valgrind we go. > The const qualifier affects modification of the object, not its destruction. This is precisely the part I have a problem with. It seems downright loony to state that destruction is not modification when the object is most definitely modified by it. It's like saying that if I break into your house and take stuff, I am a criminal, but if I burn it down, it's all perfectly fine. > void f() { const int i = 0; } > > Do you think this stack variable can't be destroyed, because it's const? What it boils down to is this: const restricts access, and so prevents ownership. If you can't destroy a thing, you don't really own it. The standard appears to have taken the position that ownership beats const correctness. I instead argue in favor of const correctness, and its guarantee of invariance. The stack variable in the example illustrates the difference between access and ownership. f() has read-only access to i, and therefore does not own it. Who owns i? The stack does. The stack passes i to f() with limited access, and then, when f() terminates, the stack destroys i. This way ownership is clearly delineated. If f() were to say delete i (assuming i were a pointer), it should be prevented from doing so. > using const_int = const int; > const int* p = new const_int(); > > Do you expect to never be able to delete this object, > instead being forced to leak it? Consequently, const objects created with new are owned by nobody, and simply do not make sense. Somebody has to own the allocated object, so creating a const object should be an invalid operation. I suppose it doesn't really matter what my opinion is in this matter. Neither I nor you write the standard, so I'll just leave this as a closing footnote in a bug correctly resolved INVALID.
[Bug c++/85858] -Weffc++ should not require copy ctor for const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85858 --- Comment #6 from Mike Sharov --- (In reply to Jonathan Wakely from comment #5) > Nope, see the C++ standard: > > [ Note: A pointer to a const type can be the operand of a > delete-expression; Ok, I guess; you have to follow the standard, after all. But I would like to see the rationale for this, because it sure looks like a violation of const correctness. I certainly feel violated.
[Bug c++/85858] -Weffc++ should not require copy ctor for const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85858 --- Comment #4 from Mike Sharov --- (In reply to Jonathan Wakely from comment #3) > Nothing stops you deallocating a const pointer. According to http://en.cppreference.com/w/cpp/memory/new/operator_delete The delete operator takes a void* and attempting to delete a const pointer would require a const_cast. This is logical, since freeing a memory block is a modification operation that changes the block's contents by marking it invalid. To my surprise, I found that g++ actually does currently accept delete of a const pointer. I believe that should be a bug.
[Bug c++/85858] -Weffc++ should not require copy ctor for const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85858 --- Comment #2 from Mike Sharov --- (In reply to Jonathan Wakely from comment #1) > (In reply to Mike Sharov from comment #0) > > When the pointer is const, it can not point to owned memory > Why not? Because a const pointer can not be freed. By "owned memory" I mean memory that was explicitly allocated by the object, which I assume was the situation that Effective C++ rule was referring to, or memory the ownership of which was passed to the object. In both cases the object has to keep a non-const pointer in order to be able to free it or to pass on the ability to free it to some other object. I can't think of any case for an owned const pointer; can you?
[Bug c++/85858] New: -Weffc++ should not require copy ctor for const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85858 Bug ID: 85858 Summary: -Weffc++ should not require copy ctor for const pointers Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net Target Milestone: --- -Weffc++ warns about missing operator= and copy ctor in a class containing a const pointer. The intent of the warning is to detect manually allocated memory owned by the class, and to ensure copying operation was explicitly considered. When the pointer is const, it can not point to owned memory and so should not result in a warning.
[Bug c++/85856] New: -Weffc++ can't see implicitly deleted constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85856 Bug ID: 85856 Summary: -Weffc++ can't see implicitly deleted constructor Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net Target Milestone: --- #include struct A { A (void) {} virtual ~A (void) {} A (const A&) = delete; void operator= (const A&) = delete; }; struct B : public A { B (const char* p) :A(),_p(p) {} const char* _p; }; int main (void) { B b ("hello"); puts (b._p); return 0; } When compiling with -Weffc++ enabled, generates: t.cc:10:8: warning: ‘struct B’ has pointer data members [-Weffc++] struct B : public A { ^ t.cc:10:8: warning: but does not override ‘B(const B&)’ [-Weffc++] t.cc:10:8: warning: or ‘operator=(const B&)’ [-Weffc++] B already has an implicitly deleted copy constructor and operator= because A implements them deleted. The compiler will correctly give a warning about it on B b2=b, for example.
[Bug c/80354] Poor support to silence -Wformat-truncation=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80354 --- Comment #9 from Mike Sharov --- (In reply to Martin Sebor from comment #8) > A simple way to avoid the warning while also avoiding bugs resulting from > unhandled truncation is to detect it and abort if it happens, e.g. First of all, you might want to mention this in the error message. The way it is presently worded gives the impression that the only way to remove the warning is to increase the buffer size. I guarantee you that most people will just turn off the warning in this case. And then come here to complain, because the kind of warning that is wrong in most cases (if only in our opinion) should not be in -Wall. Secondly, this is precisely the annoying part about it: you are making the decision that allowing truncation to happen is always a bug and forcing it to be handled as one. I do not consider it a problem to pass a truncated filename to open and having it fail there. There are, naturally, some cases where this could cause a security problem, but I am the one who should determine whether each particular snprintf is one of those cases, and consequently I should also have the option to tell the compiler that it is not. If I was ok with bloating my program due to an excessive concern with safety, I'd be using Java, not C.
[Bug c/80354] Poor support to silence -Wformat-truncation=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80354 Mike Sharov changed: What|Removed |Added CC||msharov at users dot sourceforge.n ||et --- Comment #7 from Mike Sharov --- I really do have to add my complaint about this one. Can't we have another override option here? Have the compiler parse "truncates" in a comment, for example, like it does for fallthrough. Doing format precision is not a good workaround because it hardcodes the size of the buffer into the format string, creating a maintenance problem in case the buffer size is increased later. Not to mention unnecessarily creating multiple format strings where previously a single one could have been shared. Why make us all create unnecessarily larger executables? Worse, truncation is always going to be a false positive here. Nobody wants to choose buffer size based on worst case output. Sometimes it is merely useless, such as when writing diagnostic messages. 8k of text won't fit in a message box anyway and will be truncated. Other times it is distinctly wrong. For example, if building a path from multiple components in PATH_MAX sized buffers, the result must not be larger than PATH_MAX anyway, and must be truncated. Another example is when you are trying to get a prefix from a large string. snprintf is a great way of doing that, but your warning may now lead people to rewrite the code with strncpy and its insecure behavior, possibly forgetting that it always requires explicitly terminating the buffer. Sure, it is just another warning to fix. I've had to fix some new warning with every gcc release. Not a single one of them was an actual problem with the code. It's always just "the way we've got to do things from now on", having to write each code construct in a particular way to avoid a warning. A 100% false positive rate is annoying, isn't it? Yet, I keep all warnings on, for some strange reason. Can't we all be friends and always have a polite way of saying "I know what I am doing here"?
[Bug target/85697] New: At -Os nontrivial ctor does not use SSE to zero
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85697 Bug ID: 85697 Summary: At -Os nontrivial ctor does not use SSE to zero Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net Target Milestone: --- struct alignas(16) A { A (void) :a(0),b(0),c(0),d(0) {} int a,b,c,d; }; __attribute__((noinline)) void UseA (A& a) { a.a=1; } int main (void) { A a {}; UseA (a); return a.a; } -Os -march=native on Haswell, generates: main: subq$16, %rsp movq%rsp, %rdi movq$0, (%rsp) movq$0, 8(%rsp) call_Z4UseAR1A movl(%rsp), %eax addq$16, %rsp ret Using 16 bytes to zero A with 2 movq. With -O3: main: subq$24, %rsp vpxor %xmm0, %xmm0, %xmm0 movq%rsp, %rdi vmovaps %xmm0, (%rsp) call_Z4UseAR1A movl(%rsp), %eax addq$24, %rsp ret using only 9 bytes for pxor/movaps. With -mno-avx it is 7 bytes for xorps/movaps. With multiple objects of type A, the savings would be even greater, since only one pxor would be needed for all and only 4 bytes per object for zeroing. Removing A constructor also results in SSE instruction use.
[Bug c++/85695] New: if constexpr misevaluates typedefed type value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85695 Bug ID: 85695 Summary: if constexpr misevaluates typedefed type value Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net Target Milestone: --- template struct integral_constant { using value_type = T; static constexpr const value_type value = v; constexpr operator value_type (void) const { return value; } }; template struct is_trivial : public integral_constant<bool, __is_trivial(T)> {}; template T clone_object (const T& p) { if constexpr (is_trivial::value) return p; else return p.clone(); } int main (void) { return clone_object(0); } This fails to compile: "error: request for member ‘clone’ in ‘p’". The strange part is that changing the type of integral_constant::value to T makes it work, as does using is_trivial() in the conditional, invoking the cast operator. For some reason, value_type is evaluated differently if it is a variable or return value, and differently from T.
[Bug c++/85689] New: if constexpr compiles false branch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85689 Bug ID: 85689 Summary: if constexpr compiles false branch Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net Target Milestone: --- int main (void) { if constexpr (false) static_assert (false, "this should not be compiled"); return 0; } g++ 8.1 fails compiling the branch with the static_assert even though if constexpr condition is false. May be the same as #85149, but still present in g++ 8.1.0 on Arch.
[Bug target/59578] New: Overuse of v prefix for SSE instructions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59578 Bug ID: 59578 Summary: Overuse of v prefix for SSE instructions Product: gcc Version: 4.8.2 Status: UNCONFIRMED Severity: minor Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net typedef float v16sf __attribute__((vector_size(16))); v16sf f (v16sf x) { return (__builtin_ia32_shufps (x, x, 0xff)); } Compiled on a Haswell 4770 with -march=native -O emits: vshufps $255, %xmm0, %xmm0, %xmm0 Even though all registers are the same and shufps $255, %xmm0, %xmm0 would have worked just as well without the extra byte for the v prefix. This happens with other __builtin instructions as well. For example: typedef long long v16so __attribute__((vector_size(16))); v16so k (v16so x) { return (__builtin_ia32_aeskeygenassist128 (x, 1)); } Emits vaeskeygenassist even though no memory accesses are present.
[Bug target/57288] cfi_restore should precede cfi_def_cfa_offset
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57288 --- Comment #2 from Mike Sharov msharov at users dot sourceforge.net --- (In reply to Andrew Pinski from comment #1) Can you attach the preprocessed source which is used to create this assembly file? I'm afraid not. This call has been created by a gigantic collection of templates, macros, and inline functions, so is too large to attach. Futhermore, when compiled with the current gcc 4.8.2, the .cfi directives are entirely different, with no .cfi_restore instructions emitted. If you really can't figure out what the cause was, I'd have to wait until I see another function showing the behavior.
[Bug rtl-optimization/23684] Combine stores for non strict alignment targets
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23684 --- Comment #12 from msharov at users dot sourceforge.net --- I'd like to add that this is not some corner case; this is a very common issue. In my own projects, the compiler's inability to combine stores is the single largest reason for using inline assembly and raw casts. Pretty much every time I have an object 8 or 16 bytes in size, I end up writing a zeroing ctor, copy ctor, and operator= that use full-object memory access. That's cast to uint64_t for 8 bytes, and movups/movaps for 16 bytes. It also shows up when writing raw protocol data, such as X calls, where it is very common to write several constants in succession. The last time I checked, forcing whole-object moves in these cases results in projectwide code size reduction ~10%. Unfortunately, it also causes a variety of aliasing pessimizations, so I also have to test including or not including each of the above functions to get the smallest code size. I would be a very big deal if the optimizer could do this.
[Bug rtl-optimization/57302] New: Should merge zeroing multiple consecutive memory locations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57302 Bug ID: 57302 Summary: Should merge zeroing multiple consecutive memory locations Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net struct A { short a,b; A (void); }; A::A (void) : a(0),b(0) {} void MoveA (const A* a, A* b) { *b = *a; } Generates: _ZN1AC2Ev: movw$0, (%rdi) movw$0, 2(%rdi) ret _Z5MoveAPK1APS_: movl(%rdi), %eax movl%eax, (%rsi) ret The optimizer can see that a and b are consecutive in memory and can merge the memory movs into a single 4-byte mov, but does not do the same for the zeroing code in the constructor. Merging the zeroing to movl, movq, and mov[au]ps (when SSE is available), would produce smaller code and fewer memory accesses.
[Bug target/57288] New: cfi_restore should precede cfi_def_cfa_offset
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57288 Bug ID: 57288 Summary: cfi_restore should precede cfi_def_cfa_offset Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: trivial Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: msharov at users dot sourceforge.net Created attachment 30122 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30122action=edit The emitted assembly exhibiting the ordering problem This is on x86_64, compiled with -Os. In the attached assembly, line 89, .L55, .cfi_restore is emitted for ebx and ebp after .cfi_def_cfa_offset 8 already invalidated the location where they were stored. cfa_offset should be emitted after cfi_restores, as it was in the other codepaths like .LEHE0-.L51
[Bug rtl-optimization/56598] New: Optimizer can't invert conditional when inlining a bool function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56598 Bug #: 56598 Summary: Optimizer can't invert conditional when inlining a bool function Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: minor Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: msha...@users.sourceforge.net Created attachment 29640 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29640 Simple test case and its asm When a bool macro like #define X(i) (i==3) is replaced with a static inline bool function, the optimizer sometimes generates tangled control flow. With code blocks ABCD the jumps go A-C-B-D. If the macro is used, the compiler can invert the conditional and emit ACBD with two fewer jmps. In the attached test case, func1 has .L3-.L4 block that is jumped into, out of, and over. Swapping the .L2-.L3 and .L3-.L4 blocks would produce the simpler control flow in func2. The test case is compiled with -Os, on x86_64. -O2 and -O3 also produce the same behavior, but require a larger test case to avoid path unrolling. C++ compiler must be used. The same test case compiled as C produces identical func1 and func2.
[Bug c++/56583] New: ICE with constexpr ctor and nested structs and unions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56583 Bug #: 56583 Summary: ICE with constexpr ctor and nested structs and unions Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: msha...@users.sourceforge.net Created attachment 29632 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29632 The code causing the failure An ICE occurs in a constexpr constructor where a member is initialized that is in an anonymous member union containing an anonymous struct. See attached file. Compiling with g++ -std=c++11 -c tes.cc yields: tes.cc: In function 'int main()': tes.cc:23:21: in constexpr expansion of 'r.CRect::CRect(1, 2, 3, 4)' tes.cc:23:21: internal compiler error: in base_field_constructor_elt, at cp/semantics.c:7033 Please submit a full bug report, with preprocessed source if appropriate.
[Bug libgcc/56277] New: libgcc.a and libgcc_eh.a should be compiled with function-level linking
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56277 Bug #: 56277 Summary: libgcc.a and libgcc_eh.a should be compiled with function-level linking Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: libgcc AssignedTo: unassig...@gcc.gnu.org ReportedBy: msha...@users.sourceforge.net libgcc.a and libgcc_eh.a are not compiled with -ffunction-sections -fdata-sections. libsupc++.a is compiled with those flags. Because a typical program will not use much of libgcc, enabling function level linking should noticeably reduce size of statically linked executables.
[Bug libgcc/56277] libgcc.a and libgcc_eh.a should be compiled with function-level linking
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56277 --- Comment #3 from msharov at users dot sourceforge.net 2013-02-11 00:33:03 UTC --- (In reply to comment #2) Do you think you have an example where adding -ffunction-sections can help for the compiled libgcc.a ? No, sorry. Checking this would require manually recompiling libgcc with -ffunction-sections and trying it out on my projects. Since I haven't compiled gcc in quite a long time and have forgotten all the tricks for doing it, this will be a lot of work. So, if you say you already do all you can to minimize what is linked in from there, I'll take your word for it.
[Bug c++/53380] .ehframe could be smaller
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53380 --- Comment #3 from msharov at users dot sourceforge.net 2012-05-22 11:21:41 UTC --- Did -fno-asynchronous-unwind-tables do what you wanted it to do? In that disable the unwinding tables when not using exceptions? No, it did not. For example: #include stdio.h int calculate (int x, int y) { return (x * y); } void print (void) { printf (%d, calculate(1,2)); } int main (void) { print(); return (0); } g++ -Os -c -fno-asynchronous-unwind-tables -o tes.o tes.cc readelf --debug-dump=frames tes.o Contents of the .eh_frame section: 0014 CIE Version: 1 Augmentation: zR Code alignment factor: 1 Data alignment factor: -8 Return address column: 16 Augmentation data: 1b DW_CFA_def_cfa: r7 (rsp) ofs 8 DW_CFA_offset: r16 (rip) at cfa-8 DW_CFA_nop DW_CFA_nop 0018 0010 001c FDE cie= pc=..0006 DW_CFA_nop DW_CFA_nop DW_CFA_nop 002c 0010 0030 FDE cie= pc=0006..0017 DW_CFA_nop DW_CFA_nop DW_CFA_nop 0040 0014 0044 FDE cie= pc=..000a DW_CFA_advance_loc: 1 to 0001 DW_CFA_def_cfa_offset: 16 DW_CFA_advance_loc: 8 to 0009 DW_CFA_def_cfa_offset: 8 DW_CFA_nop As you can see, all three functions still have unwind entries emitted. According to documentation I saw on the web, -fno-asynchronous-unwind-tables increases unwind information granularity to function-level, meaning that it supposedly avoids stepping cfa unless there is a function call there. While I don't regularly read the unwind tables, I was under impression that this was happening by default.
[Bug c++/53380] .ehframe could be smaller
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53380 --- Comment #5 from msharov at users dot sourceforge.net 2012-05-22 18:53:32 UTC --- (In reply to comment #4) Adding default handling if there is no FDE is an ABI change, so can't be done on existing architectures (except those that have that in their ABI already). I was not suggesting doing it by default. A switch like -fno-asynchronous-unwind-tables would be perfectly acceptable. Why do you care about the .eh_frame size so much? I don't care so much. I was merely suggesting a way of making it smaller, for the use of the same people who prefer to use -Os instead -O3. Yes, I probably could write a post-link tool to do this, but it would be much more work since I would have to implement a disassembler, etc. to find out which functions do not need unwind info. In the compiler you already have all that info in the parse tree, so it is just a matter of adding a couple of if statements. If you don't use the unwind info, it often won't be even paged in, or can be discarded from RAM if needed. Removing unnecessary entries would make lookup faster. Making .eh_frame smaller would also help the exception path by having less to page in. And if you need it, it better be accurrate. It would still be accurate. Making the common case default does not remove any useful information.
[Bug c++/53380] New: .ehframe could be smaller
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53380 Bug #: 53380 Summary: .ehframe could be smaller Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: msha...@users.sourceforge.net The unwinding information can take up a significant (10%+) chunk of a C++ executable, and reducing its size can help make executables smaller. On x86_64 in particular, the need to include .ehframe erases code size reductions acquired when porting 32bit code, and gives the incorrect impression that 64 bit code is larger. Fortunately, it is possible to make the unwinding table smaller. The entries for functions without local variables are basically empty, the information contained being largely the offset and size of the function body. If this type of entry was made the default action, all of them could be removed. Then any function not found in the table would be assumed to have the return address on top of the stack. Validity checking can still be performed by verifying that the return address is in .text. Another optimization is to remove entries for functions that do not throw; i.e. declared as noexcept or not calling any other functions that are not noexcept. Just doing these two things should remove more than half the table in a typical executable. Of course, unwinding information is also used by the debugger and backtrace, so the table should be left alone by default. It would be nice to have the option to turn them on explicitly.
[Bug rtl-optimization/52888] New: Unable to inline function pointer call with inexact signature match
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52888 Bug #: 52888 Summary: Unable to inline function pointer call with inexact signature match Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: msha...@users.sourceforge.net #include stdio.h struct A { template typename T static inline __attribute__((always_inline)) void Caller (T* pn, void (T::*pm)(void)) { (pn-*pm)(); } void Call (int i) { if (i == 1) Caller(this, A::Func1); else if (i == 2) Caller(this, A::Func2); } inline void Func1 (void){ puts (Func1); } inline void Func2 (void) noexcept { puts (Func2); } }; int main (void) { A a; a.Call(1); a.Call(2); return (0); } - Compiling with: g++ (GCC) 4.7.0 20120324 (prerelease), x86_64 g++ -O -std=c++11 a.cc a.cc: In function 'int main()': a.cc:5:55: error: inlining failed in call to always_inline 'static void A::Caller(T*, void (T::*)()) [with T = A]': mismatched arguments a.cc:10:42: error: called from here a.cc:5:55: error: inlining failed in call to always_inline 'static void A::Caller(T*, void (T::*)()) [with T = A]': mismatched arguments a.cc:10:42: error: called from here - I'm using always_inline to force the error; without it Caller is not inlined errorlessly. The problem occurs when the function pointer has an inexact signature match to the pointed function. In the above example, Func2 has a noexcept tacked on to it. In more complex examples involving pointer to function with arguments, using a typedef of an object in argument list results in this error, while using the actual object works (typedef A a_t; void good(A); void bad(a_t)). So the compiler is able to explicitly convert an inexact match like A::Func2 to void(A::*)(void) when instantiating the template, but the inliner is not able to make the same match even though it should have the same information.
[Bug rtl-optimization/49127] New: -Os generates constant mov instead of instruction xor and mov when zeroing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49127 Summary: -Os generates constant mov instead of instruction xor and mov when zeroing Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: msha...@users.sourceforge.net void zero (void* p) { *reinterpret_castulong*(p) = 0; } Generates: 48 c7 07 00 00 00 00movq $0x0,(%rdi) This is shorter by 2 bytes: 31 c0 xor%eax,%eax 48 89 07mov%rax,(%rdi) And can be reused in further assignments of zero for more savings: 31 c0 xor%eax,%eax 48 89 07mov%rax,(%rdi) 48 89 47 04 mov%rax,0x4(%rdi)