[Bug tree-optimization/78071] -Os -ffast-math generates pow() for 1/(x*x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78071 --- Comment #2 from Chris King --- Created attachment 39867 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39867&action=edit suboptimal output with -Os -ffast-math
[Bug tree-optimization/78071] -Os -ffast-math generates pow() for 1/(x*x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78071 --- Comment #1 from Chris King --- Created attachment 39866 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39866&action=edit Expected output
[Bug tree-optimization/78071] New: -Os -ffast-math generates pow() for 1/(x*x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78071 Bug ID: 78071 Summary: -Os -ffast-math generates pow() for 1/(x*x) Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: colanderman at gmail dot com Target Milestone: --- Created attachment 39865 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39865&action=edit source code The expression 1/(x*x) (where x is a double or float), when compiled with "-Os -ffast-math" on Intel (e.g. -march=core2), is compiled as a call to pow() or powf(). "-Os" and "-O3 -ffast-math" work as expected (generate a div + mul).
[Bug rtl-optimization/57970] segfault in sched-deps.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57970 --- Comment #5 from Chris King --- Would a unit test case be acceptable? That should be an easy way to evince this bug and I'd be glad to write one.
[Bug rtl-optimization/57970] segfault in sched-deps.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57970 --- Comment #4 from Chris King --- Sorry, not interested: like I said above, it's very difficult to trigger, and the only code I've been able to trigger it with is proprietary. You can either read sched-deps.c and understand the code path which fails (which I outlined) and how the patch fixes it, or close the bug and ignore the patch. I keep my own branch, so I don't really care.
[Bug rtl-optimization/57970] segfault in sched-deps.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57970 --- Comment #2 from Chris King --- If you don't want proposed patches attached to bug reports, then I suggest you remove the text "proposed patch" which is next to the "Add an attachment" link.
[Bug other/57970] New: segfault in sched-deps.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57970 Bug ID: 57970 Summary: segfault in sched-deps.c Product: gcc Version: 4.7.3 Status: UNCONFIRMED Severity: major Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: colanderman at gmail dot com Created attachment 30546 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30546&action=edit Patch Symptom: Segfault in sched-deps.c when compiling a large auto-generated C file: ==3363== Invalid read of size 8 ==3363==at 0x95A41D: sched_analyze_1 (sched-deps.c:2479) ==3363==by 0x95D182: sched_analyze_insn (sched-deps.c:2859) ==3363==by 0x95E636: deps_analyze_insn (sched-deps.c:3505) ==3363==by 0x95E7F1: sched_analyze (sched-deps.c:3653) ==3363==by 0x6EC4F8: sched_rgn_compute_dependencies (sched-rgn.c:2702) ==3363==by 0x6EF582: schedule_insns (sched-rgn.c:2915) ==3363==by 0x89E237: tilegx_reorg (tilegx.c:4710) ==3363==by 0x6E0699: rest_of_handle_machine_reorg (reorg.c:4183) ==3363==by 0x69F5BF: execute_one_pass (passes.c:2084) ==3363==by 0x69FA30: execute_pass_list (passes.c:2139) ==3363==by 0x69FA44: execute_pass_list (passes.c:2140) ==3363==by 0x69FA44: execute_pass_list (passes.c:2140) ==3363== Address 0x8 is not stack'd, malloc'd or (recently) free'd Cause: deps->pending_read_insns and deps->pending_read_mems are getting out of sync. (Hence the NULL pointer access at sched-deps.c:2479.) Fix: The conditions "!deps->readonly" under which deps->pending_read_mems is freed in flush_pending_lists() should be changed to "!deps->readonly && !DEBUG_INSN_P (insn)" to match the condition "deps->readonly || DEBUG_INSN_P (insn)" under which deps->pending_read_insns is not freed in add_dependence_list_and_free(). Patch attached. Unfortunately I cannot provide a test case, as I have only been able to reproduce the crash with a very large (auto-generated) proprietary C file. The bug seems to exist in the source code of at least 4.6.3 as well, though I have not been able to trigger it therein.
[Bug middle-end/14192] Restrict pointers don't help
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14192 Chris King changed: What|Removed |Added CC||colanderman at gmail dot com --- Comment #14 from Chris King --- > Only two_restrict_pointers is valid. > Sorry, but I still don't get it I agree. None of the above responses clearly explain why one_restrict_pointer does not represent a valid bug. The missed optimization still exists in 4.8.0.
[Bug c/55830] inline and __attribute__((always_inline)) treated differently for unused-function warning
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55830 Chris King changed: What|Removed |Added CC||colanderman at gmail dot ||com --- Comment #5 from Chris King 2013-02-28 18:18:37 UTC --- This is good. I like that I can specify __attribute__ ((always_inline)) on local static functions and still be warned if they are unused. IMHO the real bug is that such usage triggers a "warning: always_inline function might not be inlinable". If this bug and/or the above warning behavior is valid, then what's the "correct" way to say "I want my local static function to be always inlined, but don't hide warnings if it's unused"?
[Bug rtl-optimization/55360] [TileGX] Passing structure by value on stack needlessly writes to and reads from memory
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55360 --- Comment #2 from Chris King 2012-11-19 18:47:39 UTC --- Possibly, though I doubt it. PR 28831 has more to do with eliding copies of the struct in its entirety; the problem I'm having centers around accessing individual elements. If PR 28831 were the cause, I would expect both my test cases (with and without bit-fields) to behave identically, however they do not. It's possible that fixing PR 28831 may hide this bug in my particular use case (by avoiding the stack allocation in the first place), but I believe the difference in handling of normal fields vs. bit fields to be a distinct bug.
[Bug rtl-optimization/55360] New: [TileGX] Passing structure by value on stack needlessly writes to and reads from memory
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55360 Bug #: 55360 Summary: [TileGX] Passing structure by value on stack needlessly writes to and reads from memory Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: colander...@gmail.com #include struct bar { uint8_t a, b, c, d; }; struct bla { unsigned long a:8, b:8, c:8, d:8; }; uint64_t bar(struct bar); uint64_t bla(struct bla); uint64_t foo(uint8_t a, uint8_t b, uint8_t c, uint8_t d) { return bar((struct bar) { a, b, c, d }); } uint64_t baz(uint8_t a, uint8_t b, uint8_t c, uint8_t d) { return bla((struct bla) { a, b, c, d }); } when compiled with "gcc -Wall -std=gnu99 -O3 -S pbv.c" produces: .file"pbv.c" .text .align 8 .global foo .typefoo, @function foo: .LFB0: .cfi_startproc { stsp, lr .cfi_offset 55, 0 mover29, sp addir28, sp, -16 } addisp, sp, -24 .cfi_def_cfa_offset 24 { str28, r29 addir11, sp, 21 addir10, sp, 20 } { st1r11, r1 addir11, sp, 22 } { st1r11, r2 addir11, sp, 23 } { st1r10, r0 moveir0, 0 } st1r11, r3 ld4ur10, r10 { bfinsr0, r10, 0, 0+32-1 jalbar } addir29, sp, 24 ldlr, r29 { addisp, sp, 24 .cfi_restore 54 .cfi_restore 55 .cfi_def_cfa_offset 0 jrplr } .cfi_endproc .LFE0: .sizefoo, .-foo .align 8 .global baz .typebaz, @function baz: .LFB1: .cfi_startproc { moveir10, 0 stsp, lr .cfi_offset 55, 0 mover29, sp } { bfinsr10, r0, 0, 7 addir28, sp, -8 } { bfinsr10, r1, 8, 8+8-1 addisp, sp, -16 } .cfi_def_cfa_offset 16 { bfinsr10, r2, 16, 16+8-1 str28, r29 } bfinsr10, r3, 24, 24+8-1 { mover0, r10 jalbla } addir29, sp, 16 ldlr, r29 { addisp, sp, 16 .cfi_restore 54 .cfi_restore 55 .cfi_def_cfa_offset 0 jrplr } .cfi_endproc .LFE1: .sizebaz, .-baz .ident"GCC: (GNU) 4.7.2" .section.note.GNU-stack,"",@progbits My expectation is that the foo and baz should compile identically, and should use the bfins bit-arithmetic functions like baz does, rather than redundant stores and loads to stack like foo does. This is with a vanilla GCC 4.7.2 build on a Tilempower system (roughly CentOS 5.7). The problem does not occur on Debian x86-64 with either GCC 4.4.6 or GCC 4.7.2. Possibly related to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061 (however that case seems to be fixed in 4.7.2).