[Bug tree-optimization/78071] -Os -ffast-math generates pow() for 1/(x*x)

2016-10-21 Thread colanderman at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78071

--- Comment #2 from Chris King  ---
Created attachment 39867
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39867&action=edit
suboptimal output with -Os -ffast-math

[Bug tree-optimization/78071] -Os -ffast-math generates pow() for 1/(x*x)

2016-10-21 Thread colanderman at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78071

--- Comment #1 from Chris King  ---
Created attachment 39866
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39866&action=edit
Expected output

[Bug tree-optimization/78071] New: -Os -ffast-math generates pow() for 1/(x*x)

2016-10-21 Thread colanderman at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78071

Bug ID: 78071
   Summary: -Os -ffast-math generates pow() for 1/(x*x)
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: colanderman at gmail dot com
  Target Milestone: ---

Created attachment 39865
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39865&action=edit
source code

The expression 1/(x*x) (where x is a double or float), when compiled with "-Os
-ffast-math" on Intel (e.g. -march=core2), is compiled as a call to pow() or
powf().  "-Os" and "-O3 -ffast-math" work as expected (generate a div + mul).

[Bug rtl-optimization/57970] segfault in sched-deps.c

2013-11-10 Thread colanderman at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57970

--- Comment #5 from Chris King  ---
Would a unit test case be acceptable?  That should be an easy way to evince
this bug and I'd be glad to write one.


[Bug rtl-optimization/57970] segfault in sched-deps.c

2013-11-10 Thread colanderman at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57970

--- Comment #4 from Chris King  ---
Sorry, not interested: like I said above, it's very difficult to trigger, and
the only code I've been able to trigger it with is proprietary.

You can either read sched-deps.c and understand the code path which fails
(which I outlined) and how the patch fixes it, or close the bug and ignore the
patch.  I keep my own branch, so I don't really care.


[Bug rtl-optimization/57970] segfault in sched-deps.c

2013-11-10 Thread colanderman at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57970

--- Comment #2 from Chris King  ---
If you don't want proposed patches attached to bug reports, then I suggest you
remove the text "proposed patch" which is next to the "Add an attachment" link.


[Bug other/57970] New: segfault in sched-deps.c

2013-07-24 Thread colanderman at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57970

Bug ID: 57970
   Summary: segfault in sched-deps.c
   Product: gcc
   Version: 4.7.3
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: colanderman at gmail dot com

Created attachment 30546
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30546&action=edit
Patch

Symptom: Segfault in sched-deps.c when compiling a large auto-generated C file:

==3363== Invalid read of size 8
==3363==at 0x95A41D: sched_analyze_1 (sched-deps.c:2479)
==3363==by 0x95D182: sched_analyze_insn (sched-deps.c:2859)
==3363==by 0x95E636: deps_analyze_insn (sched-deps.c:3505)
==3363==by 0x95E7F1: sched_analyze (sched-deps.c:3653)
==3363==by 0x6EC4F8: sched_rgn_compute_dependencies (sched-rgn.c:2702)
==3363==by 0x6EF582: schedule_insns (sched-rgn.c:2915)
==3363==by 0x89E237: tilegx_reorg (tilegx.c:4710)
==3363==by 0x6E0699: rest_of_handle_machine_reorg (reorg.c:4183)
==3363==by 0x69F5BF: execute_one_pass (passes.c:2084)
==3363==by 0x69FA30: execute_pass_list (passes.c:2139)
==3363==by 0x69FA44: execute_pass_list (passes.c:2140)
==3363==by 0x69FA44: execute_pass_list (passes.c:2140)
==3363==  Address 0x8 is not stack'd, malloc'd or (recently) free'd

Cause: deps->pending_read_insns and deps->pending_read_mems are getting out of
sync.  (Hence the NULL pointer access at sched-deps.c:2479.)

Fix: The conditions "!deps->readonly" under which deps->pending_read_mems is
freed in flush_pending_lists() should be changed to "!deps->readonly &&
!DEBUG_INSN_P (insn)" to match the condition "deps->readonly || DEBUG_INSN_P
(insn)" under which deps->pending_read_insns is not freed in
add_dependence_list_and_free().

Patch attached.  Unfortunately I cannot provide a test case, as I have only
been able to reproduce the crash with a very large (auto-generated) proprietary
C file.

The bug seems to exist in the source code of at least 4.6.3 as well, though I
have not been able to trigger it therein.


[Bug middle-end/14192] Restrict pointers don't help

2013-06-26 Thread colanderman at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14192

Chris King  changed:

   What|Removed |Added

 CC||colanderman at gmail dot com

--- Comment #14 from Chris King  ---
> Only two_restrict_pointers is valid.

> Sorry, but I still don't get it

I agree.  None of the above responses clearly explain why one_restrict_pointer
does not represent a valid bug.  The missed optimization still exists in 4.8.0.


[Bug c/55830] inline and __attribute__((always_inline)) treated differently for unused-function warning

2013-02-28 Thread colanderman at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55830



Chris King  changed:



   What|Removed |Added



 CC||colanderman at gmail dot

   ||com



--- Comment #5 from Chris King  2013-02-28 
18:18:37 UTC ---

This is good.  I like that I can specify __attribute__ ((always_inline)) on

local static functions and still be warned if they are unused.  IMHO the real

bug is that such usage triggers a "warning: always_inline function might not be

inlinable".



If this bug and/or the above warning behavior is valid, then what's the

"correct" way to say "I want my local static function to be always inlined, but

don't hide warnings if it's unused"?


[Bug rtl-optimization/55360] [TileGX] Passing structure by value on stack needlessly writes to and reads from memory

2012-11-19 Thread colanderman at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55360



--- Comment #2 from Chris King  2012-11-19 
18:47:39 UTC ---

Possibly, though I doubt it.  PR 28831 has more to do with eliding copies of

the struct in its entirety; the problem I'm having centers around accessing

individual elements.  If PR 28831 were the cause, I would expect both my test

cases (with and without bit-fields) to behave identically, however they do not.



It's possible that fixing PR 28831 may hide this bug in my particular use case

(by avoiding the stack allocation in the first place), but I believe the

difference in handling of normal fields vs. bit fields to be a distinct bug.


[Bug rtl-optimization/55360] New: [TileGX] Passing structure by value on stack needlessly writes to and reads from memory

2012-11-16 Thread colanderman at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55360



 Bug #: 55360

   Summary: [TileGX] Passing structure by value on stack

needlessly writes to and reads from memory

Classification: Unclassified

   Product: gcc

   Version: 4.7.2

Status: UNCONFIRMED

  Severity: enhancement

  Priority: P3

 Component: rtl-optimization

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: colander...@gmail.com





#include 



struct bar { uint8_t a, b, c, d; };

struct bla { unsigned long a:8, b:8, c:8, d:8; };



uint64_t bar(struct bar);

uint64_t bla(struct bla);



uint64_t foo(uint8_t a, uint8_t b, uint8_t c, uint8_t d)

{ return bar((struct bar) { a, b, c, d }); }



uint64_t baz(uint8_t a, uint8_t b, uint8_t c, uint8_t d)

{ return bla((struct bla) { a, b, c, d }); }



when compiled with "gcc -Wall -std=gnu99 -O3 -S pbv.c" produces:



.file"pbv.c"

.text

.align 8

.global foo

.typefoo, @function

foo:

.LFB0:

.cfi_startproc

{

stsp, lr

.cfi_offset 55, 0

mover29, sp

addir28, sp, -16

}

addisp, sp, -24

.cfi_def_cfa_offset 24

{

str28, r29

addir11, sp, 21

addir10, sp, 20

}

{

st1r11, r1

addir11, sp, 22

}

{

st1r11, r2

addir11, sp, 23

}

{

st1r10, r0

moveir0, 0

}

st1r11, r3

ld4ur10, r10

{

bfinsr0, r10, 0, 0+32-1

jalbar

}

addir29, sp, 24

ldlr, r29

{



addisp, sp, 24

.cfi_restore 54

.cfi_restore 55

.cfi_def_cfa_offset 0

jrplr

}

.cfi_endproc

.LFE0:

.sizefoo, .-foo

.align 8

.global baz

.typebaz, @function

baz:

.LFB1:

.cfi_startproc

{

moveir10, 0

stsp, lr

.cfi_offset 55, 0

mover29, sp

}

{

bfinsr10, r0, 0, 7

addir28, sp, -8

}

{

bfinsr10, r1, 8, 8+8-1

addisp, sp, -16

}

.cfi_def_cfa_offset 16

{

bfinsr10, r2, 16, 16+8-1

str28, r29

}

bfinsr10, r3, 24, 24+8-1

{

mover0, r10

jalbla

}

addir29, sp, 16

ldlr, r29

{



addisp, sp, 16

.cfi_restore 54

.cfi_restore 55

.cfi_def_cfa_offset 0

jrplr

}

.cfi_endproc

.LFE1:

.sizebaz, .-baz

.ident"GCC: (GNU) 4.7.2"

.section.note.GNU-stack,"",@progbits



My expectation is that the foo and baz should compile identically, and should

use the bfins bit-arithmetic functions like baz does, rather than redundant

stores and loads to stack like foo does.



This is with a vanilla GCC 4.7.2 build on a Tilempower system (roughly CentOS

5.7).



The problem does not occur on Debian x86-64 with either GCC 4.4.6 or GCC 4.7.2.



Possibly related to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061 (however

that case seems to be fixed in 4.7.2).