http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45967
Summary: gcc-4.5.x optimizes code with side-effects away Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: nicolai.sta...@zmaw.de Created attachment 22016 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22016 testcase containing failer and workaround Hi everybody, while debugging a numpy testuite failer addressing refcounts, I came across a strange optimization issue. Since I have no clue where the problem is located, I decided to choose "rtl-optimization" as "Component". Please correct me if I'm wrong. I'm not even sure if this really is a bug (although I'm believing it), but the people in #gcc told me to post it here. I've broken down the problem to a simple testcase (see attached testcase.c). Compile with gcc -c -Wall -O1 testcase.c and have a look at the produced assembler output with objdump -S testcase.o There are two functions in my testcase: one that will be empty (PyArray_Item_XDECREF) and one that uses a workaround that works even with -O3 (PyArray_Item_XDECREF_workaround). The workaround seems to introduce some data dependency, though I don't know exactly what it does, I've found it by trial and error. To help you locating the issue: It only appears with -O1. Everything works fine with the options documented in 'man gcc', that is -fauto-inc-dec -fcprop-registers -fdce -fdefer-pop -fdelayed-branch -fdse -fguess-branch-probability -fif-conversion2 -fif-conversion -fipa-pure-const -fipa-reference -fmerge-constants -fsplit-wide-types -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre -ftree-phiprop -ftree-sra -ftree-pta -ftree-ter -funit-at-a-time given explicitly (and without any -O*) on gcc's command line. I've tested with testcase.c with different gcc versions on different platforms. The workaround function always contains correct assembler code. Only the results for PyArray_Item_XDECREF: It either contains correct code or it is empty (except entering and leaving a stack frame). +----------+------------------------+------------+-------+ |Version |Platform |Optimization|Result | +----------+------------------------+------------+-------+ |4.1.2 |i486-linux-gnu |-O3 |works | |(Debian | | | | |4.1.1-21) | | | | +----------+------------------------+------------+-------+ |4.2.0 |sparcv9-sun-solaris2.10 |-O3 |works | |(self | | | | |compiled) | | | | +----------+------------------------+------------+-------+ |4.3.2 |x86_64-linux-gnu |-O3 |works | |(Debian | | | | |4.3.2-1.1)| | | | +----------+------------------------+------------+-------+ |4.4.0 |i686-pc-linux-gnu |-O3 |works | |(self | | | | |compiled) | | | | +----------+------------------------+------------+-------+ |4.4.3 |x86_64-unknown-linux-gnu|-O3 |works | |(self | | | | |compiled) | | | | +----------+------------------------+------------+-------+ |4.4.3 |sparc-sun-solaris2.10 |-O3 |works | |(self | | | | |compiled) | | | | +----------+------------------------+------------+-------+ |4.5.0 |x86_64-unknown-linux-gnu|-O1 |fail | |(self | | | | |compiled) | | | | +----------+------------------------+------------+-------+ |4.5.1 |i686-pc-linux-gnu |-O1 |fail | |(self | | | | |compiled) | | | | +----------+------------------------+------------+-------+ |4.5.1 |sparc-sun-solaris2.10 |-O1 |fail | |(self | | | | |compiled) | | | | +----------+------------------------+------------+-------+ As you can see, the issue is not dependent on the target architecture, but on gcc's version. It seems to have been introduced post-4.4.3 (unfortunately I have no 4.4.4/4.4.5 here) Thank you very much Nicolai