http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58095

            Bug ID: 58095
           Summary: SIMD code requiring auxiliary array for best
                    optimization
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: siavashserver at gmail dot com

Created attachment 30621
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30621&action=edit
Source code and its generated asm code.

Hello. I have noticed a strange behavior when I'm trying to write SIMD code
using provided SSE intrinsics. It looks like GCC is not able to
generate/optimize same code like function (bar) for function (foo).


I was wondering how can I achieve same generated code for the function (foo)
without going into trouble of defining and using an auxiliary array like
function (bar).


I've tried using __restrict__ keyword for input data (foo2), but GCC still
generates same code like function (foo). ICC and Clang also generate same code
and fail to optimize.

Something strange I've noticed is that GCC 4.4.7 generates desired code for
function (foo), but fails to do for function (foo2) and (bar). Newer versions
generate exactly same code for function (foo) and (foo2), and desired code for
function (bar).

Output attached is generated from GCC 4.8.1 using -O2 optimization level. I've
used online GCC compiler from: http://gcc.godbolt.org/

Reply via email to