http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57890

Evgeniy Dushistov <dushistov at mail dot ru> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |tree-optimization

--- Comment #2 from Evgeniy Dushistov <dushistov at mail dot ru> ---
>That would mean the expansion for memset is not optimal for the target which 
>>means this is a target issue rather than a C++ front-end or a middle-end 
>issue.

I disagree. Bisect show that fault commit (Can anybody add him to CC?):
  2012-06-05  Richard Guenther  <rguent...@suse.de>

        PR tree-optimization/53081
        * tree-loop-distribution.c (generate_memset_builtin): Handle all
        kinds of byte-sized stores.
        (classify_partition): Likewise.
        (tree_loop_distribution): Adjust seed statements used for
        !flag_tree_loop_distribution.

        * gcc.dg/tree-ssa/ldist-19.c: New testcase.
        * gcc.c-torture/execute/builtins/builtins.exp: Always pass
        -fno-tree-loop-distribute-patterns.

Yes, for builtin memset gcc generated bad code, and this a target issue.

But for gcc 4.7 the issue was known
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55953), that bultin memset bad for
at least arm, x86 and amd64 ( suppose major CPUs that gcc supports).

Why in gcc 4.8 introduce new code in tree optimization that produce more builin
memset, why not wait untill builtin memset will be fixed?

If look at gcc as the whole thing, this is regression: "+15% CPU time for
simple loop".

Reply via email to