https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98335

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sa...@gcc.gnu.org>:

https://gcc.gnu.org/g:c5288df751f9ecd11898dec5f2a7b6b03267f79e

commit r12-7615-gc5288df751f9ecd11898dec5f2a7b6b03267f79e
Author: Roger Sayle <ro...@nextmovesoftware.com>
Date:   Fri Mar 11 17:46:50 2022 +0000

    PR tree-optimization/98335: Improvements to DSE's compute_trims.

    This patch is the main middle-end piece of a fix for PR tree-opt/98335,
    which is a code-quality regression affecting mainline.  The issue occurs
    in DSE's (dead store elimination's) compute_trims function that determines
    where a store to memory can be trimmed.  In the testcase given in the
    PR, this function notices that the first byte of a DImode store is dead,
    and replaces the 8-byte store at (aligned) offset zero, with a 7-byte store
    at (unaligned) offset one.  Most architectures can store a power-of-two
    bytes (up to a maximum) in single instruction, so writing 7 bytes requires
    more instructions than writing 8 bytes.  This patch follows Jakub Jelinek's
    suggestion in comment 5, that compute_trims needs improved heuristics.

    On x86_64-pc-linux-gnu with -O2 the new test case in the PR goes from:

            movl    $0, -24(%rsp)
            movabsq $72057594037927935, %rdx
            movl    $0, -21(%rsp)
            andq    -24(%rsp), %rdx
            movq    %rdx, %rax
            salq    $8, %rax
            movb    c(%rip), %al
            ret

    to

            xorl    %eax, %eax
            movb    c(%rip), %al
            ret

    2022-03-11  Roger Sayle  <ro...@nextmovesoftware.com>
                Richard Biener  <rguent...@suse.de>

    gcc/ChangeLog
            PR tree-optimization/98335
            * builtins.cc (get_object_alignment_2): Export.
            * builtins.h (get_object_alignment_2): Likewise.
            * tree-ssa-alias.cc (ao_ref_alignment): New.
            * tree-ssa-alias.h (ao_ref_alignment): Declare.

            * tree-ssa-dse.cc (compute_trims): Improve logic deciding whether
            to align head/tail, writing more bytes but using fewer store insns.

    gcc/testsuite/ChangeLog
            PR tree-optimization/98335
            * g++.dg/pr98335.C: New test case.
            * gcc.dg/pr86010.c: New test case.
            * gcc.dg/pr86010-2.c: New test case.

Reply via email to