https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109747

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rgue...@gcc.gnu.org>:

https://gcc.gnu.org/g:b6b8870ec585947a03a797f9037d02380316e235

commit r14-1139-gb6b8870ec585947a03a797f9037d02380316e235
Author: Richard Biener <rguent...@suse.de>
Date:   Tue May 23 15:03:00 2023 +0200

    tree-optimization/109747 - SLP cost of CTORs

    The x86 backend looks at the SLP node passed to the add_stmt_cost
    hook when costing vec_construct, looking for elements that require
    a move from a GPR to a vector register and cost that.  But since
    vect_prologue_cost_for_slp decomposes the cost for an external
    SLP node into individual pieces this cost gets applied N times
    without a chance for the backend to know it's just dealing with
    a part of the SLP node.  Just looking at a part is also not perfect
    since the GPR to XMM move cost applies only once per distinct
    element so handling the whole SLP node one more correctly reflects
    cost (albeit without considering other external SLP nodes).

    The following addresses the issue by passing down the SLP node
    only for one piece and nullptr for the rest.  The x86 backend
    is currently the only one looking at it.

    In the future the cost of external elements is something to deal
    with globally but that would require the full SLP tree be available
    to costing.

    It's difficult to write a testcase, at the tipping point not
    vectorizing is better so I'll followup with x86 specific adjustments
    and will see to add a testcase later.

            PR tree-optimization/109747
            * tree-vect-slp.cc (vect_prologue_cost_for_slp): Pass down
            the SLP node only once to the cost hook.

Reply via email to