https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84986
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hubicka at gcc dot gnu.org --- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> --- So we now have Scalar iteration cost: 4 computed from _3 = (long int) i_15; a_10 = a_16 - _3; and ix86_add_stmt_cost has 50464 case NOP_EXPR: 50465 stmt_cost = 0; 50466 break; for the HImode -> DImode sign-extension. The scalar codegen generates movswl so that's definitely not zero cost. Likewise it would cost a float <-> double conversions as zero. I think the only zero-cost conversions are sign-conversions. With that fixed we get t.c:6:9: note: Cost model analysis: Vector inside of loop cost: 44 Vector prologue cost: 64 Vector epilogue cost: 44 Scalar iteration cost: 8 Scalar outside cost: 8 Vector outside cost: 108 prologue iterations: 0 epilogue iterations: 4 Calculated minimum iters for profitability: 32 t.c:6:9: note: Runtime profitability threshold = 32 t.c:6:9: note: Static estimate profitability threshold = 37 which is a worse estimate than from GCC7 which had t.c:6:9: note: Cost model analysis: Vector inside of loop cost: 14 Vector prologue cost: 14 Vector epilogue cost: 23 Scalar iteration cost: 5 Scalar outside cost: 1 Vector outside cost: 37 prologue iterations: 0 epilogue iterations: 4 Calculated minimum iters for profitability: 9 t.c:6:9: note: Runtime profitability threshold = 8 t.c:6:9: note: Static estimate profitability threshold = 8 that's likely because of the relative increase of the prologue costs of the vector constant loads and not accounting any costs for the constants in the scalar code (x86 can handle constant operands in the instructions for scalar code). I am testing the following patch: Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 258674) +++ gcc/config/i386/i386.c (working copy) @@ -50462,7 +50462,11 @@ ix86_add_stmt_cost (void *data, int coun } break; case NOP_EXPR: - stmt_cost = 0; + /* Only sign-conversions are free. */ + if (tree_nop_conversion_p + (TREE_TYPE (gimple_assign_lhs (stmt_info->stmt)), + TREE_TYPE (gimple_assign_rhs1 (stmt_info->stmt)))) + cost = 0; break; case BIT_IOR_EXPR: