https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68234
Bug ID: 68234 Summary: tree-vrp pass need to be improved when handling ASSERT/PLUS/MINUS/_EXPR and Phi node Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: jiwang at gcc dot gnu.org Target Milestone: --- Created attachment 36661 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36661&action=edit experimental-patch r226850 will cause gcc generate more ASSERT_EXPR thus trigger some latent issues in gcc tree-vrp pass. For example, we are generating sub-optimal code for gcc.c-torture/compile/20121027-1.c since r226850. For ARM target: ./cc1 -O3 -nostdinc 20121027-1.c -march=armv8-a -mthumb -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -nostdinc before r226850: "bl/64" turned into "bl >> 6" == .L3: asrs r3, r4, #6 add r2, sp, #1024 adds r4, r4, #1 add r3, r2, r3, lsl #3 subw r3, r3, #1019 vld1.64 {d16}, [r3] vmov r0, r1, d16 @ int bl ff ldr r3, [r5] cmp r3, r4 bgt .L3 after ("bl/64" is not turned into "bl >> 6") === .L3: add r3, r4, #63 add r2, sp, #1024 ands r3, r3, r4, asr #32 it cc movcc r3, r4 adds r4, r4, #1 asrs r3, r3, #6 add r3, r2, r3, lsl #3 subw r3, r3, #1019 vld1.64 {d16}, [r3] vmov r0, r1, d16 @ int bl ff ldr r3, [r5] cmp r3, r4 bgt .L3 For mips target: ./cc1 -O2 -march=mips32r2 -mabi=32 20121027-1.c -nostdinc before === $L8: addiu $16,$16,1 sll $2,$2,3 addu $2,$17,$2 lwl $5,9($2) lwl $4,5($2) lwr $5,12($2) lwr $4,8($2) jal ff lw $2,0($18) slt $2,$16,$2 bne $2,$0,$L8 sra $2,$16,6 after === $L9: slt $2,$16,0 movz $3,$16,$2 addiu $16,$16,1 sra $2,$3,6 sll $2,$2,3 addu $2,$17,$2 lwl $5,9($2) lwl $4,5($2) lwr $5,12($2) lwr $4,8($2) jal ff lw $2,0($18) slt $2,$16,$2 bne $2,$0,$L9 addiu $3,$16,63 This is because previously gcc can deduct the value range for the variable "bl", and conclude it will be positive, then later optimization can turn the signed division to right shift thus we can avoid runtime overhead for signed division. After r226850, gcc can't deduct this. The initial cause if r226850 will introduce more ASSERT_EXPR which is fine but it caused problem for tree-vrp. From .vrp1 dump, tree-vrp pass is too conservative at three places: 1. When handling ASSERT_EXPR Visiting statement: c_13 = ASSERT_EXPR <c_1, c_1 < nc.0_4>; Intersecting [-INF, nc.0_4 + -1] and [0, 0] to [-INF, nc.0_4 + -1] the range info should be [0, nc.0_4 + -1]. my understanding is for ASSERT_EXPR <var, var < limit>, if var is SSA_NAME and have valid range info, then it's minimum should be honored. 2. When handling PLUS_EXPR with symbolic range After fixed issue 1, the range of c_13 will be [0, nc.0_4 + -1], but the following PLUS_EXPR range b1_11 to be VARYING which is wrong. Visiting statement: bl_11 = c_13 + 1; Found new range for bl_11: VARYING The range of bl_11 should be [1, nc.0_4]. looks to me the following code at the bottom of extract_range_from_binary_expr_1 is overkilling. At least for PLUS_EXPR/MINUS_EXPR. "cmp == -2" which means there is symbolic range, should be allowed for both, otherwise why there are lots of code in PLUS_EXPR/MINUS_EXPR hunk deliberately handling symbolic range? cmp = compare_values (min, max); if (cmp == -2 || cmp == 1) set_value_range_to_varying (vr); So I think we should relax the condition to not included PLUS_EXPR/MINUS_EXPR when cmp == -2. 3. When handling phi node Even after both 1 and 2 fixed, there still be another issue for phi node. Given, bl_11 now with the range [1, nc.0_4], I found it's range info is not honored when visiting PHI node, looks to me, the following code in vrp_visit_phi_node is overkilling, and I don't know how to relax it properly, if we simply remove this block of code, then gcc can finally get correct range info for the testcase 20121027-1.c and generate optimized instruction sequences. /* Do not allow equivalences or symbolic ranges to leak in from backedges. That creates invalid equivalencies. See PR53465 and PR54767. */ if (e->flags & EDGE_DFS_BACK) { if (vr_arg.type == VR_RANGE || vr_arg.type == VR_ANTI_RANGE) { vr_arg.equiv = NULL; if (symbolic_range_p (&vr_arg)) { vr_arg.type = VR_VARYING; vr_arg.min = NULL_TREE; vr_arg.max = NULL_TREE; } Visiting PHI node: c_1 = PHI <0(2), bl_11(3)> Argument #0 (2 -> 4 executable) 0: [0, 0] Argument #1 (3 -> 4 executable) bl_11: VARYING <--- bl_11 is with VR_RANGE, but forced to VARYING because of it's symbolic range Meeting [0, 0] and VARYING to VARYING attachment is experiment patch to show the bug places from my understanding. Toughts? -- Regards, Jiong