On Thu, May 25, 2017 at 5:15 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Wed, May 24, 2017 at 11:54 AM, Richard Sandiford > <richard.sandif...@linaro.org> wrote: >> "Bin.Cheng" <amker.ch...@gmail.com> writes: >>> On Tue, May 23, 2017 at 6:53 PM, Richard Sandiford >>> <richard.sandif...@linaro.org> wrote: >>>> AIUI, the reason the old code mishandled negative steps was that the >>>> associated segment lengths were stored as sizetype and so looked like >>>> big unsigned values. Those values therefore satisfied tree_fits_uhwi_p >>>> even though they were semantically negative. Is that right? >>> Yes, and the undesired wrapping behavior when such large unsigned hwi >>> is involved in computation. But I think there are possible leaks in >>> the code even after this patch, as embedded below. >>>> >>>> Assuming yes, and quoting the change as a context diff... >>>> >>>>> diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c >>>>> index a5f8c1c..f0799d9 100644 >>>>> *** a/gcc/tree-data-ref.c >>>>> --- b/gcc/tree-data-ref.c >>>>> *************** >>>>> *** 1259,1264 **** >>>>> --- 1259,1273 ---- >>>>> != tree_int_cst_compare (DR_STEP (dr_a2->dr), size_zero_node)) >>>>> continue; >>>>> >>>>> + bool neg_step >>>>> + = (tree_int_cst_compare (DR_STEP (dr_a1->dr), size_zero_node) < >>>>> 0); >>>>> + >>>>> + /* DR_A1 and DR_A2 must have the same step if it's negative. */ >>>>> + if (neg_step >>>>> + && tree_int_cst_compare (DR_STEP (dr_a1->dr), >>>>> + DR_STEP (dr_a2->dr)) != 0) >>>>> + continue; >>>>> + >>>> >>>> [Why do they need to be the same step?] >>> There are two reasons. First is to simplify diff computation between >>> dr_a1 and dr_a2, otherwise we need to adjust diff for negative steps. >> >> What kind of adjustment would be needed? Could you give an example? > I handled negative step in updated patch by adjusting diff according > to access size of references.
It's quite hard to follow. Isn't it more correct to always extend seg-len by the access size? And only consider neg_step when deciding which pair to keep/extend (which DR_BASE_ADDRESS is eventually used)? >> >>> And wrapping behavior needs to be handled when adjusting diff with >>> steps. The second reason is not fully handled in this patch. We now >>> only set merged segment length to MAX only when both dr_a1->seg_len >>> and dr_a2->seg_len are constant, as below: >>> + if (tree_fits_uhwi_p (dr_a1->seg_len) >>> + && tree_fits_uhwi_p (dr_a2->seg_len)) >>> + new_seg_len >>> + = size_int (MAX (tree_to_uhwi (dr_a1->seg_len), >>> + diff + tree_to_uhwi (dr_a2->seg_len))); >>> + else >>> + new_seg_len >>> + = size_binop (PLUS_EXPR, dr_a2->seg_len, size_int (diff)); >>> In fact, we should do this for else branch too. with different steps, >>> it is still possible that dr_a1-seg_len > dr_a2->seg_len + diff. Here >>> I only restrict it to negative DR_STEP. Patch updated with >>> explanation in comment. >>>> >>>>> /* Make sure dr_a1 starts left of dr_a2. */ >>>>> if (tree_int_cst_lt (DR_INIT (dr_a2->dr), DR_INIT (dr_a1->dr))) >>>>> std::swap (*dr_a1, *dr_a2); >>>>> *************** >>>>> *** 1266,1325 **** >>>>> bool do_remove = false; >>>>> unsigned HOST_WIDE_INT diff >>>>> = (tree_to_shwi (DR_INIT (dr_a2->dr)) >>>>> ! - tree_to_shwi (DR_INIT (dr_a1->dr))); >>>>> >>>>> ! /* If the left segment does not extend beyond the start of the >>>>> ! right segment the new segment length is that of the right >>>>> ! plus the segment distance. */ >>>>> ! if (tree_fits_uhwi_p (dr_a1->seg_len) >>>>> ! && compare_tree_int (dr_a1->seg_len, diff) <= 0) >>>>> { >>>>> ! dr_a1->seg_len = size_binop (PLUS_EXPR, dr_a2->seg_len, >>>>> ! size_int (diff)); >>>>> ! do_remove = true; >>>>> } >>>>> ! /* Generally the new segment length is the maximum of the >>>>> ! left segment size and the right segment size plus the distance. >>>>> ! ??? We can also build tree MAX_EXPR here but it's not clear >>>>> this >>>>> ! is profitable. */ >>>>> ! else if (tree_fits_uhwi_p (dr_a1->seg_len) >>>>> ! && tree_fits_uhwi_p (dr_a2->seg_len)) >>>>> ! { >>>>> ! unsigned HOST_WIDE_INT seg_len_a1 = tree_to_uhwi >>>>> (dr_a1->seg_len); >>>>> ! unsigned HOST_WIDE_INT seg_len_a2 = tree_to_uhwi >>>>> (dr_a2->seg_len); >>>>> ! dr_a1->seg_len = size_int (MAX (seg_len_a1, diff + >>>>> seg_len_a2)); >>>>> ! do_remove = true; >>>>> ! } >>>>> ! /* Now we check if the following condition is satisfied: >>>>> >>>>> ! DIFF - SEGMENT_LENGTH_A < SEGMENT_LENGTH_B >>>>> >>>>> ! where DIFF = DR_A2_INIT - DR_A1_INIT. However, >>>>> ! SEGMENT_LENGTH_A or SEGMENT_LENGTH_B may not be constant so we >>>>> ! have to make a best estimation. We can get the minimum value >>>>> ! of SEGMENT_LENGTH_B as a constant, represented by >>>>> MIN_SEG_LEN_B, >>>>> ! then either of the following two conditions can guarantee the >>>>> ! one above: >>>>> >>>>> ! 1: DIFF <= MIN_SEG_LEN_B >>>>> ! 2: DIFF - SEGMENT_LENGTH_A < MIN_SEG_LEN_B */ >>>>> ! else >>>>> { >>>>> ! unsigned HOST_WIDE_INT min_seg_len_b >>>>> ! = (tree_fits_uhwi_p (dr_b1->seg_len) >>>>> ! ? tree_to_uhwi (dr_b1->seg_len) >>>>> ! : factor); >>>>> >>>>> if (diff <= min_seg_len_b >>>>> || (tree_fits_uhwi_p (dr_a1->seg_len) >>>>> ! && diff - tree_to_uhwi (dr_a1->seg_len) < >>>>> min_seg_len_b)) >>>>> { >>>>> ! dr_a1->seg_len = size_binop (PLUS_EXPR, >>>>> ! dr_a2->seg_len, size_int >>>>> (diff)); >>>>> do_remove = true; >>>>> } >>>>> } >>>>> >>>>> if (do_remove) >>>>> { >>>>> if (dump_enabled_p ()) >>>>> --- 1275,1375 ---- >>>>> bool do_remove = false; >>>>> unsigned HOST_WIDE_INT diff >>>>> = (tree_to_shwi (DR_INIT (dr_a2->dr)) >>>>> ! - tree_to_shwi (DR_INIT (dr_a1->dr))); >>>>> ! tree new_seg_len; >>>>> ! unsigned HOST_WIDE_INT min_seg_len_b; >>>>> >>>>> ! if (tree_fits_uhwi_p (dr_b1->seg_len)) >>>>> { >>>>> ! min_seg_len_b = tree_to_uhwi (dr_b1->seg_len); >>>>> ! if (tree_int_cst_sign_bit (dr_b1->seg_len)) >>>>> ! min_seg_len_b = 0 - min_seg_len_b; >>>>> } >>>>> ! else >>>>> ! min_seg_len_b = factor; >>>> >>>> ...I'm not sure how safe this or the later neg_step handling is >>>> for 16-bit and 32-bit sizetypes. It might be better to use wide_int >>> I think it could be a problem in case sizetype is smaller than >>> unsigned_type_for(ptr). >> >> But I think it would a problem even for "normal" 32-bit and 16-bit >> targets, because you're doing uhwi (i.e. 64-bit) negation on things that >> come from 32-bit and 16-bit unsigned values. E.g. a segment length of >> -32 on a 32-bit target would be 0xffffffe0. If you read that as a uhwi >> and negate it, you get 0xffffffff00000020 rather than 32. >> >> Using wide_ints would avoid that. I don't think the existing code >> needed it (because the existing code didn't handle negative steps >> properly at all). > Right, patch updated using wide_int to compare diff and compute merged > segment length. > Bootstrap and test on x86_64 and AArch64. > > Thanks, > bin >> >>>> instead, so that all arithmetic and comparisons happen in the precision >>>> of sizetype. >>> I was trying to make minimal refactoring for fixing the negative step >>> issue. Also I guess your SVE patches will rewrite this part entirely? >> >> Not sure TBH :-) I'll have to see how it works out when I merge it in. >> >> Thanks, >> Richard