"Bin.Cheng" <amker.ch...@gmail.com> writes:
> On Thu, May 4, 2017 at 11:06 AM, Richard Sandiford
> <richard.sandif...@linaro.org> wrote:
>> "Bin.Cheng" <amker.ch...@gmail.com> writes:
>>> On Wed, May 3, 2017 at 9:00 AM, Richard Sandiford
>>> <richard.sandif...@linaro.org> wrote:
>>>> Index: gcc/tree-data-ref.c
>>>> ===================================================================
>>>> --- gcc/tree-data-ref.c 2017-02-23 19:54:15.000000000 +0000
>>>> +++ gcc/tree-data-ref.c 2017-05-03 08:48:48.737038502 +0100
>>>> @@ -123,8 +123,7 @@ Software Foundation; either version 3, o
>>>>  } dependence_stats;
>>>>
>>>> static bool subscript_dependence_tester_1 (struct
> data_dependence_relation *,
>>>> -                                          struct data_reference *,
>>>> -                                          struct data_reference *,
>>>> +                                          unsigned int, unsigned int,
>>>>                                            struct loop *);
>>> As mentioned, how about passing access_fn directly, rather than less
>>> meaningful 0/1 values?
>>
>> The problem is that access_fn is a property of the individual
>> subscripts, whereas this is operating on a full data_reference.
>>
>> One alternative would be to use conditions like:
>>
>>   first_is_a ? SUB_ACCESS_FN_A (sub) : SUB_ACCESS_FN_B (sub)
>>
>> but IMO that's less readable than the existing:
>>
>>   SUB_ACCESS_FN (sub, index)
>>
>> Or we could have individual access_fn arrays for A and B, separate
>> from the main subscript array, but that would mean allocating three
>> arrays instead of one.
> Thanks for explanation, I see the problem now.  Even the latter
> sequence could be different for A and B, there should have the same
> number index?  If that's the case, is it possible just recording the
> starting position (or length) in DR_ACCESS_FN and use that information
> to access to access_fn vector.  This can save the copy in subscript.
> Anyway, this is not am important problem.

I think that would mean trading fields in the subscript for fields
in the main ddr structure.  One advantage of doing it in the subscript
is that those are freed after the analysis is complete, whereas the
ddr stays around until the caller has finished with it.

>>>> +     latter sequence.  */
>>>> +  unsigned int start_a = 0;
>>>> +  unsigned int start_b = 0;
>>>> +  unsigned int num_dimensions = 0;
>>>> +  unsigned int struct_start_a = 0;
>>>> +  unsigned int struct_start_b = 0;
>>>> +  unsigned int struct_num_dimensions = 0;
>>>> +  unsigned int index_a = 0;
>>>> +  unsigned int index_b = 0;
>>>> +  tree next_ref_a = DR_REF (a);
>>>> +  tree next_ref_b = DR_REF (b);
>>>> +  tree struct_ref_a = NULL_TREE;
>>>> +  tree struct_ref_b = NULL_TREE;
>>>> +  while (index_a < num_dimensions_a && index_b < num_dimensions_b)
>>>> +    {
>>>> +      gcc_checking_assert (handled_component_p (next_ref_a));
>>>> +      gcc_checking_assert (handled_component_p (next_ref_b));
>>>> +      tree outer_ref_a = TREE_OPERAND (next_ref_a, 0);
>>>> +      tree outer_ref_b = TREE_OPERAND (next_ref_b, 0);
>>>> +      tree type_a = TREE_TYPE (outer_ref_a);
>>>> +      tree type_b = TREE_TYPE (outer_ref_b);
>>>> +      if (types_compatible_p (type_a, type_b))
>>>> +       {
>>>> +         /* This pair of accesses belong to a suitable sequence.  */
>>>> +         if (start_a + num_dimensions != index_a
>>>> +             || start_b + num_dimensions != index_b)
>>>> +           {
>>>> +             /* Start a new sequence here.  */
>>>> +             start_a = index_a;
>>>> +             start_b = index_b;
>>>> +             num_dimensions = 0;
>>>> +           }
>>>> +         num_dimensions += 1;
>>>> +         if (TREE_CODE (type_a) == RECORD_TYPE)
>>>> +           {
>>>> +             struct_start_a = start_a;
>>>> +             struct_start_b = start_b;
>>>> +             struct_num_dimensions = num_dimensions;
>>>> +             struct_ref_a = outer_ref_a;
>>>> +             struct_ref_b = outer_ref_b;
>>>> +           }
>>>> +         next_ref_a = outer_ref_a;
>>>> +         next_ref_b = outer_ref_b;
>>>> +         index_a += 1;
>>>> +         index_b += 1;
>>>> +         continue;
>>>> +       }
>>>> +      /* Try to approach equal type sizes.  */
>>>> +      if (!COMPLETE_TYPE_P (type_a)
>>>> +         || !COMPLETE_TYPE_P (type_b)
>>>> +         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
>>>> +         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
>>>> +       break;
>>>> + unsigned HOST_WIDE_INT size_a = tree_to_uhwi (TYPE_SIZE_UNIT
> (type_a));
>>>> + unsigned HOST_WIDE_INT size_b = tree_to_uhwi (TYPE_SIZE_UNIT
> (type_b));
>>>> +      if (size_a <= size_b)
>>>> +       {
>>>> +         index_a += 1;
>>>> +         next_ref_a = outer_ref_a;
>>>> +       }
>>>> +      if (size_b <= size_a)
>>>> +       {
>>>> +         index_b += 1;
>>>> +         next_ref_b = outer_ref_b;
>>>> +       }
>>>>      }
>>>>
>>>> -  /* If the references do not access the same object, we do not know
>>>> -     whether they alias or not.  We do not care about TBAA or alignment
>>>> -     info so we can use OEP_ADDRESS_OF to avoid false negatives.
>>>> -     But the accesses have to use compatible types as otherwise the
>>>> -     built indices would not match.  */
>>>> - if (!operand_equal_p (DR_BASE_OBJECT (a), DR_BASE_OBJECT (b),
>>> OEP_ADDRESS_OF)
>>>> -      || !types_compatible_p (TREE_TYPE (DR_BASE_OBJECT (a)),
>>>> -                             TREE_TYPE (DR_BASE_OBJECT (b))))
>>>> +  /* See whether the sequence ends at the base and whether the two bases
>>>> +     are equal.  We do not care about TBAA or alignment info so we can use
>>>> +     OEP_ADDRESS_OF to avoid false negatives.  */
>>>> +  tree base_a = DR_BASE_OBJECT (a);
>>>> +  tree base_b = DR_BASE_OBJECT (b);
>>>> +  bool same_base_p = (start_a + num_dimensions == num_dimensions_a
>>>> +                     && start_b + num_dimensions == num_dimensions_b
>>>> + && DR_UNCONSTRAINED_BASE (a) == DR_UNCONSTRAINED_BASE (b)
>>>> +                     && operand_equal_p (base_a, base_b, OEP_ADDRESS_OF)
>>>> +                     && types_compatible_p (TREE_TYPE (base_a),
>>>> +                                            TREE_TYPE (base_b))
>>>> +                     && (!loop_nest.exists ()
>>>> +                         || (object_address_invariant_in_loop_p
>>>> +                             (loop_nest[0], base_a))));
>>> Major change is in function initialize_data_dependence_relation in
>>> order to detect partial alias opportunity.  The original equality
>>> check on DR_BASE_OBJECT is bypassed now.  IMHO better to introduce a
>>> new parameter to compute_data_reference_for_loop etc., indicating
>>> whether we want to handle partial alias opportunity or not.  After
>>> all, such computation is unnecessary for predcom/prefetch/parloop.
>>> It's only a waste of time computing it.
>>
>> Well, it also means that we can now prove the accesses are independent
>> in more cases.  E.g. previously we would assume the a and b accesses in:
> Predcom cares about dependent refs with constant distance, so
> independent (neither possible dependent) information based on partial
> alias is not interested.

Yeah, but it checks for that itself, based on the individual data
references.  E.g.:

  if (!operand_equal_p (DR_STEP (a), DR_STEP (b), 0)
      || !operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0))
    return false;

This condition is independent of the ddr.

My point is that if we wanted to avoid doing redundant work in this kind
of situation, we should reorganise things so that we filter uninteresting
pairs of references out *before* creating the ddr.  Even if we passed
a flag down to initialize_data_dependence_relation, we'd still end up
with a ddr for the uninteresting pairs, just like we have now.  And if
you're using compute_all_dependences, those pairs still count towards
PARAM_LOOP_MAX_DATAREFS_FOR_DATADEPS.

(I think the only thing predcom uses the ddr for is to check for relations
that are known to be independent.  So creating specific distance and
direction vectors is itself redundant work.)

>>
>>   struct s { int x[16]; } *a, *b;
>>   for (int i = 0; i < 8; ++i)
>>     a->x[i] = b->x[i + 8];
>>
>> could conflict.
>>
>> If callers don't need to know what the relationship between a and b is,
>> I think they should check for that before going through the process of
>> initialising and analysing the ddr.
> This I don't think so.  Users don't have the information to
> pre-check/analyze reference pair.  Even it can do that by repeating
> most work as in data-ref-analyzer, it sounds not a good practice.
> That's exactly analyzer's job and the reason why interfaces like
> compute_data_dependence_for_loop are introduced.  It doesn't make much
> sense requiring users to do additional analysis before looking for
> help from data-ref-analyzer.

(The reply above was for this too.)

Thanks,
Richard

Reply via email to