will schmidt <will_schm...@vnet.ibm.com> writes:

Thanks!
> On Mon, 2020-07-06 at 15:13 +0800, guojiufu via Gcc-patches wrote:
>
> Hi,
>
> Assorted comments below.   thanks :-)
>
>> For very small loops (< 6 insns), it would be fine to unroll 4
>> times to use cache line better.  Like below loops:
>>  `while (i) a[--i] = NULL;   while (p < e)  *d++ = *p++;`
>> 
>> And for very complex loops which may cause negative impacts:
>> branch-miss or cache-miss. Like below loop: there are calls,
>> early exits and branches in loop.
>> ```
>>   for (int i = 0; i < n; i++) {
>>               int e = a[I];
>>              ....
>>               if (function_call(e))  break;
>>              ....
>>   }
>> ```
>> 
>> This patch enhances RTL unroll for small loops and prevent to
>> unroll complex loops.
>
> ok.
>
>> 
>> gcc/ChangeLog
>> 2020-07-03  Jiufu Guo  <guoji...@linux.ibm.com>
>> 
>>         * config/rs6000/rs6000.c (rs6000_loop_unroll_adjust): Refine hook.
>>         (rs6000_complex_loop_p): New function.
>>         (num_loop_calls): New function.
>
> Tabs versus spaces.
oh, thanks!

>
> (num_loop_calls): New function.
>
>
>> ---
>>  gcc/config/rs6000/rs6000.c | 46 +++++++++++++++++++++++++++++++++---
>> --
>>  1 file changed, 40 insertions(+), 6 deletions(-)
>> 
>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>> index 58f5d780603..a4874fa0efc 100644
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -5130,22 +5130,56 @@ rs6000_destroy_cost_data (void *data)
>>    free (data);
>>  }
>> 
>> +/* Count the number of call insns in LOOP.  */
>> +static unsigned int
>> +num_loop_calls (struct loop *loop)
>> +{
>> +  basic_block *bbs;
>> +  rtx_insn *insn;
>> +  unsigned int i;
>> +  unsigned int call_ins_num = 0;
>> +
>> +  bbs = get_loop_body (loop);
>> +  for (i = 0; i < loop->num_nodes; i++)
>> +    FOR_BB_INSNS (bbs[i], insn)
>> +      if (CALL_P (insn))
>> +    call_ins_num++;
>> +
>> +  free (bbs);
>> +
>> +  return call_ins_num;
>> +}
>
> ok.
>
>
>> +
>> +/* Return true if LOOP is too complex to be unrolled.  */
>> +static bool
>> +rs6000_complex_loop_p (struct loop *loop)
>> +{
>> +  unsigned call_num;
>> +
>> +  return loop->ninsns > 10
>> +    && (call_num = num_loop_calls (loop)) > 0
>> +    && (call_num + num_loop_branches (loop)) * 5 > loop->ninsns
>> +    && !single_exit (loop);
>> +}
>> +
>
>
> The assignment to call_num within the logic there concerns me.  I'd
> break that out.
>
> The 5 value is not explicitly mentioned elsewhere.  Contextually this
> appears to be evaluating the ratio of branches versus instructions
> within the loop.  Could use some clarity.
Yes, it is 20%. Would make it clarity. Thanks, 
>
>
>
>>  /* Implement targetm.loop_unroll_adjust.  */
>> 
>>  static unsigned
>>  rs6000_loop_unroll_adjust (unsigned nunroll, struct loop *loop)
>>  {
>> -   if (unroll_only_small_loops)
>> +  if (unroll_only_small_loops)
>
> indentation fix looks ok.
>
>>      {
>> -      /* TODO: This is hardcoded to 10 right now.  It can be refined, for
>> -     example we may want to unroll very small loops more times (4 perhaps).
>> -     We also should use a PARAM for this.  */
>
> Still hardcoded values, and may still wish to eventually have this as a
> tunable param.   Probably OK to drop the 2nd sentence, but first and
> last sentences should probably stay.
>
>
>> +      if (loop->ninsns <= 6)
>> +    return MIN (4, nunroll);
>>        if (loop->ninsns <= 10)
>>      return MIN (2, nunroll);
>> -      else
>> -    return 0;
>> +
>> +      return 0;
>>      }
>
>
> ok
>
>> 
>> +  if (rs6000_complex_loop_p (loop))
>> +    return 0;
>> +
>>    return nunroll;
>>  }
>> 
>
> ok

Reply via email to