On Wed, Oct 24, 2012 at 3:42 AM, Joern Rennecke
<joern.renne...@embecosm.com> wrote:
> Quoting Richard Biener <richard.guent...@gmail.com>:
>
>> On Tue, Oct 16, 2012 at 9:35 PM, Joern Rennecke
>> <joern.renne...@embecosm.com> wrote:
>
> ..
>>>
>>> Well, we could split it anyway, and give ports without the need for
>>> multiple length attributes the benefit of the optimistic algorithm.
>>>
>>> I have attached a patch that implements this.
>>
>>
>> Looks reasonable to me, though I'm not familiar enough with the code
>> to approve it.
>
>
> Now that Richard Sandiford has reviewed that split-off part and it's in
> the source tree, we can return to the remaining functionality needed
> by for the ARC port.
>
>> I'd strongly suggest to try harder to make things work for you without
>> the new attribute even though I wasn't really able to follow your
>> reasoning
>> on why that wouldn't work.  It may be easier to motivate this change
>> once the port is in without that attribute so one can actually look at
>> the machine description and port details.
>
>
> Well, it doesn't simply drop in with the existing branch shortening -
> libgcc won't compile because of out-of-range branches.
> I tried to lump the length and lock_length atribute together, and that
> just gives genattrtab indigestion.  It sits there looping forever.
> I could start debugging this, but that would take an unknown amount of
> time, and then the review of the fix would take an unknown amount of time,
> and then the ARC port probably needs fixing up again because it just
> doesn't work right with these fudged lengths.  And even if we could get
> everything required in before the close of phase 1, the branch shortening
> would be substandard.
> It seems more productive to get the branch shortening working now.
> The lock_length atrtibute is completely optional, so no port maintainer
> would be forced to use the functionality if it's not desired.
>
> The issue is that the some instructions need to be aligned or unaligned
> for performance or in a few cases even for correctness.  Just inserting
> random nops would be a waste of code space and cycles, since most of the
> time, the desired (mis)alignment can be archived by selectively making
> a short instruction long.  If an instruction that is long once was forced
> to stay long henceforth, that would actually defeat the purpose of getting
> the desired alignment.  Then another short instruction - if it can be found
> -
> would need to be upsized.  So a size increase could ripple through as
> alignments are distorted.  The natural thing to do is really when the
> alignemnt changes is really to let the upsized instruction be short again.
> Only length-determined branch sizes have to be locked to avoid cycles.

Just to add some extra information, can you quote your ports
ADJUST_INSN_LENGTH and one example instruction with length/lock_length
attribute the above applies to?

> This is the documentation for the new role of the lock_length attribute
> (reduced from my previous attempt):
>
> @cindex lock_length
> Usually, when doing optimizing branch shortening, the instruction length
> is calculated by avaluating the @code{length} attribute, applying
> @code{ADJUST_INSN_LENGTH}, and taking the maximum of the resultant
> value and the length of the instruction in the previous iteration.

Which sounds straight-forward.  The docs of ADJUST_INSN_LENGTH
are not entirely clear, but it sounds like it may only increase length, correct?
I see that ADJUST_INSN_LENGTH is really not needed as everything
should be expressable in the length attribute of an instruction?

> If you define the @code{lock_length} attribute, the @code{lock_length}
> attribute will be evaluated, and then the maximum with of @code{lock_length}

with of?  I read it as 'of the'

> value from the previous iteration will be formed and saved.

So lock_length will only increase during iteration.

> Then the maximum of that value with the @code{length} attribute will
> be formed, and @code{ADJUST_INSN_LENGTH} will be applied.

ADJUST_INSN_LENGTH will be applied to the maximum?  What will
be the 'instruction length' equivalent to the simple non-lock_length case?
Is it the above, max (ADJUST_INSN_LENGTH (lock-length-max, length))?

> Thus, you can allow the length to vary downwards as well as upwards
> across iterations with suitable definitions of the @code{length} attribute
> and/or @code{ADJUST_INSN_LENGTH}.  Care has to be taken that this does not
> lead to infinite loops.

I don't see that you can shrink length with just suitable lock_length and
length attributes.  What seems to be the cruical difference is that you
apply ADJUST_INSN_LENGTH after combining lock-length-max and length.
But then you _decrease_ length with ADJUST_INSN_LENGHT ...

Maybe what you really want is ADJUST_INSN_LENGTH_AFTER which is
applied afterwards?  Thus,

> Usually, when doing optimizing branch shortening, the instruction length
> is calculated by avaluating the @code{length} attribute, applying
> @code{ADJUST_INSN_LENGTH}, and taking the maximum of the resultant
> value and the length of the instruction in the previous iteration.
> If @code{ADJUST_INSN_LENGTH_AFTER} is defined it is applied to the
> resulting instruction length.

?

Note that

> Care has to be taken that this does not
> lead to infinite loops.

doesn't convince me that is properly designed ;)

Thanks,
Richard.

>
> The new patch builds on this patch:
> http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01890.html
> as a prerequisite.
>
> build tested with libraries in revision 192654 for i686-pc-linux-gnu X
> mipsel-elf .
> bootstrapped in revision 192703 on i686-pc-linux-gnu;
> I've also successfully run config-list.mk with the patch applied to this
> revision.  The following ports had pre-existing failures, which are
> documented in the sub-PRs or PR47093/PR44756:
>
> alpha64-dec-vms alpha-dec-vms am33_2.0-linux arm-netbsdelf arm-wrs-vxworks
> avr-elf avr-rtems c6x-elf c6x-uclinux cr16-elf fr30-elf
> i686-interix3 --enable-obsolete i686-openbsd3.0 i686-pc-msdosdjgpp
> ia64-hp-vms iq2000-elf lm32-elf lm32-rtems lm32-uclinux mep-elf
> microblaze-elf microblaze-linux mn10300-elf moxie-elf moxie-rtems
> moxie-uclinux pdp11-aout picochip-elf --enable-obsolete rl78-elf
> rx-elf score-elf --enable-obsolete sh5el-netbsd sh64-elf --with-newlib
> sh64-linux sh64-netbsd sh-elf shle-linux sh-netbsdelf sh-rtems sh-superh-elf
> sh-wrs-vxworks tilegx-linux-gnu tilepro-linux-gnu vax-linux-gnu
> vax-netbsdelf
> vax-openbsd x86_64-knetbsd-gnu
>
>
> I'll be posting the ARC port shortly; it does not fit into a single 100 KB
> posting, so I'm thinking of splitting it in a configury patch and zx
> compressed files/tarballs for arc.c, arc.md, libgcc, and the rest of the
> port.
>
> 2012-10-22  Joern Rennecke  <joern.renne...@embecosm.com>
>
>         * doc/md.texi (node Defining Attributes): Add lock_length to
>         table of special attributes.
>         (node Insn Lengths): Document lock_length attribute.
>         * final.c (uid_lock_length): New variable.
>         (insn_max_length, get_attr_lock_length): New functions.
>         (get_attr_length): Use insn_max_length.
>         (get_attr_length_1): Use direct recursion rather than
>         calling get_attr_length.
>         (INSN_VARIABLE_LENGTH_P): Define.
>         (shorten_branches): Implement lock_length attribute functionality.
>         * genattrtab.c (lock_length_str): New variable.
>         (make_length_attrs): New parameter base.
>         (main): Initialize lock_length_str.
>         Generate lock_lengths attributes.
>         * genattr.c (gen_attr): Emit declarations for lock_length attribute
>         related functions.
>         (main): Emit stub defines for lock_length attribute related
> functions.
>
> Index: doc/md.texi
> ===================================================================
> --- doc/md.texi (revision 2660)
> +++ doc/md.texi (working copy)
> @@ -7515,6 +7515,9 @@ extern enum attr_type get_attr_type ();
>  code chunks.  This is especially important when verifying branch
>  distances. @xref{Insn Lengths}.
>
> +@item lock_length
> +The @code{lock_length} attribute.
> +
>  @item enabled
>  The @code{enabled} attribute can be defined to prevent certain
>  alternatives of an insn definition from being used during code
> @@ -8038,6 +8041,22 @@ (define_insn "jump"
>                        (const_int 6)))])
>  @end smallexample
>
> +@cindex lock_length
> +Usually, when doing optimizing branch shortening, the instruction length
> +is calculated by avaluating the @code{length} attribute, applying
> +@code{ADJUST_INSN_LENGTH}, and taking the maximum of the resultant
> +value and the length of the instruction in the previous iteration.
> +
> +If you define the @code{lock_length} attribute, the @code{lock_length}
> +attribute will be evaluated, and then the maximum with of
> @code{lock_length}
> +value from the previous iteration will be formed and saved.
> +Then the maximum of that value with the @code{length} attribute will
> +be formed, and @code{ADJUST_INSN_LENGTH} will be applied.
> +Thus, you can allow the length to vary downwards as well as upwards
> +across iterations with suitable definitions of the @code{length} attribute
> +and/or @code{ADJUST_INSN_LENGTH}.  Care has to be taken that this does not
> +lead to infinite loops.
> +
>  @end ifset
>  @ifset INTERNALS
>  @node Constant Attributes
> Index: final.c
> ===================================================================
> --- final.c     (revision 2660)
> +++ final.c     (working copy)
> @@ -308,6 +308,7 @@ dbr_sequence_length (void)
>     `insn_current_length'.  */
>
>  static int *insn_lengths;
> +static int *uid_lock_length;
>
>  VEC(int,heap) *insn_addresses_;
>
> @@ -430,14 +431,41 @@ get_attr_length_1 (rtx insn, int (*fallb
>    return length;
>  }
>
> +/* Calculate the maximum length of INSN.  */
> +static int
> +insn_max_length (rtx insn)
> +{
> +  int length, lock_length;
> +
> +  length = insn_default_length (insn);
> +  if (HAVE_ATTR_lock_length)
> +    {
> +      lock_length = insn_default_lock_length (insn);
> +      if (length < lock_length)
> +       length = lock_length;
> +    }
> +  return length;
> +}
> +
>  /* Obtain the current length of an insn.  If branch shortening has been
> done,
>     get its actual length.  Otherwise, get its maximum length.  */
>  int
>  get_attr_length (rtx insn)
>  {
> -  return get_attr_length_1 (insn, insn_default_length);
> +  return get_attr_length_1 (insn, insn_max_length);
>  }
>
> +int
> +get_attr_lock_length (rtx insn)
> +{
> +  if (uid_lock_length && insn_lengths_max_uid > INSN_UID (insn))
> +    return uid_lock_length[INSN_UID (insn)];
> +  return get_attr_length_1 (insn, insn_min_lock_length);
> +}
> +
> +#define INSN_VARIABLE_LENGTH_P(INSN) \
> +  (insn_variable_length_p (INSN) || insn_variable_lock_length_p (INSN))
> +
>  /* Obtain the current length of an insn.  If branch shortening has been
> done,
>     get its actual length.  Otherwise, get its minimum length.  */
>  int
> @@ -966,6 +994,11 @@ shorten_branches (rtx first)
>    /* Allocate the rest of the arrays.  */
>    insn_lengths = XNEWVEC (int, max_uid);
>    insn_lengths_max_uid = max_uid;
> +  if (HAVE_ATTR_lock_length)
> +    uid_lock_length = XCNEWVEC (int, max_uid);
> +  else
> +    uid_lock_length = insn_lengths;
> +
>    /* Syntax errors can lead to labels being outside of the main insn
> stream.
>       Initialize insn_addresses, so that we get reproducible results.  */
>    INSN_ADDRESSES_ALLOC (max_uid);
> @@ -1064,7 +1097,7 @@ shorten_branches (rtx first)
>  #endif /* CASE_VECTOR_SHORTEN_MODE */
>
>    /* Compute initial lengths, addresses, and varying flags for each insn.
> */
> -  int (*length_fun) (rtx) = increasing ? insn_min_length :
> insn_default_length;
> +  int (*length_fun) (rtx) = increasing ? insn_min_length : insn_max_length;
>
>    for (insn_current_address = 0, insn = first;
>         insn != 0;
> @@ -1106,7 +1139,7 @@ shorten_branches (rtx first)
>           /* Alignment is handled by ADDR_VEC_ALIGN.  */
>         }
>        else if (GET_CODE (body) == ASM_INPUT || asm_noperands (body) >= 0)
> -       insn_lengths[uid] = asm_insn_count (body) * insn_default_length
> (insn);
> +       insn_lengths[uid] = asm_insn_count (body) * insn_max_length (insn);
>        else if (GET_CODE (body) == SEQUENCE)
>         {
>           int i;
> @@ -1117,7 +1150,7 @@ shorten_branches (rtx first)
>           const_delay_slots = 0;
>  #endif
>           int (*inner_length_fun) (rtx)
> -           = const_delay_slots ? length_fun : insn_default_length;
> +           = const_delay_slots ? length_fun : insn_max_length;
>           /* Inside a delay slot sequence, we do not do any branch
> shortening
>              if the shortening could change the number of delay slots
>              of the branch.  */
> @@ -1130,7 +1163,7 @@ shorten_branches (rtx first)
>               if (GET_CODE (body) == ASM_INPUT
>                   || asm_noperands (PATTERN (XVECEXP (body, 0, i))) >= 0)
>                 inner_length = (asm_insn_count (PATTERN (inner_insn))
> -                               * insn_default_length (inner_insn));
> +                               * insn_max_length (inner_insn));
>               else
>                 inner_length = inner_length_fun (inner_insn);
>
> @@ -1138,7 +1171,7 @@ shorten_branches (rtx first)
>               if (const_delay_slots)
>                 {
>                   if ((varying_length[inner_uid]
> -                      = insn_variable_length_p (inner_insn)) != 0)
> +                      = INSN_VARIABLE_LENGTH_P (inner_insn)) != 0)
>                     varying_length[uid] = 1;
>                   INSN_ADDRESSES (inner_uid) = (insn_current_address
>                                                 + insn_lengths[uid]);
> @@ -1151,7 +1184,7 @@ shorten_branches (rtx first)
>        else if (GET_CODE (body) != USE && GET_CODE (body) != CLOBBER)
>         {
>           insn_lengths[uid] = length_fun (insn);
> -         varying_length[uid] = insn_variable_length_p (insn);
> +         varying_length[uid] = INSN_VARIABLE_LENGTH_P (insn);
>         }
>
>        /* If needed, do any adjustment.  */
> @@ -1354,7 +1387,7 @@ shorten_branches (rtx first)
>                 {
>                   rtx inner_insn = XVECEXP (body, 0, i);
>                   int inner_uid = INSN_UID (inner_insn);
> -                 int inner_length;
> +                 int inner_length, lock_length;
>
>                   INSN_ADDRESSES (inner_uid) = insn_current_address;
>
> @@ -1365,16 +1398,23 @@ shorten_branches (rtx first)
>                   else
>                     inner_length = insn_current_length (inner_insn);
>
> -                 if (inner_length != insn_lengths[inner_uid])
> +                 lock_length
> +                   = (HAVE_ATTR_lock_length
> +                      ? insn_current_lock_length (inner_insn) :
> inner_length);
> +                 if (lock_length != uid_lock_length[inner_uid])
>                     {
> -                     if (!increasing || inner_length >
> insn_lengths[inner_uid])
> +                     if (!increasing
> +                         || lock_length > uid_lock_length[inner_uid])
>                         {
> -                         insn_lengths[inner_uid] = inner_length;
> +                         uid_lock_length[inner_uid] = lock_length;
>                           something_changed = 1;
>                         }
>                       else
> -                       inner_length = insn_lengths[inner_uid];
> +                       lock_length = uid_lock_length[inner_uid];
>                     }
> +                 if (inner_length < lock_length)
> +                   inner_length = lock_length;
> +                 insn_lengths[inner_uid] = inner_length;
>                   insn_current_address += inner_length;
>                   new_length += inner_length;
>                 }
> @@ -1382,6 +1422,17 @@ shorten_branches (rtx first)
>           else
>             {
>               new_length = insn_current_length (insn);
> +             if (HAVE_ATTR_lock_length)
> +               {
> +                 int lock_length = insn_current_lock_length (insn);
> +
> +                 if (!increasing || lock_length > uid_lock_length[uid])
> +                   uid_lock_length[uid] = lock_length;
> +                 else
> +                   lock_length = uid_lock_length[uid];
> +                 if (new_length < lock_length)
> +                   new_length = lock_length;
> +               }
>               insn_current_address += new_length;
>             }
>
> @@ -1393,7 +1444,8 @@ shorten_branches (rtx first)
>  #endif
>
>           if (new_length != insn_lengths[uid]
> -             && (!increasing || new_length > insn_lengths[uid]))
> +             && (!increasing || HAVE_ATTR_lock_length
> +                 || new_length > insn_lengths[uid]))
>             {
>               insn_lengths[uid] = new_length;
>               something_changed = 1;
> @@ -1407,6 +1459,11 @@ shorten_branches (rtx first)
>      }
>
>    free (varying_length);
> +  if (HAVE_ATTR_lock_length)
> +    {
> +      free (uid_lock_length);
> +      uid_lock_length = 0;
> +    }
>  }
>
>  /* Given the body of an INSN known to be generated by an ASM statement,
> return
> Index: genattr.c
> ===================================================================
> --- genattr.c   (revision 2660)
> +++ genattr.c   (working copy)
> @@ -71,6 +71,10 @@ extern int insn_default_length (rtx);\n\
>  extern int insn_min_length (rtx);\n\
>  extern int insn_variable_length_p (rtx);\n\
>  extern int insn_current_length (rtx);\n\n\
> +extern int insn_default_lock_length (rtx);\n\
> +extern int insn_min_lock_length (rtx);\n\
> +extern int insn_variable_lock_length_p (rtx);\n\
> +extern int insn_current_lock_length (rtx);\n\n\
>  #include \"insn-addr.h\"\n");
>      }
>  }
> @@ -337,7 +341,8 @@ main (int argc, char **argv)
>      }
>
>    /* Special-purpose atributes should be tested with if, not #ifdef.  */
> -  const char * const special_attrs[] = { "length", "enabled", 0 };
> +  const char * const special_attrs[]
> +    = { "length", "lock_length", "enabled", 0 };
>    for (const char * const *p = special_attrs; *p; p++)
>      {
>        printf ("#ifndef HAVE_ATTR_%s\n"
> @@ -346,13 +351,19 @@ main (int argc, char **argv)
>      }
>    /* We make an exception here to provide stub definitions for
>       insn_*_length* functions.  */
> -  puts ("#if !HAVE_ATTR_length\n"
> -       "extern int hook_int_rtx_0 (rtx);\n"
> +  puts ("extern int hook_int_rtx_0 (rtx);\n"
> +       "#if !HAVE_ATTR_length\n"
>         "#define insn_default_length hook_int_rtx_0\n"
>         "#define insn_min_length hook_int_rtx_0\n"
>         "#define insn_variable_length_p hook_int_rtx_0\n"
>         "#define insn_current_length hook_int_rtx_0\n"
>         "#include \"insn-addr.h\"\n"
> +       "#endif\n"
> +       "#if !HAVE_ATTR_lock_length\n"
> +       "#define insn_default_lock_length hook_int_rtx_0\n"
> +       "#define insn_min_lock_length hook_int_rtx_0\n"
> +       "#define insn_variable_lock_length_p hook_int_rtx_0\n"
> +       "#define insn_current_lock_length hook_int_rtx_0\n"
>         "#endif\n");
>
>    /* Output flag masks for use by reorg.
> Index: genattrtab.c
> ===================================================================
> --- genattrtab.c        (revision 2660)
> +++ genattrtab.c        (working copy)
> @@ -242,6 +242,7 @@ struct attr_value_list **insn_code_value
>
>  static const char *alternative_name;
>  static const char *length_str;
> +static const char *lock_length_str;
>  static const char *delay_type_str;
>  static const char *delay_1_0_str;
>  static const char *num_delay_slots_str;
> @@ -1541,14 +1542,14 @@ substitute_address (rtx exp, rtx (*no_ad
>    */
>
>  static void
> -make_length_attrs (void)
> +make_length_attrs (const char **base)
>  {
>    static const char *new_names[] =
>      {
> -      "*insn_default_length",
> -      "*insn_min_length",
> -      "*insn_variable_length_p",
> -      "*insn_current_length"
> +      "*insn_default_%s",
> +      "*insn_min_%s",
> +      "*insn_variable_%s_p",
> +      "*insn_current_%s"
>      };
>    static rtx (*const no_address_fn[]) (rtx)
>      = {identity_fn,identity_fn, zero_fn, zero_fn};
> @@ -1561,7 +1562,7 @@ make_length_attrs (void)
>
>    /* See if length attribute is defined.  If so, it must be numeric.  Make
>       it special so we don't output anything for it.  */
> -  length_attr = find_attr (&length_str, 0);
> +  length_attr = find_attr (base, 0);
>    if (length_attr == 0)
>      return;
>
> @@ -1574,11 +1575,14 @@ make_length_attrs (void)
>    /* Make each new attribute, in turn.  */
>    for (i = 0; i < ARRAY_SIZE (new_names); i++)
>      {
> -      make_internal_attr (new_names[i],
> +      const char *p = attr_printf (strlen (new_names[i]) - 2 + strlen
> (*base),
> +                                  new_names[i], *base);
> +
> +      make_internal_attr (p,
>                           substitute_address
> (length_attr->default_val->value,
>                                               no_address_fn[i],
> address_fn[i]),
>                           ATTR_NONE);
> -      new_attr = find_attr (&new_names[i], 0);
> +      new_attr = find_attr (&p, 0);
>        for (av = length_attr->first_value; av; av = av->next)
>         for (ie = av->first_insn; ie; ie = ie->next)
>           {
> @@ -5179,6 +5183,7 @@ main (int argc, char **argv)
>
>    alternative_name = DEF_ATTR_STRING ("alternative");
>    length_str = DEF_ATTR_STRING ("length");
> +  lock_length_str = DEF_ATTR_STRING ("lock_length");
>    delay_type_str = DEF_ATTR_STRING ("*delay_type");
>    delay_1_0_str = DEF_ATTR_STRING ("*delay_1_0");
>    num_delay_slots_str = DEF_ATTR_STRING ("*num_delay_slots");
> @@ -5275,7 +5280,8 @@ main (int argc, char **argv)
>        fill_attr (attr);
>
>    /* Construct extra attributes for `length'.  */
> -  make_length_attrs ();
> +  make_length_attrs (&length_str);
> +  make_length_attrs (&lock_length_str);
>
>    /* Perform any possible optimizations to speed up compilation.  */
>    optimize_attrs ();
>

Reply via email to