On Wed, Oct 24, 2012 at 3:42 AM, Joern Rennecke <joern.renne...@embecosm.com> wrote: > Quoting Richard Biener <richard.guent...@gmail.com>: > >> On Tue, Oct 16, 2012 at 9:35 PM, Joern Rennecke >> <joern.renne...@embecosm.com> wrote: > > .. >>> >>> Well, we could split it anyway, and give ports without the need for >>> multiple length attributes the benefit of the optimistic algorithm. >>> >>> I have attached a patch that implements this. >> >> >> Looks reasonable to me, though I'm not familiar enough with the code >> to approve it. > > > Now that Richard Sandiford has reviewed that split-off part and it's in > the source tree, we can return to the remaining functionality needed > by for the ARC port. > >> I'd strongly suggest to try harder to make things work for you without >> the new attribute even though I wasn't really able to follow your >> reasoning >> on why that wouldn't work. It may be easier to motivate this change >> once the port is in without that attribute so one can actually look at >> the machine description and port details. > > > Well, it doesn't simply drop in with the existing branch shortening - > libgcc won't compile because of out-of-range branches. > I tried to lump the length and lock_length atribute together, and that > just gives genattrtab indigestion. It sits there looping forever. > I could start debugging this, but that would take an unknown amount of > time, and then the review of the fix would take an unknown amount of time, > and then the ARC port probably needs fixing up again because it just > doesn't work right with these fudged lengths. And even if we could get > everything required in before the close of phase 1, the branch shortening > would be substandard. > It seems more productive to get the branch shortening working now. > The lock_length atrtibute is completely optional, so no port maintainer > would be forced to use the functionality if it's not desired. > > The issue is that the some instructions need to be aligned or unaligned > for performance or in a few cases even for correctness. Just inserting > random nops would be a waste of code space and cycles, since most of the > time, the desired (mis)alignment can be archived by selectively making > a short instruction long. If an instruction that is long once was forced > to stay long henceforth, that would actually defeat the purpose of getting > the desired alignment. Then another short instruction - if it can be found > - > would need to be upsized. So a size increase could ripple through as > alignments are distorted. The natural thing to do is really when the > alignemnt changes is really to let the upsized instruction be short again. > Only length-determined branch sizes have to be locked to avoid cycles.
Just to add some extra information, can you quote your ports ADJUST_INSN_LENGTH and one example instruction with length/lock_length attribute the above applies to? > This is the documentation for the new role of the lock_length attribute > (reduced from my previous attempt): > > @cindex lock_length > Usually, when doing optimizing branch shortening, the instruction length > is calculated by avaluating the @code{length} attribute, applying > @code{ADJUST_INSN_LENGTH}, and taking the maximum of the resultant > value and the length of the instruction in the previous iteration. Which sounds straight-forward. The docs of ADJUST_INSN_LENGTH are not entirely clear, but it sounds like it may only increase length, correct? I see that ADJUST_INSN_LENGTH is really not needed as everything should be expressable in the length attribute of an instruction? > If you define the @code{lock_length} attribute, the @code{lock_length} > attribute will be evaluated, and then the maximum with of @code{lock_length} with of? I read it as 'of the' > value from the previous iteration will be formed and saved. So lock_length will only increase during iteration. > Then the maximum of that value with the @code{length} attribute will > be formed, and @code{ADJUST_INSN_LENGTH} will be applied. ADJUST_INSN_LENGTH will be applied to the maximum? What will be the 'instruction length' equivalent to the simple non-lock_length case? Is it the above, max (ADJUST_INSN_LENGTH (lock-length-max, length))? > Thus, you can allow the length to vary downwards as well as upwards > across iterations with suitable definitions of the @code{length} attribute > and/or @code{ADJUST_INSN_LENGTH}. Care has to be taken that this does not > lead to infinite loops. I don't see that you can shrink length with just suitable lock_length and length attributes. What seems to be the cruical difference is that you apply ADJUST_INSN_LENGTH after combining lock-length-max and length. But then you _decrease_ length with ADJUST_INSN_LENGHT ... Maybe what you really want is ADJUST_INSN_LENGTH_AFTER which is applied afterwards? Thus, > Usually, when doing optimizing branch shortening, the instruction length > is calculated by avaluating the @code{length} attribute, applying > @code{ADJUST_INSN_LENGTH}, and taking the maximum of the resultant > value and the length of the instruction in the previous iteration. > If @code{ADJUST_INSN_LENGTH_AFTER} is defined it is applied to the > resulting instruction length. ? Note that > Care has to be taken that this does not > lead to infinite loops. doesn't convince me that is properly designed ;) Thanks, Richard. > > The new patch builds on this patch: > http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01890.html > as a prerequisite. > > build tested with libraries in revision 192654 for i686-pc-linux-gnu X > mipsel-elf . > bootstrapped in revision 192703 on i686-pc-linux-gnu; > I've also successfully run config-list.mk with the patch applied to this > revision. The following ports had pre-existing failures, which are > documented in the sub-PRs or PR47093/PR44756: > > alpha64-dec-vms alpha-dec-vms am33_2.0-linux arm-netbsdelf arm-wrs-vxworks > avr-elf avr-rtems c6x-elf c6x-uclinux cr16-elf fr30-elf > i686-interix3 --enable-obsolete i686-openbsd3.0 i686-pc-msdosdjgpp > ia64-hp-vms iq2000-elf lm32-elf lm32-rtems lm32-uclinux mep-elf > microblaze-elf microblaze-linux mn10300-elf moxie-elf moxie-rtems > moxie-uclinux pdp11-aout picochip-elf --enable-obsolete rl78-elf > rx-elf score-elf --enable-obsolete sh5el-netbsd sh64-elf --with-newlib > sh64-linux sh64-netbsd sh-elf shle-linux sh-netbsdelf sh-rtems sh-superh-elf > sh-wrs-vxworks tilegx-linux-gnu tilepro-linux-gnu vax-linux-gnu > vax-netbsdelf > vax-openbsd x86_64-knetbsd-gnu > > > I'll be posting the ARC port shortly; it does not fit into a single 100 KB > posting, so I'm thinking of splitting it in a configury patch and zx > compressed files/tarballs for arc.c, arc.md, libgcc, and the rest of the > port. > > 2012-10-22 Joern Rennecke <joern.renne...@embecosm.com> > > * doc/md.texi (node Defining Attributes): Add lock_length to > table of special attributes. > (node Insn Lengths): Document lock_length attribute. > * final.c (uid_lock_length): New variable. > (insn_max_length, get_attr_lock_length): New functions. > (get_attr_length): Use insn_max_length. > (get_attr_length_1): Use direct recursion rather than > calling get_attr_length. > (INSN_VARIABLE_LENGTH_P): Define. > (shorten_branches): Implement lock_length attribute functionality. > * genattrtab.c (lock_length_str): New variable. > (make_length_attrs): New parameter base. > (main): Initialize lock_length_str. > Generate lock_lengths attributes. > * genattr.c (gen_attr): Emit declarations for lock_length attribute > related functions. > (main): Emit stub defines for lock_length attribute related > functions. > > Index: doc/md.texi > =================================================================== > --- doc/md.texi (revision 2660) > +++ doc/md.texi (working copy) > @@ -7515,6 +7515,9 @@ extern enum attr_type get_attr_type (); > code chunks. This is especially important when verifying branch > distances. @xref{Insn Lengths}. > > +@item lock_length > +The @code{lock_length} attribute. > + > @item enabled > The @code{enabled} attribute can be defined to prevent certain > alternatives of an insn definition from being used during code > @@ -8038,6 +8041,22 @@ (define_insn "jump" > (const_int 6)))]) > @end smallexample > > +@cindex lock_length > +Usually, when doing optimizing branch shortening, the instruction length > +is calculated by avaluating the @code{length} attribute, applying > +@code{ADJUST_INSN_LENGTH}, and taking the maximum of the resultant > +value and the length of the instruction in the previous iteration. > + > +If you define the @code{lock_length} attribute, the @code{lock_length} > +attribute will be evaluated, and then the maximum with of > @code{lock_length} > +value from the previous iteration will be formed and saved. > +Then the maximum of that value with the @code{length} attribute will > +be formed, and @code{ADJUST_INSN_LENGTH} will be applied. > +Thus, you can allow the length to vary downwards as well as upwards > +across iterations with suitable definitions of the @code{length} attribute > +and/or @code{ADJUST_INSN_LENGTH}. Care has to be taken that this does not > +lead to infinite loops. > + > @end ifset > @ifset INTERNALS > @node Constant Attributes > Index: final.c > =================================================================== > --- final.c (revision 2660) > +++ final.c (working copy) > @@ -308,6 +308,7 @@ dbr_sequence_length (void) > `insn_current_length'. */ > > static int *insn_lengths; > +static int *uid_lock_length; > > VEC(int,heap) *insn_addresses_; > > @@ -430,14 +431,41 @@ get_attr_length_1 (rtx insn, int (*fallb > return length; > } > > +/* Calculate the maximum length of INSN. */ > +static int > +insn_max_length (rtx insn) > +{ > + int length, lock_length; > + > + length = insn_default_length (insn); > + if (HAVE_ATTR_lock_length) > + { > + lock_length = insn_default_lock_length (insn); > + if (length < lock_length) > + length = lock_length; > + } > + return length; > +} > + > /* Obtain the current length of an insn. If branch shortening has been > done, > get its actual length. Otherwise, get its maximum length. */ > int > get_attr_length (rtx insn) > { > - return get_attr_length_1 (insn, insn_default_length); > + return get_attr_length_1 (insn, insn_max_length); > } > > +int > +get_attr_lock_length (rtx insn) > +{ > + if (uid_lock_length && insn_lengths_max_uid > INSN_UID (insn)) > + return uid_lock_length[INSN_UID (insn)]; > + return get_attr_length_1 (insn, insn_min_lock_length); > +} > + > +#define INSN_VARIABLE_LENGTH_P(INSN) \ > + (insn_variable_length_p (INSN) || insn_variable_lock_length_p (INSN)) > + > /* Obtain the current length of an insn. If branch shortening has been > done, > get its actual length. Otherwise, get its minimum length. */ > int > @@ -966,6 +994,11 @@ shorten_branches (rtx first) > /* Allocate the rest of the arrays. */ > insn_lengths = XNEWVEC (int, max_uid); > insn_lengths_max_uid = max_uid; > + if (HAVE_ATTR_lock_length) > + uid_lock_length = XCNEWVEC (int, max_uid); > + else > + uid_lock_length = insn_lengths; > + > /* Syntax errors can lead to labels being outside of the main insn > stream. > Initialize insn_addresses, so that we get reproducible results. */ > INSN_ADDRESSES_ALLOC (max_uid); > @@ -1064,7 +1097,7 @@ shorten_branches (rtx first) > #endif /* CASE_VECTOR_SHORTEN_MODE */ > > /* Compute initial lengths, addresses, and varying flags for each insn. > */ > - int (*length_fun) (rtx) = increasing ? insn_min_length : > insn_default_length; > + int (*length_fun) (rtx) = increasing ? insn_min_length : insn_max_length; > > for (insn_current_address = 0, insn = first; > insn != 0; > @@ -1106,7 +1139,7 @@ shorten_branches (rtx first) > /* Alignment is handled by ADDR_VEC_ALIGN. */ > } > else if (GET_CODE (body) == ASM_INPUT || asm_noperands (body) >= 0) > - insn_lengths[uid] = asm_insn_count (body) * insn_default_length > (insn); > + insn_lengths[uid] = asm_insn_count (body) * insn_max_length (insn); > else if (GET_CODE (body) == SEQUENCE) > { > int i; > @@ -1117,7 +1150,7 @@ shorten_branches (rtx first) > const_delay_slots = 0; > #endif > int (*inner_length_fun) (rtx) > - = const_delay_slots ? length_fun : insn_default_length; > + = const_delay_slots ? length_fun : insn_max_length; > /* Inside a delay slot sequence, we do not do any branch > shortening > if the shortening could change the number of delay slots > of the branch. */ > @@ -1130,7 +1163,7 @@ shorten_branches (rtx first) > if (GET_CODE (body) == ASM_INPUT > || asm_noperands (PATTERN (XVECEXP (body, 0, i))) >= 0) > inner_length = (asm_insn_count (PATTERN (inner_insn)) > - * insn_default_length (inner_insn)); > + * insn_max_length (inner_insn)); > else > inner_length = inner_length_fun (inner_insn); > > @@ -1138,7 +1171,7 @@ shorten_branches (rtx first) > if (const_delay_slots) > { > if ((varying_length[inner_uid] > - = insn_variable_length_p (inner_insn)) != 0) > + = INSN_VARIABLE_LENGTH_P (inner_insn)) != 0) > varying_length[uid] = 1; > INSN_ADDRESSES (inner_uid) = (insn_current_address > + insn_lengths[uid]); > @@ -1151,7 +1184,7 @@ shorten_branches (rtx first) > else if (GET_CODE (body) != USE && GET_CODE (body) != CLOBBER) > { > insn_lengths[uid] = length_fun (insn); > - varying_length[uid] = insn_variable_length_p (insn); > + varying_length[uid] = INSN_VARIABLE_LENGTH_P (insn); > } > > /* If needed, do any adjustment. */ > @@ -1354,7 +1387,7 @@ shorten_branches (rtx first) > { > rtx inner_insn = XVECEXP (body, 0, i); > int inner_uid = INSN_UID (inner_insn); > - int inner_length; > + int inner_length, lock_length; > > INSN_ADDRESSES (inner_uid) = insn_current_address; > > @@ -1365,16 +1398,23 @@ shorten_branches (rtx first) > else > inner_length = insn_current_length (inner_insn); > > - if (inner_length != insn_lengths[inner_uid]) > + lock_length > + = (HAVE_ATTR_lock_length > + ? insn_current_lock_length (inner_insn) : > inner_length); > + if (lock_length != uid_lock_length[inner_uid]) > { > - if (!increasing || inner_length > > insn_lengths[inner_uid]) > + if (!increasing > + || lock_length > uid_lock_length[inner_uid]) > { > - insn_lengths[inner_uid] = inner_length; > + uid_lock_length[inner_uid] = lock_length; > something_changed = 1; > } > else > - inner_length = insn_lengths[inner_uid]; > + lock_length = uid_lock_length[inner_uid]; > } > + if (inner_length < lock_length) > + inner_length = lock_length; > + insn_lengths[inner_uid] = inner_length; > insn_current_address += inner_length; > new_length += inner_length; > } > @@ -1382,6 +1422,17 @@ shorten_branches (rtx first) > else > { > new_length = insn_current_length (insn); > + if (HAVE_ATTR_lock_length) > + { > + int lock_length = insn_current_lock_length (insn); > + > + if (!increasing || lock_length > uid_lock_length[uid]) > + uid_lock_length[uid] = lock_length; > + else > + lock_length = uid_lock_length[uid]; > + if (new_length < lock_length) > + new_length = lock_length; > + } > insn_current_address += new_length; > } > > @@ -1393,7 +1444,8 @@ shorten_branches (rtx first) > #endif > > if (new_length != insn_lengths[uid] > - && (!increasing || new_length > insn_lengths[uid])) > + && (!increasing || HAVE_ATTR_lock_length > + || new_length > insn_lengths[uid])) > { > insn_lengths[uid] = new_length; > something_changed = 1; > @@ -1407,6 +1459,11 @@ shorten_branches (rtx first) > } > > free (varying_length); > + if (HAVE_ATTR_lock_length) > + { > + free (uid_lock_length); > + uid_lock_length = 0; > + } > } > > /* Given the body of an INSN known to be generated by an ASM statement, > return > Index: genattr.c > =================================================================== > --- genattr.c (revision 2660) > +++ genattr.c (working copy) > @@ -71,6 +71,10 @@ extern int insn_default_length (rtx);\n\ > extern int insn_min_length (rtx);\n\ > extern int insn_variable_length_p (rtx);\n\ > extern int insn_current_length (rtx);\n\n\ > +extern int insn_default_lock_length (rtx);\n\ > +extern int insn_min_lock_length (rtx);\n\ > +extern int insn_variable_lock_length_p (rtx);\n\ > +extern int insn_current_lock_length (rtx);\n\n\ > #include \"insn-addr.h\"\n"); > } > } > @@ -337,7 +341,8 @@ main (int argc, char **argv) > } > > /* Special-purpose atributes should be tested with if, not #ifdef. */ > - const char * const special_attrs[] = { "length", "enabled", 0 }; > + const char * const special_attrs[] > + = { "length", "lock_length", "enabled", 0 }; > for (const char * const *p = special_attrs; *p; p++) > { > printf ("#ifndef HAVE_ATTR_%s\n" > @@ -346,13 +351,19 @@ main (int argc, char **argv) > } > /* We make an exception here to provide stub definitions for > insn_*_length* functions. */ > - puts ("#if !HAVE_ATTR_length\n" > - "extern int hook_int_rtx_0 (rtx);\n" > + puts ("extern int hook_int_rtx_0 (rtx);\n" > + "#if !HAVE_ATTR_length\n" > "#define insn_default_length hook_int_rtx_0\n" > "#define insn_min_length hook_int_rtx_0\n" > "#define insn_variable_length_p hook_int_rtx_0\n" > "#define insn_current_length hook_int_rtx_0\n" > "#include \"insn-addr.h\"\n" > + "#endif\n" > + "#if !HAVE_ATTR_lock_length\n" > + "#define insn_default_lock_length hook_int_rtx_0\n" > + "#define insn_min_lock_length hook_int_rtx_0\n" > + "#define insn_variable_lock_length_p hook_int_rtx_0\n" > + "#define insn_current_lock_length hook_int_rtx_0\n" > "#endif\n"); > > /* Output flag masks for use by reorg. > Index: genattrtab.c > =================================================================== > --- genattrtab.c (revision 2660) > +++ genattrtab.c (working copy) > @@ -242,6 +242,7 @@ struct attr_value_list **insn_code_value > > static const char *alternative_name; > static const char *length_str; > +static const char *lock_length_str; > static const char *delay_type_str; > static const char *delay_1_0_str; > static const char *num_delay_slots_str; > @@ -1541,14 +1542,14 @@ substitute_address (rtx exp, rtx (*no_ad > */ > > static void > -make_length_attrs (void) > +make_length_attrs (const char **base) > { > static const char *new_names[] = > { > - "*insn_default_length", > - "*insn_min_length", > - "*insn_variable_length_p", > - "*insn_current_length" > + "*insn_default_%s", > + "*insn_min_%s", > + "*insn_variable_%s_p", > + "*insn_current_%s" > }; > static rtx (*const no_address_fn[]) (rtx) > = {identity_fn,identity_fn, zero_fn, zero_fn}; > @@ -1561,7 +1562,7 @@ make_length_attrs (void) > > /* See if length attribute is defined. If so, it must be numeric. Make > it special so we don't output anything for it. */ > - length_attr = find_attr (&length_str, 0); > + length_attr = find_attr (base, 0); > if (length_attr == 0) > return; > > @@ -1574,11 +1575,14 @@ make_length_attrs (void) > /* Make each new attribute, in turn. */ > for (i = 0; i < ARRAY_SIZE (new_names); i++) > { > - make_internal_attr (new_names[i], > + const char *p = attr_printf (strlen (new_names[i]) - 2 + strlen > (*base), > + new_names[i], *base); > + > + make_internal_attr (p, > substitute_address > (length_attr->default_val->value, > no_address_fn[i], > address_fn[i]), > ATTR_NONE); > - new_attr = find_attr (&new_names[i], 0); > + new_attr = find_attr (&p, 0); > for (av = length_attr->first_value; av; av = av->next) > for (ie = av->first_insn; ie; ie = ie->next) > { > @@ -5179,6 +5183,7 @@ main (int argc, char **argv) > > alternative_name = DEF_ATTR_STRING ("alternative"); > length_str = DEF_ATTR_STRING ("length"); > + lock_length_str = DEF_ATTR_STRING ("lock_length"); > delay_type_str = DEF_ATTR_STRING ("*delay_type"); > delay_1_0_str = DEF_ATTR_STRING ("*delay_1_0"); > num_delay_slots_str = DEF_ATTR_STRING ("*num_delay_slots"); > @@ -5275,7 +5280,8 @@ main (int argc, char **argv) > fill_attr (attr); > > /* Construct extra attributes for `length'. */ > - make_length_attrs (); > + make_length_attrs (&length_str); > + make_length_attrs (&lock_length_str); > > /* Perform any possible optimizations to speed up compilation. */ > optimize_attrs (); >