On Wed, Nov 28, 2018 at 07:08:02AM -0600, Philipp Tomsich wrote: > > > On 28.11.2018, at 13:10, Richard Earnshaw (lists) > <richard.earns...@arm.com<mailto:richard.earns...@arm.com>> wrote: > > On 26/11/2018 19:50, Christoph Muellner wrote: > The aarch64 ISA specification allows a left shift amount to be applied > after extension in the range of 0 to 4 (encoded in the imm3 field). > > This is true for at least the following instructions: > > * ADD (extend register) > * ADDS (extended register) > * SUB (extended register) > > The result of this patch can be seen, when compiling the following code: > > uint64_t myadd(uint64_t a, uint64_t b) > { > return a+(((uint8_t)b)<<4); > } > > Without the patch the following sequence will be generated: > > 0000000000000000 <myadd>: > 0: d37c1c21 ubfiz x1, x1, #4, #8 > 4: 8b000020 add x0, x1, x0 > 8: d65f03c0 ret > > With the patch the ubfiz will be merged into the add instruction: > > 0000000000000000 <myadd>: > 0: 8b211000 add x0, x0, w1, uxtb #4 > 4: d65f03c0 ret > > Tested with "make check" and no regressions found. > > *** gcc/ChangeLog *** > > 2018-xx-xx Christoph Muellner > <christoph.muell...@theobroma-systems.com<mailto:christoph.muell...@theobroma-systems.com>> > Philipp Tomsich > <philipp.toms...@theobroma-systems.com<mailto:philipp.toms...@theobroma-systems.com>> > > * config/aarch64/aarch64.c (aarch64_uxt_size): Correct the maximum > shift amount for shifted operands. > > *** gcc/testsuite/ChangeLog *** > > 2018-xx-xx Christoph Muellner > <christoph.muell...@theobroma-systems.com<mailto:christoph.muell...@theobroma-systems.com>> > Philipp Tomsich > <philipp.toms...@theobroma-systems.com<mailto:philipp.toms...@theobroma-systems.com>> > > * gcc.target/aarch64/extend.c: Adjust the testcases to cover > the changed shift amount. > > > This is OK. Thanks. > > R. > > PS, I was sufficiently surprised by this that I went and checked the > original commit (it's not an obvious off-by-one error). But it does > appear that it's been this way since the code was originally added > (prior to the initial publication of the port) and there's no obvious > reason why. > > Since we don’t have any Reported-by: tags in GCC: the credit for initially > finding and > reporting this goes to AppliedMicro's original chip verification team for the > XGene1.
Good spot that team! This also took me down a rabbit hole of Architecture Reference Manuals and old source trees wondering how we got off by one. My best guess is that 0-3 just feels like the more natural range than 0-4... Thanks for the patch and the distraction! James