RE: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass

Nidal Faour Thu, 31 Oct 2019 02:42:10 -0700

A tests case for this patch has been written and pushed to WD github repository 
at the following link:
https://github.com/westerndigitalcorporation/riscv32-Code-density-test-bench/tree/master/common_load_store_base_address_offset


the test case itself has been written based on a real product scenario. 
We got a saving of about 10% in code size of the test itself.

Best Regards,

Nidal Faour
Staff Engineer, R&D Engineering – Firmware & Toolchain, CTO Group

Western Digital®
Migdal Tefen 24959, P.O Box 3
Email: nidal.fa...@wdc.com
Office: +972-4-9078756
Mobile: +972-50-8867756

>-----Original Message-----
>From: Jim Wilson <j...@sifive.com>
>Sent: Thursday, 31 October 2019 1:57
>To: Craig Blackmore <craig.blackm...@embecosm.com>
>Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Ofer Shinaar
><ofer.shin...@wdc.com>; Nidal Faour <nidal.fa...@wdc.com>; Kito Cheng
><kito.ch...@gmail.com>; Jeff Law <l...@redhat.com>
>Subject: Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass
>
>CAUTION: This email originated from outside of Western Digital. Do not click
>on links or open attachments unless you recognize the sender and know that
>the content is safe.
>
>
>On Fri, Oct 25, 2019 at 10:40 AM Craig Blackmore
><craig.blackm...@embecosm.com> wrote:
>> This patch aims to allow more load/store instructions to be compressed
>> by replacing a load/store of 'base register + large offset' with a new
>> load/store of 'new base + small offset'. If the new base gets stored
>> in a compressed register, then the new load/store can be compressed.
>> Since there is an overhead in creating the new base, this change is
>> only attempted when 'base register' is referenced in at least 4 load/stores 
>> in
>a basic block.
>>
>> The optimization is implemented in a new RISC-V specific pass called
>> shorten_memrefs which is enabled for RVC targets. It has been
>> developed for the 32-bit lw/sw instructions but could also be extended to
>64-bit ld/sd in future.
>
>The fact that this needs 4 load/stores in a block with the same base address
>means it won't trigger very often.  But it does seem to work.
>I see about a 0.05% code size reduction for a rv32gc newlib nano libc.a.  There
>might be other codes that benefit more.
>
>I'm concerned about the number of RISC-V specific optimization passes
>people are writing.  I've seen at least 3 so far.  This one is at least small 
>and
>simple enough to understand that it doesn't look like it will cause maintenance
>problems.
>
>The config.gcc change conflicts with the save-restore optimization pass that
>Andrew Burgess added, but that is a trivial fix.
>
>The code only works for 32-bit load/stores.  rv64 has compressed 64-bit
>load/stores, and the F and D extensions have float and double compressed
>loads and stores.  The patch would be more useful if it handled all of these.
>
>The patch doesn't check the other operand, it only looks at the memory
>operand.  This results in some cases where the code rewrites instructions that
>can never be compressed.  For instance, given void store1z (int *array) {
>  array[200] = 0;
>  array[201] = 0;
>  array[202] = 0;
>  array[203] = 0;
>}
>the patch increases code size by 4 bytes because it rewrites the stores, but we
>don't have compressed store 0, and x0 is not a compressed reg, so there is no
>point in rewriting the stores.  I can also create examples that show a size
>increase with loads but it is a little trickier.
>extern int sub1 (int, int, int, int, int, int, int); int load1a (int a0, int 
>a1, int a2, int
>a3, int a4, int *array) {
>  int a = 0;
>  a += array[200];
>  a += array[201];
>  a += array[202];
>  a += array[203];
>  return sub1 (a0, a1, a2, a3, a4, 0, a); } The register allocator will pass 
> a0-a4
>directly through from args to the call, leaving only a5-a7 for the load base
>address and dest, and only one of those regs is a compressed reg, but we
>need two, so these loads can't be compressed.  The code still gets rewritten,
>resulting in a size increase for the extra add.  Not sure if you can do 
>anything
>about that though, since you are doing this before register allocation.
>
>It isn't clear that the change riscv_address_cost is for.  That should be
>commented.
>
>I'd suggest parens in the riscv_compressed_lw_offset_p return statement,
>that makes it easier to align the code correctly (in emacs at least).
>
>The RISCV_MAX_COMPRESSED_LW_OFFSET should be moved down to the
>"ISA constants" section of riscv.h, and maybe renamed similar to the other
>constants.
>
>There is a REG_P check in get_si_mem_reg.  That should probably handle
>SUBREGs too.
>
>A testcase to verify the patch would be nice.  I have one I wrote for testing
>that shows some tests that work, some tests that don't work, and some that
>work but maybe shouldn't.
>
>I vaguely remember Micheal Meissner talking about doing something similar
>for PPC in a GNU Cauldron maybe 2 years ago.  But I don't know if he tried, or
>how far he got if he did try.  It might be useful to ask him about that work.
>
>Otherwise this looks OK to me.
>
>Jim

RE: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass

Reply via email to