A tests case for this patch has been written and pushed to WD github repository at the following link: https://github.com/westerndigitalcorporation/riscv32-Code-density-test-bench/tree/master/common_load_store_base_address_offset
the test case itself has been written based on a real product scenario. We got a saving of about 10% in code size of the test itself. Best Regards, Nidal Faour Staff Engineer, R&D Engineering – Firmware & Toolchain, CTO Group Western Digital® Migdal Tefen 24959, P.O Box 3 Email: nidal.fa...@wdc.com Office: +972-4-9078756 Mobile: +972-50-8867756 >-----Original Message----- >From: Jim Wilson <j...@sifive.com> >Sent: Thursday, 31 October 2019 1:57 >To: Craig Blackmore <craig.blackm...@embecosm.com> >Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Ofer Shinaar ><ofer.shin...@wdc.com>; Nidal Faour <nidal.fa...@wdc.com>; Kito Cheng ><kito.ch...@gmail.com>; Jeff Law <l...@redhat.com> >Subject: Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass > >CAUTION: This email originated from outside of Western Digital. Do not click >on links or open attachments unless you recognize the sender and know that >the content is safe. > > >On Fri, Oct 25, 2019 at 10:40 AM Craig Blackmore ><craig.blackm...@embecosm.com> wrote: >> This patch aims to allow more load/store instructions to be compressed >> by replacing a load/store of 'base register + large offset' with a new >> load/store of 'new base + small offset'. If the new base gets stored >> in a compressed register, then the new load/store can be compressed. >> Since there is an overhead in creating the new base, this change is >> only attempted when 'base register' is referenced in at least 4 load/stores >> in >a basic block. >> >> The optimization is implemented in a new RISC-V specific pass called >> shorten_memrefs which is enabled for RVC targets. It has been >> developed for the 32-bit lw/sw instructions but could also be extended to >64-bit ld/sd in future. > >The fact that this needs 4 load/stores in a block with the same base address >means it won't trigger very often. But it does seem to work. >I see about a 0.05% code size reduction for a rv32gc newlib nano libc.a. There >might be other codes that benefit more. > >I'm concerned about the number of RISC-V specific optimization passes >people are writing. I've seen at least 3 so far. This one is at least small >and >simple enough to understand that it doesn't look like it will cause maintenance >problems. > >The config.gcc change conflicts with the save-restore optimization pass that >Andrew Burgess added, but that is a trivial fix. > >The code only works for 32-bit load/stores. rv64 has compressed 64-bit >load/stores, and the F and D extensions have float and double compressed >loads and stores. The patch would be more useful if it handled all of these. > >The patch doesn't check the other operand, it only looks at the memory >operand. This results in some cases where the code rewrites instructions that >can never be compressed. For instance, given void store1z (int *array) { > array[200] = 0; > array[201] = 0; > array[202] = 0; > array[203] = 0; >} >the patch increases code size by 4 bytes because it rewrites the stores, but we >don't have compressed store 0, and x0 is not a compressed reg, so there is no >point in rewriting the stores. I can also create examples that show a size >increase with loads but it is a little trickier. >extern int sub1 (int, int, int, int, int, int, int); int load1a (int a0, int >a1, int a2, int >a3, int a4, int *array) { > int a = 0; > a += array[200]; > a += array[201]; > a += array[202]; > a += array[203]; > return sub1 (a0, a1, a2, a3, a4, 0, a); } The register allocator will pass > a0-a4 >directly through from args to the call, leaving only a5-a7 for the load base >address and dest, and only one of those regs is a compressed reg, but we >need two, so these loads can't be compressed. The code still gets rewritten, >resulting in a size increase for the extra add. Not sure if you can do >anything >about that though, since you are doing this before register allocation. > >It isn't clear that the change riscv_address_cost is for. That should be >commented. > >I'd suggest parens in the riscv_compressed_lw_offset_p return statement, >that makes it easier to align the code correctly (in emacs at least). > >The RISCV_MAX_COMPRESSED_LW_OFFSET should be moved down to the >"ISA constants" section of riscv.h, and maybe renamed similar to the other >constants. > >There is a REG_P check in get_si_mem_reg. That should probably handle >SUBREGs too. > >A testcase to verify the patch would be nice. I have one I wrote for testing >that shows some tests that work, some tests that don't work, and some that >work but maybe shouldn't. > >I vaguely remember Micheal Meissner talking about doing something similar >for PPC in a GNU Cauldron maybe 2 years ago. But I don't know if he tried, or >how far he got if he did try. It might be useful to ask him about that work. > >Otherwise this looks OK to me. > >Jim