[Bug target/31640] cache block alignment is too aggressive on sh-elf

oleg.e...@t-online.de Sat, 31 Dec 2011 09:25:12 -0800

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31640


--- Comment #3 from Oleg Endo <oleg.e...@t-online.de> 2011-12-31 17:24:47 UTC 
---
Created attachment 26208
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26208
Proposed patch

(In reply to comment #0)
> The sh4 port aligns blocks that have no fallthrus and that are either
> frequently executed (JUMP_ALIGN) or preceeded a barrier
> (LABEL_ALIGN_AFTER_BARRIER) on a cache line.
> 
> While in theory this help to avoid cache misses if the block slits over 2 
> cache
> lines, in practise this reduces cache locality and lenghten distance between
> blocks.
> The number of issued instructions are also impacted. For example the relative
> indirect address in jump tables needs a byte zero extend instruction if the
> distance occupies 8 bits instead of 7 bits. 
> 
> I ran some experiments and benchmarked (eembc) with 2 strategies
> 1) -falign-jumps=1
> 2) Align the block if the size is bigger than a given threshold. (empirically
> set to 16 bytes, half of the cache line size). See illustrating attached 
> patch.
> 
> My conclusion is that in -O3 the performance never degrades (option 2 is a
> little bit better, even improving dhrystone by 3%) when removing this padding.
> And the text size improves by ~15%.

Because of this I would like to propose the following alignment strategies
(unless they are changed by the user with -falign-??? options).

-Os:
  Align everything to 2 byte to get compact code

-O2,-O3:
  Align functions to 4 bytes.
  Align labels and jumps to 2 bytes (to avoid potential code bloat).
  Align loops to 4 bytes.

The attached patch should do that, although not fully tested yet.

[Bug target/31640] cache block alignment is too aggressive on sh-elf

Reply via email to