Am 21.03.2015 um 07:27 schrieb Emilio G. Cota:
This brings down the size of the struct from 56 to 48 bytes.

Signed-off-by: Emilio G. Cota <c...@braap.org>
---
  tcg/tcg.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index add7f75..3276924 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -429,8 +429,8 @@ typedef struct TCGTemp {
      int val_type;
      int reg;
      tcg_target_long val;
-    int mem_reg;
      intptr_t mem_offset;
+    int mem_reg;
      unsigned int fixed_reg:1;
      unsigned int mem_coherent:1;
      unsigned int mem_allocated:1;

Reviewed-by: Stefan Weil <s...@weilnetz.de>

TCGContext includes an array of TCGTemp, so it is even reduced by 4 KiB (good for caching),
and tcg.o now uses 55364 instead of 56116 bytes (maybe faster, too).

Further optimizations are possible. TCGTemp can be reduced to 32 bytes as the output
of pahole shows:

struct TCGTemp {
        TCGTempVal                 val_type:8; /*     0:24  4 */
        unsigned int               reg:8; /*     0:16  4 */
        unsigned int               mem_reg:8; /*     0: 8  4 */

        /* Bitfield combined with next fields */

        _Bool                      fixed_reg:1; /*     3: 7  1 */
        _Bool                      mem_coherent:1; /*     3: 6  1 */
        _Bool                      mem_allocated:1; /*     3: 5  1 */
        _Bool                      temp_local:1; /*     3: 4  1 */
        _Bool                      temp_allocated:1; /*     3: 3  1 */

        /* XXX 3 bits hole, try to pack */

        TCGType                    base_type:16; /*     4:16  4 */
        TCGType                    type:16; /*     4: 0  4 */
        tcg_target_long            val; /*     8     8 */
        intptr_t                   mem_offset; /*    16     8 */
        const char  *              name; /*    24     8 */

        /* size: 32, cachelines: 1, members: 13 */
        /* bit holes: 1, sum bit holes: 3 bits */
        /* last cacheline: 32 bytes */
};

Here I used a new enum type for val_type and reduced some values to 8 or 16 bit.
I also put the two most often used values at the beginning, so they can be
addressed without or with a small offset ("often" in the code, no runtime
data available).

Are such optimizations useful?

Stefan


Reply via email to