On 03/23/2015 02:42 PM, Stefan Weil wrote: > Further optimizations are possible. TCGTemp can be reduced to 32 bytes as the > output > of pahole shows: > > struct TCGTemp { > TCGTempVal val_type:8; /* 0:24 4 */
Need only be 2 bits. > unsigned int reg:8; /* 0:16 4 */ > unsigned int mem_reg:8; /* 0: 8 4 */ Need only be 6 (ia64) bits, but an aligned 8-bit slot probably performs best. > > /* Bitfield combined with next fields */ > > _Bool fixed_reg:1; /* 3: 7 1 */ > _Bool mem_coherent:1; /* 3: 6 1 */ > _Bool mem_allocated:1; /* 3: 5 1 */ > _Bool temp_local:1; /* 3: 4 1 */ > _Bool temp_allocated:1; /* 3: 3 1 */ > > /* XXX 3 bits hole, try to pack */ > > TCGType base_type:16; /* 4:16 4 */ > TCGType type:16; /* 4: 0 4 */ Need only be 1 bit, honestly, but 2 bits might be easier to arrange. Anyway, you're down to 23 bits from the word, or 16 bytes on a 32-bit host. It's no better than the 32 bytes you got for a 64-bit host though. > tcg_target_long val; /* 8 8 */ > intptr_t mem_offset; /* 16 8 */ > const char * name; /* 24 8 */ > > /* size: 32, cachelines: 1, members: 13 */ > /* bit holes: 1, sum bit holes: 3 bits */ > /* last cacheline: 32 bytes */ > }; > > Here I used a new enum type for val_type and reduced some values to 8 or 16 > bit. > I also put the two most often used values at the beginning, so they can be > addressed without or with a small offset ("often" in the code, no runtime > data available). > > Are such optimizations useful? Yes, I think so. Especially because of the rather large arrays we build. r~