Manolis Tsamis <manolis.tsa...@vrull.eu> writes: > > Assembly like this can appear with bitfields or type punning / unions. > On stress-ng when running the cpu-union microbenchmark the following speedups > have been observed. > > Neoverse-N1: +29.4% > Intel Coffeelake: +13.1% > AMD 5950X: +17.5%
It seems this should have some kind of target hook so that the target can configure what forwards should be avoided. At least in x86 land there is a trend to the hardware handling more and more cases with each generation. Also is there any data what this does to code size? Perhaps it should be only done on hot blocks? And did you see speedups on real applications? -Andi