On Aug 22, 2015 9:45 AM, Artyom Tarasenko <atar4q...@gmail.com> wrote: > For my test case tcg-indirect brings more performance gain than for Dennis: > > git master: 18m31s > tcg-indirect: 16m50s > #undef USE_TCG_OPTIMIZATIONS: 14m18s
Thanks. That's useful. > > > JIT statistic, before starting the test: > (qemu) info jit > Translation buffer state: > gen code size 31851136/314448896 > TB count 128224/2457592 > TB avg target size 18 max=704 bytes > TB avg host size 248 bytes (expansion ratio: 13.4) > cross page TB count 0 (0%) > direct jump count 83840 (65%) (2 jumps=64730 50%) > > Statistics: > TB flush count 5 > TB invalidate count 317160 > TLB flush count 1180769 > [TCG profiler not compiled] > > After > (qemu) info jit > Translation buffer state: > gen code size 282903344/314448896 > TB count 1139744/2457592 > TB avg target size 17 max=704 bytes > TB avg host size 248 bytes (expansion ratio: 14.0) > cross page TB count 0 (0%) > direct jump count 739828 (64%) (2 jumps=569074 49%) > > Statistics: > TB flush count 5 > TB invalidate count 324362 > TLB flush count 2050744 > > So, TB invalidate count gained only ~ 5000. > Yet tcg_optimize is ~7% in the perf top, and tcg_liveness_analysis > ~3%. Why do we translate so much? I don't know. It must be something SPARC specific, as I don't see so much for alpha. I'll try to think of good places to collect data. r~