On Wed, Feb 13, 2019 at 20:06:48 -0800, Richard Henderson wrote: > We've talked about this before, caching state to reduce the > amount of computation that happens looking up each TB. > > I know that Peter has been concerned that we would not be able to > reliably maintain all of the places that need to be updates to > keep this up-to-date. > > Well, modulo dirty tricks within linux-user, it appears as if > exception delivery and return, plus after every TB-ending write > to a system register is sufficient. > > There seems to be a noticable improvement, although wall-time > is harder to come by -- all of my system-level measurements > include user input, and my user-level measurements seem to be > too small to matter.
Thanks for this! Some SPEC06int user-mode numbers (before vs. after) aarch64-linux-user speedup for SPEC06int (test set) Host: Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz 2 +-----------------------------------------+ | | 1.9 |-+.........................a+-+r.......+-| | +-+ | | * * | 1.8 |-+..........................*.*........+-| | +-+ * * | 1.7 |-+.....+-+...............+-+*.*...+-+..+-| | * * +-+ * ** * +-+ | 1.6 |-+.....*.*..........|....*.**.*+-+*.*..+-| | * * *|* * ** *+-+* * | 1.5 |-+.....*.*.........*|*...*.**.**.**.*..+-| | * * +-+ * ** ** ** * | | * * * * * ** ** ** * | 1.4 |-+.....*.*.........*.*...*.**.**.**.*+-+-| | * * +-+ * * * ** ** ** ** * | 1.3 |-+.....*.*...+-+...*.*...*.**.**.**.**.*-| | +-+ * * * * * * * ** ** ** ** * | 1.2 |-+-+...*.*...*.*...*.*...*.**.**.**.**.*-| | * * * * * * * * * ** ** ** ** * | | * * * * * *+-+* * * ** ** ** ** * | 1.1 |-*.*...*.*...*.**.**.*...*.**.**.**.**.*-| | * *+-+* *+-+* ** ** *+-+* ** ** ** ** * | 1 +-----------------------------------------+ 400.per401.b40344454462.li464471.483.xalangeomean png: https://imgur.com/RjkYYJ5 That is, a 1.4x average speedup. Emilio