On Sun, Nov 05, 2006 at 10:38:20AM -0500, Daniel Jacobowitz wrote: > On Mon, Mar 06, 2006 at 02:59:29PM +0000, Thiemo Seufer wrote: > > Hello All, > > > > this patch vastly improves TLB performance on MIPS, and probably also > > on other architectures. I measured a Linux boot-shutdown cycle, > > including userland init. > > Quoting the whole message since this is from March... > > I don't remember seeing any followup discussion of this patch, but I > may have missed it. Thiemo's definitely right about "vastly". Is this > patch appropriate, or would anyone care to suggest a more > sophisticated data structure to avoid the full cache invalidate?
This patch is an even nicer alternative, I think. I benchmarked four alternatives (several times each): Straight qemu with my previously posted MIPS patches takes 6:13 to start and reboot a MIPS userspace (through init, so lots of fork/exec). Thiemo's patch, which flushes the whole jump buffer, cuts it to 1:40. A patch which finds the entries which need to be flushed more efficiently cuts it to 1:21. A patch which flushes up to 1/32nd of the jump buffer indiscriminately cuts it to 1:11-1:13. Here's that last patch. It changes the hash function so that entries from a particular page are always grouped together in tb_jmp_cache, then finds the possibly two affected ranges and memsets them clear. Thoughts? Is this acceptable, where else should it be tested besides MIPS? I haven't fine-tuned the numbers; it currently allows for max 64 cached jump targets per target page, but that could be made higher or lower. -- Daniel Jacobowitz CodeSourcery --- cpu-defs.h | 5 +++++ exec-all.h | 12 +++++++++++- exec.c | 15 +++++++-------- 3 files changed, 23 insertions(+), 9 deletions(-) Index: qemu/cpu-defs.h =================================================================== --- qemu.orig/cpu-defs.h 2006-11-11 15:12:26.000000000 -0500 +++ qemu/cpu-defs.h 2006-11-11 15:12:33.000000000 -0500 @@ -80,6 +80,11 @@ typedef unsigned long ram_addr_t; #define TB_JMP_CACHE_BITS 12 #define TB_JMP_CACHE_SIZE (1 << TB_JMP_CACHE_BITS) +#define TB_JMP_PAGE_BITS (TB_JMP_CACHE_BITS / 2) +#define TB_JMP_PAGE_SIZE (1 << TB_JMP_PAGE_BITS) +#define TB_JMP_ADDR_MASK (TB_JMP_PAGE_SIZE - 1) +#define TB_JMP_PAGE_MASK (TB_JMP_ADDR_MASK << TB_JMP_PAGE_BITS) + #define CPU_TLB_BITS 8 #define CPU_TLB_SIZE (1 << CPU_TLB_BITS) Index: qemu/exec-all.h =================================================================== --- qemu.orig/exec-all.h 2006-11-11 15:12:26.000000000 -0500 +++ qemu/exec-all.h 2006-11-11 19:56:36.000000000 -0500 @@ -196,9 +196,19 @@ typedef struct TranslationBlock { struct TranslationBlock *jmp_first; } TranslationBlock; +static inline unsigned int tb_jmp_cache_hash_page(target_ulong pc) +{ + target_ulong tmp; + tmp = pc ^ (pc >> (TARGET_PAGE_BITS - TB_JMP_PAGE_BITS)); + return (tmp >> TB_JMP_PAGE_BITS) & TB_JMP_PAGE_MASK; +} + static inline unsigned int tb_jmp_cache_hash_func(target_ulong pc) { - return (pc ^ (pc >> TB_JMP_CACHE_BITS)) & (TB_JMP_CACHE_SIZE - 1); + target_ulong tmp; + tmp = pc ^ (pc >> (TARGET_PAGE_BITS - TB_JMP_PAGE_BITS)); + return (((tmp >> TB_JMP_PAGE_BITS) & TB_JMP_PAGE_MASK) | + (tmp & TB_JMP_ADDR_MASK)); } static inline unsigned int tb_phys_hash_func(unsigned long pc) Index: qemu/exec.c =================================================================== --- qemu.orig/exec.c 2006-11-11 15:12:26.000000000 -0500 +++ qemu/exec.c 2006-11-11 19:39:45.000000000 -0500 @@ -1299,14 +1299,13 @@ void tlb_flush_page(CPUState *env, targe tlb_flush_entry(&env->tlb_table[0][i], addr); tlb_flush_entry(&env->tlb_table[1][i], addr); - for(i = 0; i < TB_JMP_CACHE_SIZE; i++) { - tb = env->tb_jmp_cache[i]; - if (tb && - ((tb->pc & TARGET_PAGE_MASK) == addr || - ((tb->pc + tb->size - 1) & TARGET_PAGE_MASK) == addr)) { - env->tb_jmp_cache[i] = NULL; - } - } + /* Discard jump cache entries for any tb which might potentially + overlap the flushed page. */ + i = tb_jmp_cache_hash_page(addr - TARGET_PAGE_SIZE); + memset (&env->tb_jmp_cache[i], 0, TB_JMP_PAGE_SIZE * sizeof(tb)); + + i = tb_jmp_cache_hash_page(addr); + memset (&env->tb_jmp_cache[i], 0, TB_JMP_PAGE_SIZE * sizeof(tb)); #if !defined(CONFIG_SOFTMMU) if (addr < MMAP_AREA_END) _______________________________________________ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel