On Mon, Mar 06, 2006 at 02:59:29PM +0000, Thiemo Seufer wrote: > Hello All, > > this patch vastly improves TLB performance on MIPS, and probably also > on other architectures. I measured a Linux boot-shutdown cycle, > including userland init.
Quoting the whole message since this is from March... I don't remember seeing any followup discussion of this patch, but I may have missed it. Thiemo's definitely right about "vastly". Is this patch appropriate, or would anyone care to suggest a more sophisticated data structure to avoid the full cache invalidate? > > With minimal jump cache invalidation: > > real 11m43.429s > user 9m51.975s > sys 0m1.375s > > 64.19 1476.81 1476.81 20551904 0.00 0.00 tlb_flush_page > 6.72 1631.36 154.55 184346 0.00 0.00 cpu_mips_exec > 4.35 1731.46 100.10 3550500 0.00 0.00 dyngen_code > 3.66 1815.77 84.31 90897893 0.00 0.00 decode_opc > 2.89 1882.21 66.44 11170487 0.00 0.00 > gen_intermediate_code_internal > 1.72 1921.80 39.59 29919267 0.00 0.00 map_address > 1.52 1956.66 34.86 7619987 0.00 0.00 tb_find_pc > 0.96 1978.85 22.19 26361969 0.00 0.00 tlb_set_page_exec > 0.96 2000.84 21.99 __ldl_mmu > 0.90 2021.59 20.75 27279747 0.00 0.00 gen_arith_imm > > > With global jump cache kill: > > real 6m19.811s > user 4m23.650s > sys 0m0.617s > > 21.67 188.78 188.78 146571 0.00 0.00 cpu_mips_exec > 11.37 287.88 99.10 3393051 0.00 0.00 dyngen_code > 9.59 371.45 83.57 89839869 0.00 0.00 decode_opc > 7.68 438.33 66.88 10989930 0.00 0.00 > gen_intermediate_code_internal > 4.24 475.26 36.93 30124659 0.00 0.00 map_address > 3.80 508.33 33.07 7596879 0.00 0.00 tb_find_pc > 2.74 532.22 23.89 27781692 0.00 0.00 tlb_set_page_exec > 2.62 555.02 22.80 39891573 0.00 0.00 > cpu_mips_handle_mmu_fault > 2.55 577.25 22.23 __ldl_mmu > 2.30 597.26 20.01 26968709 0.00 0.00 gen_arith_imm > > > Thiemo > > > Index: qemu-work/exec.c > =================================================================== > --- qemu-work.orig/exec.c 2006-03-06 01:30:09.000000000 +0000 > +++ qemu-work/exec.c 2006-03-06 01:30:28.000000000 +0000 > @@ -1247,7 +1247,6 @@ > void tlb_flush_page(CPUState *env, target_ulong addr) > { > int i; > - TranslationBlock *tb; > > #if defined(DEBUG_TLB) > printf("tlb_flush_page: " TARGET_FMT_lx "\n", addr); > @@ -1261,14 +1260,10 @@ > tlb_flush_entry(&env->tlb_table[0][i], addr); > tlb_flush_entry(&env->tlb_table[1][i], addr); > > - for(i = 0; i < TB_JMP_CACHE_SIZE; i++) { > - tb = env->tb_jmp_cache[i]; > - if (tb && > - ((tb->pc & TARGET_PAGE_MASK) == addr || > - ((tb->pc + tb->size - 1) & TARGET_PAGE_MASK) == addr)) { > - env->tb_jmp_cache[i] = NULL; > - } > - } > + /* We throw away the jump cache altogether. This is cheaper than > + trying to be smart by invalidating only the entries in the > + affected address range. */ > + memset (env->tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *)); > > #if !defined(CONFIG_SOFTMMU) > if (addr < MMAP_AREA_END) > > > _______________________________________________ > Qemu-devel mailing list > Qemu-devel@nongnu.org > http://lists.nongnu.org/mailman/listinfo/qemu-devel > -- Daniel Jacobowitz CodeSourcery _______________________________________________ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel