On 20/02/2016 03:35, Gonglei wrote:
> Perf top tells me qemu_get_ram_ptr consume too much cpu cycles.
>> 22.56%  qemu-kvm                 [.] address_space_translate
>>  13.29%  qemu-kvm                 [.] qemu_get_ram_ptr
>>   4.71%  qemu-kvm                 [.] phys_page_find
>>   4.43%  qemu-kvm                 [.] address_space_translate_internal
>>   3.47%  libpthread-2.19.so       [.] __pthread_mutex_unlock_usercnt
>>   3.08%  qemu-kvm                 [.] qemu_ram_addr_from_host
>>   2.62%  qemu-kvm                 [.] address_space_map
>>   2.61%  libc-2.19.so             [.] _int_malloc
>>   2.58%  libc-2.19.so             [.] _int_free
>>   2.38%  libc-2.19.so             [.] malloc
>>   2.06%  libpthread-2.19.so       [.] pthread_mutex_lock
>>   1.68%  libc-2.19.so             [.] malloc_consolidate
>>   1.35%  libc-2.19.so             [.] __memcpy_sse2_unaligned
>>   1.23%  qemu-kvm                 [.] lduw_le_phys
>>   1.18%  qemu-kvm                 [.] find_next_zero_bit
>>   1.02%  qemu-kvm                 [.] object_unref
> 
> And Paolo suggested that we can get rid of qemu_get_ram_ptr
> by storing the RAMBlock pointer into the memory region,
> instead of the ram_addr_t value. And after appling this change,
> I got much better performance indeed.

What's the gain like?

I've not reviewed the patch in depth, but what I can say is that I like
it a lot.  It only does the bare minimum needed to provide the
optimization, but this also makes it very simple to understand.  More
cleanups and further optimizations are possible (including removing
mr->ram_addr completely), but your patches really does one thing and
does it well.  Good job!

Paolo

Reply via email to