On 01/24/14 02:52, Qiao Nuohan wrote:
> On 01/23/2014 01:04 AM, Laszlo Ersek wrote:
>>> @@ -864,6 +884,16 @@ static int dump_init(DumpState *s, int fd, bool
>>> paging, bool has_filter,
>>> >           
>>> qemu_get_guest_simple_memory_mapping(&s->list,&s->guest_phys_blocks);
>>> >        }
>>> >
>>> >  +    s->nr_cpus = nr_cpus;
>>> >  +    s->page_size = TARGET_PAGE_SIZE;
>>> >  +    s->page_shift = ffs(s->page_size) - 1;
>>> >  +
>>> >  +    get_max_mapnr(s);
>> Again from v6 10/11, good. The flag_flatten assignment has been dropped.
>> Initialization seems to happen in a good spot this time too.
>>
>>> >  +
>>> >  +    uint64_t tmp;
>>> >  +    tmp = DIV_ROUND_UP(DIV_ROUND_UP(s->max_mapnr, CHAR_BIT),
>>> s->page_size);
>>> >  +    s->len_dump_bitmap = tmp * s->page_size;
>>> >  +
>>> >        if (s->has_filter) {
>>> >            memory_mapping_filter(&s->list, s->begin, s->length);
>>> >        }
>> Again from v6 10/11.
>>
>> These assignments now all occur without depending on a user request for
>> a compressed dump (kept this way in v7 12/13 too), but they are not
>> costly. The loop in get_max_mapnr() iterates over less than 10 mappings
>> in the non-paging dump case, and in the paging dump case it also
>> shouldn't be more than a hundred or so (as I recall from earlier
>> testing). This might be worth some regression-testing (perf-wise), but
>> it looks OK to me.
>>
> 
> I see, moving them into "if(format...) {...}" block would be better. But, I
> have no idea of "regression-testing (perf-wise)", would you mind give
> some hint?

I meant comparing how long it would take to dump in paging mode before
this patchset, vs. after this patchset. In order to see the difference
that is introduced by get_max_mapnr() when paging is enabled.

However, please ignore this point. First, the loop is most probably
negligible even for a paging dump. Second, you could make it conditional
on compressed dumps (which force non-paging + non-filtering), where the
number of mappings is very low. And third, as I wrote in later, the loop
should be replaced anyway with an O(1) QTAILQ_LAST() access.

So please just ignore this "performance" remark.

Ultimately, what I suggest for get_max_mapnr() is:
- rebase it to guest_phys_blocks, just like the other two places (which
are now calling get_next_page()),
- use QTAILQ_LAST() in it,
- don't bother making it conditional (ie. its current call site is
fine), because:
  - It'll be fast with QTAILQ_LAST(),
  - guest_phys_blocks is available in any case, so you can access it
    always

Thanks
Laszlo


Reply via email to