Hi Peter,

On 2021/3/23 3:45, Peter Xu wrote:
> On Mon, Mar 22, 2021 at 10:02:38PM +0800, Keqian Zhu wrote:
>> Hi Peter,
> 
> Hi, Keqian,
> 
> [...]
> 
>> You emphasize that dirty ring is a "Thread-local buffers", but dirty bitmap 
>> is global,
>> but I don't see it has optimization about "locking" compared to dirty bitmap.
>>
>> The thread-local means that vCPU can flush hardware buffer into dirty ring 
>> without
>> locking, but for bitmap, vCPU can also use atomic set to mark dirty without 
>> locking.
>> Maybe I miss something?
> 
> Yes, the atomic ops guaranteed locking as you said, but afaiu atomics are
> expensive already, since at least on x86 I think it needs to lock the memory
> bus.  IIUC that'll become even slower as cores grow, as long as the cores 
> share
> the memory bus.
> 
> KVM dirty ring is per-vcpu, it means its metadata can be modified locally
> without atomicity at all (but still, we'll need READ_ONCE/WRITE_ONCE to
> guarantee ordering of memory accesses).  It should scale better especially 
> with
> hosts who have lots of cores.
That makes sense to me.

> 
>>
>> The second question is that you observed longer migration time (55s->73s) 
>> when guest
>> has 24G ram and dirty rate is 800M/s. I am not clear about the reason. As 
>> with dirty
>> ring enabled, Qemu can get dirty info faster which means it handles dirty 
>> page more
>> quick, and guest can be throttled which means dirty page is generated 
>> slower. What's
>> the rationale for the longer migration time?
> 
> Because dirty ring is more sensitive to dirty rate, while dirty bitmap is more
Emm... Sorry that I'm very clear about this... I think that higher dirty rate 
doesn't cause
slower dirty_log_sync compared to that of legacy bitmap mode. Besides, higher 
dirty rate
means we may have more full-exit, which can properly limit the dirty rate. So 
it seems that
dirty ring "prefers" higher dirty rate.

> sensitive to memory footprint.  In above 24G mem + 800MB/s dirty rate
> condition, dirty bitmap seems to be more efficient, say, collecting dirty
> bitmap of 24G mem (24G/4K/8=0.75MB) for each migration cycle is fast enough.
> 
> Not to mention that current implementation of dirty ring in QEMU is not
> complete - we still have two more layers of dirty bitmap, so it's actually a
> mixture of dirty bitmap and dirty ring.  This series is more like a POC on
> dirty ring interface, so as to let QEMU be able to run on KVM dirty ring.
> E.g., we won't have hang issue when getting dirty pages since it's totally
> async, however we'll still have some legacy dirty bitmap issues e.g. memory
> consumption of userspace dirty bitmaps are still linear to memory footprint.
The plan looks good and coordinated, but I have a concern. Our dirty ring 
actually depends
on the structure of hardware logging buffer (PML buffer). We can't say it can 
be properly
adapted to all kinds of hardware design in the future.

> 
> Moreover, IMHO another important feature that dirty ring provided is actually
> the full-exit, where we can pause a vcpu when it dirties too fast, while other
I think a proper pause time is hard to decide. Short time may have little effect
of throttle, but long time may have heavy effect on guest. Do you have a good 
algorithm?


> vcpus won't be affected.  That's something I really wanted to POC too but I
> don't have enough time.  I think it's a worth project in the future to really
> make the full-exit throttle vcpus, then ideally we'll remove all the dirty
> bitmaps in QEMU as long as dirty ring is on.
> 
> So I'd say the number I got at that time is not really helping a lot - as you
> can see for small VMs it won't make things faster.  Maybe a bit more 
> efficient?
> I can't tell.  From design-wise it looks actually still better.  However dirty
> logging still has the reasoning to be the default interface we use for small
> vms, imho.
I see.

> 
>>
>> PS: As the dirty ring is still converted into dirty_bitmap of kvm_slot, so 
>> the
>> "get dirty info faster" maybe not true. :-(
> 
> We can get dirty info faster even now, I think, because previously we only do
> KVM_GET_DIRTY_LOG once per migration iteration, which could be tens of seconds
> for a VM mentioned above with 24G and 800MB/s dirty rate.  Dirty ring is fully
> async, we'll get that after the reaper thread timeout.  However I must also
> confess "get dirty info faster" doesn't help us a lot on anything yet, afaict,
> comparing to a full-featured dirty logging where clear dirty log and so on.
OK.

> 
> Hope above helps.
Sure, thanks. :)


Keqian

Reply via email to