Hi Marc, Catalin,

I'd like to propose a new dirty tracking solution that combines hardware based
dirty tracking (DBM) and software based dirty tracking (memory abort). Hope to
get your opinions about it.

The core idea is that we do not enable hardware dirty at start (do not add DBM 
bit).
When a arbitrary PT occurs fault, we execute soft tracking for this PT and 
enable
hardware tracking for its *nearby* PTs (e.g. Add DBM bit for nearby 16PTs). 
Then when
sync dirty log, we have known all PTs with hardware dirty enabled, so we do not 
need
to scan all PTs.

We may worry that when dirty rate is over-high we still need to scan too much 
PTs.
We mainly concern the VM stop time. With Qemu dirty rate throttling, the dirty 
memory
is closing to the VM stop threshold, so there is a little PTs to scan after VM 
stop.

It has the advantages of hardware tracking that minimizes side effect on vCPU,
and also has the advantages of software tracking that controls vCPU dirty rate.
Moreover, software tracking helps us to scan PTs at some fixed points, which
greatly reduces scanning time. And the biggest benefit is that we can apply this
solution for dma dirty tracking.

Thanks,
Keqian


On 2020/7/28 15:52, Marc Zyngier wrote:
> On 2020-07-28 03:11, zhukeqian wrote:
>> Hi Marc,
> 
> [...]
> 
>>> But you are still reading the leaf entries of the PTs, hence defeating
>>> any sort of prefetch that the CPU could do for you. And my claim is
>>> that reading the bitmap is much faster than parsing the PTs. Are you
>>> saying that this isn't the case?
>> I am confused here. MMU DBM just updates the S2AP[1] of PTs, so dirty
>> information
>> is not continuous. The smallest granularity of read instruction is one
>> byte, we must
>> read one byte of each PTE to determine whether it is dirty. So I think
>> the smallest
>> reading amount is 512 bytes per 2MB.
> 
> Which is why using DBM as a way to implement dirty-logging doesn't work.
> Forcing the vcpu to take faults in order to update the dirty bitmap
> has the benefit of (a) telling you exactly what page has been written to,
> (b) *slowing the vcpu down*.
> 
> See? no additional read, better convergence ratio because you are not
> trying to catch up with a vcpu running wild. You are in control of the
> dirtying rate, not the vcpu, and the information you get requires no
> extra work (just set the corresponding bit in the dirty bitmap).
> 
> Honestly, I think you are looking at the problem the wrong way.
> DBM at S2 is not a solution to anything, because the information is
> way too sparse, and  it doesn't solve the real problem, which is
> the tracking of dirty pages caused by devices.
> 
> As I said twice before, I will not consider these patches without
> a solution to the DMA problem, and I don't think DBM is part of
> that solution.
> 
> Thanks,
> 
>         M.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Reply via email to