On Mon, Sep 30, 2024 at 01:14:28AM +0800, yong.hu...@smartx.com wrote:
> From: Hyman Huang <yong.hu...@smartx.com>
> 
> Currently, the convergence algorithm determines that the migration
> cannot converge according to the following principle:
> The dirty pages generated in current iteration exceed a specific
> percentage (throttle-trigger-threshold, 50 by default) of the number
> of transmissions. Let's refer to this criteria as the "dirty rate".
> If this criteria is met more than or equal to twice
> (dirty_rate_high_cnt >= 2), the throttle percentage increased.
> 
> In most cases, above implementation is appropriate. However, for a
> VM with high memory overload, each iteration is time-consuming.
> The VM's computing performance may be throttled at a high percentage
> and last for a long time due to the repeated confirmation behavior.
> Which may be intolerable for some computationally sensitive software
> in the VM.
> 
> As the comment mentioned in the migration_trigger_throttle function,
> in order to avoid erroneous detection, the original algorithm confirms
> the criteria repeatedly. Put differently, the criteria does not need
> to be validated again once the detection is more reliable.
> 
> In the refinement, in order to make the detection more accurate, we
> introduce another criteria, called the "dirty ratio" to determine
> the migration convergence. The "dirty ratio" is the ratio of
> bytes_xfer_period and bytes_dirty_period. When the algorithm
> repeatedly detects that the "dirty ratio" of current sync is lower
> than the previous, the algorithm determines that the migration cannot
> converge. For the "dirty rate" and "dirty ratio", if one of the two
> criteria is met, the penalty percentage would be increased. This
> makes CPU throttle more responsively and therefor saves the time of
> the entire iteration and therefore reduces the time of VM performance
> degradation.
> 
> In conclusion, this refinement significantly reduces the processing
> time required for the throttle percentage step to its maximum while
> the VM is under a high memory load.

I'm a bit lost on why this patch 2-3 is still needed if patch 1 works.
Wouldn't that greatly increase the chance of throttle code being inovked
already?  Why we still need this?

Thanks,

-- 
Peter Xu


Reply via email to