On Tue, Oct 1, 2024 at 4:47 AM Peter Xu <pet...@redhat.com> wrote:

> On Mon, Sep 30, 2024 at 01:14:28AM +0800, yong.hu...@smartx.com wrote:
> > From: Hyman Huang <yong.hu...@smartx.com>
> >
> > Currently, the convergence algorithm determines that the migration
> > cannot converge according to the following principle:
> > The dirty pages generated in current iteration exceed a specific
> > percentage (throttle-trigger-threshold, 50 by default) of the number
> > of transmissions. Let's refer to this criteria as the "dirty rate".
> > If this criteria is met more than or equal to twice
> > (dirty_rate_high_cnt >= 2), the throttle percentage increased.
> >
> > In most cases, above implementation is appropriate. However, for a
> > VM with high memory overload, each iteration is time-consuming.
> > The VM's computing performance may be throttled at a high percentage
> > and last for a long time due to the repeated confirmation behavior.
> > Which may be intolerable for some computationally sensitive software
> > in the VM.
> >
> > As the comment mentioned in the migration_trigger_throttle function,
> > in order to avoid erroneous detection, the original algorithm confirms
> > the criteria repeatedly. Put differently, the criteria does not need
> > to be validated again once the detection is more reliable.
> >
> > In the refinement, in order to make the detection more accurate, we
> > introduce another criteria, called the "dirty ratio" to determine
> > the migration convergence. The "dirty ratio" is the ratio of
> > bytes_xfer_period and bytes_dirty_period. When the algorithm
> > repeatedly detects that the "dirty ratio" of current sync is lower
> > than the previous, the algorithm determines that the migration cannot
> > converge. For the "dirty rate" and "dirty ratio", if one of the two
> > criteria is met, the penalty percentage would be increased. This
> > makes CPU throttle more responsively and therefor saves the time of
> > the entire iteration and therefore reduces the time of VM performance
> > degradation.
> >
> > In conclusion, this refinement significantly reduces the processing
> > time required for the throttle percentage step to its maximum while
> > the VM is under a high memory load.
>
> I'm a bit lost on why this patch 2-3 is still needed if patch 1 works.
> Wouldn't that greatly increase the chance of throttle code being inovked
> already?  Why we still need this?
>

Indeed, if we are considering how to increase the change of throttle.
Patch 1 is sufficient, and I'm not insisting.

If we are talking about how to detect the migration convergence, this
patch, IMHO, is still helpful. Anyway, it depends on your judgment. :)


>
> Thanks,
>
> --
> Peter Xu
>
>
Yong

-- 
Best regards

Reply via email to