On Tue, Oct 1, 2024 at 4:47 AM Peter Xu <pet...@redhat.com> wrote: > On Mon, Sep 30, 2024 at 01:14:28AM +0800, yong.hu...@smartx.com wrote: > > From: Hyman Huang <yong.hu...@smartx.com> > > > > Currently, the convergence algorithm determines that the migration > > cannot converge according to the following principle: > > The dirty pages generated in current iteration exceed a specific > > percentage (throttle-trigger-threshold, 50 by default) of the number > > of transmissions. Let's refer to this criteria as the "dirty rate". > > If this criteria is met more than or equal to twice > > (dirty_rate_high_cnt >= 2), the throttle percentage increased. > > > > In most cases, above implementation is appropriate. However, for a > > VM with high memory overload, each iteration is time-consuming. > > The VM's computing performance may be throttled at a high percentage > > and last for a long time due to the repeated confirmation behavior. > > Which may be intolerable for some computationally sensitive software > > in the VM. > > > > As the comment mentioned in the migration_trigger_throttle function, > > in order to avoid erroneous detection, the original algorithm confirms > > the criteria repeatedly. Put differently, the criteria does not need > > to be validated again once the detection is more reliable. > > > > In the refinement, in order to make the detection more accurate, we > > introduce another criteria, called the "dirty ratio" to determine > > the migration convergence. The "dirty ratio" is the ratio of > > bytes_xfer_period and bytes_dirty_period. When the algorithm > > repeatedly detects that the "dirty ratio" of current sync is lower > > than the previous, the algorithm determines that the migration cannot > > converge. For the "dirty rate" and "dirty ratio", if one of the two > > criteria is met, the penalty percentage would be increased. This > > makes CPU throttle more responsively and therefor saves the time of > > the entire iteration and therefore reduces the time of VM performance > > degradation. > > > > In conclusion, this refinement significantly reduces the processing > > time required for the throttle percentage step to its maximum while > > the VM is under a high memory load. > > I'm a bit lost on why this patch 2-3 is still needed if patch 1 works. > Wouldn't that greatly increase the chance of throttle code being inovked > already? Why we still need this? >
Indeed, if we are considering how to increase the change of throttle. Patch 1 is sufficient, and I'm not insisting. If we are talking about how to detect the migration convergence, this patch, IMHO, is still helpful. Anyway, it depends on your judgment. :) > > Thanks, > > -- > Peter Xu > > Yong -- Best regards