* Igor Redko (red...@virtuozzo.com) wrote: > On 28.09.2015 22:22, Dr. David Alan Gilbert wrote: > >* Denis V. Lunev (d...@openvz.org) wrote: > >>From: Igor Redko <red...@virtuozzo.com> > >> > >>To get this estimation we must divide pending_size by bandwidth > >>according to description of expected-downtime ("qmp-commands.hx:3246"): > >> "expected-downtime": only present while migration is active > >> total amount in ms for downtime that was calculated on > >> the last bitmap round (json-int) > >> > >>Previous version was just wrong because dirty_bytes_rate and bandwidth > >>are measured in Bytes/ms, so dividing first by second we get some > >>dimensionless quantity. > >>As it said in description above this value is showed during active > >>migration phase and recalculated only after transferring all memory > >>and if this process took more than 1 sec. So maybe just nobody noticed > >>that bug. > > > >While I agree the existing code looks wrong, I don't see how this is > >any more correct. > > This patch is aimed to fix units of expected_downtime. It is reasonable that > expected_downtime should be measured in milliseconds. While the existing > implementation lacks of any units.
I agree your units are correct where the old one isn't; and I agree it needs fixing. However I'm worrying about whether the value in your fix is correct. > > I think 'pending_size' is an estimate of the number of bytes left > >to transfer, the intention being that most of those are transferred > >prior to pausing the machine, if those are transferred before pausing > >then they aren't part of the downtime. > > > Yes, 'pending_size' is an estimate of the number of bytes left to transfer, > indeed. > But the condition: > > if (s->dirty_bytes_rate && transferred_bytes > 10000) { > slightly modifies the meaning of pending_size correspondingly. > dirty_bytes_rate is set in migration_sync() that is called when pending_size > < max_downtime * bandwidth. This estimation is higher than max_downtime by > design I don't think that check really modifies the meaning of pending_size; it's just a sanity check that means we don't start trying to predict downtime when we've not transmitted much yet. > >It feels that: > > * If the guest wasn't dirtying pages, then you wouldn't have to > > pause the guest; if it was just dirtying them a little then you > > wouldn't have much to transfer after the pages you'd already > > sent; so if the guest dirty pages fast then the estimate should be > > larger; so 'dirty_bytes_rate' being on top of the fraction feels right. > > > > * If the bandwidth is higher then the estimate should be smaller; so > > 'bandwidth' being on the bottom of the fraction feels right. > > > >Dave > > > The 'expected_downtime' in the existing code takes two types of values: > * positive - dirty_bytes_rate is higher than bandwidth. In this > case migration doesn't complete. > * zero - bandwidth is higher than dirty_bytes_rate. In this case > migration is possible, but we don’t have the downtime value. OK, I don't think that should give zero; my argument being that given each pass throuhg RAM takes some time, you're always going to some proportion of RAM that's dirty. > This patch has some imperfections. But if we would look back into history, > it seems that this patch just restores the broken logic. > The existing code is introduced by commit > https://github.com/qemu/qemu/commit/90f8ae724a575861f093fbdbfd49a925bcfec327 > which claims, that it just restores the mistakenly deleted estimation > (commit > https://github.com/qemu/qemu/commit/e4ed1541ac9413eac494a03532e34beaf8a7d1c5) > Meanwhile, the estimation has changed during this restore operation. The > estimation before the removal (before > e4ed1541ac9413eac494a03532e34beaf8a7d1c5) was just like the one in my patch. Yes; remember I believe that the current code is wrong - I'm just not sure your suggested fix is right. > So maybe we should think about improvement of this estimation. > I'm suggest using something like: > expected_downtime = migrate_max_downtime * dirty_bytes_rate / bandwidth > > In my opinion this is more correct than the existing approach since the last > step of the migration process (before pause) is transferring of max_size > bytes (max_size = bandwidth * migrate_max_downtime() / 1000000). So the > bytes that were dirtied at this step will be transferred during downtime. > The transferred bytes count is dirty_bytes_rate * max_size/bandwidth or > migrate_max_downtime * dirty_bytes_rate and division by the bandwidth > results in a formula: > expected_downtime = migrate_max_downtime * dirty_bytes_rate / bandwidth Are you sure - I thought migrate_max_downtime is governing the time *after* pause; and it doesn't make sense to me for the expectation of the system to be related to the expectation of the user. Dave > Igor > > >>Signed-off-by: Igor Redko <red...@virtuozzo.com> > >>Reviewed-by: Anna Melekhova <an...@virtuozzo.com> > >>Signed-off-by: Denis V. Lunev <d...@openvz.org> > >>CC: Juan Quintela <quint...@redhat.com> > >>CC: Amit Shah <amit.s...@redhat.com> > >>--- > >> migration/migration.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >>diff --git a/migration/migration.c b/migration/migration.c > >>index 662e77e..d55d545 100644 > >>--- a/migration/migration.c > >>+++ b/migration/migration.c > >>@@ -994,7 +994,7 @@ static void *migration_thread(void *opaque) > >> /* if we haven't sent anything, we don't want to recalculate > >> 10000 is a small enough number for our purposes */ > >> if (s->dirty_bytes_rate && transferred_bytes > 10000) { > >>- s->expected_downtime = s->dirty_bytes_rate / bandwidth; > >>+ s->expected_downtime = pending_size / bandwidth; > >> } > >> > >> qemu_file_reset_rate_limit(s->file); > >>-- > >>2.1.4 > >> > >> > >-- > >Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK