On Fri, Aug 04, 2017 at 10:52:27AM +0100, Dr. David Alan Gilbert wrote: > * Peter Xu (pet...@redhat.com) wrote: > > On Thu, Aug 03, 2017 at 02:54:35PM +0100, Dr. David Alan Gilbert wrote:
[...] > > > > @@ -2319,6 +2327,7 @@ static void *migration_thread(void *opaque) > > > > /* The active state we expect to be in; ACTIVE or POSTCOPY_ACTIVE > > > > */ > > > > enum MigrationStatus current_active_state = > > > > MIGRATION_STATUS_ACTIVE; > > > > bool enable_colo = migrate_colo_enabled(); > > > > + MigThrError thr_error; > > > > > > > > rcu_register_thread(); > > > > > > > > @@ -2395,8 +2404,17 @@ static void *migration_thread(void *opaque) > > > > * Try to detect any kind of failures, and see whether we > > > > * should stop the migration now. > > > > */ > > > > - if (migration_detect_error(s)) { > > > > + thr_error = migration_detect_error(s); > > > > + if (thr_error == MIG_THR_ERR_FATAL) { > > > > + /* Stop migration */ > > > > break; > > > > + } else if (thr_error == MIG_THR_ERR_RECOVERED) { > > > > + /* > > > > + * Just recovered from a e.g. network failure, reset all > > > > + * the local variables. > > > > + */ > > > > + initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); > > > > + initial_bytes = 0; > > > > > > They don't seem that important to reset? > > > > The problem is that we have this in migration_thread(): > > > > if (current_time >= initial_time + BUFFER_DELAY) { > > uint64_t transferred_bytes = qemu_ftell(s->to_dst_file) - > > initial_bytes; > > uint64_t time_spent = current_time - initial_time; > > double bandwidth = (double)transferred_bytes / time_spent; > > threshold_size = bandwidth * s->parameters.downtime_limit; > > ... > > } > > > > Here qemu_ftell() would possibly be very small since we have just > > resumed... and then transferred_bytes will be extremely huge since > > "qemu_ftell(s->to_dst_file) - initial_bytes" is actually negative... > > Then, with luck, we'll got extremely huge "bandwidth" as well. > > Ah yes that's a good reason to reset it then; add a comment like > 'important to avoid breaking transferred_bytes and bandwidth > calculation' Will do. -- Peter Xu