Peter Xu <[email protected]> writes:

> Commit dd4fe8844b changed the reporting of expected downtime behavior, so
> that the value will be calculated on-demand.  One side effect on the change
> is QEMU will allow the calculation to happen anytime even if there's no
> transfer happening for a short while.
>
> PeterM reported an ubsan report from clang when running migration-test with
> aarch64 binary on x86_64 hosts.  I can also reproduce if I run the test
> concurrently so some of the src QEMU may not get chance to push any data,
> causing mbps to be 0:
>
> ../migration/migration.c:1051:12: runtime error: -nan is outside the range of 
> representable values of type 'long'
>
> Fix it by properly handle both Inf and Nan to return INT64_MAX.
>
> Add a rich comment, per PeterM's suggestion.
>
> Link: 
> https://lore.kernel.org/r/CAFEAcA-MYH6C39xO0OLx4-M5pKurJpurwRsMqZe9q=w-nsh...@mail.gmail.com
> Reported-by: Peter Maydell <[email protected]>
> Fixes: dd4fe8844b ("migration: Calculate expected downtime on demand")
> Signed-off-by: Peter Xu <[email protected]>
> ---
>  migration/migration.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index b6f78eb3ac..05f10e4576 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -63,6 +63,7 @@
>  #include "system/dirtylimit.h"
>  #include "qemu/sockets.h"
>  #include "system/kvm.h"
> +#include "math.h"
>  
>  #define NOTIFIER_ELEM_INIT(array, elem)    \
>      [elem] = NOTIFIER_WITH_RETURN_LIST_INITIALIZER((array)[elem])
> @@ -1044,12 +1045,29 @@ static bool migrate_show_downtime(MigrationState *s)
>  /* Return expected downtime (unit: milliseconds) */
>  int64_t migration_downtime_calc_expected(MigrationState *s)
>  {
> +    double expected_ms;
> +
>      if (mig_stats.dirty_sync_count <= 1) {
>          return migrate_downtime_limit();
>      }
>  
> -    return mig_stats.dirty_bytes_last_sync /
> +    expected_ms = mig_stats.dirty_bytes_last_sync /
>          migration_get_switchover_bw(s) * 1000;
> +
> +    /*
> +     * If we haven't been able to transfer any data, the result here could
> +     * be NaN (for 0 / 0) or infinity (something else / 0).
> +     *
> +     * Return INT64_MAX as our best approximation to "this will take
> +     * forever to complete". If the problem is transient (e.g. we just
> +     * haven't started to transfer yet) we'll recalculate to a more
> +     * accurate figure later.
> +     */
> +    if (isnan(expected_ms) || expected_ms >= (double)INT64_MAX) {
> +        return INT64_MAX;
> +    }
> +
> +    return (int64_t) expected_ms;
>  }
>  
>  static void populate_time_info(MigrationInfo *info, MigrationState *s)

Reviewed-by: Fabiano Rosas <[email protected]>

Reply via email to