Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen
On 1/16/19 10:29 AM, Juergen Gross wrote: > On 16/01/2019 16:07, Boris Ostrovsky wrote: >> On 1/16/19 9:33 AM, Juergen Gross wrote: >>> On 16/01/2019 14:17, Boris Ostrovsky wrote: On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote: > @@ -1650,13 +1650,14 @@ void xen_callback_vector(void) > xen_have_vector_callback = 0; > return; > } > - pr_info("Xen HVM callback vector for event delivery is > enabled\n"); > + if (!silent) > + pr_info("Xen HVM callback vector for event > delivery is enabled\n"); How about replacing pr_info() with pr_info_once()? >>> What a nice and simple idea! >>> >>> Extra patch or V4? >>> >> >> I can add this while committing, I don't think it's worth a whole new patch. >> >> One outstanding question I have is whether anything needs to be added to >> the commit message (Thomas had some questions) > He didn't react to my explanation. I'm interpreting that as him being > fine with my explanation, which I believe is not suitable to be added > to the commit message. OK. Applied to (now properly named) for-linus-5.0 -boris
Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen
On 16/01/2019 16:07, Boris Ostrovsky wrote: > On 1/16/19 9:33 AM, Juergen Gross wrote: >> On 16/01/2019 14:17, Boris Ostrovsky wrote: >>> On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote: >>> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void) xen_have_vector_callback = 0; return; } - pr_info("Xen HVM callback vector for event delivery is enabled\n"); + if (!silent) + pr_info("Xen HVM callback vector for event delivery is enabled\n"); >>> How about replacing pr_info() with pr_info_once()? >> What a nice and simple idea! >> >> Extra patch or V4? >> > > > I can add this while committing, I don't think it's worth a whole new patch. > > One outstanding question I have is whether anything needs to be added to > the commit message (Thomas had some questions) He didn't react to my explanation. I'm interpreting that as him being fine with my explanation, which I believe is not suitable to be added to the commit message. Juergen
Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen
On 1/16/19 9:33 AM, Juergen Gross wrote: > On 16/01/2019 14:17, Boris Ostrovsky wrote: >> On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote: >> >>> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void) >>> xen_have_vector_callback = 0; >>> return; >>> } >>> - pr_info("Xen HVM callback vector for event delivery is >>> enabled\n"); >>> + if (!silent) >>> + pr_info("Xen HVM callback vector for event >>> delivery is enabled\n"); >> How about replacing pr_info() with pr_info_once()? > What a nice and simple idea! > > Extra patch or V4? > I can add this while committing, I don't think it's worth a whole new patch. One outstanding question I have is whether anything needs to be added to the commit message (Thomas had some questions) -boris
Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen
On 16/01/2019 14:17, Boris Ostrovsky wrote: > On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote: > >> @@ -1650,13 +1650,14 @@ void xen_callback_vector(void) >> xen_have_vector_callback = 0; >> return; >> } >> - pr_info("Xen HVM callback vector for event delivery is >> enabled\n"); >> + if (!silent) >> + pr_info("Xen HVM callback vector for event >> delivery is enabled\n"); > > How about replacing pr_info() with pr_info_once()? What a nice and simple idea! Extra patch or V4? Juergen
Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen
On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote: > @@ -1650,13 +1650,14 @@ void xen_callback_vector(void) > xen_have_vector_callback = 0; > return; > } > - pr_info("Xen HVM callback vector for event delivery is > enabled\n"); > + if (!silent) > + pr_info("Xen HVM callback vector for event > delivery is enabled\n"); How about replacing pr_info() with pr_info_once()? -boris
Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen
On 16/01/2019 01:24, Hans van Kranenburg wrote: > Hi, > > On 1/14/19 1:44 PM, Juergen Gross wrote: >> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' >> sched_clock() interface") broke Xen guest time handling across >> migration: >> >> [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) >> done. >> [ 187.251137] OOM killer disabled. >> [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 >> seconds) done. >> [ 187.252299] suspending xenstore... >> [ 187.266987] xen:grant_table: Grant tables using version 1 layout >> [18446743811.706476] OOM killer enabled. >> [18446743811.706478] Restarting tasks ... done. >> [18446743811.720505] Setting capacity to 16777216 >> >> Fix that by setting xen_sched_clock_offset at resume time to ensure a >> monotonic clock value. >> >> [...] > > With v3 of the patch, I see the time jump in one log line happen, but > only when using PVH. > > [ 49.486453] Freezing user space processes ... (elapsed 0.002 seconds) > done. > [ 49.488743] OOM killer disabled. > [ 49.488764] Freezing remaining freezable tasks ... (elapsed 0.001 > seconds) done. > [ 49.491117] suspending xenstore... > [2000731.388722] xen:events: Xen HVM callback vector for event delivery > is enabled > [ 49.491750] xen:grant_table: Grant tables using version 1 layout > [ 49.810722] OOM killer enabled. > [ 49.810744] Restarting tasks ... done. > [ 49.856263] Setting capacity to 6291456 > [ 50.006002] Setting capacity to 10485760 > > If I start as PV, it never seems to happen. > > Up to you to decide how important this is. :) We could do something like below. Boris? Juergen --- diff --git a/arch/x86/xen/suspend_hvm.c b/arch/x86/xen/suspend_hvm.c index e666b614cf6d..088f3a6b4be9 100644 --- a/arch/x86/xen/suspend_hvm.c +++ b/arch/x86/xen/suspend_hvm.c @@ -13,6 +13,6 @@ void xen_hvm_post_suspend(int suspend_cancelled) xen_hvm_init_shared_info(); xen_vcpu_restore(); } - xen_callback_vector(); + xen_callback_vector(true); xen_unplug_emulated_devices(); } diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index 0e60bd918695..ba293fda3265 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -55,7 +55,7 @@ void xen_enable_sysenter(void); void xen_enable_syscall(void); void xen_vcpu_restore(void); -void xen_callback_vector(void); +void xen_callback_vector(bool silent); void xen_hvm_init_shared_info(void); void xen_unplug_emulated_devices(void); diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 93194f3e7540..8d8d50bea215 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1637,7 +1637,7 @@ EXPORT_SYMBOL_GPL(xen_set_callback_via); /* Vector callbacks are better than PCI interrupts to receive event * channel notifications because we can receive vector callbacks on any * vcpu and we don't need PCI support or APIC interactions. */ -void xen_callback_vector(void) +void xen_callback_vector(bool silent) { int rc; uint64_t callback_via; @@ -1650,13 +1650,14 @@ void xen_callback_vector(void) xen_have_vector_callback = 0; return; } - pr_info("Xen HVM callback vector for event delivery is enabled\n"); + if (!silent) + pr_info("Xen HVM callback vector for event delivery is enabled\n"); alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, xen_hvm_callback_vector); } } #else -void xen_callback_vector(void) {} +void xen_callback_vector(bool silent) {} #endif #undef MODULE_PARAM_PREFIX @@ -1692,7 +1693,7 @@ void __init xen_init_IRQ(void) pci_xen_initial_domain(); } if (xen_feature(XENFEAT_hvm_callback_vector)) - xen_callback_vector(); + xen_callback_vector(false); if (xen_hvm_domain()) { native_init_IRQ();
Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen
Hi Boris, On 1/14/19 2:54 PM, Boris Ostrovsky wrote: > On 1/14/19 7:44 AM, Juergen Gross wrote: >> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' >> sched_clock() interface") broke Xen guest time handling across >> migration: >> >> [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) >> done. >> [ 187.251137] OOM killer disabled. >> [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 >> seconds) done. >> [ 187.252299] suspending xenstore... >> [ 187.266987] xen:grant_table: Grant tables using version 1 layout >> [18446743811.706476] OOM killer enabled. >> [18446743811.706478] Restarting tasks ... done. >> [18446743811.720505] Setting capacity to 16777216 >> >> Fix that by setting xen_sched_clock_offset at resume time to ensure a >> monotonic clock value. >> >> Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' >> sched_clock() interface") >> Cc: # 4.11 >> Reported-by: Hans van Kranenburg >> Signed-off-by: Juergen Gross > > Reviewed-by: Boris Ostrovsky Can you please change the address to my work email? Reported-by: Hans van Kranenburg Also (see other email): Tested-by: Hans van Kranenburg Thanks, Hans
Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen
Hi, On 1/14/19 1:44 PM, Juergen Gross wrote: > Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' > sched_clock() interface") broke Xen guest time handling across > migration: > > [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. > [ 187.251137] OOM killer disabled. > [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) > done. > [ 187.252299] suspending xenstore... > [ 187.266987] xen:grant_table: Grant tables using version 1 layout > [18446743811.706476] OOM killer enabled. > [18446743811.706478] Restarting tasks ... done. > [18446743811.720505] Setting capacity to 16777216 > > Fix that by setting xen_sched_clock_offset at resume time to ensure a > monotonic clock value. > > [...] With v3 of the patch, I see the time jump in one log line happen, but only when using PVH. [ 49.486453] Freezing user space processes ... (elapsed 0.002 seconds) done. [ 49.488743] OOM killer disabled. [ 49.488764] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 49.491117] suspending xenstore... [2000731.388722] xen:events: Xen HVM callback vector for event delivery is enabled [ 49.491750] xen:grant_table: Grant tables using version 1 layout [ 49.810722] OOM killer enabled. [ 49.810744] Restarting tasks ... done. [ 49.856263] Setting capacity to 6291456 [ 50.006002] Setting capacity to 10485760 If I start as PV, it never seems to happen. Up to you to decide how important this is. :) FYI this is with v3 on top of the Debian stretch-backports 4.19 kernel, which I'm starting to use now to reboot things with. -# uname -a Linux appnode-kylie 4.19.0-0.bpo.1-cloud-amd64 #1 SMP Debian 4.19.12-1~bpo9+1+mendix1 (2019-01-15) x86_64 GNU/Linux https://salsa.debian.org/knorrie-guest/linux/commits/mendix/stretch-backports Hans
Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen
On 15/01/2019 11:43, Thomas Gleixner wrote: > On Mon, 14 Jan 2019, Juergen Gross wrote: > >> Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' >> sched_clock() interface") broke Xen guest time handling across >> migration: >> >> [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) >> done. >> [ 187.251137] OOM killer disabled. >> [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 >> seconds) done. >> [ 187.252299] suspending xenstore... >> [ 187.266987] xen:grant_table: Grant tables using version 1 layout >> [18446743811.706476] OOM killer enabled. >> [18446743811.706478] Restarting tasks ... done. >> [18446743811.720505] Setting capacity to 16777216 > > I see that it's broken, but the changelog could do with an explanation WHY > it broke. This seems to be rather complex. I believe the mentioned commit just ignored Xen guests resulting in a "stable" clock where it shouldn't, but maybe I have missed another aspect of this commit which is to blame. I tried to fix that by replacing using_native_sched_clock() with a hypervisor specific pvops function. Unfortunately this didn't work, maybe due to other uses of using_native_sched_clock() added by later patches. In the end it was quite clear that updating the Xen clock offset was the right thing to do, so I ended up with this patch. Juergen
Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen
On Mon, 14 Jan 2019, Juergen Gross wrote: > Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' > sched_clock() interface") broke Xen guest time handling across > migration: > > [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. > [ 187.251137] OOM killer disabled. > [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) > done. > [ 187.252299] suspending xenstore... > [ 187.266987] xen:grant_table: Grant tables using version 1 layout > [18446743811.706476] OOM killer enabled. > [18446743811.706478] Restarting tasks ... done. > [18446743811.720505] Setting capacity to 16777216 I see that it's broken, but the changelog could do with an explanation WHY it broke. Other than that this looks about right. Thanks, tglx
Re: [PATCH v3] xen: Fix x86 sched_clock() interface for xen
On 1/14/19 7:44 AM, Juergen Gross wrote: > Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' > sched_clock() interface") broke Xen guest time handling across > migration: > > [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. > [ 187.251137] OOM killer disabled. > [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) > done. > [ 187.252299] suspending xenstore... > [ 187.266987] xen:grant_table: Grant tables using version 1 layout > [18446743811.706476] OOM killer enabled. > [18446743811.706478] Restarting tasks ... done. > [18446743811.720505] Setting capacity to 16777216 > > Fix that by setting xen_sched_clock_offset at resume time to ensure a > monotonic clock value. > > Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' > sched_clock() interface") > Cc: # 4.11 > Reported-by: Hans van Kranenburg > Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky
[PATCH v3] xen: Fix x86 sched_clock() interface for xen
Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") broke Xen guest time handling across migration: [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 187.251137] OOM killer disabled. [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 187.252299] suspending xenstore... [ 187.266987] xen:grant_table: Grant tables using version 1 layout [18446743811.706476] OOM killer enabled. [18446743811.706478] Restarting tasks ... done. [18446743811.720505] Setting capacity to 16777216 Fix that by setting xen_sched_clock_offset at resume time to ensure a monotonic clock value. Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") Cc: # 4.11 Reported-by: Hans van Kranenburg Signed-off-by: Juergen Gross --- arch/x86/xen/time.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 72bf446c3fee..6e29794573b7 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -361,8 +361,6 @@ void xen_timer_resume(void) { int cpu; - pvclock_resume(); - if (xen_clockevent != &xen_vcpuop_clockevent) return; @@ -379,12 +377,15 @@ static const struct pv_time_ops xen_time_ops __initconst = { }; static struct pvclock_vsyscall_time_info *xen_clock __read_mostly; +static u64 xen_clock_value_saved; void xen_save_time_memory_area(void) { struct vcpu_register_time_memory_area t; int ret; + xen_clock_value_saved = xen_clocksource_read() - xen_sched_clock_offset; + if (!xen_clock) return; @@ -404,7 +405,7 @@ void xen_restore_time_memory_area(void) int ret; if (!xen_clock) - return; + goto out; t.addr.v = &xen_clock->pvti; @@ -421,6 +422,11 @@ void xen_restore_time_memory_area(void) if (ret != 0) pr_notice("Cannot restore secondary vcpu_time_info (err %d)", ret); + +out: + /* Need pvclock_resume() before using xen_clocksource_read(). */ + pvclock_resume(); + xen_sched_clock_offset = xen_clocksource_read() - xen_clock_value_saved; } static void xen_setup_vsyscall_time_info(void) -- 2.16.4