Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-03-01 Thread Thomas Gleixner
On Tue, 7 Feb 2012, Marcelo Tosatti wrote:
> 
> Upon resume from hibernation, CPU 0's hvclock area contains the old
> values for system_time and tsc_timestamp. It is necessary for the
> hypervisor to update these values with uptodate ones before the CPU uses
> them.
> 
> Abstract TSC's save/restore sched_clock_state functions and use
> restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.
> 
> Fixes suspend-to-disk with kvmclock.
> 
> Signed-off-by: Marcelo Tosatti 
> 
> diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
> index 15d9915..c91e8b9 100644
> --- a/arch/x86/include/asm/tsc.h
> +++ b/arch/x86/include/asm/tsc.h
> @@ -61,7 +61,7 @@ extern void check_tsc_sync_source(int cpu);
>  extern void check_tsc_sync_target(void);
>  
>  extern int notsc_setup(char *);
> -extern void save_sched_clock_state(void);
> -extern void restore_sched_clock_state(void);
> +extern void tsc_save_sched_clock_state(void);
> +extern void tsc_restore_sched_clock_state(void);
>  
>  #endif /* _ASM_X86_TSC_H */
> diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
> index 5d0afac..baaca8d 100644
> --- a/arch/x86/include/asm/x86_init.h
> +++ b/arch/x86/include/asm/x86_init.h
> @@ -162,6 +162,8 @@ struct x86_cpuinit_ops {
>   * @is_untracked_pat_range   exclude from PAT logic
>   * @nmi_init enable NMI on cpus
>   * @i8042_detect pre-detect if i8042 controller exists
> + * @save_sched_clock_state:  save state for sched_clock() on suspend
> + * @restore_sched_clock_state:   restore state for sched_clock() on 
> resume
>   */
>  struct x86_platform_ops {
>   unsigned long (*calibrate_tsc)(void);
> @@ -173,6 +175,8 @@ struct x86_platform_ops {
>   void (*nmi_init)(void);
>   unsigned char (*get_nmi_reason)(void);
>   int (*i8042_detect)(void);
> + void (*save_sched_clock_state)(void);
> + void (*restore_sched_clock_state)(void);
>  };
>  
>  struct pci_dev;
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index ca4e735..57e6b78 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -136,6 +136,15 @@ int kvm_register_clock(char *txt)
>   return ret;
>  }
>  
> +void kvm_save_sched_clock_state(void)

static ?

> +{
> +}
> +
> +void kvm_restore_sched_clock_state(void)

Ditto

> +{
> + kvm_register_clock("primary cpu clock, resume");
> +}
> +

Otherwise: Reviewed-by: Thomas Gleixner 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state (v2)

2012-02-15 Thread Avi Kivity
On 02/13/2012 05:52 PM, Marcelo Tosatti wrote:
> > >  {
> > >+  x86_platform.restore_sched_clock_state();
> > Isn't it too early? It is scarry to say hypervisor to write to some
> > memory location and than completely replace page-tables and half of
> > cpu state in __restore_processor_state. Wouldn't that have a potential
> > of writing into a place that is not restored hv_clock and restored
> > hv_clock might still be stale?
>
> No, memory is copied in swsusp_arch_resume(), which happens
> before restore_processor_state. restore_processor_state() is only
> setting up registers and MTRR.
>

In addition, kvmclock uses physical addresses, so page table changes
don't matter.

Note we could have done this in
__save_processor_state()/__restore_processor_state() (it's just reading
and writing an MSR, like we do for MSR_IA32_MISC_ENABLE), but I think
your patch is the right way.  I'd like an ack from the x86 maintainers
though.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state (v2)

2012-02-13 Thread Amit Shah
On (Mon) 13 Feb 2012 [11:07:27], Marcelo Tosatti wrote:
> 
> Upon resume from hibernation, CPU 0's hvclock area contains the old
> values for system_time and tsc_timestamp. It is necessary for the
> hypervisor to update these values with uptodate ones before the CPU uses
> them.
> 
> Abstract TSC's save/restore sched_clock_state functions and use
> restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.
> 
> Also move restore_sched_clock_state before __restore_processor_state,
> since the later calls CONFIG_LOCK_STAT's lockstat_clock (also for TSC).
> Thanks to Igor Mammedov for tracking it down.
> 
> Fixes suspend-to-disk with kvmclock.
> 
> Signed-off-by: Marcelo Tosatti 

This works fine, thanks.

Tested-by: Amit Shah 

Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state (v2)

2012-02-13 Thread Marcelo Tosatti
On Mon, Feb 13, 2012 at 04:20:24PM +0100, Igor Mammedov wrote:
> On 02/13/2012 02:07 PM, Marcelo Tosatti wrote:
> >
> >Upon resume from hibernation, CPU 0's hvclock area contains the old
> >values for system_time and tsc_timestamp. It is necessary for the
> >hypervisor to update these values with uptodate ones before the CPU uses
> >them.
> >
> >Abstract TSC's save/restore sched_clock_state functions and use
> >restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.
> >
> >Also move restore_sched_clock_state before __restore_processor_state,
> >since the later calls CONFIG_LOCK_STAT's lockstat_clock (also for TSC).
> >Thanks to Igor Mammedov for tracking it down.
> >
> >Fixes suspend-to-disk with kvmclock.
> >
> >Signed-off-by: Marcelo Tosatti
> >
> >diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
> >index 15d9915..c91e8b9 100644
> >--- a/arch/x86/include/asm/tsc.h
> >+++ b/arch/x86/include/asm/tsc.h
> >@@ -61,7 +61,7 @@ extern void check_tsc_sync_source(int cpu);
> >  extern void check_tsc_sync_target(void);
> >
> >  extern int notsc_setup(char *);
> >-extern void save_sched_clock_state(void);
> >-extern void restore_sched_clock_state(void);
> >+extern void tsc_save_sched_clock_state(void);
> >+extern void tsc_restore_sched_clock_state(void);
> >
> >  #endif /* _ASM_X86_TSC_H */
> >diff --git a/arch/x86/include/asm/x86_init.h 
> >b/arch/x86/include/asm/x86_init.h
> >index 5d0afac..baaca8d 100644
> >--- a/arch/x86/include/asm/x86_init.h
> >+++ b/arch/x86/include/asm/x86_init.h
> >@@ -162,6 +162,8 @@ struct x86_cpuinit_ops {
> >   * @is_untracked_pat_range exclude from PAT logic
> >   * @nmi_init   enable NMI on cpus
> >   * @i8042_detect   pre-detect if i8042 controller exists
> >+ * @save_sched_clock_state: save state for sched_clock() on suspend
> >+ * @restore_sched_clock_state:  restore state for sched_clock() on 
> >resume
> >   */
> >  struct x86_platform_ops {
> > unsigned long (*calibrate_tsc)(void);
> >@@ -173,6 +175,8 @@ struct x86_platform_ops {
> > void (*nmi_init)(void);
> > unsigned char (*get_nmi_reason)(void);
> > int (*i8042_detect)(void);
> >+void (*save_sched_clock_state)(void);
> >+void (*restore_sched_clock_state)(void);
> >  };
> >
> >  struct pci_dev;
> >diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> >index ca4e735..57e6b78 100644
> >--- a/arch/x86/kernel/kvmclock.c
> >+++ b/arch/x86/kernel/kvmclock.c
> >@@ -136,6 +136,15 @@ int kvm_register_clock(char *txt)
> > return ret;
> >  }
> >
> >+void kvm_save_sched_clock_state(void)
> >+{
> >+}
> >+
> >+void kvm_restore_sched_clock_state(void)
> >+{
> >+kvm_register_clock("primary cpu clock, resume");
> >+}
> >+
> >  #ifdef CONFIG_X86_LOCAL_APIC
> >  static void __cpuinit kvm_setup_secondary_clock(void)
> >  {
> >@@ -195,6 +204,8 @@ void __init kvmclock_init(void)
> > x86_cpuinit.early_percpu_clock_init =
> > kvm_setup_secondary_clock;
> >  #endif
> >+x86_platform.save_sched_clock_state = kvm_save_sched_clock_state;
> >+x86_platform.restore_sched_clock_state = kvm_restore_sched_clock_state;
> > machine_ops.shutdown  = kvm_shutdown;
> >  #ifdef CONFIG_KEXEC
> > machine_ops.crash_shutdown  = kvm_crash_shutdown;
> >diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> >index a62c201..aed2aa1 100644
> >--- a/arch/x86/kernel/tsc.c
> >+++ b/arch/x86/kernel/tsc.c
> >@@ -629,7 +629,7 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int 
> >cpu)
> >
> >  static unsigned long long cyc2ns_suspend;
> >
> >-void save_sched_clock_state(void)
> >+void tsc_save_sched_clock_state(void)
> >  {
> > if (!sched_clock_stable)
> > return;
> >@@ -645,7 +645,7 @@ void save_sched_clock_state(void)
> >   * that sched_clock() continues from the point where it was left off during
> >   * suspend.
> >   */
> >-void restore_sched_clock_state(void)
> >+void tsc_restore_sched_clock_state(void)
> >  {
> > unsigned long long offset;
> > unsigned long flags;
> >diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
> >index 6f2ec53..e9f265f 100644
> >--- a/arch/x86/kernel/x86_init.c
> >+++ b/arch/x86/kernel/x86_init.c
> >@@ -108,7 +108,9 @@ struct x86_platform_ops x86_platform = {
> > .is_untracked_pat_range = is_ISA_range,
> > .nmi_init   = default_nmi_init,
> > .get_nmi_reason = default_get_nmi_reason,
> >-.i8042_detect   = default_i8042_detect
> >+.i8042_detect   = default_i8042_detect,
> >+.save_sched_clock_state = tsc_save_sched_clock_state,
> >+.restore_sched_clock_state  = tsc_restore_sched_clock_state,
> >  };
> >
> >  EXPORT_SYMBOL_GPL(x86_platform);
> >diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
> >index f10c0af..0e76a28 100644
> >--- a/arch/x86/power/cpu.c
> >+++ b/arch/x86/power/cpu.c
> >@@ -114,7 +114,7 @@ static void __save_processor_sta

Re: x86: kvmclock: abstract save/restore sched_clock_state (v2)

2012-02-13 Thread Igor Mammedov

On 02/13/2012 02:07 PM, Marcelo Tosatti wrote:


Upon resume from hibernation, CPU 0's hvclock area contains the old
values for system_time and tsc_timestamp. It is necessary for the
hypervisor to update these values with uptodate ones before the CPU uses
them.

Abstract TSC's save/restore sched_clock_state functions and use
restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.

Also move restore_sched_clock_state before __restore_processor_state,
since the later calls CONFIG_LOCK_STAT's lockstat_clock (also for TSC).
Thanks to Igor Mammedov for tracking it down.

Fixes suspend-to-disk with kvmclock.

Signed-off-by: Marcelo Tosatti

diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 15d9915..c91e8b9 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -61,7 +61,7 @@ extern void check_tsc_sync_source(int cpu);
  extern void check_tsc_sync_target(void);

  extern int notsc_setup(char *);
-extern void save_sched_clock_state(void);
-extern void restore_sched_clock_state(void);
+extern void tsc_save_sched_clock_state(void);
+extern void tsc_restore_sched_clock_state(void);

  #endif /* _ASM_X86_TSC_H */
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 5d0afac..baaca8d 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -162,6 +162,8 @@ struct x86_cpuinit_ops {
   * @is_untracked_pat_rangeexclude from PAT logic
   * @nmi_init  enable NMI on cpus
   * @i8042_detect  pre-detect if i8042 controller exists
+ * @save_sched_clock_state:save state for sched_clock() on suspend
+ * @restore_sched_clock_state: restore state for sched_clock() on resume
   */
  struct x86_platform_ops {
unsigned long (*calibrate_tsc)(void);
@@ -173,6 +175,8 @@ struct x86_platform_ops {
void (*nmi_init)(void);
unsigned char (*get_nmi_reason)(void);
int (*i8042_detect)(void);
+   void (*save_sched_clock_state)(void);
+   void (*restore_sched_clock_state)(void);
  };

  struct pci_dev;
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index ca4e735..57e6b78 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -136,6 +136,15 @@ int kvm_register_clock(char *txt)
return ret;
  }

+void kvm_save_sched_clock_state(void)
+{
+}
+
+void kvm_restore_sched_clock_state(void)
+{
+   kvm_register_clock("primary cpu clock, resume");
+}
+
  #ifdef CONFIG_X86_LOCAL_APIC
  static void __cpuinit kvm_setup_secondary_clock(void)
  {
@@ -195,6 +204,8 @@ void __init kvmclock_init(void)
x86_cpuinit.early_percpu_clock_init =
kvm_setup_secondary_clock;
  #endif
+   x86_platform.save_sched_clock_state = kvm_save_sched_clock_state;
+   x86_platform.restore_sched_clock_state = kvm_restore_sched_clock_state;
machine_ops.shutdown  = kvm_shutdown;
  #ifdef CONFIG_KEXEC
machine_ops.crash_shutdown  = kvm_crash_shutdown;
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a62c201..aed2aa1 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -629,7 +629,7 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)

  static unsigned long long cyc2ns_suspend;

-void save_sched_clock_state(void)
+void tsc_save_sched_clock_state(void)
  {
if (!sched_clock_stable)
return;
@@ -645,7 +645,7 @@ void save_sched_clock_state(void)
   * that sched_clock() continues from the point where it was left off during
   * suspend.
   */
-void restore_sched_clock_state(void)
+void tsc_restore_sched_clock_state(void)
  {
unsigned long long offset;
unsigned long flags;
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 6f2ec53..e9f265f 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -108,7 +108,9 @@ struct x86_platform_ops x86_platform = {
.is_untracked_pat_range = is_ISA_range,
.nmi_init   = default_nmi_init,
.get_nmi_reason = default_get_nmi_reason,
-   .i8042_detect   = default_i8042_detect
+   .i8042_detect   = default_i8042_detect,
+   .save_sched_clock_state = tsc_save_sched_clock_state,
+   .restore_sched_clock_state  = tsc_restore_sched_clock_state,
  };

  EXPORT_SYMBOL_GPL(x86_platform);
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index f10c0af..0e76a28 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -114,7 +114,7 @@ static void __save_processor_state(struct saved_context 
*ctxt)
  void save_processor_state(void)
  {
__save_processor_state(&saved_context);
-   save_sched_clock_state();
+   x86_platform.save_sched_clock_state();
  }
  #ifdef CONFIG_X86_32
  EXPORT_SYMBOL(save_processor_state);
@@ -230,8 +230,8 @@ static void __restore_processor_state(struct saved_context 
*ctxt)
  /* Needed by apm.c 

Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-13 Thread Amit Shah
On (Fri) 10 Feb 2012 [10:33:37], Marcelo Tosatti wrote:
> On Fri, Feb 10, 2012 at 10:32:16AM -0200, Marcelo Tosatti wrote:
> > On Fri, Feb 10, 2012 at 03:32:11PM +0530, Amit Shah wrote:
> > > On (Thu) 09 Feb 2012 [16:13:29], Igor Mammedov wrote:
> > > 
> > > > Stalls are probably caused by uninitialized percpu hv_clock, with
> > > > following patch I don't see stalls. Although I might be just lucky.
> > > > http://git.kernel.org/?p=virt/kvm/kvm.git;a=commit;h=e2971ac7e1d186af059e088d305496c5cb47d487
> > > 
> > > Your commit does make things better, I don't see any stalls on the
> > > first resume.
> > > 
> > > However, a subsequent s4 causes the stall to re-appear on resume, and
> > > this time there are no stall messages; the kernel just sits there
> > > spinning on something.  I've not found the solution to this one yet (I
> > > had a commit similar to Marcelo's in the works, which got me to the
> > > previous works-but-stalls behaviour).
> > 
> > I cannot reproduce it here. Suspend/resume are operating normally after
> > several iterations. Igor do you see anything similar?
> > 
> > Amit, can you please enable CONFIG_PRINTK_TIME=y and post a full dmesg 
> > (both during suspend and also the new kernel during resume).
> 
> Also is it reproducible with UP guest?

Yes, it is.

Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-13 Thread Amit Shah
On (Fri) 10 Feb 2012 [13:43:05], Igor Mammedov wrote:
> Another thing is to try smp guest without kvmclock and see if it helps.
> It might be just something else.

Nope, it's related to kvmclock.

Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-13 Thread Amit Shah
On (Fri) 10 Feb 2012 [21:58:47], Igor Mammedov wrote:
> BTW Amit,
> your config doesn't have CONFIG_KVM_GUEST set, which causes primary cpu clock 
> to be
> uninitialized too in case of SMP kernel.

Interesting.  I didn't notice that.  However, if I enable that option,
resume fails for me even the first time.

Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-10 Thread Igor Mammedov

I've to revoke my ack and say NAK to this patch. Patch itself is in right
direction but clock restore happens too late.

With patch I've used to hunt down overflow for cpu hotplug
(maybe it will be better for it to be in kernel, which will help to detect
 overflow problem without spending a lot of time to debug it?):

diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c
index 42eb330..0081e10 100644
--- a/arch/x86/kernel/pvclock.c
+++ b/arch/x86/kernel/pvclock.c
@@ -43,7 +43,10 @@ void pvclock_set_flags(u8 flags)

 static u64 pvclock_get_nsec_offset(struct pvclock_shadow_time *shadow)
 {
-   u64 delta = native_read_tsc() - shadow->tsc_timestamp;
+   u64 tsc = native_read_tsc();
+   u64 shadow_timestamp = shadow->tsc_timestamp;
+   u64 delta = tsc - shadow_timestamp;
+   BUG_ON (tsc < shadow_timestamp);
return pvclock_scale_delta(delta, shadow->tsc_to_nsec_mul,
   shadow->tsc_shift);
 }

I've got following back-trace:

#29 0x81004925 in oops_end (flags=70, regs=, signr=11) 
at arch/x86/kernel/dumpstack.c:246
#30 0x81004a77 in die (str=0x8152ca7f "invalid opcode", 
regs=0x8800149859e8, err=0)
at arch/x86/kernel/dumpstack.c:305
#31 0x81001fed in do_trap (trapnr=6, signr=4, str=0x8152ca7f 
"invalid opcode", regs=0x8800149859e8,
error_code=0, info=0x880014985948) at arch/x86/kernel/traps.c:168
#32 0x810021f7 in do_invalid_op (regs=0x8800149859e8, error_code=0) 
at arch/x86/kernel/traps.c:209
#33 
#34 pvclock_get_nsec_offset (shadow=0x880014985a98) at 
arch/x86/kernel/pvclock.c:51
#35 pvclock_clocksource_read (src=0x88001f1d1ac0) at 
arch/x86/kernel/pvclock.c:107
#36 0x8101aa97 in kvm_clock_read () at arch/x86/kernel/kvmclock.c:79
#37 0x810087cc in paravirt_sched_clock () at 
/home/imammedov/fc15/linux-2.6/arch/x86/include/asm/paravirt.h:230
#38 sched_clock () at arch/x86/kernel/tsc.c:71
#39 0x8104f025 in sched_clock_local (scd=0x88001f1d2600) at 
kernel/sched/clock.c:147
#40 0x8104f1b7 in sched_clock_cpu (cpu=0) at kernel/sched/clock.c:232
#41 0x8104f1e5 in local_clock () at kernel/sched/clock.c:316
#42 0x810699d2 in lockstat_clock () at kernel/lockdep.c:158
#43 __lock_acquire (lock=0x816131b8, subclass=, trylock=0, 
read=0, check=2, hardirqs_off=,
nest_lock=0x0, ip=18446744071578924098, references=0) at 
kernel/lockdep.c:3098
#44 0x8106a924 in lock_acquire (lock=0x816131b8, subclass=0, 
trylock=0, read=0, check=2, nest_lock=0x0,
ip=18446744071578924098) at kernel/lockdep.c:3555
#45 0x8136b1c3 in __raw_spin_lock (lock=0x816131a0) at 
include/linux/spinlock_api_smp.h:143
#46 _raw_spin_lock (lock=0x816131a0) at kernel/spinlock.c:137
#47 0x81013442 in prepare_set () at 
arch/x86/kernel/cpu/mtrr/generic.c:676
#48 0x8101368c in generic_set_all () at 
arch/x86/kernel/cpu/mtrr/generic.c:723
#49 0x8101258b in mtrr_bp_restore () at 
arch/x86/kernel/cpu/mtrr/main.c:738
#50 0x812cabfa in __restore_processor_state (ctxt=) at 
arch/x86/power/cpu.c:227
#51 restore_processor_state () at arch/x86/power/cpu.c:233
---Type  to continue, or q  to quit---
#52 0x8105a1b6 in create_image (platform_mode=0) at 
kernel/power/hibernate.c:291
#53 hibernation_snapshot (platform_mode=0) at kernel/power/hibernate.c:363
#54 0x8105a825 in hibernate () at kernel/power/hibernate.c:629
#55 0x81058868 in state_store (kobj=, attr=,
buf=0x88001366e000 , n=5) at 
kernel/power/main.c:284
#56 0x811e6cc3 in kobj_attr_store (kobj=, attr=, buf=,
count=) at lib/kobject.c:699
#57 0x811372c2 in flush_write_buffer (count=, buffer=, dentry=)
at fs/sysfs/file.c:202
#58 sysfs_write_file (file=, buf=0x7f9ce710d000 , count=,
ppos=0x880014985f58) at fs/sysfs/file.c:236
#59 0x810ebfbe in vfs_write (file=0x88001cb48000, buf=0x7f9ce710d000 
,
count=, pos=0x880014985f58) at fs/read_write.c:435
#60 0x810ec175 in sys_write (fd=, buf=0x7f9ce710d000 
,
count=) at fs/read_write.c:487

that shows an access to clock happens right before 
x86_platform.restore_sched_clock_state
is called.

Moving x86_platform.restore_sched_clock_state before mtrr_bp_restore solves 
issue.
It isn't bugged on me for 20 save/restore cycles with this change, without this 
change
it bugs on 2nd-3rd cycle.


BTW Amit,
your config doesn't have CONFIG_KVM_GUEST set, which causes primary cpu clock 
to be
uninitialized too in case of SMP kernel. Commit ca3f10172eea9b95b moved it from
kvmclock_init to kvm_guest_init but forgot to change Kconfig too and 
kvm_guest_init
is called only when KVM_GUEST is selected.  This commit created following 
implicit deps

if SMP && KVM_CLOCK then select KVM_GUEST

But I don't know if it's possible to express it in Kconfig. That rises a 
question:
Do we need KVM_CLOCK option? Maybe better

Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-10 Thread Amit Shah
On (Fri) 10 Feb 2012 [10:32:16], Marcelo Tosatti wrote:
> On Fri, Feb 10, 2012 at 03:32:11PM +0530, Amit Shah wrote:
> > On (Thu) 09 Feb 2012 [16:13:29], Igor Mammedov wrote:
> > 
> > > Stalls are probably caused by uninitialized percpu hv_clock, with
> > > following patch I don't see stalls. Although I might be just lucky.
> > > http://git.kernel.org/?p=virt/kvm/kvm.git;a=commit;h=e2971ac7e1d186af059e088d305496c5cb47d487
> > 
> > Your commit does make things better, I don't see any stalls on the
> > first resume.
> > 
> > However, a subsequent s4 causes the stall to re-appear on resume, and
> > this time there are no stall messages; the kernel just sits there
> > spinning on something.  I've not found the solution to this one yet (I
> > had a commit similar to Marcelo's in the works, which got me to the
> > previous works-but-stalls behaviour).
> 
> I cannot reproduce it here. Suspend/resume are operating normally after
> several iterations. Igor do you see anything similar?
> 
> Amit, can you please enable CONFIG_PRINTK_TIME=y and post a full dmesg 
> (both during suspend and also the new kernel during resume).

In my case, I run a ping to the host (10.0.2.2) while the s4
suspend/resume operations are performed.

Complete dmesg, for all 3 invocations of the guest.  First one boots,
starts ping, starts s4.  Second one starts s4 after confirming ping is
working fine.  Third one just stays there, spinning.

$ ./x86_64-softmmu/qemu-system-x86_64 -kernel ~/src/linux/arch/x86/boot/bzImage 
 -append 'root=/dev/vda1 console=tty0 console=ttyS0 no_console_suspend' -drive 
file=/guests/f14-suspend.qcow2,if=none,id=dr0 -device virtio-blk-pci,drive=dr0 
-net nic,model=virtio -net user  -serial stdio  -enable-kvm -m 512  -cpu host 
-smp 4
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 3.3.0-rc2+ (a...@amit.redhat.com) (gcc version 
4.6.2 20111027 (Red Hat 4.6.2-1) (GCC) ) #295 SMP PREEMPT Fri Feb 10 18:39:48 
IST 2012
[0.00] Command line: root=/dev/vda1 console=tty0 console=ttyS0 
no_console_suspend
[0.00] BIOS-provided physical RAM map:
[0.00]  BIOS-e820:  - 0009dc00 (usable)
[0.00]  BIOS-e820: 0009dc00 - 000a (reserved)
[0.00]  BIOS-e820: 000f - 0010 (reserved)
[0.00]  BIOS-e820: 0010 - 1fffd000 (usable)
[0.00]  BIOS-e820: 1fffd000 - 2000 (reserved)
[0.00]  BIOS-e820: feffc000 - ff00 (reserved)
[0.00]  BIOS-e820: fffc - 0001 (reserved)
[0.00] NX (Execute Disable) protection: active
[0.00] DMI 2.4 present.
[0.00] No AGP bridge found
[0.00] last_pfn = 0x1fffd max_arch_pfn = 0x4
[0.00] x86 PAT enabled: cpu 0, old 0x70406, new 0x7010600070106
[0.00] init_memory_mapping: -1fffd000
[0.00] RAMDISK: 1fa2e000 - 1fff
[0.00] ACPI: RSDP 000fd3f0 00014 (v00 BOCHS )
[0.00] ACPI: RSDT 1fffd660 00034 (v01 BOCHS  BXPCRSDT 0001 
BXPC 0001)
[0.00] ACPI: FACP 1f80 00074 (v01 BOCHS  BXPCFACP 0001 
BXPC 0001)
[0.00] ACPI: DSDT 1fffd9b0 02589 (v01   BXPC   BXDSDT 0001 
INTL 20100528)
[0.00] ACPI: FACS 1f40 00040
[0.00] ACPI: SSDT 1fffd7e0 001C1 (v01 BOCHS  BXPCSSDT 0001 
BXPC 0001)
[0.00] ACPI: APIC 1fffd6e0 0008A (v01 BOCHS  BXPCAPIC 0001 
BXPC 0001)
[0.00] ACPI: HPET 1fffd6a0 00038 (v01 BOCHS  BXPCHPET 0001 
BXPC 0001)
[0.00] kvm-clock: Using msrs 4b564d01 and 4b564d00
[0.00] kvm-clock: cpu 0, msr 0:18509c1, boot clock
[0.00] Zone PFN ranges:
[0.00]   DMA  0x0010 -> 0x1000
[0.00]   DMA320x1000 -> 0x0010
[0.00]   Normal   empty
[0.00] Movable zone start PFN for each node
[0.00] Early memory PFN ranges
[0.00] 0: 0x0010 -> 0x009d
[0.00] 0: 0x0100 -> 0x0001fffd
[0.00] ACPI: PM-Timer IO Port: 0xb008
[0.00] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[0.00] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
[0.00] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
[0.00] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
[0.00] ACPI: IOAPIC (id[0x04] address[0xfec0] gsi_base[0])
[0.00] IOAPIC[0]: apic_id 4, version 17, address 0xfec0, GSI 0-23
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[0.00] Using ACPI

Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-10 Thread Igor Mammedov

On 02/10/2012 01:33 PM, Marcelo Tosatti wrote:

On Fri, Feb 10, 2012 at 10:32:16AM -0200, Marcelo Tosatti wrote:

On Fri, Feb 10, 2012 at 03:32:11PM +0530, Amit Shah wrote:

On (Thu) 09 Feb 2012 [16:13:29], Igor Mammedov wrote:


Stalls are probably caused by uninitialized percpu hv_clock, with
following patch I don't see stalls. Although I might be just lucky.
http://git.kernel.org/?p=virt/kvm/kvm.git;a=commit;h=e2971ac7e1d186af059e088d305496c5cb47d487


Your commit does make things better, I don't see any stalls on the
first resume.

However, a subsequent s4 causes the stall to re-appear on resume, and
this time there are no stall messages; the kernel just sits there
spinning on something.  I've not found the solution to this one yet (I
had a commit similar to Marcelo's in the works, which got me to the
previous works-but-stalls behaviour).


I cannot reproduce it here. Suspend/resume are operating normally after
several iterations. Igor do you see anything similar?


I wasn't able to reproduce it either but I haven't tried with Amit's config
yet.



Amit, can you please enable CONFIG_PRINTK_TIME=y and post a full dmesg
(both during suspend and also the new kernel during resume).


Also is it reproducible with UP guest?


Another thing is to try smp guest without kvmclock and see if it helps.
It might be just something else.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
Thanks,
 Igor
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-10 Thread Marcelo Tosatti
On Fri, Feb 10, 2012 at 10:32:16AM -0200, Marcelo Tosatti wrote:
> On Fri, Feb 10, 2012 at 03:32:11PM +0530, Amit Shah wrote:
> > On (Thu) 09 Feb 2012 [16:13:29], Igor Mammedov wrote:
> > 
> > > Stalls are probably caused by uninitialized percpu hv_clock, with
> > > following patch I don't see stalls. Although I might be just lucky.
> > > http://git.kernel.org/?p=virt/kvm/kvm.git;a=commit;h=e2971ac7e1d186af059e088d305496c5cb47d487
> > 
> > Your commit does make things better, I don't see any stalls on the
> > first resume.
> > 
> > However, a subsequent s4 causes the stall to re-appear on resume, and
> > this time there are no stall messages; the kernel just sits there
> > spinning on something.  I've not found the solution to this one yet (I
> > had a commit similar to Marcelo's in the works, which got me to the
> > previous works-but-stalls behaviour).
> 
> I cannot reproduce it here. Suspend/resume are operating normally after
> several iterations. Igor do you see anything similar?
> 
> Amit, can you please enable CONFIG_PRINTK_TIME=y and post a full dmesg 
> (both during suspend and also the new kernel during resume).

Also is it reproducible with UP guest?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-10 Thread Marcelo Tosatti
On Fri, Feb 10, 2012 at 03:32:11PM +0530, Amit Shah wrote:
> On (Thu) 09 Feb 2012 [16:13:29], Igor Mammedov wrote:
> 
> > Stalls are probably caused by uninitialized percpu hv_clock, with
> > following patch I don't see stalls. Although I might be just lucky.
> > http://git.kernel.org/?p=virt/kvm/kvm.git;a=commit;h=e2971ac7e1d186af059e088d305496c5cb47d487
> 
> Your commit does make things better, I don't see any stalls on the
> first resume.
> 
> However, a subsequent s4 causes the stall to re-appear on resume, and
> this time there are no stall messages; the kernel just sits there
> spinning on something.  I've not found the solution to this one yet (I
> had a commit similar to Marcelo's in the works, which got me to the
> previous works-but-stalls behaviour).

I cannot reproduce it here. Suspend/resume are operating normally after
several iterations. Igor do you see anything similar?

Amit, can you please enable CONFIG_PRINTK_TIME=y and post a full dmesg 
(both during suspend and also the new kernel during resume).

Thanks.

> > However there is/are a warning/s on suspend path and with following patch:
> 
> I didn't see this.

This is unrelated.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-10 Thread Amit Shah
On (Fri) 10 Feb 2012 [05:11:00], Igor Mammedov wrote:
> Could you send me your .config and commit id of kernel you are using?

Kernel's based on bd3ce7d57c380af110c86d19e256115d0e7053ca plus your
commit + Marcelo's patch.

config is attached below.

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 3.3.0-rc2 Kernel Configuration
#
CONFIG_64BIT=y
# CONFIG_X86_32 is not set
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_DEFAULT_IDLE=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_X86_HT=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx 
-fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 
-fcall-saved-r11"
# CONFIG_KTIME_SCALAR is not set
CONFIG_ARCH_CPU_PROBE_RELEASE=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_HAVE_IRQ_WORK=y
CONFIG_IRQ_WORK=y

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
# CONFIG_KERNEL_GZIP is not set
# CONFIG_KERNEL_BZIP2 is not set
CONFIG_KERNEL_LZMA=y
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
CONFIG_DEFAULT_HOSTNAME="virthost"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
# CONFIG_FHANDLE is not set
# CONFIG_TASKSTATS is not set
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y
# CONFIG_AUDIT_LOGINUID_IMMUTABLE is not set
CONFIG_HAVE_GENERIC_HARDIRQS=y

#
# IRQ subsystem
#
CONFIG_GENERIC_HARDIRQS=y
CONFIG_HAVE_SPARSE_IRQ=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y

#
# RCU Subsystem
#
CONFIG_TREE_PREEMPT_RCU=y
CONFIG_PREEMPT_RCU=y
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_FANOUT=64
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_RCU_FAST_NO_HZ is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_RCU_BOOST is not set
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=17
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_CGROUPS=y
# CONFIG_CGROUP_DEBUG is not set
# CONFIG_CGROUP_FREEZER is not set
# CONFIG_CGROUP_DEVICE is not set
# CONFIG_CPUSETS is not set
# CONFIG_CGROUP_CPUACCT is not set
# CONFIG_RESOURCE_COUNTERS is not set
# CONFIG_CGROUP_PERF is not set
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_CFS_BANDWIDTH is not set
# CONFIG_RT_GROUP_SCHED is not set
# CONFIG_BLK_CGROUP is not set
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
# CONFIG_EXPERT is not set
# CONFIG_SYSCTL_SYSCALL is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
# CONFIG_EMBEDDED is not set
CONFIG_HAVE_PERF_EVENTS=y

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_PERF_COUNTERS is not set
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_PCI_QUIRKS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_COMPAT_BRK is not set
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_PROFILING is not set
CONFIG_TRACEPOINTS=y
CONFIG_HAVE_OPROFILE=y
CONFIG_OPROFILE_NMI_TIMER=y
# CONFIG_KPROBES is not set
# CONFIG_JUMP_LABEL is not set
CONFIG_HAVE_

Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-10 Thread Igor Mammedov
Could you send me your .config and commit id of kernel you are using?

- Original Message -
> From: "Amit Shah" 
> To: "Igor Mammedov" 
> Cc: "Marcelo Tosatti" , kvm@vger.kernel.org, 
> t...@linutronix.de, mi...@redhat.com,
> h...@zytor.com, x...@kernel.org, johns...@us.ibm.com, r...@redhat.com, 
> a...@redhat.com
> Sent: Friday, February 10, 2012 11:02:11 AM
> Subject: Re: x86: kvmclock: abstract save/restore sched_clock_state
> 
> On (Thu) 09 Feb 2012 [16:13:29], Igor Mammedov wrote:
> 
> > Stalls are probably caused by uninitialized percpu hv_clock, with
> > following patch I don't see stalls. Although I might be just lucky.
> > http://git.kernel.org/?p=virt/kvm/kvm.git;a=commit;h=e2971ac7e1d186af059e088d305496c5cb47d487
> 
> Your commit does make things better, I don't see any stalls on the
> first resume.
> 
> However, a subsequent s4 causes the stall to re-appear on resume, and
> this time there are no stall messages; the kernel just sits there
> spinning on something.  I've not found the solution to this one yet
> (I
> had a commit similar to Marcelo's in the works, which got me to the
> previous works-but-stalls behaviour).
> 
> > However there is/are a warning/s on suspend path and with following
> > patch:
> 
> I didn't see this.
> 
>   Amit
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-10 Thread Amit Shah
On (Thu) 09 Feb 2012 [16:13:29], Igor Mammedov wrote:

> Stalls are probably caused by uninitialized percpu hv_clock, with
> following patch I don't see stalls. Although I might be just lucky.
> http://git.kernel.org/?p=virt/kvm/kvm.git;a=commit;h=e2971ac7e1d186af059e088d305496c5cb47d487

Your commit does make things better, I don't see any stalls on the
first resume.

However, a subsequent s4 causes the stall to re-appear on resume, and
this time there are no stall messages; the kernel just sits there
spinning on something.  I've not found the solution to this one yet (I
had a commit similar to Marcelo's in the works, which got me to the
previous works-but-stalls behaviour).

> However there is/are a warning/s on suspend path and with following patch:

I didn't see this.

Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-09 Thread Igor Mammedov

On 02/09/2012 01:27 PM, Amit Shah wrote:

On (Tue) 07 Feb 2012 [19:05:42], Marcelo Tosatti wrote:


Upon resume from hibernation, CPU 0's hvclock area contains the old
values for system_time and tsc_timestamp. It is necessary for the
hypervisor to update these values with uptodate ones before the CPU uses
them.

Abstract TSC's save/restore sched_clock_state functions and use
restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.

Fixes suspend-to-disk with kvmclock.


There are stalls after resume, see trace below.

./x86_64-softmmu/qemu-system-x86_64 -kernel ~/src/linux/arch/x86/boot/bzImage  
-append 'root=/dev/vda1 console=tty0 console=ttyS0 no_console_suspend' -drive 
file=/guests/f14-suspend.qcow2,if=none,id=dr0 -device virtio-blk-pci,drive=dr0 
-net nic,model=virtio -net user  -serial stdio  -enable-kvm -m 512  -cpu host 
-smp 4



Disabling non-boot CPUs ...
CPU 1 is now offline
CPU 2 is now offline
CPU 3 is now offline
lockdep: fixing up alternatives.
kvm-clock: cpu 0, msr 0:1f1d19c1, primary cpu clock, resume
PM: Restoring platform NVS memory
Enabling non-boot CPUs ...
lockdep: fixing up alternatives.
Booting Node 0 Processor 1 APIC 0x1
Calibrating delay loop (skipped) already calibrated this CPU
INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=314192 
jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
 [] __rcu_pending+0x268/0x3c5
  [] rcu_check_callbacks+0xab/0x108
  [] update_process_times+0x3f/0x75
  [] tick_sched_timer+0x6d/0x8c
  [] __run_hrtimer+0xc9/0x15c
  [] ? tick_nohz_handler+0xd5/0xd5
  [] hrtimer_interrupt+0xea/0x1b4
  [] smp_apic_timer_interrupt+0x76/0x89
  [] apic_timer_interrupt+0x73/0x80
 [] ? arch_local_irq_restore+0x6/0xd
  [] native_cpu_up+0x108/0x127
  [] _cpu_up+0x92/0xfc
  [] enable_nonboot_cpus+0x4d/0xb2
  [] hibernation_snapshot+0x1f3/0x2a0
  [] ? cleanup_srcu_struct+0x52/0x58
  [] hibernate+0x97/0x196
  [] state_store+0x5c/0x106
  [] kobj_attr_store+0x17/0x19
  [] sysfs_write_file+0x10e/0x14a
  [] vfs_write+0xab/0xd2
  [] ? fget_light+0x3a/0xa1
  [] sys_write+0x4d/0x74
  [] system_call_fastpath+0x16/0x1b
INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=314193 
jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
 [] __rcu_pending+0x268/0x3c5
  [] rcu_check_callbacks+0xe9/0x108
  [] update_process_times+0x3f/0x75
  [] tick_sched_timer+0x6d/0x8c
  [] __run_hrtimer+0xc9/0x15c
  [] ? tick_nohz_handler+0xd5/0xd5
  [] hrtimer_interrupt+0xea/0x1b4
  [] smp_apic_timer_interrupt+0x76/0x89
  [] apic_timer_interrupt+0x73/0x80
 [] ? arch_local_irq_restore+0x6/0xd
  [] native_cpu_up+0x108/0x127
  [] _cpu_up+0x92/0xfc
  [] enable_nonboot_cpus+0x4d/0xb2
  [] hibernation_snapshot+0x1f3/0x2a0
  [] ? cleanup_srcu_struct+0x52/0x58
  [] hibernate+0x97/0x196
  [] state_store+0x5c/0x106
  [] kobj_attr_store+0x17/0x19
  [] sysfs_write_file+0x10e/0x14a
  [] vfs_write+0xab/0xd2
  [] ? fget_light+0x3a/0xa1
  [] sys_write+0x4d/0x74
  [] system_call_fastpath+0x16/0x1b
INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, 
t=1713579840232 jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
 [] __rcu_pending+0x268/0x3c5
  [] rcu_check_callbacks+0xab/0x108
  [] update_process_times+0x3f/0x75
  [] tick_sched_timer+0x6d/0x8c
  [] __run_hrtimer+0xc9/0x15c
  [] ? tick_nohz_handler+0xd5/0xd5
  [] hrtimer_interrupt+0xea/0x1b4
  [] smp_apic_timer_interrupt+0x76/0x89
  [] apic_timer_interrupt+0x73/0x80
 [] ? arch_local_irq_restore+0x6/0xd
  [] native_cpu_up+0x108/0x127
  [] _cpu_up+0x92/0xfc
  [] enable_nonboot_cpus+0x4d/0xb2
  [] hibernation_snapshot+0x1f3/0x2a0
  [] ? cleanup_srcu_struct+0x52/0x58
  [] hibernate+0x97/0x196
  [] state_store+0x5c/0x106
  [] kobj_attr_store+0x17/0x19
  [] sysfs_write_file+0x10e/0x14a
  [] vfs_write+0xab/0xd2
  [] ? fget_light+0x3a/0xa1
  [] sys_write+0x4d/0x74
  [] system_call_fastpath+0x16/0x1b
INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, 
t=1713579840233 jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
 [] __rcu_pending+0x268/0x3c5
  [] rcu_check_callbacks+0xe9/0x108
  [] update_process_times+0x3f/0x75
  [] tick_sched_timer+0x6d/0x8c
  [] __run_hrtimer+0xc9/0x15c
  [] ? tick_nohz_handler+0xd5/0xd5
  [] hrtimer_interrupt+0xea/0x1b4
  [] smp_apic_timer_interrupt+0x76/0x89
  [] apic_timer_interrupt+0x73/0x80
 [] ? arch_local_irq_restore+0x6/0xd
  [] native_cpu_up+0x108/0x127
  [] _cpu_up+0x92/0xfc
  [] enable_nonboot_cpus+0x4d/0xb2
  [] hibernation_snapshot+0x1f3/0x2a0
  [] ? cleanup_srcu_struct+0x52/0x58
  [] hibernate+0x97/0x196
  [] state_store+0x5c/0x106
  [] kobj_attr_store+0x17/0x19
  [] sysfs_write_file+0x10e/0x14a
  [] vfs_write+0xab/0xd2
  [] ? fget_light+0x3a/0xa1
  [] sys_write+0x4d/0x74
  [] system_call_fastpath+0x16/0x1b


Amit


Stalls are probably caused by uninitialized percpu hv_clock, with
following patch I don't see stalls. Although I might be j

Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-09 Thread Amit Shah
On (Tue) 07 Feb 2012 [19:05:42], Marcelo Tosatti wrote:
> 
> Upon resume from hibernation, CPU 0's hvclock area contains the old
> values for system_time and tsc_timestamp. It is necessary for the
> hypervisor to update these values with uptodate ones before the CPU uses
> them.
> 
> Abstract TSC's save/restore sched_clock_state functions and use
> restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.
> 
> Fixes suspend-to-disk with kvmclock.

There are stalls after resume, see trace below.

./x86_64-softmmu/qemu-system-x86_64 -kernel ~/src/linux/arch/x86/boot/bzImage  
-append 'root=/dev/vda1 console=tty0 console=ttyS0 no_console_suspend' -drive 
file=/guests/f14-suspend.qcow2,if=none,id=dr0 -device virtio-blk-pci,drive=dr0 
-net nic,model=virtio -net user  -serial stdio  -enable-kvm -m 512  -cpu host 
-smp 4



Disabling non-boot CPUs ...
CPU 1 is now offline
CPU 2 is now offline
CPU 3 is now offline
lockdep: fixing up alternatives.
kvm-clock: cpu 0, msr 0:1f1d19c1, primary cpu clock, resume
PM: Restoring platform NVS memory
Enabling non-boot CPUs ...
lockdep: fixing up alternatives.
Booting Node 0 Processor 1 APIC 0x1
Calibrating delay loop (skipped) already calibrated this CPU
INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=314192 
jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
   [] __rcu_pending+0x268/0x3c5
 [] rcu_check_callbacks+0xab/0x108
 [] update_process_times+0x3f/0x75
 [] tick_sched_timer+0x6d/0x8c
 [] __run_hrtimer+0xc9/0x15c
 [] ? tick_nohz_handler+0xd5/0xd5
 [] hrtimer_interrupt+0xea/0x1b4
 [] smp_apic_timer_interrupt+0x76/0x89
 [] apic_timer_interrupt+0x73/0x80
   [] ? arch_local_irq_restore+0x6/0xd
 [] native_cpu_up+0x108/0x127
 [] _cpu_up+0x92/0xfc
 [] enable_nonboot_cpus+0x4d/0xb2
 [] hibernation_snapshot+0x1f3/0x2a0
 [] ? cleanup_srcu_struct+0x52/0x58
 [] hibernate+0x97/0x196
 [] state_store+0x5c/0x106
 [] kobj_attr_store+0x17/0x19
 [] sysfs_write_file+0x10e/0x14a
 [] vfs_write+0xab/0xd2
 [] ? fget_light+0x3a/0xa1
 [] sys_write+0x4d/0x74
 [] system_call_fastpath+0x16/0x1b
INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=314193 
jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
   [] __rcu_pending+0x268/0x3c5
 [] rcu_check_callbacks+0xe9/0x108
 [] update_process_times+0x3f/0x75
 [] tick_sched_timer+0x6d/0x8c
 [] __run_hrtimer+0xc9/0x15c
 [] ? tick_nohz_handler+0xd5/0xd5
 [] hrtimer_interrupt+0xea/0x1b4
 [] smp_apic_timer_interrupt+0x76/0x89
 [] apic_timer_interrupt+0x73/0x80
   [] ? arch_local_irq_restore+0x6/0xd
 [] native_cpu_up+0x108/0x127
 [] _cpu_up+0x92/0xfc
 [] enable_nonboot_cpus+0x4d/0xb2
 [] hibernation_snapshot+0x1f3/0x2a0
 [] ? cleanup_srcu_struct+0x52/0x58
 [] hibernate+0x97/0x196
 [] state_store+0x5c/0x106
 [] kobj_attr_store+0x17/0x19
 [] sysfs_write_file+0x10e/0x14a
 [] vfs_write+0xab/0xd2
 [] ? fget_light+0x3a/0xa1
 [] sys_write+0x4d/0x74
 [] system_call_fastpath+0x16/0x1b
INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, 
t=1713579840232 jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
   [] __rcu_pending+0x268/0x3c5
 [] rcu_check_callbacks+0xab/0x108
 [] update_process_times+0x3f/0x75
 [] tick_sched_timer+0x6d/0x8c
 [] __run_hrtimer+0xc9/0x15c
 [] ? tick_nohz_handler+0xd5/0xd5
 [] hrtimer_interrupt+0xea/0x1b4
 [] smp_apic_timer_interrupt+0x76/0x89
 [] apic_timer_interrupt+0x73/0x80
   [] ? arch_local_irq_restore+0x6/0xd
 [] native_cpu_up+0x108/0x127
 [] _cpu_up+0x92/0xfc
 [] enable_nonboot_cpus+0x4d/0xb2
 [] hibernation_snapshot+0x1f3/0x2a0
 [] ? cleanup_srcu_struct+0x52/0x58
 [] hibernate+0x97/0x196
 [] state_store+0x5c/0x106
 [] kobj_attr_store+0x17/0x19
 [] sysfs_write_file+0x10e/0x14a
 [] vfs_write+0xab/0xd2
 [] ? fget_light+0x3a/0xa1
 [] sys_write+0x4d/0x74
 [] system_call_fastpath+0x16/0x1b
INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, 
t=1713579840233 jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
   [] __rcu_pending+0x268/0x3c5
 [] rcu_check_callbacks+0xe9/0x108
 [] update_process_times+0x3f/0x75
 [] tick_sched_timer+0x6d/0x8c
 [] __run_hrtimer+0xc9/0x15c
 [] ? tick_nohz_handler+0xd5/0xd5
 [] hrtimer_interrupt+0xea/0x1b4
 [] smp_apic_timer_interrupt+0x76/0x89
 [] apic_timer_interrupt+0x73/0x80
   [] ? arch_local_irq_restore+0x6/0xd
 [] native_cpu_up+0x108/0x127
 [] _cpu_up+0x92/0xfc
 [] enable_nonboot_cpus+0x4d/0xb2
 [] hibernation_snapshot+0x1f3/0x2a0
 [] ? cleanup_srcu_struct+0x52/0x58
 [] hibernate+0x97/0x196
 [] state_store+0x5c/0x106
 [] kobj_attr_store+0x17/0x19
 [] sysfs_write_file+0x10e/0x14a
 [] vfs_write+0xab/0xd2
 [] ? fget_light+0x3a/0xa1
 [] sys_write+0x4d/0x74
 [] system_call_fastpath+0x16/0x1b


Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: x86: kvmclock: abstract save/restore sched_clock_state

2012-02-08 Thread Igor Mammedov

On 02/07/2012 10:05 PM, Marcelo Tosatti wrote:


Upon resume from hibernation, CPU 0's hvclock area contains the old
values for system_time and tsc_timestamp. It is necessary for the
hypervisor to update these values with uptodate ones before the CPU uses
them.

Abstract TSC's save/restore sched_clock_state functions and use
restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.

Fixes suspend-to-disk with kvmclock.

Signed-off-by: Marcelo Tosatti



Acked-by: Igor Mammedov 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html