Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-09 Thread Don Zickus
On Fri, Jan 09, 2015 at 02:15:00PM +1100, Cyril Bur wrote:
> > 
> > I am not too familar with it, but the kernel/watchdog.c code has calls to
> > kvm_check_and_clear_guest_paused(), which is probably a good place to
> > start.
> > 
> Ah yes that, I did initially have a look at what it does when I
> undertook to solve the problem on power and I suppose the two solutions
> are similar in that they both just use a virtualised time source. The
> similarities stop there though, the paravirtualised clock that x86 uses
> provides (as the name of the function implies) a 'was paused' flag.
> Obviously the flag isn't something the vtb register on power8 can
> provide and since we have a vtb, its preferable to use that.
> Perhaps x86 can do something with running_clock?

Marcello?  Drew?

Cheers,
Don

> 
> Regards,
> 
> Cyril
> 
> > Cheers,
> > Don
> > 
> > > 
> > > > Not sure if that is useful or could be incoporated into the power8 code.
> > > > Though to be honest I am curious if the steal_time code could be ported 
> > > > to
> > > > your solution as it seems the watchdog code could remove all the
> > > > steal_time warts.
> > > Happy to help sus out the situation here, again, if you could pass on
> > > what the x86 guys are working on, thanks.
> > > 
> > > 
> > > Thanks,
> > > 
> > > Cyril
> > > > I have cc'd Marcelo into this discussion as he was the last person I
> > > > remember talking with about this problem.
> > > > 
> > > > Cheers,
> > > > Don
> > > 
> > > 
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-09 Thread Don Zickus
On Fri, Jan 09, 2015 at 02:15:00PM +1100, Cyril Bur wrote:
  
  I am not too familar with it, but the kernel/watchdog.c code has calls to
  kvm_check_and_clear_guest_paused(), which is probably a good place to
  start.
  
 Ah yes that, I did initially have a look at what it does when I
 undertook to solve the problem on power and I suppose the two solutions
 are similar in that they both just use a virtualised time source. The
 similarities stop there though, the paravirtualised clock that x86 uses
 provides (as the name of the function implies) a 'was paused' flag.
 Obviously the flag isn't something the vtb register on power8 can
 provide and since we have a vtb, its preferable to use that.
 Perhaps x86 can do something with running_clock?

Marcello?  Drew?

Cheers,
Don

 
 Regards,
 
 Cyril
 
  Cheers,
  Don
  
   
Not sure if that is useful or could be incoporated into the power8 code.
Though to be honest I am curious if the steal_time code could be ported 
to
your solution as it seems the watchdog code could remove all the
steal_time warts.
   Happy to help sus out the situation here, again, if you could pass on
   what the x86 guys are working on, thanks.
   
   
   Thanks,
   
   Cyril
I have cc'd Marcelo into this discussion as he was the last person I
remember talking with about this problem.

Cheers,
Don
   
   
 
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-08 Thread Cyril Bur

On Tue, 2015-01-06 at 10:01 -0500, Don Zickus wrote:
> On Tue, Jan 06, 2015 at 10:53:35AM +1100, Cyril Bur wrote:
> > On Mon, 2015-01-05 at 11:50 -0500, Don Zickus wrote:
> > > cc'ing Marcelo
> > > 
> > > On Mon, Dec 22, 2014 at 04:06:02PM +1100, Cyril Bur wrote:
> > > > When the hypervisor pauses a virtualised kernel the kernel will observe 
> > > > a jump
> > > > in timebase, this can cause spurious messages from the softlockup 
> > > > detector.
> > > > 
> > > > Whilst these messages are harmless, they are accompanied with a stack 
> > > > trace
> > > > which causes undue concern and more problematically the stack trace in 
> > > > the
> > > > guest has nothing to do with the observed problem and can only be 
> > > > misleading.
> > > > 
> > > > Futhermore, on POWER8 this is completely avoidable with the 
> > > > introduction of
> > > > the Virtual Time Base (VTB) register.
> > > 
> > > Hi Cyril,
> > > 
> > > Your solution seems simple and doesn't disturb the softlockup code as much
> > > as the x86 solution does.  The only small issue I had was the use of
> > > sched_clock instead of local_clock.  I keep forgetting the difference
> > > (unstable clock is the biggest reason I think).
> > My apologies there it appears I stuffed up, local_clock was used
> > initially in the softlockup code, I'll send a v2.
> 
> Thanks!
> 
> > 
> > > Other than that, I am not the biggest fan of putting multiple virtual
> > > guest solutions for the same problem into the watchdog code.  I would
> > > prefer a common solution/framework to leverage.
> > Agreed.
> > 
> > > I have the x86 folks focusing on the steal_time stuff.  It started with
> > > KVM and I believe VMWare is working on utilizing it too (and maybe Xen).
> > I'm not sure I've ever seen this, could you please point me towards
> > something I can look at?
> 
> I am not too familar with it, but the kernel/watchdog.c code has calls to
> kvm_check_and_clear_guest_paused(), which is probably a good place to
> start.
> 
Ah yes that, I did initially have a look at what it does when I
undertook to solve the problem on power and I suppose the two solutions
are similar in that they both just use a virtualised time source. The
similarities stop there though, the paravirtualised clock that x86 uses
provides (as the name of the function implies) a 'was paused' flag.
Obviously the flag isn't something the vtb register on power8 can
provide and since we have a vtb, its preferable to use that.
Perhaps x86 can do something with running_clock?

Regards,

Cyril

> Cheers,
> Don
> 
> > 
> > > Not sure if that is useful or could be incoporated into the power8 code.
> > > Though to be honest I am curious if the steal_time code could be ported to
> > > your solution as it seems the watchdog code could remove all the
> > > steal_time warts.
> > Happy to help sus out the situation here, again, if you could pass on
> > what the x86 guys are working on, thanks.
> > 
> > 
> > Thanks,
> > 
> > Cyril
> > > I have cc'd Marcelo into this discussion as he was the last person I
> > > remember talking with about this problem.
> > > 
> > > Cheers,
> > > Don
> > 
> > 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-08 Thread Cyril Bur

On Tue, 2015-01-06 at 10:01 -0500, Don Zickus wrote:
 On Tue, Jan 06, 2015 at 10:53:35AM +1100, Cyril Bur wrote:
  On Mon, 2015-01-05 at 11:50 -0500, Don Zickus wrote:
   cc'ing Marcelo
   
   On Mon, Dec 22, 2014 at 04:06:02PM +1100, Cyril Bur wrote:
When the hypervisor pauses a virtualised kernel the kernel will observe 
a jump
in timebase, this can cause spurious messages from the softlockup 
detector.

Whilst these messages are harmless, they are accompanied with a stack 
trace
which causes undue concern and more problematically the stack trace in 
the
guest has nothing to do with the observed problem and can only be 
misleading.

Futhermore, on POWER8 this is completely avoidable with the 
introduction of
the Virtual Time Base (VTB) register.
   
   Hi Cyril,
   
   Your solution seems simple and doesn't disturb the softlockup code as much
   as the x86 solution does.  The only small issue I had was the use of
   sched_clock instead of local_clock.  I keep forgetting the difference
   (unstable clock is the biggest reason I think).
  My apologies there it appears I stuffed up, local_clock was used
  initially in the softlockup code, I'll send a v2.
 
 Thanks!
 
  
   Other than that, I am not the biggest fan of putting multiple virtual
   guest solutions for the same problem into the watchdog code.  I would
   prefer a common solution/framework to leverage.
  Agreed.
  
   I have the x86 folks focusing on the steal_time stuff.  It started with
   KVM and I believe VMWare is working on utilizing it too (and maybe Xen).
  I'm not sure I've ever seen this, could you please point me towards
  something I can look at?
 
 I am not too familar with it, but the kernel/watchdog.c code has calls to
 kvm_check_and_clear_guest_paused(), which is probably a good place to
 start.
 
Ah yes that, I did initially have a look at what it does when I
undertook to solve the problem on power and I suppose the two solutions
are similar in that they both just use a virtualised time source. The
similarities stop there though, the paravirtualised clock that x86 uses
provides (as the name of the function implies) a 'was paused' flag.
Obviously the flag isn't something the vtb register on power8 can
provide and since we have a vtb, its preferable to use that.
Perhaps x86 can do something with running_clock?

Regards,

Cyril

 Cheers,
 Don
 
  
   Not sure if that is useful or could be incoporated into the power8 code.
   Though to be honest I am curious if the steal_time code could be ported to
   your solution as it seems the watchdog code could remove all the
   steal_time warts.
  Happy to help sus out the situation here, again, if you could pass on
  what the x86 guys are working on, thanks.
  
  
  Thanks,
  
  Cyril
   I have cc'd Marcelo into this discussion as he was the last person I
   remember talking with about this problem.
   
   Cheers,
   Don
  
  



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-06 Thread Don Zickus
On Tue, Jan 06, 2015 at 10:53:35AM +1100, Cyril Bur wrote:
> On Mon, 2015-01-05 at 11:50 -0500, Don Zickus wrote:
> > cc'ing Marcelo
> > 
> > On Mon, Dec 22, 2014 at 04:06:02PM +1100, Cyril Bur wrote:
> > > When the hypervisor pauses a virtualised kernel the kernel will observe a 
> > > jump
> > > in timebase, this can cause spurious messages from the softlockup 
> > > detector.
> > > 
> > > Whilst these messages are harmless, they are accompanied with a stack 
> > > trace
> > > which causes undue concern and more problematically the stack trace in the
> > > guest has nothing to do with the observed problem and can only be 
> > > misleading.
> > > 
> > > Futhermore, on POWER8 this is completely avoidable with the introduction 
> > > of
> > > the Virtual Time Base (VTB) register.
> > 
> > Hi Cyril,
> > 
> > Your solution seems simple and doesn't disturb the softlockup code as much
> > as the x86 solution does.  The only small issue I had was the use of
> > sched_clock instead of local_clock.  I keep forgetting the difference
> > (unstable clock is the biggest reason I think).
> My apologies there it appears I stuffed up, local_clock was used
> initially in the softlockup code, I'll send a v2.

Thanks!

> 
> > Other than that, I am not the biggest fan of putting multiple virtual
> > guest solutions for the same problem into the watchdog code.  I would
> > prefer a common solution/framework to leverage.
> Agreed.
> 
> > I have the x86 folks focusing on the steal_time stuff.  It started with
> > KVM and I believe VMWare is working on utilizing it too (and maybe Xen).
> I'm not sure I've ever seen this, could you please point me towards
> something I can look at?

I am not too familar with it, but the kernel/watchdog.c code has calls to
kvm_check_and_clear_guest_paused(), which is probably a good place to
start.

Cheers,
Don

> 
> > Not sure if that is useful or could be incoporated into the power8 code.
> > Though to be honest I am curious if the steal_time code could be ported to
> > your solution as it seems the watchdog code could remove all the
> > steal_time warts.
> Happy to help sus out the situation here, again, if you could pass on
> what the x86 guys are working on, thanks.
> 
> 
> Thanks,
> 
> Cyril
> > I have cc'd Marcelo into this discussion as he was the last person I
> > remember talking with about this problem.
> > 
> > Cheers,
> > Don
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-06 Thread Don Zickus
On Tue, Jan 06, 2015 at 10:53:35AM +1100, Cyril Bur wrote:
 On Mon, 2015-01-05 at 11:50 -0500, Don Zickus wrote:
  cc'ing Marcelo
  
  On Mon, Dec 22, 2014 at 04:06:02PM +1100, Cyril Bur wrote:
   When the hypervisor pauses a virtualised kernel the kernel will observe a 
   jump
   in timebase, this can cause spurious messages from the softlockup 
   detector.
   
   Whilst these messages are harmless, they are accompanied with a stack 
   trace
   which causes undue concern and more problematically the stack trace in the
   guest has nothing to do with the observed problem and can only be 
   misleading.
   
   Futhermore, on POWER8 this is completely avoidable with the introduction 
   of
   the Virtual Time Base (VTB) register.
  
  Hi Cyril,
  
  Your solution seems simple and doesn't disturb the softlockup code as much
  as the x86 solution does.  The only small issue I had was the use of
  sched_clock instead of local_clock.  I keep forgetting the difference
  (unstable clock is the biggest reason I think).
 My apologies there it appears I stuffed up, local_clock was used
 initially in the softlockup code, I'll send a v2.

Thanks!

 
  Other than that, I am not the biggest fan of putting multiple virtual
  guest solutions for the same problem into the watchdog code.  I would
  prefer a common solution/framework to leverage.
 Agreed.
 
  I have the x86 folks focusing on the steal_time stuff.  It started with
  KVM and I believe VMWare is working on utilizing it too (and maybe Xen).
 I'm not sure I've ever seen this, could you please point me towards
 something I can look at?

I am not too familar with it, but the kernel/watchdog.c code has calls to
kvm_check_and_clear_guest_paused(), which is probably a good place to
start.

Cheers,
Don

 
  Not sure if that is useful or could be incoporated into the power8 code.
  Though to be honest I am curious if the steal_time code could be ported to
  your solution as it seems the watchdog code could remove all the
  steal_time warts.
 Happy to help sus out the situation here, again, if you could pass on
 what the x86 guys are working on, thanks.
 
 
 Thanks,
 
 Cyril
  I have cc'd Marcelo into this discussion as he was the last person I
  remember talking with about this problem.
  
  Cheers,
  Don
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-05 Thread Cyril Bur
On Mon, 2015-01-05 at 14:09 -0800, Andrew Morton wrote:
> On Mon, 22 Dec 2014 16:06:02 +1100 Cyril Bur  wrote:
> 
> > When the hypervisor pauses a virtualised kernel the kernel will observe a 
> > jump
> > in timebase, this can cause spurious messages from the softlockup detector.
> > 
> > Whilst these messages are harmless, they are accompanied with a stack trace
> > which causes undue concern and more problematically the stack trace in the
> > guest has nothing to do with the observed problem and can only be 
> > misleading.
> > 
> > Futhermore, on POWER8 this is completely avoidable with the introduction of
> > the Virtual Time Base (VTB) register.
> 
> Does this problem apply to other KVM implementations and to Xen?  If
> so, what would implementations of running_clock() for those look like? 
> If not, why not?
Yes the problem should appear on other KVM implementations, not really
sure about Xen but I don't see why the problem wouldn't crop up.

x86 do have a method for dealing with it in the softlockup detector,
they've added a check in the softlockup using a paravirtualised clock
where the guest can discover if it had been paused, Xen could be using
too.
It doesn't appear s390 do anything.

Thanks,

Cyril
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-05 Thread Cyril Bur
On Mon, 2015-01-05 at 11:50 -0500, Don Zickus wrote:
> cc'ing Marcelo
> 
> On Mon, Dec 22, 2014 at 04:06:02PM +1100, Cyril Bur wrote:
> > When the hypervisor pauses a virtualised kernel the kernel will observe a 
> > jump
> > in timebase, this can cause spurious messages from the softlockup detector.
> > 
> > Whilst these messages are harmless, they are accompanied with a stack trace
> > which causes undue concern and more problematically the stack trace in the
> > guest has nothing to do with the observed problem and can only be 
> > misleading.
> > 
> > Futhermore, on POWER8 this is completely avoidable with the introduction of
> > the Virtual Time Base (VTB) register.
> 
> Hi Cyril,
> 
> Your solution seems simple and doesn't disturb the softlockup code as much
> as the x86 solution does.  The only small issue I had was the use of
> sched_clock instead of local_clock.  I keep forgetting the difference
> (unstable clock is the biggest reason I think).
My apologies there it appears I stuffed up, local_clock was used
initially in the softlockup code, I'll send a v2.

> Other than that, I am not the biggest fan of putting multiple virtual
> guest solutions for the same problem into the watchdog code.  I would
> prefer a common solution/framework to leverage.
Agreed.

> I have the x86 folks focusing on the steal_time stuff.  It started with
> KVM and I believe VMWare is working on utilizing it too (and maybe Xen).
I'm not sure I've ever seen this, could you please point me towards
something I can look at?

> Not sure if that is useful or could be incoporated into the power8 code.
> Though to be honest I am curious if the steal_time code could be ported to
> your solution as it seems the watchdog code could remove all the
> steal_time warts.
Happy to help sus out the situation here, again, if you could pass on
what the x86 guys are working on, thanks.


Thanks,

Cyril
> I have cc'd Marcelo into this discussion as he was the last person I
> remember talking with about this problem.
> 
> Cheers,
> Don


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-05 Thread Andrew Morton
On Mon, 22 Dec 2014 16:06:02 +1100 Cyril Bur  wrote:

> When the hypervisor pauses a virtualised kernel the kernel will observe a jump
> in timebase, this can cause spurious messages from the softlockup detector.
> 
> Whilst these messages are harmless, they are accompanied with a stack trace
> which causes undue concern and more problematically the stack trace in the
> guest has nothing to do with the observed problem and can only be misleading.
> 
> Futhermore, on POWER8 this is completely avoidable with the introduction of
> the Virtual Time Base (VTB) register.

Does this problem apply to other KVM implementations and to Xen?  If
so, what would implementations of running_clock() for those look like? 
If not, why not?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-05 Thread Don Zickus
cc'ing Marcelo

On Mon, Dec 22, 2014 at 04:06:02PM +1100, Cyril Bur wrote:
> When the hypervisor pauses a virtualised kernel the kernel will observe a jump
> in timebase, this can cause spurious messages from the softlockup detector.
> 
> Whilst these messages are harmless, they are accompanied with a stack trace
> which causes undue concern and more problematically the stack trace in the
> guest has nothing to do with the observed problem and can only be misleading.
> 
> Futhermore, on POWER8 this is completely avoidable with the introduction of
> the Virtual Time Base (VTB) register.

Hi Cyril,

Your solution seems simple and doesn't disturb the softlockup code as much
as the x86 solution does.  The only small issue I had was the use of
sched_clock instead of local_clock.  I keep forgetting the difference
(unstable clock is the biggest reason I think).

Other than that, I am not the biggest fan of putting multiple virtual
guest solutions for the same problem into the watchdog code.  I would
prefer a common solution/framework to leverage.

I have the x86 folks focusing on the steal_time stuff.  It started with
KVM and I believe VMWare is working on utilizing it too (and maybe Xen).

Not sure if that is useful or could be incoporated into the power8 code.
Though to be honest I am curious if the steal_time code could be ported to
your solution as it seems the watchdog code could remove all the
steal_time warts.

I have cc'd Marcelo into this discussion as he was the last person I
remember talking with about this problem.

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-05 Thread Don Zickus
cc'ing Marcelo

On Mon, Dec 22, 2014 at 04:06:02PM +1100, Cyril Bur wrote:
 When the hypervisor pauses a virtualised kernel the kernel will observe a jump
 in timebase, this can cause spurious messages from the softlockup detector.
 
 Whilst these messages are harmless, they are accompanied with a stack trace
 which causes undue concern and more problematically the stack trace in the
 guest has nothing to do with the observed problem and can only be misleading.
 
 Futhermore, on POWER8 this is completely avoidable with the introduction of
 the Virtual Time Base (VTB) register.

Hi Cyril,

Your solution seems simple and doesn't disturb the softlockup code as much
as the x86 solution does.  The only small issue I had was the use of
sched_clock instead of local_clock.  I keep forgetting the difference
(unstable clock is the biggest reason I think).

Other than that, I am not the biggest fan of putting multiple virtual
guest solutions for the same problem into the watchdog code.  I would
prefer a common solution/framework to leverage.

I have the x86 folks focusing on the steal_time stuff.  It started with
KVM and I believe VMWare is working on utilizing it too (and maybe Xen).

Not sure if that is useful or could be incoporated into the power8 code.
Though to be honest I am curious if the steal_time code could be ported to
your solution as it seems the watchdog code could remove all the
steal_time warts.

I have cc'd Marcelo into this discussion as he was the last person I
remember talking with about this problem.

Cheers,
Don
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-05 Thread Andrew Morton
On Mon, 22 Dec 2014 16:06:02 +1100 Cyril Bur cyril...@gmail.com wrote:

 When the hypervisor pauses a virtualised kernel the kernel will observe a jump
 in timebase, this can cause spurious messages from the softlockup detector.
 
 Whilst these messages are harmless, they are accompanied with a stack trace
 which causes undue concern and more problematically the stack trace in the
 guest has nothing to do with the observed problem and can only be misleading.
 
 Futhermore, on POWER8 this is completely avoidable with the introduction of
 the Virtual Time Base (VTB) register.

Does this problem apply to other KVM implementations and to Xen?  If
so, what would implementations of running_clock() for those look like? 
If not, why not?


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-05 Thread Cyril Bur
On Mon, 2015-01-05 at 11:50 -0500, Don Zickus wrote:
 cc'ing Marcelo
 
 On Mon, Dec 22, 2014 at 04:06:02PM +1100, Cyril Bur wrote:
  When the hypervisor pauses a virtualised kernel the kernel will observe a 
  jump
  in timebase, this can cause spurious messages from the softlockup detector.
  
  Whilst these messages are harmless, they are accompanied with a stack trace
  which causes undue concern and more problematically the stack trace in the
  guest has nothing to do with the observed problem and can only be 
  misleading.
  
  Futhermore, on POWER8 this is completely avoidable with the introduction of
  the Virtual Time Base (VTB) register.
 
 Hi Cyril,
 
 Your solution seems simple and doesn't disturb the softlockup code as much
 as the x86 solution does.  The only small issue I had was the use of
 sched_clock instead of local_clock.  I keep forgetting the difference
 (unstable clock is the biggest reason I think).
My apologies there it appears I stuffed up, local_clock was used
initially in the softlockup code, I'll send a v2.

 Other than that, I am not the biggest fan of putting multiple virtual
 guest solutions for the same problem into the watchdog code.  I would
 prefer a common solution/framework to leverage.
Agreed.

 I have the x86 folks focusing on the steal_time stuff.  It started with
 KVM and I believe VMWare is working on utilizing it too (and maybe Xen).
I'm not sure I've ever seen this, could you please point me towards
something I can look at?

 Not sure if that is useful or could be incoporated into the power8 code.
 Though to be honest I am curious if the steal_time code could be ported to
 your solution as it seems the watchdog code could remove all the
 steal_time warts.
Happy to help sus out the situation here, again, if you could pass on
what the x86 guys are working on, thanks.


Thanks,

Cyril
 I have cc'd Marcelo into this discussion as he was the last person I
 remember talking with about this problem.
 
 Cheers,
 Don


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Quieten softlockup detector on virtualised kernels

2015-01-05 Thread Cyril Bur
On Mon, 2015-01-05 at 14:09 -0800, Andrew Morton wrote:
 On Mon, 22 Dec 2014 16:06:02 +1100 Cyril Bur cyril...@gmail.com wrote:
 
  When the hypervisor pauses a virtualised kernel the kernel will observe a 
  jump
  in timebase, this can cause spurious messages from the softlockup detector.
  
  Whilst these messages are harmless, they are accompanied with a stack trace
  which causes undue concern and more problematically the stack trace in the
  guest has nothing to do with the observed problem and can only be 
  misleading.
  
  Futhermore, on POWER8 this is completely avoidable with the introduction of
  the Virtual Time Base (VTB) register.
 
 Does this problem apply to other KVM implementations and to Xen?  If
 so, what would implementations of running_clock() for those look like? 
 If not, why not?
Yes the problem should appear on other KVM implementations, not really
sure about Xen but I don't see why the problem wouldn't crop up.

x86 do have a method for dealing with it in the softlockup detector,
they've added a check in the softlockup using a paravirtualised clock
where the guest can discover if it had been paused, Xen could be using
too.
It doesn't appear s390 do anything.

Thanks,

Cyril
 
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] Quieten softlockup detector on virtualised kernels

2014-12-21 Thread Cyril Bur
When the hypervisor pauses a virtualised kernel the kernel will observe a jump
in timebase, this can cause spurious messages from the softlockup detector.

Whilst these messages are harmless, they are accompanied with a stack trace
which causes undue concern and more problematically the stack trace in the
guest has nothing to do with the observed problem and can only be misleading.

Futhermore, on POWER8 this is completely avoidable with the introduction of
the Virtual Time Base (VTB) register.

Cyril Bur (2):
  Add another clock for use with the soft lockup watchdog.
  powerpc: add running_clock for powerpc to prevent spurious softlockup
warnings

 arch/powerpc/kernel/time.c | 24 
 include/linux/sched.h  |  1 +
 kernel/sched/clock.c   | 14 ++
 kernel/watchdog.c  |  2 +-
 4 files changed, 40 insertions(+), 1 deletion(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] Quieten softlockup detector on virtualised kernels

2014-12-21 Thread Cyril Bur
When the hypervisor pauses a virtualised kernel the kernel will observe a jump
in timebase, this can cause spurious messages from the softlockup detector.

Whilst these messages are harmless, they are accompanied with a stack trace
which causes undue concern and more problematically the stack trace in the
guest has nothing to do with the observed problem and can only be misleading.

Futhermore, on POWER8 this is completely avoidable with the introduction of
the Virtual Time Base (VTB) register.

Cyril Bur (2):
  Add another clock for use with the soft lockup watchdog.
  powerpc: add running_clock for powerpc to prevent spurious softlockup
warnings

 arch/powerpc/kernel/time.c | 24 
 include/linux/sched.h  |  1 +
 kernel/sched/clock.c   | 14 ++
 kernel/watchdog.c  |  2 +-
 4 files changed, 40 insertions(+), 1 deletion(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] Quieten softlockup detector on virtualised kernels

2014-11-30 Thread Cyril Bur
When the hypervisor pauses a virtualised kernel the kernel will observe a jump
in timebase, this can cause spurious messages from the softlockup detector.

Whilst these messages are harmless, they are accompanied with a stack trace
which causes undue concern and more problematically the stack trace in the
guest has nothing to do with the observed problem and can only be misleading.

Futhermore, on POWER8 this is completely avoidable with the introduction of
the Virtual Time Base (VTB) register.

Cyril Bur (2):
  Add another clock for use with the soft lockup watchdog.
  powerpc: add running_clock for powerpc to prevent spurious softlockup
warnings

 arch/powerpc/kernel/time.c | 24 
 include/linux/sched.h  |  1 +
 kernel/sched/clock.c   | 14 ++
 kernel/watchdog.c  |  2 +-
 4 files changed, 40 insertions(+), 1 deletion(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] Quieten softlockup detector on virtualised kernels

2014-11-30 Thread Cyril Bur
When the hypervisor pauses a virtualised kernel the kernel will observe a jump
in timebase, this can cause spurious messages from the softlockup detector.

Whilst these messages are harmless, they are accompanied with a stack trace
which causes undue concern and more problematically the stack trace in the
guest has nothing to do with the observed problem and can only be misleading.

Futhermore, on POWER8 this is completely avoidable with the introduction of
the Virtual Time Base (VTB) register.

Cyril Bur (2):
  Add another clock for use with the soft lockup watchdog.
  powerpc: add running_clock for powerpc to prevent spurious softlockup
warnings

 arch/powerpc/kernel/time.c | 24 
 include/linux/sched.h  |  1 +
 kernel/sched/clock.c   | 14 ++
 kernel/watchdog.c  |  2 +-
 4 files changed, 40 insertions(+), 1 deletion(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/