Re: [Xen-devel] [patch 14/33] xen: xen time implementation

2007-06-06 Thread Andi Kleen
On Wednesday 06 June 2007 12:05:22 Jeremy Fitzhardinge wrote:
 Jan Beulich wrote:
  Xen itself knows to deal with this (by using an error correction factor to
  slow down the local [TSC-based] clock), but for the kernel such a situation
  may be fatal: If clocksource-cycle_last was most recently set on a CPU
  with shadow-tsc_to_nsec_mul sufficiently different from that where
  getnstimeofday() is being used, timekeeping.c's __get_nsec_offset() will
  calculate a huge nanosecond value (due to cyc2ns() doing unsigned
  operations), worth abut 4000s. This value may then be used to set a
  timeout that was intended to be a few milliseconds, effectively yielding
  a hung app (and perhaps system).

 
 Hm.  I had a similar situation in the stolen time code, and I ended up
 using signed values so I could clamp at zero.  Though that might have
 been another bug; either way, the clamp is still there.
 
 I wonder if cyc2ns might not be better using signed operations?  Or
 perhaps better, the time code should endevour to do things on a
 completely per-cpu basis (haven't really given this any thought).

This is being worked on.


  Unfortunately so far I haven't been able to think of a reasonable solution
  to this - a simplistic approach like making xen_clocksource_read() check
  the value it is about to return against the last value it returned doesn't
  seem to be a good idea (time might appear to have stopped over some
  period of time otherwise), nor does attempting to adjust the shadowed
  tsc_to_nsec_mul values (because the kernel can't know whether it should
  boost the lagging CPU or throttle the rushing one).
 
 I once had some code in there to do that, implemented in very boneheaded
 way with a spinlock to protect the last time returned variable.  I
 expect there's a better way to implement it.

But any per CPU setup likely needs this to avoid non monotonicity 

-Andi
 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [Xen-devel] [patch 14/33] xen: xen time implementation

2007-06-06 Thread Keir Fraser



On 6/6/07 12:00, Jan Beulich [EMAIL PROTECTED] wrote:

 If the error across CPUS is +/- just a few microseconds at worst then having
 the clocksource clamp to no less than the last timestamp returned seems a
 reasonable fix. Time won't 'stop' for longer than the cross-CPU error, and
 that should always be a tiny value.
 
 Are you sure this is also true when e.g. a CPU gets throttled due to thermal
 conditions? It is my understanding that both the duty cycle adjustment and
 the frequency reduction would yield a reduced rate TSC, which would be
 accounted for only the next time the local clock gets calibrated. Otherwise,
 immediate calibration (and vcpu update) would need to be forced out of the
 thermal interrupt.

Yes, this could be an issue. Is there any way to get an interrupt or MCE
when thermal throttling occurs?

 -- Keir

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [Xen-devel] [patch 14/33] xen: xen time implementation

2007-06-06 Thread Jan Beulich
 Andi Kleen [EMAIL PROTECTED] 06.06.07 14:18 

 
 Yes, this could be an issue. Is there any way to get an interrupt or MCE
 when thermal throttling occurs?

Yes you can get an thermal interrupt from the local APIC. See the Linux
kernel source. Of course there would be still a race window.

On the other hand some timing issues on throttling are probably 
the smallest of the users' problems when it really happens.

Not if this results in your box hanging - I think throttling is exactly intended
to keep the box alive as long as possible (and I've seen throttling in action,
with the box happily recovering from the situation - after having seen it a
few times I checked and found the fan covered with dust).

Standard Linux just ignores it.

Jan


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [Xen-devel] [patch 14/33] xen: xen time implementation

2007-06-06 Thread Andi Kleen
On Wednesday 06 June 2007 14:46:59 Jan Beulich wrote:
  Andi Kleen [EMAIL PROTECTED] 06.06.07 14:18 
 
  
  Yes, this could be an issue. Is there any way to get an interrupt or MCE
  when thermal throttling occurs?
 
 Yes you can get an thermal interrupt from the local APIC. See the Linux
 kernel source. Of course there would be still a race window.
 
 On the other hand some timing issues on throttling are probably 
 the smallest of the users' problems when it really happens.
 
 Not if this results in your box hanging 

Yes it shouldn't hang. Just saying that some non monotonicity in the returned
values under this abnormal condition is probably not the world's end.

-Andi
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization