Re: [PATCH 4/4] Add a timer to allow the separation of consigned from steal time.

2013-03-05 Thread Michael Wolf
Sorry for the delay in the response.  I did not see your question.

On Mon, 2013-02-18 at 20:57 -0300, Marcelo Tosatti wrote:
 On Tue, Feb 05, 2013 at 03:49:41PM -0600, Michael Wolf wrote:
  Add a helper routine to scheduler/core.c to allow the kvm module
  to retrieve the cpu hardlimit settings.  The values will be used
  to set up a timer that is used to separate the consigned from the
  steal time.
 
 1) Can you please describe, in english, the mechanics of subtracting cpu
 hardlimit values from steal time reported via run_delay supposed to
 work?
 
 The period and the quota used to separate the consigned time 
 (expected steal) from the steal time are taken
 from the cfs bandwidth control settings. Any other steal time
 accruing during that period will show as the traditional steal time.
 
 There is no expected steal time over a fixed period of real time.
There is expected steal time in the sense that the administrator of the
system sets up guests on the host so that there will be cpu
overcommitment.  The end user who is using the guest does not know this,
they only know they have been guaranteed a certain level of performance.
So if steal time shows up the end user typically thinks they are not
getting their guaranteed performance. So this patchset is meant to allow
top to show 100% utilization and ONLY show steal time if it is over the
level of steal time that the host administrator setup.  So take a simple
example of a host with 1 cpu and two guest on it.  If each guest is
fully utilized a user will see 50% utilization and 50% steal in either
of the guests.  In this case the amount of steal time that the host 
administrator would expect to see is 50%.  As long as the steal in the
guest does not exceed 50% the guest is running as expected.  If for some
reason the steal increases to 60%, now something is wrong and the steal
time needs to be reported and the end user will make inquiries?

 
 2) From the description of patch 1: In the case of where you have
 a system that is running in a capped or overcommitted environment 
 the user may see steal time being reported in accounting tools 
 such as top or vmstat. 
 
 This is outdated, right? Because overcommitted environment is exactly
 what steal time should report.

I hope I'm not missing your point here.  But again this comes down to
the point of view.  The end user is guaranteed a capability/level of
performance that may not be a whole cpu.  So only show steal time if the
amount of steal time exceeds what the host admin expected when the guest
was set up.
 
 
 Thanks

thanks
Mike Wolf

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Add a timer to allow the separation of consigned from steal time.

2013-03-05 Thread Marcelo Tosatti
On Tue, Mar 05, 2013 at 02:17:57PM -0600, Michael Wolf wrote:
 Sorry for the delay in the response.  I did not see your question.
 
 On Mon, 2013-02-18 at 20:57 -0300, Marcelo Tosatti wrote:
  On Tue, Feb 05, 2013 at 03:49:41PM -0600, Michael Wolf wrote:
   Add a helper routine to scheduler/core.c to allow the kvm module
   to retrieve the cpu hardlimit settings.  The values will be used
   to set up a timer that is used to separate the consigned from the
   steal time.
  
  1) Can you please describe, in english, the mechanics of subtracting cpu
  hardlimit values from steal time reported via run_delay supposed to
  work?
  
  The period and the quota used to separate the consigned time 
  (expected steal) from the steal time are taken
  from the cfs bandwidth control settings. Any other steal time
  accruing during that period will show as the traditional steal time.
  
  There is no expected steal time over a fixed period of real time.
 There is expected steal time in the sense that the administrator of the
 system sets up guests on the host so that there will be cpu
 overcommitment. 

I refer to 

+   /* split the delta into steal and consigned */
+   if (vcpu-arch.current_consigned  vcpu-arch.consigned_quota) {
+   vcpu-arch.current_consigned += delta;
+   if (vcpu-arch.current_consigned  vcpu-arch.consigned_quota) {
+   steal_delta = vcpu-arch.current_consigned
+   - vcpu-arch.consigned_quota;
+   consigned_delta = delta - steal_delta;
+   } else {

You can't expect there to be any amount of stolen time over a fixed
period of time.

  The end user who is using the guest does not know this,
 they only know they have been guaranteed a certain level of performance.
 So if steal time shows up the end user typically thinks they are not
 getting their guaranteed performance. So this patchset is meant to allow
 top to show 100% utilization and ONLY show steal time if it is over the
 level of steal time that the host administrator setup.  So take a simple
 example of a host with 1 cpu and two guest on it.  If each guest is
 fully utilized a user will see 50% utilization and 50% steal in either
 of the guests.  In this case the amount of steal time that the host 
 administrator would expect to see is 50%.  As long as the steal in the
 guest does not exceed 50% the guest is running as expected.  If for some
 reason the steal increases to 60%, now something is wrong and the steal
 time needs to be reported and the end user will make inquiries?

This is the purpose of stolen time: to report the amount of time guest 
vcpu was runnable, but not running (IOW: starved).

  2) From the description of patch 1: In the case of where you have
  a system that is running in a capped or overcommitted environment 
  the user may see steal time being reported in accounting tools 
  such as top or vmstat. 
  
  This is outdated, right? Because overcommitted environment is exactly
  what steal time should report.
 
 I hope I'm not missing your point here.  But again this comes down to
 the point of view.  The end user is guaranteed a capability/level of
 performance that may not be a whole cpu.  So only show steal time if the
 amount of steal time exceeds what the host admin expected when the guest
 was set up.

The real values must be reported. If the host system becomes suddenly
loaded beyond what the host can provide to the guest, should the system
report an incorrect value, to avoid users from complaining? Sounds
incorrect.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Add a timer to allow the separation of consigned from steal time.

2013-02-18 Thread Marcelo Tosatti
On Tue, Feb 05, 2013 at 03:49:41PM -0600, Michael Wolf wrote:
 Add a helper routine to scheduler/core.c to allow the kvm module
 to retrieve the cpu hardlimit settings.  The values will be used
 to set up a timer that is used to separate the consigned from the
 steal time.

1) Can you please describe, in english, the mechanics of subtracting cpu
hardlimit values from steal time reported via run_delay supposed to
work?

The period and the quota used to separate the consigned time 
(expected steal) from the steal time are taken
from the cfs bandwidth control settings. Any other steal time
accruing during that period will show as the traditional steal time.

There is no expected steal time over a fixed period of real time.

2) From the description of patch 1: In the case of where you have
a system that is running in a capped or overcommitted environment 
the user may see steal time being reported in accounting tools 
such as top or vmstat. 

This is outdated, right? Because overcommitted environment is exactly
what steal time should report.


Thanks

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Add a timer to allow the separation of consigned from steal time.

2013-02-07 Thread Glauber Costa
On 02/06/2013 10:07 PM, Michael Wolf wrote:
 On 02/06/2013 08:36 AM, Glauber Costa wrote:
 On 02/06/2013 01:49 AM, Michael Wolf wrote:
 Add a helper routine to scheduler/core.c to allow the kvm module
 to retrieve the cpu hardlimit settings.  The values will be used
 to set up a timer that is used to separate the consigned from the
 steal time.
 Sorry: What is the business of a timer in here?
 Whenever we read steal time, we know how much time has passed and with
 that information we can know the entitlement for the period. This breaks
 if we suspend, but we know that we suspended, so this is not a problem.
 I may be missing something, but how do we know how much time has
 passed?  That is why
 I had the timer in there.  I will go look again at the code but I
 thought the data was collected
 as ticks and passed at random times.  The ticks are also accumulating so
 we are looking at the
 difference in the count between reads.

They can be collected at random times, but you can of course record the
time in which it happened.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Add a timer to allow the separation of consigned from steal time.

2013-02-07 Thread Michael Wolf

On 02/07/2013 02:46 AM, Glauber Costa wrote:

On 02/06/2013 10:07 PM, Michael Wolf wrote:

On 02/06/2013 08:36 AM, Glauber Costa wrote:

On 02/06/2013 01:49 AM, Michael Wolf wrote:

Add a helper routine to scheduler/core.c to allow the kvm module
to retrieve the cpu hardlimit settings.  The values will be used
to set up a timer that is used to separate the consigned from the
steal time.

Sorry: What is the business of a timer in here?
Whenever we read steal time, we know how much time has passed and with
that information we can know the entitlement for the period. This breaks
if we suspend, but we know that we suspended, so this is not a problem.

I may be missing something, but how do we know how much time has
passed?  That is why
I had the timer in there.  I will go look again at the code but I
thought the data was collected
as ticks and passed at random times.  The ticks are also accumulating so
we are looking at the
difference in the count between reads.

They can be collected at random times, but you can of course record the
time in which it happened.


ok.  Let me add a previous_read field and take out the timer.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Add a timer to allow the separation of consigned from steal time.

2013-02-06 Thread Glauber Costa
On 02/06/2013 01:49 AM, Michael Wolf wrote:
 Add a helper routine to scheduler/core.c to allow the kvm module
 to retrieve the cpu hardlimit settings.  The values will be used
 to set up a timer that is used to separate the consigned from the
 steal time.

Sorry: What is the business of a timer in here?
Whenever we read steal time, we know how much time has passed and with
that information we can know the entitlement for the period. This breaks
if we suspend, but we know that we suspended, so this is not a problem.

Everything bigger the entitlement is steal time.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Add a timer to allow the separation of consigned from steal time.

2013-02-06 Thread Michael Wolf

On 02/06/2013 08:36 AM, Glauber Costa wrote:

On 02/06/2013 01:49 AM, Michael Wolf wrote:

Add a helper routine to scheduler/core.c to allow the kvm module
to retrieve the cpu hardlimit settings.  The values will be used
to set up a timer that is used to separate the consigned from the
steal time.

Sorry: What is the business of a timer in here?
Whenever we read steal time, we know how much time has passed and with
that information we can know the entitlement for the period. This breaks
if we suspend, but we know that we suspended, so this is not a problem.
I may be missing something, but how do we know how much time has 
passed?  That is why
I had the timer in there.  I will go look again at the code but I 
thought the data was collected
as ticks and passed at random times.  The ticks are also accumulating so 
we are looking at the

difference in the count between reads.



Everything bigger the entitlement is steal time.
I agree provided I know the amount of total time that the steal time was 
accumulated.



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html