Re: Seeing a lot of scale_rt_power messages

2013-10-24 Thread KarlKingston
Linux on 390 Port LINUX-390@VM.MARIST.EDU wrote on 10/23/2013 10:46:35
AM:

 From: Richard Troth ri...@velocitysoftware.com
 To: LINUX-390@VM.MARIST.EDU,
 Date: 10/23/2013 10:46 AM
 Subject: Re: Seeing a lot of scale_rt_power messages
 Sent by: Linux on 390 Port LINUX-390@VM.MARIST.EDU

 I haven't seen it that I recall, but a Google search suggests that it
 comes from load balancing when you're running tickless.

  scale_rt_power: clock:3806a691fbdb1 age:3806a4e60fe00, avg:5d4b8d2f

 The scheduler is trying to tell you something (because the
 scale_rt_power() function is in the scheduler), but the context is
 lost.

 Does this guest have multiple CPUs?

2 cpu's defined.   We have other guests with multiple CPU's but don't
exhibit this behavior.

 Also, have you made any tuning changes over its life?  (Things handled
 by 'sysctl' or /etc/sysctl.conf.)

Not that I am aware of.

 What is the output of ...

 sysctl -a | grep sched


kernel.sched_child_runs_first = 0
kernel.sched_min_granularity_ns = 400
kernel.sched_latency_ns = 1200
kernel.sched_wakeup_granularity_ns = 500
kernel.sched_tunable_scaling = 1
kernel.sched_migration_cost = 50
kernel.sched_nr_migrate = 32
kernel.sched_time_avg = 1000
kernel.sched_shares_window = 1000
kernel.sched_rt_period_us = 100
kernel.sched_rt_runtime_us = 95
kernel.sched_compat_yield = 0
kernel.sched_cfs_bandwidth_slice_us = 5000
kernel.sched_domain.cpu0.domain0.min_interval = 1
kernel.sched_domain.cpu0.domain0.max_interval = 4
kernel.sched_domain.cpu0.domain0.busy_idx = 2
kernel.sched_domain.cpu0.domain0.idle_idx = 1
kernel.sched_domain.cpu0.domain0.newidle_idx = 0
kernel.sched_domain.cpu0.domain0.wake_idx = 0
kernel.sched_domain.cpu0.domain0.forkexec_idx = 0
kernel.sched_domain.cpu0.domain0.busy_factor = 64
kernel.sched_domain.cpu0.domain0.imbalance_pct = 125
kernel.sched_domain.cpu0.domain0.cache_nice_tries = 1
kernel.sched_domain.cpu0.domain0.flags = 4143
kernel.sched_domain.cpu0.domain0.name = CPU
kernel.sched_domain.cpu1.domain0.min_interval = 1
kernel.sched_domain.cpu1.domain0.max_interval = 4
kernel.sched_domain.cpu1.domain0.busy_idx = 2
kernel.sched_domain.cpu1.domain0.idle_idx = 1
kernel.sched_domain.cpu1.domain0.newidle_idx = 0
kernel.sched_domain.cpu1.domain0.wake_idx = 0
kernel.sched_domain.cpu1.domain0.forkexec_idx = 0
kernel.sched_domain.cpu1.domain0.busy_factor = 64
kernel.sched_domain.cpu1.domain0.imbalance_pct = 125
kernel.sched_domain.cpu1.domain0.cache_nice_tries = 1
kernel.sched_domain.cpu1.domain0.flags = 4143
kernel.sched_domain.cpu1.domain0.name = CPU


 Also, what do your boot parms look like?  (Look for HZ timer and other
 scheduler tweaks.)

No parms on the boot for scheduler or HZ timer.



--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Seeing a lot of scale_rt_power messages

2013-10-23 Thread KarlKingston
Seeing a lot of these since we upgraded from SLES10SP4 to SLES11SP2:

scale_rt_power: clock:3806a691fbdb1 age:3806a4e60fe00, avg:5d4b8d2f
Oct 23 04:43:19 sandbx3 kernel: scale_rt_power: clock:3806a691fbdb1
age:3806a4e60fe00, avg:5d4b8d2f
scale_rt_power: clock:3806a71c9a763 age:3806a6c2e6300, avg:2ed2b4f8
 Oct 23 04:43:19 sandbx3 kernel: scale_rt_power: clock:3806a71c9a763
age:3806a6c2e6300, avg:2ed2b4f8

What's up with this?  Any way to get around this?


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Seeing a lot of scale_rt_power messages

2013-10-23 Thread Richard Troth
I haven't seen it that I recall, but a Google search suggests that it
comes from load balancing when you're running tickless.

 scale_rt_power: clock:3806a691fbdb1 age:3806a4e60fe00, avg:5d4b8d2f

The scheduler is trying to tell you something (because the
scale_rt_power() function is in the scheduler), but the context is
lost.

Does this guest have multiple CPUs?
Also, have you made any tuning changes over its life?  (Things handled
by 'sysctl' or /etc/sysctl.conf.)
What is the output of ...

sysctl -a | grep sched

Also, what do your boot parms look like?  (Look for HZ timer and other
scheduler tweaks.)




On Wed, Oct 23, 2013 at 7:32 AM,  karlkings...@ongov.net wrote:
 Seeing a lot of these since we upgraded from SLES10SP4 to SLES11SP2:

 scale_rt_power: clock:3806a691fbdb1 age:3806a4e60fe00, avg:5d4b8d2f
 Oct 23 04:43:19 sandbx3 kernel: scale_rt_power: clock:3806a691fbdb1
 age:3806a4e60fe00, avg:5d4b8d2f
 scale_rt_power: clock:3806a71c9a763 age:3806a6c2e6300, avg:2ed2b4f8
  Oct 23 04:43:19 sandbx3 kernel: scale_rt_power: clock:3806a71c9a763
 age:3806a6c2e6300, avg:2ed2b4f8

 What's up with this?  Any way to get around this?


 --
 For LINUX-390 subscribe / signoff / archive access instructions,
 send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
 http://www.marist.edu/htbin/wlvindex?LINUX-390
 --
 For more information on Linux on System z, visit
 http://wiki.linuxvm.org/



--
-- R;
Rick Troth
Velocity Software
http://www.velocitysoftware.com/

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Seeing a lot of scale_rt_power messages

2013-10-23 Thread Mark Post
 On 10/23/2013 at 07:32 AM, karlkings...@ongov.net wrote: 
 Seeing a lot of these since we upgraded from SLES10SP4 to SLES11SP2:
 
 scale_rt_power: clock:3806a691fbdb1 age:3806a4e60fe00, avg:5d4b8d2f
 Oct 23 04:43:19 sandbx3 kernel: scale_rt_power: clock:3806a691fbdb1
 age:3806a4e60fe00, avg:5d4b8d2f
 scale_rt_power: clock:3806a71c9a763 age:3806a6c2e6300, avg:2ed2b4f8
  Oct 23 04:43:19 sandbx3 kernel: scale_rt_power: clock:3806a71c9a763
 age:3806a6c2e6300, avg:2ed2b4f8
 
 What's up with this?  Any way to get around this?

You'll need to open up a service request with your service provider to get a 
really detailed answer.  At a higher level, that message is coming out of 
kernel/sched_fair.c.  The comment above it says:
/* RT usage tracking looks fishy, report anomaly and restore sanity */

From what I can make out, RT refers to Real Time, but I don't see how that 
fits in with the Completely Fair Scheduler that you're apparently running.  If 
I had to guess, I would say that you're seeing some sort of problem with the 
system clock not being right enough which could be due to some sort of 
resource starvation.

Switching to the Deadline scheduler would probably make the messages go away 
(and I believe Deadline is the recommended scheduler for System z), but that 
wouldn't address the root cause.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/