On 11/09, Hubert Feyrer wrote: > > On Sat, 5 Nov 2016, Hubert Feyrer wrote: > >Is this expected behaviour? Definitely surprised me! :) > > FWIW, it seems the same behaviour happens on both netbsd-7/amd64 as > of today as well as on -current/amd64, also from today. > > I've put a screenshot here that shows the issue: > http://www.feyrer.de/Misc/priv/bad-scheduling-7.0_STABLE+7.99.42.png > (Right side is 7.0_STABLE, left is -current as can be seen on the top)
I can also reproduce the problem on amd64 "NetBSD 6.1_STABLE (GENERIC)" (updated today). I used a VMware Fusion (8.5.1) virtual machine hosted on a Mac (macOS Sierra (10.12.1)). The Mac has an Intel Core i7 processor with four physical (eight logical) CPUs: --- 8< --- $ sysctl -n hw.physicalcpu 4 $ sysctl -n hw.logicalcpu 8 --- >8 --- I gave the VM two CPUs, and ran 'sh loop.sh' (using loop.sh as shown in the above screenshot) in two separate terminals. Sure enough, NetBSD schedules both sh processes on the same CPU (1), leaving the other CPU (0) idle, as shown by top: --- 8< --- load averages: 1.31, 0.38, 0.14; up 0+00:05:45 11:59:07 30 processes: 1 runnable, 27 sleeping, 2 on CPU CPU0 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU1 states: 100% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle Memory: 59M Act, 7528K Exec, 40M File, 386M Free Swap: 256M Total, 256M Free PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 4266 jlmuir 32 0 13M 1256K CPU/1 0:33 49.37% 47.46% sh 2727 jlmuir 31 0 13M 1256K RUN/1 0:33 47.99% 46.04% sh 4749 jlmuir 43 0 17M 1760K CPU/0 0:00 0.00% 0.00% top 0 root 125 0 0K 8108K aiodon/1 0:00 0.00% 0.00% [system] 2869 root 85 0 75M 5376K select/1 0:00 0.00% 0.00% sshd 3781 root 85 0 75M 5376K select/1 0:00 0.00% 0.00% sshd 455 root 85 0 75M 5308K select/0 0:00 0.00% 0.00% sshd 631 jlmuir 85 0 75M 3876K select/0 0:00 0.00% 0.00% sshd 4607 jlmuir 85 0 75M 3876K select/0 0:00 0.00% 0.00% sshd 5212 jlmuir 85 0 75M 3876K select/1 0:00 0.00% 0.00% sshd 593 root 85 0 48M 3808K kqueue/1 0:00 0.00% 0.00% master 539 postfix 85 0 48M 3764K kqueue/0 0:00 0.00% 0.00% qmgr 597 postfix 85 0 48M 3732K kqueue/1 0:00 0.00% 0.00% pickup 355 root 85 0 56M 2832K select/1 0:00 0.00% 0.00% sshd 164 root 85 0 23M 1860K kqueue/1 0:00 0.00% 0.00% syslogd 2630 jlmuir 85 0 8844K 1396K pause/1 0:00 0.00% 0.00% ksh --- >8 --- I also tried giving the VM three CPUs and running three sh loop processes. NetBSD didn't leave any CPUs idle in this configuration, but it also didn't seem to fully utilize all of them. I also noticed that it didn't seem to leave each sh process scheduled on the CPU it was running on, rather it seemed to move them around sometimes. NetBSD also didn't appear to always schedule the three sh processes on all three of the CPUs; it seemed to sometimes schedule the three sh processes on just two of the CPUs. This is based only on looking at top, though, so maybe it's just related to the fact that the scheduler is moving the three sh processes around to different CPUs, and top happened to capture the process info at a time where an sh process was in the middle of being moved to another CPU; I don't know. Anyway, here's a top capture where the three sh processes appear to be scheduled on CPU 0 and 2: --- 8< --- load averages: 2.62, 1.02, 0.40; up 0+00:03:31 12:14:48 34 processes: 1 runnable, 30 sleeping, 3 on CPU CPU0 states: 100% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle CPU1 states: 90.4% user, 0.0% nice, 0.0% system, 0.0% interrupt, 9.6% idle CPU2 states: 81.6% user, 0.0% nice, 0.0% system, 0.0% interrupt, 18.4% idle Memory: 41M Act, 7528K Exec, 20M File, 410M Free Swap: 256M Total, 256M Free PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 389 jlmuir 26 0 13M 1260K CPU/0 1:45 90.19% 89.99% sh 260 jlmuir 25 0 13M 1260K RUN/0 1:45 89.79% 89.60% sh 486 jlmuir 25 0 13M 1260K CPU/2 1:46 84.92% 84.72% sh 44 jlmuir 43 0 17M 1812K CPU/1 0:00 0.00% 0.00% top 0 root 125 0 0K 8604K aiodon/0 0:00 0.00% 0.00% [system] 697 root 85 0 75M 5380K select/2 0:00 0.00% 0.00% sshd 41 root 85 0 75M 5380K select/2 0:00 0.00% 0.00% sshd 536 root 85 0 75M 5380K select/2 0:00 0.00% 0.00% sshd 446 root 85 0 75M 5312K select/2 0:00 0.00% 0.00% sshd 691 jlmuir 85 0 75M 3880K select/2 0:00 0.00% 0.00% sshd 658 jlmuir 85 0 75M 3880K select/2 0:00 0.00% 0.00% sshd 74 jlmuir 85 0 75M 3880K select/2 0:00 0.00% 0.00% sshd 40 jlmuir 85 0 75M 3880K select/2 0:00 0.00% 0.00% sshd 592 root 85 0 48M 3812K kqueue/2 0:00 0.00% 0.00% master 540 postfix 85 0 48M 3768K kqueue/0 0:00 0.00% 0.00% qmgr --- >8 --- And here's another top capture, this time where the three sh processes appear to be scheduled on CPU 0 and 1: --- 8< --- load averages: 2.98, 1.91, 0.89; up 0+00:06:32 12:17:49 34 processes: 2 runnable, 29 sleeping, 3 on CPU CPU0 states: 100% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle CPU1 states: 74.5% user, 0.0% nice, 0.0% system, 0.0% interrupt, 25.5% idle CPU2 states: 66.1% user, 0.0% nice, 0.0% system, 0.0% interrupt, 33.9% idle Memory: 41M Act, 7528K Exec, 20M File, 410M Free Swap: 256M Total, 256M Free PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 389 jlmuir 25 0 13M 1260K RUN/0 4:34 84.62% 84.62% sh 260 jlmuir 25 0 13M 1260K CPU/0 4:34 82.67% 82.67% sh 486 jlmuir 25 0 13M 1260K RUN/1 4:18 78.13% 78.12% sh 0 root 0 0 0K 8600K CPU/2 ??? 0.00% 0.00% [system] 44 jlmuir 43 0 17M 1816K CPU/1 0:00 0.00% 0.00% top 697 root 85 0 75M 5380K select/2 0:00 0.00% 0.00% sshd 536 root 85 0 75M 5380K select/2 0:00 0.00% 0.00% sshd 41 root 85 0 75M 5380K select/2 0:00 0.00% 0.00% sshd 446 root 85 0 75M 5312K select/2 0:00 0.00% 0.00% sshd 691 jlmuir 85 0 75M 3880K select/2 0:00 0.00% 0.00% sshd 40 jlmuir 85 0 75M 3880K select/2 0:00 0.00% 0.00% sshd 658 jlmuir 85 0 75M 3880K select/2 0:00 0.00% 0.00% sshd 74 jlmuir 85 0 75M 3880K select/2 0:00 0.00% 0.00% sshd 592 root 85 0 48M 3812K kqueue/2 0:00 0.00% 0.00% master 540 postfix 85 0 48M 3768K kqueue/2 0:00 0.00% 0.00% qmgr --- >8 --- Thanks! Lewis