On Wednesday 25 April 2007 04:26, Mike Mattie wrote: > Hello, > > 0. intro > > I am very happy to report that v46 of RSDL subjectively is much better than > v42. As you (Con Kolivas) might remember from a previous mail I was > experimenting with using nice levels effectively. I have refined these > levels to this layout: > > -2 : clock (ntpd) > -1 : syslog,sshd,X > 0 : command; default for shells > 1 : audacious (audio), xfce window manager (with compositor on ) > 2 : emacs (SCHED_OTHER), desktop/window manager infrastructure (dbus), > ssh-agent , bind (batch scheduled ) 3 : desktop applications (mail , > xchat, openoffice ) > 5 : spamd,batch scheduled compiles/test-suites. > 10 : cron jobs > > 1. Some numbers > > My machine is a particularly tough case I think. A uni-processor Athlon XP > 3000+ (involuntary pre-empt) with a software RAID5 on PATA drives. I load > it heavily with compiles/test-suites, and I am very sensitive to audio > glitches. > > here are some stats for idle: > > ---load-avg--- ------memory-usage----- ----total-cpu-usage---- > ----interrupts--- ---system-- _1m_ _5m_ 15m_|_used _buff _cach _free|usr > sys idl wai hiq siq|__17_ __18_ __20_|_int_ _csw_ 0.2 0.2 0.2| 170M 15M > 309M 6560k| 2 1 94 4 0 0| 1 7 150 | 238 208 0.2 0.2 > 0.2| 170M 15M 309M 6568k| 1 0 99 0 0 0| 0 0 0 | 76 > 55 0.2 0.2 0.2| 170M 15M 309M 6568k| 0 1 99 0 0 0| 0 > 0 0 | 75 47 0.2 0.2 0.2| 170M 15M 309M 6624k| 4 0 96 0 > 0 0| 0 0 0 | 75 37 0.2 0.2 0.2| 170M 15M 309M 6624k| > 1 0 99 0 0 0| 0 0 0 | 75 36 > > here are some stats for music playing: > > ---load-avg--- ------memory-usage----- ----total-cpu-usage---- > ----interrupts--- ---system-- _1m_ _5m_ 15m_|_used _buff _cach _free|usr > sys idl wai hiq siq|__17_ __18_ __20_|_int_ _csw_ 0.9 0.4 0.2| 175M 15M > 305M 5652k| 2 1 94 4 0 0| 1 7 150 | 238 210 0.9 0.4 > 0.2| 175M 15M 305M 5652k| 10 1 89 0 0 0| 0 3 989 |1068 > 1510 0.9 0.4 0.2| 175M 15M 305M 5592k| 13 0 87 0 0 0| 0 > 3 1013 |1093 1565 0.9 0.4 0.2| 175M 15M 304M 6300k| 11 1 88 0 > 0 0| 0 3 1000 |1078 1496 0.9 0.4 0.2| 175M 15M 305M 6300k| > 13 0 87 0 0 0| 0 3 1006 |1084 1509 0.8 0.4 0.2| 175M > 15M 305M 6180k| 13 1 86 0 0 0| 0 3 1000 |1078 1524 0.8 > 0.4 0.2| 175M 15M 305M 6060k| 12 1 87 0 0 0| 0 3 1000 > |1078 1564 > > The context switches are high, but so are the interrupts (USB 2.0 Audigy > NX) > > To see how effective using these nice levels were I decided to play with > rr_interval, on the theory that with priorities strictly enforced and used > aggressively that a longer time-slice would not cause audio delay. So far > that theory is holding. All of these numbers are with rr_internal = 20, and > I have less audio problems than any previous kernel/tuning setup. > > That is very impressive. > > as far as batch loading goes I tried a kernel compile. These numbers look > nice for RSDL but there are some caveats: > > kernel compile , CFS v3 : make 756.83s user 89.37s > system 58% cpu 24:08.21 total kernel compile , v46 rr_interval = default : > make 754.66s user 89.74s system 59% cpu 23:35.38 total kernel compile , > v46 rr_interval = 20 : make 682.83s user 84.34s system 73% cpu > 17:29.57 total > > 1. The system was noisy. I did this intentionally. My typical load is a > mixture of desktop/compile. All three numbers were generated while > listening to music, reading docs/web/news, using emacs etc. with each of > the compiles I tried running a visualization plugin (ProjectM inside > audacious ) for a minute or so. > > This skews the numbers for comparison , but I was looking for an > impression that was based off a *real* work-load. > > It would like to add as well that before RSDL the mainline scheduler > failed completely at running ProjectM even when it was the only application > on the desktop. ( It stalled for seconds with a rock steady period ). > > 2. All of these ran nice 5 sched: BATCH > > 3. I have the xfce compositor turned on, using the transparency. > > 4. compiled on software RAID 5 (md) -> dev mapper -> lvm2 -> ext3 , 4 > drives, write-cache disabled, external 512 mg flash drive for a external > journal , commit=15, journal=data > > From the caveats above , especially the deep stack for the block layer, > plus meeting audio deadlines while sharing a interrupt with the journal > drive (arghh) this is very impressive system behavior for me. > > Here is the stats for doing a kernel compile with audacious running, plus > mail,editor etc. > > ---load-avg--- ------memory-usage----- ----total-cpu-usage---- > ----interrupts--- ---system-- _1m_ _5m_ 15m_|_used _buff _cach _free|usr > sys idl wai hiq siq|__17_ __18_ __20_|_int_ _csw_ 1.3 1 0.8| 198M 22M > 269M 11M| 3 1 92 4 0 0| 1 7 199 | 287 348 1.3 1 > 0.8| 204M 22M 269M 6072k| 79 12 0 9 0 0| 0 7 1003 |1087 > 2160 1.3 1 0.8| 195M 22M 268M 16M| 82 18 0 0 0 0| 0 > 8 1003 |1085 2703 1.3 1 0.8| 200M 22M 268M 10M| 82 16 0 2 > 0 0| 0 8 1009 |1094 2204 1.4 1 0.8| 195M 22M 269M 15M| > 83 15 0 2 0 0| 0 8 1014 |1099 3007 1.4 1 0.8| 200M > 22M 269M 9488k| 82 14 0 4 0 0| 0 7 1000 |1082 2361 1.4 > 1 0.8| 200M 22M 267M 12M| 83 15 0 2 0 0| 0 7 1000 > |1085 2579 > > > Now for some comments from the peanut gallery. > > 2. Window Manager scheduler hinting ? > > On reflection my workload may be the easy case. As a developer I run a > somewhat small number of applications, typically the lightest I can find, > except emacs :) > > A more typical desktop user might not be able to use my sort of setup, > where I can push a batchy job down in priority and wait for it. I also > write shell functions, aliases etc to set this up, which is easy for a > distro, but not necessarily average user usable. For the users where they > are running multiple monolithic CPU hog programs, like openoffice,firefox > etc This sort of approach won't suit them. > > However the strict enforcement of RSDL could be leveraged for the desktop > user as well. The Mac OSX scheduler has layered on-top of the typical nice > priority levels the concept of foreground and background scheduling. > Basically the Mac window manager can tune the scheduling based on window > focus. > > I think something like this combined with RSDL could be a worthy > experiment. If the window manager can calculate the "attention" a user > gives a window then it could nice it up/down within a small range. Mac OS X > has a nasty behavior of being jerky when switching focus under load. I > think this is due to a simplistic knee-jerk response to window focus in > scheduling (or my ibook has to little RAM). > > If a linux window manager were to rank the attention of windows, and be > smart about cycling between groups of apps I think three priority levels > could be used like this: > > 1 : foreground ( frequent attention ) > 2 : background ( infrequent attention ) > 3 : batchy ( downloaders, other long running infrequently monitored > programs ) > > Think of how easy this is for a window-manager to compute, compared to > trying to re-build the information in-kernel with heuristics. > > If this idea is actually pursued there may need to be a new feature in > RSDL. With this scheme it is very important to ensure that a particular > nice level does not become overloaded ( think foreground ) . The current > linux schedulers report a load value for the total system. This scheme > needs to know the load value for a individual nice level as well, that way > the foreground nice level could remain responsive by worst case kicking a > program down a level or two if it starts becoming unresponsive. > > 3. Better throughput > > I think that this mixed developer work-load is actually the worst case for > a scheduler. It has to meet deadlines and provide decent throughput. Beyond > pre-empt and clock precise scheduling I am not sure if there is much more > that can be done for interactive. > > I do think that SCHED_BATCH provides alot of room for interesting ideas > though since the guarantees are so loose. As I understand it SCHED_BATCH is > guaranteed to not starve and that is about it. > > Since I am commenting freely here is a idea to be taken with a huge grain > of salt. Is it possible that the scheduler could compute and combine the > deadlines for both audio/video ? If the scheduler can compute the longest > interval between both video/audio refresh then scheduling could be arranged > like so: > > refresh -> interactive -> batch -> refresh > > The interactive processes would run first, that way the risk of missing a > refresh would be minimized. Once the scheduler has ran all the interactive > stuff, for the case of a small set of programs such as audio player and > editor, it would be very likely that alot of time is left. > > Next assume that the SCHED_BATCH has been sorted into CPU intensive and IO > intensive. For the CPU intensive it would be nice if the scheduler would > give it a massive time-slice, why not all the time until the next refresh > point ? Basically reduce the context-switching to mostly > interrupts/background noise. The SCHED_BATCH programs may take longer to > run, as they are being interleaved more than balanced, but I think it's > possible that overall throughput could be increased considerably. If > something like this could be done while still honoring the nice values > (though not as strictly as for interactive programs ) it would be a big > win. With huge time-slices other parts of the system such as VM management > might behave more efficiently as well. > > I think linux would be quite special if it was the best in throughput > efficiency (ignoring completion time, just how much processor etc used to > run the same work-load ) for SETI like work-loads while still running a > fully responsive interactive desktop. > > btw, the above concept is articulated from a distant background of > programming a VGA adapter on a 286. That the last time I dealt with > hard-deadlines hands on. I haven't had a reason to code at bare-metal since > I started using linux so please consider it a vehicle for articulating a > concept. > > 4. Outro > > In summary I like the RSDL scheduler quite a bit. It is consistent and > doesn't do magic so I can build a priority scheme on-top of it with a very > compact and reliable behavior model. Using the priority levels seems to > allow me to use larger time-slices without sacrificing interactivity. This > is unsuprising as I am actually telling the scheduler what I want ...... > > I think that the window manager can use simple algorithms to calculate what > the kernel would have to guess at with hairy heuristics. Hacking nice > throttling into the window manager combined with a very simple but reliable > scheduler may work pretty well for desktop users. Maybe that will excite > someone enough to go try it, or dig up some existing implementation (other > than OSX). > > I also think that SCHED_BATCH is where alot of fun experiments can be > played. Especially in regards to CPU intensive programs. This combination > is actually quite common I would think in audio/video production. > > At this point with how well my system works the itch has been scratched as > far as the in-kernel part goes. I am interested though in playing around > with your idlerun program though. > > Later on , possibly much later I will cook up some better > numbers/comparisons. I really don't trust subjective evaluations of > scheduling, my own included. I think people really want a new kernel patch > to work better, which is a horrible way to start an evaluation. I want to > measure both throughput, and interactivity in a double-blind like way. > (random option for grub ?) > > With most of my work-load IO bound I expect the performance improvements to > come from places like CFQ,ext4,syslet etc. > > Thank you to all for a good kernel. Linux user-space is quite comfortable > these days.
Thanks for extensive report. The only thing I can say in a short time space at the pc is that SCHED_BATCH as defined by Ingo in the current mainline kernel, which if I am to abide by in SD, means "I want the same total cpu percentage as other tasks at this nice level but I am latency insensitive". ie it is not the "idle priority" type of SCHED_BATCH. That sort of thing is implemented in -ck as SCHED_IDLEPRIO. If I have pc time and health I'll be reimplementing it for -ck when it moves to the SD scheduler. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/