The following message is a courtesy copy of an article that has been posted to bit.listserv.ibm-main,alt.folklore.computers as well.
[EMAIL PROTECTED] (Tom Schmidt) writes: > I've been running VM more off than on since PLC 5 and I'm certain that > the behavior that I referenced WAS in VM... at some point. But if you > & Lynn Wheeler say it isn't there now, I'll believe you (unless/until > I can prove you wrong, of course). > > But I know back in the VM/HPO or (maybe) early VM/XA days it was true > that VM put itself into a tiny loop while it waited for work. The > loop was in a unique-to-VM PSW key so that the hardware monitor (the > "speedometer") could tell the difference between work and wait. there were a number of specific environment experiments done in that time frame ... for one reason or another. one of the first was for acp/tpf on 3081. acp/tpf didn't support multiprocessor support ... and there wasn't going to be a non-multiprocessor machine. normally to simulate a privilege instruction (not handled by the micrcode) ... interrupt into the vm kernel, do the simulation, and return to the virtual machine. this "overhead" will tend to be constant from run to run ... directly part of doing work for the virtual machine. over the years, attempts were made to get this as small as possible and/or have it done directly in the hardware of the machine. the other overhead is the cache/paging scenario ... fault for page not in memory and there is overhead to bring the page into memory. this is analogous to cache miss ... and the program appears to execute slower because of the latency to process the cache miss. this can be variable based on other activity going on in the real machine (analogous reasons for both cache misses and page faults). in the acp/tpf scenario ... if essentially just about the only workload was acp/tpf ... the 2nd 3081 processor would be idle. so there was a hack developed for things like SIOF emulation ... interrupt into the kernel, create an asyncronous task for SIOF emulation, SIGP the other processor and return to acp/tpf virtual machine. Creation of asyncronous task, signaling the other processor, taking the interrupts, plus misc. multiprocessing locking/unlocking drove up total avg "overhead" by ten to fifteen percent. However, the SIOF and ccw translation offloaded to run asyncronously on the idle 3081 resulted in net thruput benefit for the single acp/tpf scenario. The "problem" was that the implementation drove up the total avg "overhead" by ten to fifteen percent for every customer running VM on multiprocessor ... even those where the other processors weren't idle. For pure topic drift ... there is something analogous going on in the current environment with multi-core processors being introduced into the desktop/laptop (personal) computing environment. Eventually, 3081 with the 2nd processor was removed was announced as 3083 (for acp/tpf customers). Since 3081 still had the cross-cache chatter 10 percent cycle slowdown scenario i've described for 370 multiprocessor ... they were able to run the single 3081 (aka 3083) processor nearly 15percent faster. even later still, acp/tpf eventually supported multiprocessor. "Active wait" was another such experiment ... where a specific hardware configuration and workload gained a couple percent if the system effectively polled for something to do. from long ago and far away To: wheeler DATE: 04/19/85 20:58:47 On 4/10/85 xxxxxx presented his latest results to management and others and I thought you might be interested to hear how we stack up against HPO. These are runs of VM/XA SF1 (which is Mig. Aid releases 3 and 4 rolled up into one package now), with about 2K LOC of enhancements to boost the performance. The enhancements include processor-local true runlists and "active wait", with a master-only runlist also. They also include a significant rework of the drum paging code and rework of the SSKE code (for non-resident pages only?). And other things which I just forget now. All these things collectively saved a whole lot of execution time. As a result, SF1 now can handle 80% of the number of CMS users that HPO can handle, whereas earlier it was only about 60% as many as HPO. ... snip ... now, the HPO base they are referring to still has the 10-15 percent multiprocessor "penalty" that had been introduced for the acp/tpf environment. There were also a list of a dozen or so other carefully chosen workload and configuration items to try and weight the comparison in VM/XA SF1 favor (CMS workload truely trivial, trivial paging activity, homogeneous well-behaved CMS workload ... but lots of them). I don't remember the exact VM/XA SF1 processor cycles for "active wait" trade-off vis-a-vis actually being in wait state (and VM/XA was a totally different implementation from VM/HPO). The "active wait" was along the same lines as the "delayed queue drop" fix from the same era. There was a bug in identifying "idle" activity and dropping "idle" tasks from active queue (decommitting resources) ... for some virtual machines under some circumstances. Rather than fix the bug ... whenever something was identified as idle, introduce a couple hundred millisecond delay before actual "queue drop" (for everybody). Under specific configuration & workload considerations, systems showed improved thruput with the "delayed queue drop" fix ... but could make thruput in other environments worse. The actual solution should have been to fix the underlying bug ... rather than layering fixes on top. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html

