In <005a01cbff6c$b63dded0$22b99c70$@[email protected]>, on 04/20/2011 at 08:07 AM, Ron Hawkins <[email protected]> said: >I remember spending some time playing with CPU affinity trying to >keep the CPU bound jobs away from the AP
re: http://www.garlic.com/~lynn/2011f.html#49 Dyadic vs AP: Was "CPU utilization/forecasting" http://www.garlic.com/~lynn/2011f.html#50 Dyadic vs AP: Was "CPU utilization/forecasting" 360&370 had two-processor multiprocessor shared memory and although had dedicated channels ... tended to try and simulate shared channels by trying to configure the same channel numbers on the two processors so they connected to same controllers (at same addresses, for controllers that supported multiple channel attachments) ... allowing I/O to be done to the same controller/device by both processors. 370-APs only had one of the processors with channels ... the other processor was purely for dedicated execution. I/O requests that originated on the attahced processor (w/o channels ... or in multiprocessor for device only connected to channel on the other processor) ... resulted in internal kernel operation that handed off the i/o request to processor with the appropriately connected channel. one of the issues in cache machines ... was that high interrupt rates tended to have very deterious effect on cache-hit ratios (translates to effective MIP rate) ... where cache entries for the running application got replaced with cache entries for interrupt & device i/o handling ... and then possibly replaced again when switching back to the running application. A gimick I hased in the early/mid 70s for cache machine ... was when observed I/O interrupt rates exceeding threshold ... started running disabled for I/O interrupts ... but with periodic timer interrupt. At the timer interrupt, all pending interrupts would be "drained" under software control (using SSM to enable for i/o interrupts). The increased delay in taking the i/o interrupt was more than offset by both increased cache hit ratio (aka MIP rate) of the application running w/o interrupts, and then increased cache hit ratio (aka MIP rate) of effectively batch processing multiple I/O interrupts. For AP support, I also had a gimick that tended to keep the CPU intensive operations on the processor w/o channels (sort of natural CPU affinity). two processor 370 cache machine operation slowed down the processor machine cycle by 10% (for multiprocessor operation to take into account cross-cache communication) ... resulted in two-processor having base hardware of 1.8 times single processor ... multiprocessor software overhead then tended to result in multiprocssor having 1.4-1.5 times that of uniprocessor. For HONE 370APs, sometimes I could get 2-processor throughput more than twice single processor thruput. HONE was heavily compute intensive APL applications ... although some would periodically do lots of I/O. The natural processor/cache affinity (improved MIP rate) increased thruput (along with having extremely short multiprocessor support pathlengths) ... keeping the compute intensive (non I/O) application execution on the AP processor w/o channels. misc. past posts mentioning (virtual machine based) HONE (US HONE & HONE clones around the world provided world-wide sales & marketing support) http://www.garlic.com/~lynn/subtopic.html#hone This got messed up in the early 3081 dyadic time-frame. Since ACP/TPF didn't have multiprocessor support (and it hadn't yet been decided to do 3083) ... VM was "enhanced" to try and improve ACP/TFP 3081 virtual machine thruput. For a lot of VM pathlength that used to be serialized with virtual machine execution ... there was an attempt to make it asynchronous ... running on the 2nd, presumably idle processor ... with lots of request queuing and processor "shoulder taping" (the increase in overhead theoritically offset by the reduction in ACP/TPF elapsed time). However, for customers that had been running fully loaded (non-ACP/TPF) multiprocessor operation ... the transition to this new release represented significant degradation (the increased request queuing and shoulder taping taking 10-15% of both processors). Then there was a number of onsite visits at various large customers ... attempting to perform other kinds of tuning operations to mask the enormously increased multiprocessor overhead in the new release (all the shoulder taping also messed up the natural affinity characteristics ... motivating a large increase in explicit specified processor affinity). old email mentioning large gov. TLA ... trying to provide a variety of performance improvements to offset the multiprocessor overhead increase: http://www.garlic.com/~lynn/2001f.html#email830402 -- virtualization experience starting Jan1968, online at home since Mar1970 ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html

