In <005a01cbff6c$b63dded0$22b99c70$@[email protected]>, on
 04/20/2011 at 08:07 AM, Ron Hawkins <[email protected]> said:
>I remember spending some time playing with CPU affinity trying to
>keep the CPU bound jobs away from the AP

re:
http://www.garlic.com/~lynn/2011f.html#49 Dyadic vs AP: Was "CPU 
utilization/forecasting"
http://www.garlic.com/~lynn/2011f.html#50 Dyadic vs AP: Was "CPU 
utilization/forecasting"

360&370 had two-processor multiprocessor shared memory and although had
dedicated channels ... tended to try and simulate shared channels by
trying to configure the same channel numbers on the two processors so
they connected to same controllers (at same addresses, for controllers
that supported multiple channel attachments) ... allowing I/O to be done
to the same controller/device by both processors.

370-APs only had one of the processors with channels ... the other
processor was purely for dedicated execution. I/O requests that
originated on the attahced processor (w/o channels ... or in
multiprocessor for device only connected to channel on the other
processor) ... resulted in internal kernel operation that handed off the
i/o request to processor with the appropriately connected channel.

one of the issues in cache machines ... was that high interrupt rates
tended to have very deterious effect on cache-hit ratios (translates to
effective MIP rate) ... where cache entries for the running application
got replaced with cache entries for interrupt & device i/o handling ...
and then possibly replaced again when switching back to the running
application.

A gimick I hased in the early/mid 70s for cache machine ... was when
observed I/O interrupt rates exceeding threshold ... started running
disabled for I/O interrupts ... but with periodic timer interrupt.  At
the timer interrupt, all pending interrupts would be "drained" under
software control (using SSM to enable for i/o interrupts). The increased
delay in taking the i/o interrupt was more than offset by both increased
cache hit ratio (aka MIP rate) of the application running w/o
interrupts, and then increased cache hit ratio (aka MIP rate) of
effectively batch processing multiple I/O interrupts.

For AP support, I also had a gimick that tended to keep the CPU
intensive operations on the processor w/o channels (sort of natural CPU
affinity). two processor 370 cache machine operation slowed down the
processor machine cycle by 10% (for multiprocessor operation to take
into account cross-cache communication) ... resulted in two-processor
having base hardware of 1.8 times single processor ... multiprocessor
software overhead then tended to result in multiprocssor having 1.4-1.5
times that of uniprocessor.

For HONE 370APs, sometimes I could get 2-processor throughput more than
twice single processor thruput. HONE was heavily compute intensive APL
applications ... although some would periodically do lots of I/O. The
natural processor/cache affinity (improved MIP rate) increased thruput
(along with having extremely short multiprocessor support pathlengths)
... keeping the compute intensive (non I/O) application execution on the
AP processor w/o channels. misc. past posts mentioning (virtual machine
based) HONE (US HONE & HONE clones around the world provided world-wide
sales & marketing support)
http://www.garlic.com/~lynn/subtopic.html#hone

This got messed up in the early 3081 dyadic time-frame. Since ACP/TPF
didn't have multiprocessor support (and it hadn't yet been decided to do
3083) ... VM was "enhanced" to try and improve ACP/TFP 3081 virtual
machine thruput. For a lot of VM pathlength that used to be serialized
with virtual machine execution ... there was an attempt to make it
asynchronous ... running on the 2nd, presumably idle processor ... with
lots of request queuing and processor "shoulder taping" (the increase in
overhead theoritically offset by the reduction in ACP/TPF elapsed
time). However, for customers that had been running fully loaded
(non-ACP/TPF) multiprocessor operation ... the transition to this new
release represented significant degradation (the increased request
queuing and shoulder taping taking 10-15% of both processors).

Then there was a number of onsite visits at various large customers
... attempting to perform other kinds of tuning operations to mask the
enormously increased multiprocessor overhead in the new release (all the
shoulder taping also messed up the natural affinity characteristics ...
motivating a large increase in explicit specified processor affinity).

old email mentioning large gov. TLA ... trying to provide a variety of
performance improvements to offset the multiprocessor overhead increase:
http://www.garlic.com/~lynn/2001f.html#email830402

-- 
virtualization experience starting Jan1968, online at home since Mar1970

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to