Re: Computer History Museum

Anne & Lynn Wheeler Thu, 25 Dec 2008 13:16:59 -0800

The following message is a courtesy copy of an article
that has been posted to bit.listserv.ibm-main,alt.folklore.computers as well.

rfocht...@ync.net (Rick Fochtman) writes:
> I don't remember all the mods we made at NCSS, but one change that made 
> a BIG difference on the simplex and duplex 360/67's was this: in the CP 
> kernel, ALL SVC instructions were modified to a BAL to a specific 
> address in the first 4K of storage, where a "vector table" rerouted the 
> call to a specific CP "subroutine". All those interrupts and PSW swaps 
> took FOREVER on the 360/67, whereas a BAL to low storage SEEMED to fly 
> almost instantaineously. The change also seemed to be beneficial when we 
> switched to 370/168 platforms as well. The CMS kernel used a HVC (in 
> actual fact, a DIAGNOSE) to request services from the CP kernel, 
> including I/O services. We also modified MVT to run in a virtual machine 
> using DIAGNOSE, rather than SIO/TIO/HIO, for I/O services. Made MVT run 
> MUCH FASTER in the virtual machine and freed us from all the related 
> emulation of these I/O instructions. One thing I miss: Grant wrote a 
> program, called IMAGE, that created a complete image of the CP kernel, 
> which would load in record time when bringing up the system. I wish I 
> had a copy of that program now, because of its rather unique processing 
> of the RLD data from the object code. I've never quite understood how 
> RLD data is processed by either the linkage editor or the loader. :-(

re:
http://www.garlic.com/~lynn/2008s.html#51 Computer History Museum
http://www.garlic.com/~lynn/2008s.html#52 Computer History Museum
http://www.garlic.com/~lynn/2008s.html#54 Computer History Museum

as an undergraduate ... before joining the science center ... I first
looked at the standard SVC linkage routine (for all kernel calls) and
cut the pathlength by about 75%. I then looked at the most frequently
called subroutines ... and changed them to BALRs ... leaving the
remaining as SVC ... since it no longer represented a significant
portion of CP overhead .... i.e. while SVC/LPSW was expensive with
regard to BALR ... the actual time spent in the original SVC
linkage&return was much, much larger than the SVC/LPSW instruction
... most of the benefit came from reducing the logic. The next was the
BALR ... not only replaced the SVC/LPSW instructions but were also
"eliminated" the rest of the logic for the linkage/return for high-use
routines. When that was done, the remaining SVC/LPSW (and associated
linkage/return overhead) was a trivial percentage of overall time spent
in the kernel.

Remaining big overhead wasn't so much the SIO instruction ... but the
channel program simulation overhead done in "CCWTRANS". CMS turned out
to do very stylized disk channel programs. I created a fastpath channel
program emulation operation for CMS disk I/O (that was also syncronous
... avoiding all the virtual machine gorp for entering wait state,
asyncronous interrupts, etc). This got severely criticized by the people
at the science center (mostly bob adair) because it violated the 360
principles of operation. However, it did significantly reduce cp67
kernel overhead for operating CMS virtual machines. This was then redone
using "DIAGNOSE" instruction ... since the 360 principles of operation
defines the "DIAGNOSE" instruction operation as model-dependent. The
facade was that there was a 360 "virtual machine" machine model which
had its own definition for DIAGNOSE instruction operation.

Standard CP67 saved core image of the loaded kernel to disk (routine
SAVECP) and a very fast loader sequence that brought back that image
back into memory on IPL and then transferred to CP67 startup routine
CPINIT. One of the people at the science center modified CP67 kernel
failure processing to write a image dump to disk area and then simulate
reloaded the disk kernel image from scratch ... basically automagically
failure/restart ... this is mentioned in one of the referenced stories
at MULTICS websites ... one of the people who supported CP67 system at
MIT (and later worked on MULTICS) had modified TTY/ASCII terminal line
processing that would cause the system to crash ... and one day CP67
crashed and automagically (fast) restarted 27 times in a single day
(which help instigate some MULTICS rewrite because it was taking an hour
elapsed time to restart).

The cp67 kernel was undergoing was amount of evolution with new
functions being added. On 768k real storage machine ... every little bit
hurt. So I did a little slight of hand and created a virtual address
space that mapped the cp67 kernel image ... and then flagged the
standard portion as fixed ... but created an infrastructure that allowed
other portions to be paged in & out. This required enhancing the SVC
linkage infrastructure to recognize portions of the kernel that could be
pageable (and do page fetch operation before doing the linkage).

The standard CP67 kernel was built up of "card decks" which had the BPS
loader slapped on the front and "IPL'ed" (either on the real machine or
in a virtual machine). Once the BPS loader had all the routines resolved
in real storage ... it would transfer to SAVECP ... which wrote the core
image to disk (for later IPL). It turns out that the BPS loader also
passed (in registers) the pointer to the resolved (RLD) symbol table.  I
then changed SAVECP to move the BPS (RLD) symbol table to the end of the
(pageable) kernel image ... so that it was also saved to disk (as part
of the pageable kernel area).

I ran into a major problem ... the BPS loader only supported up to 256
external symbols. As part of reorg'ing parts of the kernel to make it
pageable ... i split modules into 4k-byte "chunks" ... creating a lot of
new external symbols. This initially overflowed the BPS loader 256
external symbol limit ... and so I had to resort to all sorts of hacks
to keep the number of external symbols within the 256 limit. Much later
at the science center ... I found a source copy of the BPS loader in a
old card cabinet that was in storage ... I could then modify the BPS
loader to extend the external symbol table maximum.

for additional drift ... in the initial work to convert MVT into VS2 ...
some virtual address tables and page fault processing was hacked into
the side of MVT ... and a copy of CCWTRANS was borrowed from CP67
(i.e. VS2 has the same issue with translating application channel
programs passed by EXCP ... as CP67/VM370 has with translating virtual
machine channel programs). Past posts with references to CCWTRANS:
http://www.garlic.com/~lynn/2008g.html#45 authoritative IEFBR14 reference
http://www.garlic.com/~lynn/2008i.html#68 EXCP access methos
http://www.garlic.com/~lynn/2008i.html#69 EXCP access methos
http://www.garlic.com/~lynn/2008m.html#7 Future architectures
http://www.garlic.com/~lynn/2008o.html#50 Old XDS Sigma stuff
http://www.garlic.com/~lynn/2008q.html#31 TOPS-10

The thing missing from the automagic fast restart ... was the growing
number of "service virtual machines" that had to be brought up manually
... i.e. performance monitor DUSETIMR machine, the VNET, networking
machine, and growing number of others. These "service virtual machines"
are analogous to the current genre of "virtual appliances" found in the
latest incarnation of virtual machine technology.

As part of the performance work on cp67 and then moving to vm370 ...  I
also did a lot of benchmarking work. One of the things that I wanted to
do was automate the benchmarking process ... lots of past posts with
references
http://www.garlic.com/~lynn/submain.html#benchmark

For this, I created the "AUTOLOG" command ... where a virtual machine
could automagically logon other virtual machines ... including passing
an initial startup command to that virtual machine. Then DMKCPI (the
rename CPINIT for vm370) was modified to do a special case execution of
the AUTOLOG command for a specific virtual machine (which would then
handle all the other AUTOLOGS). As mentioned other places, part of the
final sequence for the release of my (vm370) resource manager ... i ran
a series of 2000 (automated) benchmarks that took 3months elapsed time
(as part of final calibration and verification).

However, the AUTOLOG command was also got a lot of use as part of
automating the other parts of automatic bringup (in addition to just
getting the bare bones kernel operational). A few past posts mentioning
AUTOLOG command:
http://www.garlic.com/~lynn/2002q.html#28 Origin of XAUTOLOG (x-post)
http://www.garlic.com/~lynn/2005.html#59 8086 memory space
http://www.garlic.com/~lynn/2006g.html#34 The Pankian Metaphor
http://www.garlic.com/~lynn/2007d.html#23 How many 36-bit Unix ports in the old 
days?
http://www.garlic.com/~lynn/2007n.html#10 The top 10 dead (or dying) computer 
skills
http://www.garlic.com/~lynn/2007r.html#68 High order bit in 31/24 bit address
http://www.garlic.com/~lynn/2007s.html#41 Age of IBM VM
http://www.garlic.com/~lynn/2008m.html#42 APL

As mentioning in previous references ... one of the things I did after
joining the science center ... was also doing a pagemapped filesystem
for CMS. The diagnose I/O API was specific oriented towards drastically
reducing the pathlength overhead associated with CMS I/O. However, there
are still some large number of performance issues related to simulating
a "real address I/O" paradigm in a virtual address environment. The page
map changes retained the high level CMS filesystem paradigm while
remapping the underlying implementation to page mapped infrastructure.
Misc. past post mentioning doing page mapped infrastructure
http://www.garlic.com/~lynn/submain.html#mmap

There were some benchmark comparisons with the same cms and mix-mode,
moderately filesystem intensive operation ... one using underlying
traditional CMS filesystem .... and same CMS, workload, and CMS
filesystem ... but underlying paged mapped ... where the paged mapped
flavor had three times the throughput of the traditional non-paged
mapped flavor.

-- 
40+yrs virtualization experience (since Jan68), online at home since Mar70

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Re: Computer History Museum

Reply via email to