The following message is a courtesy copy of an article
that has been posted to bit.listserv.ibm-main,alt.folklore.computers as well.

Anne & Lynn Wheeler <[EMAIL PROTECTED]> writes:
> in the early 80s ... "big pages" were implemented for both VM and MVS.
> this didn't change the virtual page size ... but changed the unit of
> moving pages between memory and 3380s ... i.e. "big pages" were
> 10 4k pages (3380) that moved to disk and were fetched back in from
> disk. a page fault for any 4k page in a "big page" ... would result
> in the whole "big page" being fetched from disk.

re:
http://www.garlic.com/~lynn/2006r.html#35 REAL memory column in SDSF

"big pages" support shipped in VM HPO3.4 ... it was referred to as
"swapper" ...  however the traditional definition of swapping has been
to move all storage associated with a task in single unit ... I've
used the term of "big pages" ... since the implementation was more
akin to demand paging ... but in 3380 track sized units (10 4k pages).

from vmshare archive ... discussion of hpo3.4
http://vm.marist.edu/~vmshare/browse?fn=34PERF&ft=MEMO

and mention of hpo3.4 swapper from melinda's vm history
http://vm.marist.edu/~vmshare/browse?fn=VMHIST05&ft=NOTE&args=swapper#hit

vmshare was online discussion forum provided by tymshare to share
organization starting in the mid-70s on tymshare's vm370 based
commercial timesharing service ... misc. past posts referencing
various vm370 based commercial timesharing services
http://www.garlic.com/~lynn/subtopic.html#timeshare

in the original 370, there was support for both 2k and 4k pages
... and the page size unit of managing real storage with virtual
memory was also the unit of moving virtual memory between real storage
and disk. the smaller page sizes tended to better optimize constrained
real storage sizes (i.e. compared to 4k page sizes, an application
might actually only need the first half or the last half of a specific
4k page, 2k page sizes could mean that the application could
effectively execute in less total real storage).

the issue mentioned in this post
http://www.garlic.com/~lynn/2006r.html#36 REAL memory column in SDSF
and 
http://www.garlic.com/~lynn/2001l.html#46 MVS History (all parts)
http://www.garlic.com/~lynn/2006f.html#3 using 3390 mod-9s

was that systems had shifted from having excess disk i/o resources to
disk i/o resources being a major system bottleneck ... issue also discussed
here about CKD DASD architecture
http://www.garlic.com/~lynn/2006r.thml#31 50th Anniversary of invention of disk 
drives
http://www.garlic.com/~lynn/2006r.thml#33 50th Anniversary of invention of disk 
drives

with the increasing amounts of real storage ... there was more and
more a tendency to leveraging the additional real storage resources to
compensate for the declining relative system disk i/o efficiency.

this was seen in mid-70s with the vs1 "hand-shaking" that was somewhat
done in conjunction with the ECPS microcode enhancement for 370
138/148.
http://www.garlic.com/~lynn/94.html#21 370 ECPS VM microcode assist

VS1 was effectively MFT laid out to run in single 4mbyte virtual
address space with 2k paging (somewhat akin to os/vs2 svs mapping MVT
to a single 16mbyte virtual address space). In vs1 hand-shaking, vs1
was run in a 4mbyte virtual machine with a one-to-one correspondance
between the vs1 4mbyte virtual address space 2k virtual pages and the
4mbyte virtual machine address space. 

VS1 hand-shaking effectively turned over paging to the vm virtual
machine handler (vm would present a special page fault interrupt to
the vs1 supervisor ... and then when vm had finished handling the page
fault, present a page complete interrupt to the vs1 supervisor). Part
of the increase in efficiency was eliminating duplicate paging when
VS1 was running under vm. However part of the efficiency improvement
was VM was doing demand paging using 4k transfers rather than VS1 2k
transfers. In fact, there were situations were VS1 running on 1mbyte
370/148 under VM had better thruput than VS1 running stand-alone w/o VM
(the other part of this was my global LRU replacement algorithm and my
code pathlength from handling page fault, to doing the page i/o to
completion was much better than the equivalent VS1 code).

there were two issues with 3380, over the years, disk i/o had become
increasingly a significant system bottleneck. more specifically
latency per disk access (arm motion and avg. rotational delay) was
significantly lagging behind improvements in other system
components. so part of compensating for disk i/o access latency was to
significantly increase amount transfered per operation. the other was
that 3380 increased the transfer rate by a factor of ten while its
access time only increased by a factor of 3-4. significantly
increasing the amount transferred per access also better matched the
changes in disk technology over time (note later technologies
introduced raid that did large transfers across multiple disk arms in
parallel)

full track caching is another approach that attempts to leverage the
relative abundance of electronic memory (in the drive or controller)
to compensiate for the relative high system cost of doing each disk
arm access. part of this starts transfers (to the cache) as soon as
the arm has settled ... even before the head has reached the specified
requested record. disk rotation is part of the bottleneck ... so full
track caching goes ahead and transfers the full track during the
rotation ... on the off chance that the application might have some
need for any of the rest of the data on the track (the electronic
memory in the cache is relatively free compared to the high system
cost of doing each arm access and rotational delay).

there is a separate system optimization with respect to increasing the
physical page size. making the physical page size smaller allowed for
better optimizing relatively scarce real storage sizes. with the shift
in system bottleneck from constrained real storage to constrained i/o
...  it was possible to increase the amount of data paged per
operation w/o having to actually going to larger physical page size
(by doing transfering multiple pages at a time ... as in the "big
page" scenario).

there is periodic discussion in comp.arch about advantages going to
much bigger (hardware) page sizes ... 64kbytes, 256kbytes, etc ... as
part of increasing TLB (table look-aside buffer) performance. the
actual translation of a virtual address to a physical real storage
address is implemented in TLB. A task switch may result in the need to
change TLB entries ... where hundreds of TLB entries ... one for each
application 4k virtual page may be involved. For some
loads/configuration, the TLB reload latency may become a significant
portion of a task switch elapsed time. Going to much larger pages
sizes ... reduces the number of TLB entries ... and possible TLB entry
reloads ... that are necessary for running an application.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to