Re: SHAREWARE at Its Finest

Anne & Lynn Wheeler Sun, 28 Feb 2010 21:33:38 -0800

The following message is a courtesy copy of an article
that has been posted to bit.listserv.ibm-main,alt.folklore.computers as well.

t...@harminc.net (Tony Harminc) writes:
> I don't know how similar the 158 and 155 really were (certainly very
> different front panel implementations), but it's interesting that the
> 303x got the microcoded channels, rather than the clunky but rock
> solid 28x0 hardwired ones.

re:
http://www.garlic.com/~lynn/2010e.html#27 SHAREWARE at Its Finest

as i've mentioned before ... happening to mention the 15min MTBF issue
internally at the time brought down the wrath of the mvs organization on
my head.

the 3081 channel was a lot like the 158 also.

some discussion about the 3081 ...  another somewhat quick effort in the
wake of the failure of FS;
http://www.jfsowa.com/computer/memo125.htm

misc. past posts mentioning future system
http://www.garlic.com/~lynn/submain.html#futuresys

misc. past posts getting to play disk engineer in bldgs. 14&15
http://www.garlic.com/~lynn/subtopic.html#disk

i had been doing some timing tests on how sort a "dummy" record that was
needed to doing track head switch (seek head) between two rotationally
consecutive records on different tracks (involving both channel
processing latency and control unit processing latency). 168, 145, 4341
could succesfully do the switch with shorter block than 158, 303x, and
3081. There were also some number of OEM disk controllers that had lower
latency and required smaller dummy record ... less of rotational delay
to cover the processing latency to process seek head operation.

The 3830 disk controller was horizontal microcode engine that was much
faster than the vertical microcode engine (jib-prime) used in the 3880
disk controller. To compensate for the slower processing and also handle
higher data rates ... there was dedicated hardware for data flow
... separate from the control processing done by the jib-prime. 
Data-streaming was also introduced (no longer having to do handshake for
every byte transfer (help both with supporting 3mbyte transfers at the
same time allowing max. channel length to be increased from 200ft to
400ft).

There was requirement at the time that 3880 had to be within +/-
5percent of 3830 ... they ran some batch operating performance tests in
STL and it didn't quite make it ... so they tweaked the 3880 control to
present operation complete interrupt ... before 3880 had actually fully
completed all the operations (to appear to be "within" five percent of
3830). Then if 3880 discovered something in error in its cleanup work
... it would present an asyncrhonous unit check. I told them that was
violation of the architecture ... at which time they dragged me into
resolution conference calls with the channel engineers in POK. Finally
they decided that they would saved up the unit check error condition
... and present it as cc=1, csw-stored, unit check on the next sio
("unsolicited" unit checks were violation of channel architecture).

so everybody seems to be happy. then one monday morning the bldg. 15
engineers call me up and asked me what i did over the weekend to trash
the performance of the (my) vm370 system they were running.  I claimed
to have done nothing ... they claimed to have done nothing.  Finally it
was determined that they had replaced a 3830 that they were using with
string of 16 3330 cms drives ... with a 3880. While their batch os
acceptance test didn't have a problem ... i had severely optimized the
pathlength for i/o redrive (of queued operations) after an i/o
completion. My redrive sio was managed to hit the 3880 with the next
operation while the 3880 was still busy cleaning up the previous
operation (they had assumed that they could get it done faster than
operating system interrupt processing). Because the controller was still
busy, I would get cc=1, csw-stored, sm+busy (aka controller busy). The
operation then would have to be requeued and go off to look for
something else to do. Then because the controller had signaled SM+BUSY,
it was obligated to do a CUE interrupt. The combination of the 3880
slower processing and all the extra operating system processing gorp
... was degrading their interactive service by severe 30percent (which
was what had prompted the monday morning call).

While 3880, the batch performance "acceptance" tests had originally
eventually passsed ... however somewhat related the the earlier 15min
MTBF issue ... much nearer 3880 product ship, engineers had developed a
regression test suite of 57 expected errors. old email that for all the
errors in the regression test, mvs required reboot ... and in 2/3rds the
cases, there was no evidence of what required mvs to be rebooted.
Recent post mentioning the issue
http://www.garlic.com/~lynn/2010d.html#59 LPARs: More or Less?

old posted email mentioning the problem:
http://www.garlic.com/~lynn/2007.html#email801015

-- 
42yrs virtualization experience (since Jan68), online at home since Mar1970

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Re: SHAREWARE at Its Finest

Reply via email to