Rick Fochtman wrote:

> Consider chewing gum and bailing wire??  :-)

An old war story is always fun, ne c'est pas? 

I've told this one before here, but it is right up 
this little side alley Ed Jaffee started. This one 
involves a missing hammer.

I had standalone time on a 360/75 one weekend, from 
10 PM Saturday to 6 AM Sunday.  Very unusual, but I
needed every minute. I had made several passes (and
taken several standalone dumps -- on a fast 1403-N1
[1,100 lpm] printer) when it started red-lighting,
but only when I ran my new code. Of course a program
could not (or should not) cause a machine to halt
completely, but I'd already been around the block
enough times to know that if you did just the wrong 
thing, you could elicit some unwanted behavior, and
it might be days or even weeks before it could all
get resolved and management would realize that the
finger of blame should have been pointing at IBM 
all along. I knew what I had changed in my code,
so after a couple of false starts, I had branched
around a subroutine that was using some very simple
floating point instructions to accumulate some data
so as not to have to save and restore any of R0-R15; 
I found if I didn't call this code then no red light 
resulted. I figured I could step thru the subroutine 
instruction by instruction, to find the problem that
I had assumed was a bad core way up at the very high
end of LCS where I had hidden my data. That quickly
got tedious so I formatted the restart PSW at X'000'
to start up again after the call to the subroutine,
so I didn't have to re-IPL each time. I'd just hit
RESET and then RESTART. Back then, boys and girls,
there were no special control registers or machine
state that needed preserving: RESET would reset 
virtually everything; RESTART would simply load a 
new PSW from X'0' and off you could go again, with
OS/360 and your program still running. Any pending
I/O was toast, however, but there was none going 
on at that point, anyway. It took a while, but I
finally determined that the box was red lighting
on an otherwise ordinary AW instruction, and no
LCS was involved. Even if I moved the AW to some
other location or even if I executed it, the box
still red lighted. So it was time to call the CE.

By then I had used and mostly wasted 4 hours. The
CE that showed up took only 15 minutes to arrive, 
because he lived nearby. I guess that's why they
gave him the call. But he was our CE's 2nd level 
manager and area specialist, so I was surprised.
I started to explain the problem, but he just said
"show me." So I re-IPLed and let it red-light. He
looked at the lights and just said "Oh! Floating."

He then opened the front frame, pulled out the gate, 
opened a card cover and ran his hands -- ever so 
gently -- over the ends of the cards inserted into 
the backplane there. I guessed he was feeling for a
hot card or something sophisticated like that. He 
found what he was looking for, grinned, walked over
to his tool briefcase, extracted a HUGE screwdriver,
took it by the business end (not the handle), and
proceeded to beat the living daylights out of the
end of a couple of cards on which his fingers had 
stopped. I mean he BEAT those cards like he was
chopping wood. I was speechless. He stopped, put
his screwdriver up, closed his case, put the gate
back and closed the cover, put his suit jacket
back on (yes, at this time, if an IBM employee
showed up at a customer site, they were in a dark
suit, white shirt, and appropriate tie, even if it
was 2 AM on a Sunday morning), and said, as he was
walking away, not even facing me, "try it now," and
left before I even had a chance to pick my jaw up
off the floor. Of course, the box (but not my code)
worked fine after that. 

I found him in the CE room later that week and, in
awe, asked him about it. He didn't seem to think he
had done anything slick. He was actually embarrassed
that it took him so long to remember on which cards
the floating logic was located (the 360/75 was not a
microcoded machine -- it was the first machine to
implement the S/360 instruction set completely in
hardware). I asked him how he knew it was a floating
point instruction that was causing the problem, and 
he simply said "I glanced at the lights."

I then asked him about literally beating the cards. 
He said the connectors got corroded sometimes, but
the simplest way to clean them was to reseat them
by just pushing them a little bit further into their
slot on the backplane. He said if you pulled one out, 
you risked something else going wrong, so it was just
easier to take a hammer to the card cage. 

He didn't have a hammer in his tool briefcase, so he 
used the biggest thing he had: the giant screwdriver
normally used for frame shipping screws.

--
WB

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to