Broken hardware was Re: Broken Brancher (was Re: Best IEFACTRT)

2009-10-02 Thread Clark Morris
3 incidents come to mind.  The first was a 2821 print controller that
blew up error recovery by sending back Device End and Busy.  Despite
MVT being in its last days, we were the site of first discovery.  The
second was on a mod 65 where the CSW was getting stored x'40' or x'48'
from a 256K boundary.  We were finally able to force it by 
On 2 Oct 2009 06:05:52 -0700, in bit.listserv.ibm-main you wrote:

Many years ago the company I worked for had a 3031. We added the AP to it. 
Soon after, we started experiencing random 0Cx abends that made no sense when 
the dump was examined. The abends were in user and IBM code. CEs could find no 
problem so it had to be software. The PSR agreed that the data in the dump was 
valid. There were even samples where registers were wrong (did not match the 
storage that they were loaded from). HW started looking again. The problem did 
not occur with the AP offline. The problem was narrowed down to the TLB, with 
it off all was good. Replaced TLB. Still failed. An old CE came in with a data 
scope. The problem - the TLB was receiving the here's data signal 1.5ms 
ahead of the data, causing the TLB to load with all 1 bits. There was an 
optional EC that reduced a section of tri-lead by 18 inches. The EC fixed the 
problem.

Dennis Roach
GHG Corporation
Lockheed Martin Mission Services
Facilities Design and Operations Contract
NASA/JSC
Address:
   2100 Space Park Drive 
   LM-15-4BH
   Houston, Texas 77058
Mail:
   P.O. Box 58487
   Mail Code H4C
   Houston, Texas 77258
Phone:
   Voice:  (281)336-5027
   Cell:   (713)591-1059
   Fax:(281)336-5410
E-Mail:  dennis.ro...@lmco.com

All opinions expressed by me are mine and may not agree with my employer or 
any person, company, or thing, living or dead, on or near this or any other 
planet, moon, asteroid, or other spatial object, natural or manufactured, 
since the beginning of time.

 -Original Message-
 From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On
 Behalf Of Bruce Richardson
 Sent: Thursday, October 01, 2009 12:54 PM
 To: IBM-MAIN@bama.ua.edu
 Subject: Broken Brancher (was Re: Best IEFACTRT)
 
 Are you sure your code didn't suffer the same fate as IEFBR14?
 The story (Urban Legend?) I heard, was that IEFBR14 was originally just
 a BR
 14, but that code was APAR'd to add a SR 15,15 before the BR 14 to
 set
 the return code to zero. But then along came a problem with the loader,
 it
 seems that the minimum program length has to be 8 bytes, so another
 APAR
 was opened to add two NOPRs to the code.
 Your code without the second BR 14 is just 6 bytes!
 
 
 On Tue, 29 Sep 2009 21:34:13 -0500, William H. Blair
 wmhbl...@comcast.net wrote:
 
 Edward Jaffe asks:
 
  Which is the best IEFACTRT?
 
 I am dying to know what you meant exactly by that question.
 
 But I'll offer my candidate (in case this is a contest):
 
 IEFACTRT CSECT
 IEFACTRT AMODE 31
 IEFACTRT RMODE ANY
 R1   EQU   1
 R14  EQU   14
 R15  EQU   15
  SRR1,R1  Write SMF termination record
  SRR15,R15JOB processing is to continue
  BRR14Return to INITiator
  BRR14(just in case the brancher's broke
 * when it executes that first BR)
  END
 
 And, yes, at one point, I had a machine where the brancher
 was broke. I had to code a Bx immediately after every Bx
 in case the first Bx ended up at a certain offset in a page,
 else the box ignored the Bx as if it were a NOP[R] and went
 on to whatever followed, unless it was an invalid opcode,
 in which case it threw an ABEND S0C4 on the Bx even if the
 branch address was, in fact, good. No, the CE didn't believe
 me.  Nobody believed me for a week or so until some special
 CE diagnostic tape flown in by IBM from POK failed to run,
 red lighting the box.
 
 The hardware guys kept telling everyone it was a software
 problem, but the IBM software guys kept saying what they
 saw in the dumps was impossible, so it had to be a hardware
 problem. (IBM pointing fingers at itself.) Took 2 weeks to
 find it. Meanwhile, everything ran fine except _my_ code,
 which had the BR that elicited the error (an IEFACTRT exit,
 in fact), and the odd application here and there (which the
 operators just recovered and restarted on the other machine).
 
 I remembered the incident because a frequent complaint from
 some of the less experienced application programmers working
 on Assembler programs (when the PSW ended up somewhere they
 didn't think it should ever have gotten to) was that the
 brancher was broke. It always gave us lots of good laughs.
 
 Well, for at least once in this world, it really was broke.
 
 --
 WB
 
 --
 For IBM-MAIN subscribe / signoff / archive access instructions,
 send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
 Search the archives at http://bama.ua.edu/archives/ibm-main.html
 
 

Re: Broken hardware was Re: Broken Brancher (was Re: Best IEFACTRT)

2009-10-02 Thread Pommier, Rex R.
Clark,

Something ate the last half of your post.

Rex

-Original Message-
From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On
Behalf Of Clark Morris
Sent: Friday, October 02, 2009 10:29 AM
To: IBM-MAIN@bama.ua.edu
Subject: Broken hardware was Re: Broken Brancher (was Re: Best IEFACTRT)

3 incidents come to mind.  The first was a 2821 print controller that
blew up error recovery by sending back Device End and Busy.  Despite
MVT being in its last days, we were the site of first discovery.  The
second was on a mod 65 where the CSW was getting stored x'40' or x'48'
from a 256K boundary.  We were finally able to force it by 
On 2 Oct 2009 06:05:52 -0700, in bit.listserv.ibm-main you wrote:

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html


Re: Broken hardware was Re: Broken Brancher (was Re: Best IEFACTRT)

2009-10-02 Thread William Donzelli
 Something ate the last half of your post.

Cookie monster.







(5 points to anyone that understands that reference).

--
Will

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html


Re: Broken hardware was Re: Broken Brancher (was Re: Best IEFACTRT)

2009-10-02 Thread Scott Rowe
Sesame Street?

 William Donzelli wdonze...@gmail.com 10/2/2009 12:22 PM 
 Something ate the last half of your post.

Cookie monster.







(5 points to anyone that understands that reference).

--
Will

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html 



CONFIDENTIALITY/EMAIL NOTICE: The material in this transmission contains 
confidential and privileged information intended only for the addressee.  If 
you are not the intended recipient, please be advised that you have received 
this material in error and that any forwarding, copying, printing, 
distribution, use or disclosure of the material is strictly prohibited.  If you 
have received this material in error, please (i) do not read it, (ii) reply to 
the sender that you received the message in error, and (iii) erase or destroy 
the material. Emails are not secure and can be intercepted, amended, lost or 
destroyed, or contain viruses. You are deemed to have accepted these risks if 
you communicate with us by email. Thank you.


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html


Broken hardware was Re: Broken Brancher (was Re: Best IEFACTRT)

2009-10-02 Thread Clark Morris
I'll try not to send the message before its time (reference to an old
wine commercial) this time.  And to Scott, Sesame Street should have
called it the Cookie-eating Monster (and I resemble that).

3 incidents come to mind.  The first was a 2821 print controller that
blew up error recovery by sending back Device End and Busy.  Despite
MVT being in its last days, we were the site of first discovery.  The
second was on a mod 65 where the CSW was getting stored x'40' or x'48'
from a 256K boundary.  We were finally able to force it by using an
IEBCOPY unload with IEBCOPY brought back from SVS thanks to the
MICHMODS MVT tape.  We called in the third party memory CEs who came
in and proved it wasn't their problem by some process that I forget
even though I was the person watching this for the company.  I then
called IBM and the CE came in.  He checked for the problem after I
showed the symptoms thinking it wasn't an IBM problem and turned up a
250 nano-second delay card in the channel that wasn't delaying things
for 250 nano-seconds.   The last was under MVS when we lost an indexed
VTOC on a 3380.  After rebuilding it, I checked EREP to see what was
happening at the time and found a large number of temporary write
errors to the drive at the time.  The CE checked it out and found a
loose card in the controller.  Reseating the card ended the problem.

Interestingly both the wacko 2821 and the loose controller card
resulted in PTFs because of the inadequate handling of error
conditions.   

On 2 Oct 2009 06:05:52 -0700, in bit.listserv.ibm-main you wrote:

Many years ago the company I worked for had a 3031. We added the AP to it. 
Soon after, we started experiencing random 0Cx abends that made no sense when 
the dump was examined. The abends were in user and IBM code. CEs could find no 
problem so it had to be software. The PSR agreed that the data in the dump was 
valid. There were even samples where registers were wrong (did not match the 
storage that they were loaded from). HW started looking again. The problem did 
not occur with the AP offline. The problem was narrowed down to the TLB, with 
it off all was good. Replaced TLB. Still failed. An old CE came in with a data 
scope. The problem - the TLB was receiving the here's data signal 1.5ms 
ahead of the data, causing the TLB to load with all 1 bits. There was an 
optional EC that reduced a section of tri-lead by 18 inches. The EC fixed the 
problem.

Dennis Roach
GHG Corporation
Lockheed Martin Mission Services
Facilities Design and Operations Contract
NASA/JSC
Address:
   2100 Space Park Drive 
   LM-15-4BH
   Houston, Texas 77058
Mail:
   P.O. Box 58487
   Mail Code H4C
   Houston, Texas 77258
Phone:
   Voice:  (281)336-5027
   Cell:   (713)591-1059
   Fax:(281)336-5410
E-Mail:  dennis.ro...@lmco.com

All opinions expressed by me are mine and may not agree with my employer or 
any person, company, or thing, living or dead, on or near this or any other 
planet, moon, asteroid, or other spatial object, natural or manufactured, 
since the beginning of time.

 -Original Message-
 From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On
 Behalf Of Bruce Richardson
 Sent: Thursday, October 01, 2009 12:54 PM
 To: IBM-MAIN@bama.ua.edu
 Subject: Broken Brancher (was Re: Best IEFACTRT)
 
 Are you sure your code didn't suffer the same fate as IEFBR14?
 The story (Urban Legend?) I heard, was that IEFBR14 was originally just
 a BR
 14, but that code was APAR'd to add a SR 15,15 before the BR 14 to
 set
 the return code to zero. But then along came a problem with the loader,
 it
 seems that the minimum program length has to be 8 bytes, so another
 APAR
 was opened to add two NOPRs to the code.
 Your code without the second BR 14 is just 6 bytes!
 
 
 On Tue, 29 Sep 2009 21:34:13 -0500, William H. Blair
 wmhbl...@comcast.net wrote:
 
 Edward Jaffe asks:
 
  Which is the best IEFACTRT?
 
 I am dying to know what you meant exactly by that question.
 
 But I'll offer my candidate (in case this is a contest):
 
 IEFACTRT CSECT
 IEFACTRT AMODE 31
 IEFACTRT RMODE ANY
 R1   EQU   1
 R14  EQU   14
 R15  EQU   15
  SRR1,R1  Write SMF termination record
  SRR15,R15JOB processing is to continue
  BRR14Return to INITiator
  BRR14(just in case the brancher's broke
 * when it executes that first BR)
  END
 
 And, yes, at one point, I had a machine where the brancher
 was broke. I had to code a Bx immediately after every Bx
 in case the first Bx ended up at a certain offset in a page,
 else the box ignored the Bx as if it were a NOP[R] and went
 on to whatever followed, unless it was an invalid opcode,
 in which case it threw an ABEND S0C4 on the Bx even if the
 branch address was, in fact, good. No, the CE didn't believe
 me.  Nobody believed me for a week or so until some special
 CE diagnostic tape flown in by IBM from POK failed to run,
 red 

Re: Broken hardware was Re: Broken Brancher (was Re: Best IEFACTRT)

2009-10-02 Thread William Donzelli
 Sesame Street?

One of Cookie Monster's early appearances, before Sesame Street, was
in an IBM training film, called Coffee Break Machine. Here is a 1967
performance of the same skit, on the Ed Sullivan show:

http://www.dailymotion.com/video/x1tmko_coffee-break-machine_business

--
Will

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html