Re: DFSMShsm Abend S878
We are a z/OS 1.13 shop and encountered an S878 as well. There is an APAR OA39358 with fix # UA65499. I have applied this fix to my system and have not had a re-occurrence. -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Ravi Gaur Sent: Thursday, October 18, 2012 5:25 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: DFSMShsm Abend S878 We are having a very busy system as well (for example 15 volume migration tasks, 13 recycle tasks, and 10 recall tasks running) hence couple of things being discussed are : Time to run HSM's major functions(recycle/migration etc) to ensure you have them spread out as much as possible. consider using the Mutiple Address Space HSM (MASH) function. Using MASH, you can run multiple instances of HSM on an LPAR and assign some of the workload to the additional HSM(s). Use DSS cross-memory mode for HSM procsessing. This will move much of the storage requirements of DSS to cross-memory address spaces and get them out of the HSM user region. (FYI Above DSS Cross memory is a new function drived from zOS1.9 onward more to address 878- 10 issue and going to run new address space for most of the HSM function except Recall and indeed need a kind of whole new setup and to address/implement it a complete new project needed.) Region size increase (check with IEFUTL exit) and also any above the line storage definition in STC Proc of dfhsm. FYI - There's some orphaned tape MVT allocations. These are control blocks that were allocated by HSM for tape processing but do not get freed when the task finished with the tape and being addressed with APAR OA40365. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: DFSMShsm Abend S878
We are having a very busy system as well (for example 15 volume migration tasks, 13 recycle tasks, and 10 recall tasks running) hence couple of things being discussed are : Time to run HSM's major functions(recycle/migration etc) to ensure you have them spread out as much as possible. consider using the Mutiple Address Space HSM (MASH) function. Using MASH, you can run multiple instances of HSM on an LPAR and assign some of the workload to the additional HSM(s). Use DSS cross-memory mode for HSM procsessing. This will move much of the storage requirements of DSS to cross-memory address spaces and get them out of the HSM user region. (FYI Above DSS Cross memory is a new function drived from zOS1.9 onward more to address 878- 10 issue and going to run new address space for most of the HSM function except Recall and indeed need a kind of whole new setup and to address/implement it a complete new project needed.) Region size increase (check with IEFUTL exit) and also any above the line storage definition in STC Proc of dfhsm. FYI - There's some orphaned tape MVT allocations. These are control blocks that were allocated by HSM for tape processing but do not get freed when the task finished with the tape and being addressed with APAR OA40365. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: DFSMShsm Abend S878
Ah, 878's in DFHSM takes me back to about 1988. Fun times - for ME but probably not for the customer or DFHSM Development. :-) The debacle at a well-known UK customer got the Virtual Storage estimates in the manual fixed and helped spur Development on to move stuff above the line. Cheers, Martin Martin Packer, zChampion, Principal Systems Investigator, Worldwide Banking Center of Excellence, IBM +44-7802-245-584 email: martin_pac...@uk.ibm.com Twitter / Facebook IDs: MartinPacker Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/MartinPacker From: Mike Wood To: IBM-MAIN@listserv.ua.edu, Date: 10/18/2012 08:28 AM Subject: Re: DFSMShsm Abend S878 Sent by:IBM Mainframe Discussion List On Wednesday, 17 October 2012 23:02:49 UTC+1, Munish Sharma wrote: > We had an issue on one system wherein hsm cds backups were failing on LPAR1 with RC=12 and REAS=08. The jrnl backups were going fine. > > When browsing through the activity logs, I came across the following: > > ADR376E (ttt)-m(yy), UNABLE TO ACQUIRE ADDITIONAL STORAGE FOR THE TASK > > Since dfdss tries a concurrent backup of all 3 cds' to tape, I think it is facing a memory issue. But since journal is not included in this concurrency, it always goes fine. Also, situation is compounded for hsm since mcds is multi-cluster. So it requires the more memory. > > But the backup on LPAR2 was running fine. So I think it comes down to the region size on both tasks. LPAR1 has 8M but LPAR2 has 0M. Therefore I guess LPAR2 has the capability to scale up the region size required by dfdss. > > Shall we try modifying the REGION parm for LPAR1 and test the backvolcds command > > OR As earlier suggested the << Use the DSSXMMODE(?) SETSYS parm to run DSS in its own cross-memory address space>> ..clearly in this case as well... It's happening as too many DFDSS tasks are running in HSM., to backup the multiple CDS's > > Regards > > Munish Sharma Munish, Looking at the R12 hsm books http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/dgt2i490/2.3.2.1.1?SHELF=EZ2ZBK0K&DT=20100729205957&CASE= you can see there the recommendations for hsm region size both for primary and secondary address spaces. Many examples in the hsm book show 8M, but that was probably unchanged since early releases of hsm (which I remember well!). Now the book seems to recommend 0M Mike Wood -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: DFSMShsm Abend S878
On Wednesday, 17 October 2012 23:02:49 UTC+1, Munish Sharma wrote: > We had an issue on one system wherein hsm cds backups were failing on LPAR1 > with RC=12 and REAS=08. The jrnl backups were going fine. > > When browsing through the activity logs, I came across the following: > > ADR376E (ttt)-m(yy), UNABLE TO ACQUIRE ADDITIONAL STORAGE FOR THE TASK > > Since dfdss tries a concurrent backup of all 3 cds' to tape, I think it is > facing a memory issue. But since journal is not included in this concurrency, > it always goes fine. Also, situation is compounded for hsm since mcds is > multi-cluster. So it requires the more memory. > > But the backup on LPAR2 was running fine. So I think it comes down to the > region size on both tasks. LPAR1 has 8M but LPAR2 has 0M. Therefore I guess > LPAR2 has the capability to scale up the region size required by dfdss. > > Shall we try modifying the REGION parm for LPAR1 and test the backvolcds > command > > OR As earlier suggested the << Use the DSSXMMODE(?) SETSYS parm to run DSS in > its own cross-memory address space>> ..clearly in this case as well... It's > happening as too many DFDSS tasks are running in HSM., to backup the multiple > CDS's > > Regards > > Munish Sharma Munish, Looking at the R12 hsm books http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/dgt2i490/2.3.2.1.1?SHELF=EZ2ZBK0K&DT=20100729205957&CASE= you can see there the recommendations for hsm region size both for primary and secondary address spaces. Many examples in the hsm book show 8M, but that was probably unchanged since early releases of hsm (which I remember well!). Now the book seems to recommend 0M Mike Wood -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: DFSMShsm Abend S878
We had z/OS 1.13 HSM abend S878-10 at 1am Monday. IBM tracked down an open APAR. It allocates a small amount of space for each volser during migration. Forgets to free the memory, and eventually chews up all private memory below the 24 bit line in SYSHSM*. It was opened in Sept, Should be out by the end of the year. Local fix is to bound HSM periodically (weekly/monthly). We were up on a busy LPAR for 2 months before it crashed. On Wed, Oct 17, 2012 at 5:02 PM, Munish Sharma wrote: > We had an issue on one system wherein hsm cds backups were failing on LPAR1 > with RC=12 and REAS=08. The jrnl backups were going fine. > > When browsing through the activity logs, I came across the following: > > ADR376E (ttt)-m(yy), UNABLE TO ACQUIRE ADDITIONAL STORAGE FOR THE TASK > > Since dfdss tries a concurrent backup of all 3 cds' to tape, I think it is > facing a memory issue. But since journal is not included in this concurrency, > it always goes fine. Also, situation is compounded for hsm since mcds is > multi-cluster. So it requires the more memory. > > But the backup on LPAR2 was running fine. So I think it comes down to the > region size on both tasks. LPAR1 has 8M but LPAR2 has 0M. Therefore I guess > LPAR2 has the capability to scale up the region size required by dfdss. > > Shall we try modifying the REGION parm for LPAR1 and test the backvolcds > command > > OR As earlier suggested the << Use the DSSXMMODE(?) SETSYS parm to run DSS in > its own cross-memory address space>> ..clearly in this case as well... It's > happening as too many DFDSS tasks are running in HSM., to backup the multiple > CDS's > > Regards > Munish Sharma > > -- > For IBM-MAIN subscribe / signoff / archive access instructions, > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- Mike A Schwab, Springfield IL USA Where do Forest Rangers go to get away from it all? -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: DFSMShsm Abend S878
OOPS... forgot to mention... it was 878-1c -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: DFSMShsm Abend S878
We had an issue on one system wherein hsm cds backups were failing on LPAR1 with RC=12 and REAS=08. The jrnl backups were going fine. When browsing through the activity logs, I came across the following: ADR376E (ttt)-m(yy), UNABLE TO ACQUIRE ADDITIONAL STORAGE FOR THE TASK Since dfdss tries a concurrent backup of all 3 cds' to tape, I think it is facing a memory issue. But since journal is not included in this concurrency, it always goes fine. Also, situation is compounded for hsm since mcds is multi-cluster. So it requires the more memory. But the backup on LPAR2 was running fine. So I think it comes down to the region size on both tasks. LPAR1 has 8M but LPAR2 has 0M. Therefore I guess LPAR2 has the capability to scale up the region size required by dfdss. Shall we try modifying the REGION parm for LPAR1 and test the backvolcds command OR As earlier suggested the << Use the DSSXMMODE(?) SETSYS parm to run DSS in its own cross-memory address space>> ..clearly in this case as well... It's happening as too many DFDSS tasks are running in HSM., to backup the multiple CDS's Regards Munish Sharma -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: DFSMShsm Abend S878
Sent on the Sprint® Now Network from my BlackBerry® -Original Message- From: Thomas Conley Sender: IBM Mainframe Discussion List Date: Thu, 4 Oct 2012 17:10:51 To: Reply-To: IBM Mainframe Discussion List Subject: Re: DFSMShsm Abend S878 On 10/4/2012 4:39 PM, Munish Sharma wrote: > Hello > I have faced the same problem around 2 years back... I was told that this > was just CPU's mess while talking to z/OS.. and it victimized DFSMShsm.. > later one by one all the DB2 keep going down with same error... I thought IBM > had fixed it... > > Best of luck with APAR... > Munish, It's not CPU's mess, and HSM is not a victim. It's what happens when too many DFDSS tasks are running in HSM. Use the DSSXMMODE(?) SETSYS parm to run DSS in its own cross-memory address space. This was done for VSCR in the HSM address space, to prevent 878's and other region-related abends. Regards, Tom Conley -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: DFSMShsm Abend S878
On 10/4/2012 4:39 PM, Munish Sharma wrote: Hello I have faced the same problem around 2 years back... I was told that this was just CPU's mess while talking to z/OS.. and it victimized DFSMShsm.. later one by one all the DB2 keep going down with same error... I thought IBM had fixed it... Best of luck with APAR... Munish, It's not CPU's mess, and HSM is not a victim. It's what happens when too many DFDSS tasks are running in HSM. Use the DSSXMMODE(?) SETSYS parm to run DSS in its own cross-memory address space. This was done for VSCR in the HSM address space, to prevent 878's and other region-related abends. Regards, Tom Conley -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: DFSMShsm Abend S878
Hello I have faced the same problem around 2 years back... I was told that this was just CPU's mess while talking to z/OS.. and it victimized DFSMShsm.. later one by one all the DB2 keep going down with same error... I thought IBM had fixed it... Best of luck with APAR... -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN