Problems caused by Health Checker?

Pew, Curtis G Mon, 15 Aug 2022 15:09:30 -0700

We’ve occasionally had problems with our production z/OS LPAR that seem to be 
caused by HZSPROC, and I was wondering if anyone else has seen anything like 
this.


One problem we’ve seen is that sometimes when HZSPROC starts during an IPL, it 
for some reason seems to try to access a migrated dataset (we use FDRABR for 
dataset migration) and since DFRMM hasn’t started yet the recall fails. (We 
don’t know what dataset it is trying to access; none of the datasets explicitly 
in the PROC ever migrate.) Everything seems to freeze until we reply “CANCEL” 
to the EDG4012D message.

This is annoying enough, but last Thursday one of my coworkers stopped HZSPROC 
because it was repeatedly issuing the message that ECSA usage was high. (We’d 
already scheduled an emergency IPL for Sunday to fix that.) Then Friday another 
coworker tried to restart HZSPROC to see if he could figure out the migration 
issue, and our system stopped working well. It looked like batch jobs were 
stuck in allocation or deallocation from what was described to me. (I’m 
semi-retired and don’t work on Fridays.) They decided to go ahead with the 
emergency IPL then and there, but without HZSPROC starting. That IPL went fine 
and we haven't seen any issues since then, but we’re afraid to try to start 
HZSPROC.

Has anyone had issues like this with Health Checker? Any suggestions for how to 
resolve this?

Thanks.



-- 
Curtis Pew
ITS Campus Solutions
[email protected]




----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Problems caused by Health Checker?

Reply via email to