I posted a question to the list last week about help in debugging a CICS app that's having problems. More than anything else, I was venting my spleen. But since I posted it, I thought I'd post the cause of the problem, now that I've found it.
The top level program in the problem transaction inserted some data into DB2. Down in the fourth layer of called programs, a subroutine put a message on a queue and waited for a response. The program that got the message off the queue, in some circumstances, needed access to the data inserted by the first program in the transaction. But that data wasn't committed, so the two programs could be stuck in a deadlock. Eventually the program waiting for a response would get a 2033, and abort. At that point the getting program would get access to the DB2 data it needed and faithfully send the reply (the reply that no one wanted any more). The other issue was that while this was going on, more responsible messages were backing up in the queue, leading to the general feeling of slowness in the application. So now the developers responsible for this mess are saying "it's a DB2 problem." But I'm not a DBA, so it's not my problem any more. In the meantime, I'm keeping several of those "getting" transactions running so that if one message gets stuck it won't jam up everything. Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com Archive: http://vm.akh-wien.ac.at/MQSeries.archive
