Hi, Unfortunately I have a production MQ problem today!! Actually I believe it's not a MQ problem but a network problem. However, when messages cannot be delievered, it's a "MQ problem".
I would like to get some advice from the list before I spent time to get all different groups to do a trace. Problem details : Box 1 Windows NT4 running MQ 5.3 Box 2 z/OS 1.4 running MQ 5.3 Messages are to be sent from NT to z/os one way only. No changes on MQ at both end. When problem happen, the channel status at QM1 shows running. However, when look at the detail status , it stops at the 25 messages. That is for example when the xmitq got 100 messages, when I started the channel and display qlocal(xmit queue), it shows 75 messages. If I display chs, it shows msg(25) and curmsgs(26). On the mainframe end, the rcvr shows running as well (with msg 0) but after around 5 minutes, the adopt Mca kicks in and reestablish the connection. On the Window end, the error log shows a TCP/IP error 10054 (connection forcily closed by remote host). Usually that's due to network problem. I've tried to change the batchsz to 1 with no luck except the ql(xmit) becomes 99 and the chs all with MSG become 1. All other errors message are the same. I then tried the PING CHANNEL and it worked. I then dump the xmitq messages and found that the messages around 1300-1500 bytes so I use PING CHANNEL with DATALEN(1500). This time....." AMQ9208..error receiving from host..." in runmqsc. I then do a command prompt PING without problem. I do PING again with length 1500 and 10000 and still ok ( response is fast too). Then I do a TSO PING from z/OS wiith length 1500 and 10000 as well without problem. As usual, contact network team for the problem and they usually come back with a ping statistics saying "no problem' on network at all regardless what -l they used in ping. So the problem come back to me. I tried redefine another channel but with same symptom. As I strongly believe it's a network problem between the 2 sites, I use a another qmgr in a UNIX box as a middle man. So I send those messages from NT box to the middleman UNIX qmgr and then to the mainframe qmgr. Hurray! Now everything works fine and message all gone through. So..I believe nothing wrong the the qmgr on NT and z/OS. As it's just a temporary solution, now I have to find out the root cause. I tried the MQ PING again to some other boxes from the NT box without problem of data length 1500. I tried multiple time to the z/OS and figured out problem happen when DATALEN is bigger than or equal to 1427. Anyone got any idea? I know ethernet size is 1500 so it looks like something with MTU size but it should also affect command prompt ping as well? What 's the difference between the command prompt ping and the MQ ping except MQ go through the MQ port and verify the channel name? Thanks in advance, Ian Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
