I think this is also a case where the adoptMCA feature needs to be used. Unfortunately the platform or version was not specified. On OS390 v1.2 and NT we experienced this sort of problem with the OS390 side. There was a PMR (not sure of the number) which allowed the adopt an MCA feature to be used and this has solved the problem. All the other suggestions mentioned may improve the network connectivity, but they won't solve the problem if the network does experience a problem.
-----Original Message----- From: Taylor, Neil [mailto:[EMAIL PROTECTED]] Sent: Friday, May 24, 2002 2:15 AM To: [EMAIL PROTECTED] Subject: Mark You may also want to look at TCP/IP KeepAlive. If you reach a state where Sender is in Retry state and receiver is in Running state KeepAlive allows the receiver to "detect" that it hasn't received any traffic from the sender, for a specified period of time, and so drops down to Inactive state automatically. Once done, the Sender can then connect successfully. I have used this to great effect and include it in all configurations. Regards Neil -----Original Message----- From: Michael F Murphy/AZ/US/MQSolutions [mailto:[EMAIL PROTECTED]] Sent: Fri 24/05/2002 02:35 To: [EMAIL PROTECTED] Cc: Subject: Mark, I have seen this many times. It would be helpful to know both the platforms because that can sometimes make a difference. I am not clear on exactly what is happening but since you say stopping and starting the receiver side as well corrects the problem, I am pretty sure I know what you are experiencing. This is very common between OS/390 and Unix or NT platforms but can happen between any platform occasionally. I think what you may notice is your sender is retrying but your corresponding receiver is in a RUNNING status still so it can't accept a new channel connection. If you look at the logs on the receiving end and see an error message stating something like a channel wants to start but there are not enough resources to start one (sorry, I don't remember the exact wording), you can use AdoptNewMCA to help correct the problem. This is done on the receiving side to correct a problem but I enable it on all queue managers. On NT it is done through the Services GUI in the queue manager properties on the Channels tab. On Unix you put it in the channels stanza in qm.ini, on OS/390 I can't tell you how, but it is available. If one side is OS/390, make sure the OS and TCP are up to date. AdoptNewMCA will cause the receiver stuck running to be dropped and "adopt" the new request to start the same channel again. This happens quickly and the drop is not really noticeable. I have been down the same road with all sorts of network engineers sniffing the network, looking at routers, but they never find a problem. Of course IBM says it is not MQSeries, it must be the network. I hope this helps. If this is completely wrong, we'll keep trying. Mike Murphy Sr. Middleware Consultant MQ Solutions, LLC http://www.mqsolutions.com Mark Lees <[EMAIL PROTECTED]> wrote: Date Recieved: 05/23/2002 11:00:09 PM To: [EMAIL PROTECTED] cc: Bcc Subject: Stefan / Ian, thanks for getting back to me. The AMQ log on host A, out machine, says that error 10054 occurred (This is an NT machine) which equates to WSAECONNRESET. The exact AMQ log entry on our machine (Host A) is as follows ---------------------------------------------------------------------------- --- 24/05/02 06:57:35 AMQ9208: Error on receive from host 194.35.94.31. EXPLANATION: An error occurred receiving data from 194.35.94.31 over TCP/IP. This may be due to a communications failure. ACTION: The return code from the TCP/IP (recv) call was 10054 (X'2746'). Record these values and tell the systems administrator. ---------------------------------------------------------------------------- --- I've requested the AMQ error log from their machine but there rather MQSeries naive I can positively rule out a network outage. Our network dept has had tracing and monitoring on for two days now and there has been not network problems for over a month now (that in itself is a cause for celebration :-/ ) I'll endeavour to get Host B's AMQ log and hopefully this will shed some light on it. Regards Mark. -----Original Message----- From: Chan, Ian M [mailto:[EMAIL PROTECTED]] Sent: 24 May 2002 03:05 To: [EMAIL PROTECTED] Subject: what error message displayed at host B AMQERR01.LOG (tcp/ip err? mq error? etc) ? You can't start it at host A because host B receiver is stop and obviously has to be started at host B end. Regards, Ian -----Original Message----- From: Mark Lees [mailto:[EMAIL PROTECTED]] Sent: Thursday, 23 May 2002 10:29 PM To: [EMAIL PROTECTED] Subject: All, I need your help. We have two-way connection to a clients MQSeries using a sender and receiver channel on our machine (host A) and a corresponding receiver and sender channel on their machine (host B). Every now and then, and it appears to be increasing in frequency, the connection between the two machines drops. i.e. host A's sender channel starts retrying / binding the connection until it finally gives up and host B's sender channel does the same. If I stop the sender on our host A and restart it, nothing happens, the sender channel remains in retrying / binding. If however their host B's sender channel is stopped and their receiver channel is also stopped and both restarted and then our host A is subsequently restarted it all appears to work again. Both companies network staff have ran various network monitoring and can detect not untoward network loss or other activity at the connectivity level. I have had a ping session running constantly and have detected not noticeably packet loss. Has anyone ever come across this and if so what the solution is. Mark Lees Senior Technologist BrokerTec Europe **************************************************************************** This message is confidential to the sender and addressee, and may contain proprietary or legally privileged information. If you are not the intended recipient, please delete it from your system, destroy any copies, and notify the sender immediately. Opinions stated herein are not necessarily those of BrokerTec. BrokerTec reserves the right to monitor messages that pass through its networks. BrokerTec Europe Ltd is regulated by FSA. Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com Archive: http://vm.akh-wien.ac.at/MQSeries.archive Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com Archive: http://vm.akh-wien.ac.at/MQSeries.archive **************************************************************************** This message is confidential to the sender and addressee, and may contain proprietary or legally privileged information. If you are not the intended recipient, please delete it from your system, destroy any copies, and notify the sender immediately. Opinions stated herein are not necessarily those of BrokerTec. BrokerTec reserves the right to monitor messages that pass through its networks. BrokerTec Europe Ltd is regulated by FSA. Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com Archive: http://vm.akh-wien.ac.at/MQSeries.archive Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com Archive: http://vm.akh-wien.ac.at/MQSeries.archive Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com Archive: http://vm.akh-wien.ac.at/MQSeries.archive