Antony, Here is some info that I put on the BMC support web site at http://service.bmc.com/bmc_mq/kb/view_article/0,,4960%2B4977%2B4990%2B4991%2 B10423,00.html
Overview FDCs are found in /var/mqm/errors. Also the error logs are there. AMQERR01.LOG is the current error log. The FDC filenames look like this: AMQ10450.0.FDC. Where 10450 is the pid of the process throwing the FDC. This resolution describes when tuning is needed due to a lack of semaphore undo structures. The FDCs will have these characteristics: Probe Id: XY324192 Probe Description :- AMQ6119: An internal MQSeries error has occurred ('22 | - Invalid argument' from semop.) This problem is seen most frequently with Solaris. It is not common with HP-UX and I have never seen the problem with AIX. Solution Recommendation: 1. For Solaris, increase the value of SEMUME and SEMMNU in /etc/system and boot. 2. Make sure you do not delete (ipcrm) MQSeries semaphores while MQSeries is running. This is not likely to be the problem but I have seen it. ====== DETAILS: The following are details on how to come to the conclusion above. At a Solaris computer type: man semop And see: EINVAL The semid argument is not a valid semaphore iden- tifier, or the number of individual semaphores for which the calling process requests a SEM_UNDO would exceed the limit. The EINVAL is E = error and INVAL = INVALID. So the FDC is saying that the semop call returned EINVAL. Looks like the number of SEM_UNDOs being requested is too large for the current kernel. Looks like you need to increase the number of SEM_UNDOs allowed. Looking at http://www.sun.com/sun-on-net/itworld/UIR960101perf.html we see that the kernel parm to increase is semsys:seminfo_semume. The kernel parms can be increased by changing /etc/system and then booting. How much should you increase it? I usually double it until I get to 4096 and then increase by smaller amounts. In the past I have seen increasing increasing just SEMUME helped with the EINVAL problem. But, to be safe you may wish to increase both SEMUME and SEMMNU. I usually double the values to start with. It is difficult to come up with exact recommendations. However, here is some info that will help. See http://www-3.ibm.com/software/ts/mqseries/txppacs/supportpacs/mp00.pdf Page 60 of 71. Here is the quote: "SEMMNU , semaphore undo's was increased from 256 to 2048. Without this only approx 90 clients could be MQCONNected." So based on that I would say that in addition to SEMUME (number of semaphore undos per process), also we have SEMNU (total number of undos in the system). It sounds like SEMMNU=256 handles approximately 90 MQCONNs. You could extrapolate from there. ANOTHER POSSIBILITY: Notice that the error explanation for EINVAL includes two possibilities. This article describes how to fix one possible problem. The other possibility is that the semaphore identifier being used by MQSeries is invalid. This probably means that something or someone has deleted the semaphore. Please examine for this, are there any jobs that use ipcrm to delete, to clean up, semaphores? It is ok to delete semaphores that are not MQSeries semaphores or to cleanup MQSeries semaphores while MQSeries is down. But if you delete (ipcrm) semaphores while MQSeries is running this error will occur. Dan Egner IBM WebSphere MQ V5.3 System Administration Certified Product Support BMC Software, Inc -----Original Message----- From: Antony Boggis [mailto:[EMAIL PROTECTED] Sent: Monday, September 08, 2003 2:02 PM To: [EMAIL PROTECTED] Subject: Large number of QMgrs on a Solaris box... I am in the midst of troubleshooting an issue where I am getting huge numbers of FDC files generated on one of our development Solaris machines. Some of these FDC files are are REALLY large too. The machine in question is an 8 CPU box with 32GB RAM. There are currently 70+ queue managers *DEFINED* on the box, many in a "local" cluster (ie the CLUSTER of queue managers is all on the same box - this is a DEVELOPMENT machine... in the real world the cluster members would be on different physical machines). At the moment I am showing 27 active queue managers. The kernel parameters have been updated to the followiung values: set shmsys:shminfo_shmmax = 4294967295 set shmsys:shminfo_shmseg = 2048 set shmsys:shminfo_shmmni = 2048 set semsys:seminfo_semaem = 16384 set semsys:seminfo_semmni = 1024 set semsys:seminfo_semmap = 1026 set semsys:seminfo_semmns = 16384 set semsys:seminfo_semmsl = 10000 set semsys:seminfo_semopm = 100 set semsys:seminfo_semmnu = 2048 set semsys:seminfo_semume = 256 set msgsys:msginfo_msgmap = 1026 set msgsys:msginfo_msgmax = 4096 In addition to the FDC files in /var/mqm/errors, the AMQERROR0x files are also filled with: AMQ6119: An internal MQSeries error has occurred ('22 - Invalid argument' from semop.) If anyone can shed some more light on the problem, I'd apreciate it. I am in the process of reviewing the solaris kernel params to see if that (likely) is the root of the problem. Regards, tonyB. Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com Archive: http://vm.akh-wien.ac.at/MQSeries.archive Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com Archive: http://vm.akh-wien.ac.at/MQSeries.archive