You may be running out of file handles. Based on your numbers, I'd guess there is a per-process limit of 1024 file handles. Your storage volumes are using most of them and there aren't enough left over for TSM to do what it needs to (open a message file, etc).
Assuming mainframe Linux is equivalent to other platforms, look into the ulimit command (ulimit -a will show the various resource limits). -Ken On Apr 7, 2011, at 9:30, "Thomas Denier" <thomas.den...@jeffersonhospital.org> wrote: > We have a TSM 5.5.4.0 server running under mainframe Linux. The server has > two random access disk storage pools with a total of 726 volumes. Most of > the volumes are 2 GB. A few are smaller to fit in space left over after > populating file systems with as many 2 GB volumes as possible. Yesterday we > attempted to add 233 more volumes to one of the pools. We had added 214 > when a volume formatting process failed. Shortly after that we starting > seeing a wide range of errors. Reclamation processes failed. The server > refused TCP/IP connection requests from both nodes and administrative > command line clients. Sessions for scheduled backups (with prompt mode > scheduling) hung. The activity log reported that message texts were > unavailable for a variety of message numbers. Many header fields in query > output contained something like 'HEADER NOT AVAILABLE'. The server was > unable to write accounting records. I restarted the server, and the > symptoms came back within minutes after the restart. I removed the new > volumes and restarted the server again. The server then behaved normally. > > Am I correct in suspecting that the problem had to do with the number > of storage pool volumes, and that I will be able to enlarge the storage > pool safely if I replace existing 2 GB volumes with volumes in the 10 > to 20 GB range, and use the same size for new volumes?