Was the single tape drive that had an error your library control path? Is Atape up to date?
What OS is the TSM server running? Are there any errors in the errpt / error log / syslog, etc? [RC] -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Schneider, John Sent: Friday, September 05, 2008 9:14 AM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] TSM library master for two tape libraries causes problems Greetings, We have a TSM instance that serves as the library master for two tape libraries. One is a IBM3584 tape library with 24 LTO4 drives. The second is a virtual tape library, an EMC EDL configured as a IBM3584 with 128 LTO1 tape drives. The library master has 9 other TSM instances and 4 Lan-free clients that are library clients, and appeal to it for tape mounts. Most of the time this works just fine. All the TSM instances are running TSM 5.4.3.0. A few weeks ago we upgraded them from 5.4.2.0 in order to pick up a patch to provide the LIBSHRTIMEOUT parameter. This may not have anything to do with our problem, but it IS a recent change. The problem comes when we have any problem with the real IBM3584 tape library. Two weeks ago a tape got stuck in the gripper. Last week the gripper itself actually broke, so no tapes could get mounted. Last night it was a single tape drive that got an error and went Polling. For some reason, in each one of these cases, after an hour or two all tape mounts start hanging, even those belonging to the virtual tape library. When we would do a 'q mount' they all showed up in Reserved status. So before long all backups going to the virtual tape library ground to a halt. Can any of you see a reason why the TSM library master should get into such a problem? Shouldn't all tape mounts be asynchronous? It seems like to me a single tape drive getting into problems should not keep all other mounts from proceeding. And it doesn't happen instantly. It seems to happen gradually. I probably should also mention that this is a fairly busy environment during the night. It isn't unusual for us to have over 100 virtual tape mounts simultaneously. That is the reason we needed the LIBSHRTIMEOUT parameter (mentioned above). Before we had that, we sometimes would get timeouts that caused tape mount failures, because the TSM library master's queue of tape mounts polls would get overrun. Since we put on that patch, and added 'LIBSHRTIMEOUT 60' to the options file, that problem has gone away. But now this problem seems to have taken it's place. Best Regards, John D. Schneider Lead Systems Administrator - Storage Sisters of Mercy Health Systems 3637 South Geyer Road St. Louis, MO 63127 Phone: 314-364-3150 Cell: 314-750-8721 Email: [EMAIL PROTECTED] This e-mail contains information which (a) may be PROPRIETARY IN NATURE OR OTHERWISE PROTECTED BY LAW FROM DISCLOSURE, and (b) is intended only for the use of the addressee(s) named above. If you are not the addressee, or the person responsible for delivering this to the addressee(s), you are notified that reading, copying or distributing this e-mail is prohibited. If you have received this e-mail in error, please contact the sender immediately. DISCLAIMER: This message is intended for the sole use of the addressee, and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If you are not the addressee you are hereby notified that you may not use, copy, disclose, or distribute to anyone the message or any information contained in the message. If you have received this message in error, please immediately advise the sender by reply email and delete this message.