How are those of you who run TSM servers or storage agents on Linux on Intel doing with disruptions with SAN-attached tape devices or the SAN fabric itself?
In my current shop, we run TSM servers on AIX (and MVS, but that's another story), but we have storage agents on AIX, Windows, and Red Hat Linux on Intel. The Linux storage agents are relatively new; they were first deployed about two years ago. AIX and Windows storage agents have been there a bit longer, although I can't say how much longer; I, too, have been there less than two years. One problem that we've never been able to overcome with our Linux storage agents has been that if a virtual tape library is rebooted or if the SAN fabric gets massively unzoned (it happened about a month ago to us, sigh), the Linux storage agents don't notice the return of the SAN-attached tape devices until we reboot the Linux server. (We never had the Linux servers zoned to real 3584s and real LTO tape devices; they've only ever been zoned up to EMC Clariian Disk Libraries and then DataDomains with VTL cards in them.) This has persisted across updates to LINtape, CDL code levels, Data Domain code levels, and TSM storage agent levels. Needless to say, the application teams are rather steamed with us about this. We have at times had cases open simultaneously with EMC, Red Hat, and IBM, to no avail. If you have Linux TSM servers or storage agents that gracefully recover from disruptions on your tape SAN, can you share with me (and the rest of the list, if you want) RHEL level, device driver levels, HBA configuration, and whatever else you think might be relevant? Thanks, Nick