WOW - Deja Vu - We had these same problems on our 3494:
*1) The case of the frayed ribbon cable.* *2) The case of the mysterious gripper failures.* So, eventhough the 3584 is blazing fast, the basic concept/structure/problems haven't changed radically! We have just hit the 1-year mark with our 3584. All TSM servers that access it are RH Linux so lin_tape is the driver. We have had a couple of service calls throughout the year, but nothing like with the 3494. Some were drive problems. Once was a known firmware problem. Our 3584 is 7-frames (1-L23 + 6-D23) with 17 TS1130 E06 drives. We did not feel comfortable with the High-Density approach (i.e. stack tapes up to 4-deep horizontally and then play "3-card monte" when you need to get to the 4th deep tape). Currently 63% full with over a 1000 JA tapes in the mix. If we needed to reduce the number of tapes in the library, we would replace those with JB tapes and get a 30%+ boost in tape capacity. On Mon, May 5, 2014 at 9:09 AM, Rhodes, Richard L. < rrho...@firstenergycorp.com> wrote: > >We have been using a 3584 for about 12 years and have had > >no issues at all with it. The only time it has been "down" > >is for firmware upgrades, replacing tape drives (upgrade from > >LTO2 to LTO4), and when we moved to our new datacenter. Very > >stable and a great workhorse. > > I generally agree with this. We love our 3584's (we have two). > They have been very good workhorses. > BUT, we have gone through some very frustrating problems with them! > > 1) The case of the frayed ribbon cable. > > One of the libraries had the ribbon cable that connects > the library proper to the robot fray, which caused a short, which took out > several cards. It took IBM well over 30hr to resolve. > I think we had 3 or 4 IBM'ers onsite trying to figure > this problem out. They wouldn't just order a bunch of parts. > They insisted in ordering parts one at a time as they > decided to replace them. The parts are all far away, causing > many, many hours of waiting. > > 2) The case of the mysterious gripper failures. > > The robot would get stuck with the a tape suspended > between the robot gripper and the drive mouth. The tape > cartridge pinned the robot. Both libraries were doing this. > It got so bad the library would fail several times per day. > Many grippers were exchanged, it would work well for a while, > then go back to failing. Long story short. The cartridge slots > that line the walls of the library, as cartridges were > inserted/removed, caused a powder (a light dust) to get on > everything in the library, causing gripper failure. > IBM had to replace all the plastic slot things in both libraries. > This finally resolved this problem. > > 3) The case of the slow console > > Others have said this. There are certain options where it can > go away for what seems like forever. One thing I do > once in a while is removing old cleaning cartridges. If I > get on auto-pilot and start hitting the menu items > without thinking, I will > inevitably hit this one item that requests something about all > cartridges . . . .it goes away for what seems like forever > getting that list. > > 4) The case of the Web console weirdness > > The web console is simple to use and generally is great, but > some functions simply don't work well. For example, requesting > a tape to be moved to a specific element address may or may not > work. We've never been able to figure out why it works some times, > and not others. > > Drive firmware upgrade can do flaky things. We have 50 drives > in each 3584. When I've performed a drive firmware upgrade > on all drive, I can count on some number of drives that fail > the upgrade. Sometimes it's all the drives in a frame that fail. > Those drives then have to be upgraded one at a time. > (drive firmware upgrade options via the web console are > All drives at once, or, one at a time). Sometimes > out of 50 drives, a third will fail the upgrade. (This is doing > the upgrade live where you have the firmware activated on next umount). > I talked with the IBM folks about this, and the local CE thinks > this is caused by some communications timeout in the lib. > I opened a support case about this and got nowhere. > Currently we have some old node cards requiring the older firmware. > With a scheduled upgrade we are getting all Enhanced node cards. > I'm hoping getting to the latest/greatest code resolves this. > > 5) The case of the useless dial home. > > Our libraries are set up for dial home when a problem comes up. > Here we just shake our heads and sigh . . . > Sometimes it will dial home on something as simple as a I/O > error writing to a tape, but sometimes won't dial home if the robot hangs. > It's almost a joke between us and the local CE's as to > what/why/when it dials-home, or not. No one can make sense of it. > > 6) The case of the mysterious failing drive in frame 1 slot 12. > > One of our libraries has a ongoing problem with one particular drive, > the drive in frame 1 drive slot 12. This particular drive will fail > any time it is powered off. It goes into some weird unknown > state that requires the drive to be replaced. Yes, that drive has > been replaced many, many, many times over the years. Firmware upgrade > that requires the drive to be power cycled to activate the code, > it fails and needs replaced. Get a scsi reservation problem that requires > the > drive to be power cycled, it fails and needs replaced. If the library has > to > be powered off/on (IBM doing some upgrade or something), the drive fails > gets replaced. You would think that after all this time > IBM would figure out what is wrong - nope, they have no idea! > > 7) Atape - the mysterious of who within IBM owns it! > > We all use atape on our hosts for the tape lib/drive driver. > If you ever suspect/have a problem with it, you will get nowhere in > trying to get support from IBM. Open a case on the 3584? Nope, > we don't support that - it's host software. Open a case with AIX? > Nope, that's not a AIX piece of sftw. Open a case with TSM support? > Nope, they have nothing to do with it. > > > Now . . .as far as a 3494 vs 3584 . . . > > The 3584 is a SCSI library. It is designed around the SCSI standard > for a tape library. This isn't bad or good, it's just different than > now the 3494 works. Probably the biggest thing to get used to is how > TSM (or any backup product) keeps a inventory of tape cartridges and the > Slots (element addresses) they are in. You never had to think about this > for the 3494, since it was in charge of the slot the tapes were in. > Just spend some time reading up on SCSI libraries to get familiar > with them. > > > > > Rick > > > > > ----------------------------------------- > The information contained in this message is intended only for the > personal and confidential use of the recipient(s) named above. If > the reader of this message is not the intended recipient or an > agent responsible for delivering it to the intended recipient, you > are hereby notified that you have received this document in error > and that any review, dissemination, distribution, or copying of > this message is strictly prohibited. If you have received this > communication in error, please notify us immediately, and delete > the original message. -- *Zoltan Forray* TSM Software & Hardware Administrator BigBro / Hobbit / Xymon Administrator Virginia Commonwealth University UCC/Office of Technology Services zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit http://infosecurity.vcu.edu/phishing.html