Re: Looking for SAN/tape experts assistance
George, Thanks for responding with your expertise and obvious hard work on addressing these short-comings. Yes, I would be interesting in seeing what you have written. You can contact me - off-list.. Zoltan Forray TSM Software Hardware Administrator Virginia Commonwealth University UCC/Office of Technology Services zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit http://infosecurity.vcu.edu/phishing.html From: giblackwood tsm-fo...@backupcentral.com To: ADSM-L@VM.MARIST.EDU Date: 10/18/2010 04:33 PM Subject: [ADSM-L] Looking for SAN/tape experts assistance Sent by: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Mr Forray, I know a lot about this problem you are dealing with. My name is George Blackwood. I was a Systems Engineer with IBM for 30 years. Among other things, I was a SAN, tape, and TSM specialist. I have been retired for 2 years, 1 month. I have my own consulting business doing what I did when I was an IBMer. When Linux is rebooted (RedHat, SLES, whatever), it will scan and re-discover its SCSI and FCP (Fibre Channel Protocol) tape resources without regard of what it knew about those same devices before the reboot (this is not the case with some UNIX systems). So, unless you have one changer and one tape drive, you have no guarantee that the Linux device numbers will be the same after reboot. So, chances are IBMtape0 will be IBMtape20 the next time you reboot. IBM's answer is to set SANDISCOVERY ON. This works sometimes for a small number of drives (under 20), and will sometimes work for more. But after 18 months of being in and out of IBM PMRs and CritSits, I have given up on sandiscovery to fix this issue. I wrote a BASH script to fix this issue. A current customer of mine has 8 RedHat Linux servers sharing 12 TSM instances (we can move them around as need be). Two instances are Library Managers. All instances have access to 4 EMC EDLs. Each EDL has 80 drives. So that comes to 3890 drives paths, plus 4 Library paths to maintain. The script I wrote discovers what TSM instances (Library Servers and Clients) are running on a given Linux server that has just been rebooted. It compensates for any drives that may be mounted, or any Libraries that are in use, and re-defines all the Library and drive paths for any TSM instance on a given Linux server. So if one of the 8 servers needs to be rebooted, the script is run on that server after reboot. There is no need to unmount and quiesce Libraries. The only requirement is the Library Managers must be up. The script will also find what drives are in a SCSI reserve lock out. And, it is safe to be run during full production time. I can give you a few pointers to write a similar script (for free), or for a fee, write it for you. I guarantee my work. George Blackwood Blackwood Data Protection Consulting, LLC 785-218-9961 georgeblackw...@sunflower.com +-- |This was sent by georgeblackw...@sunflower.com via Backup Central. |Forward SPAM to ab...@backupcentral.com. +--
Re: Looking for SAN/tape experts assistance
The zoning process simply associates a server HBA port on the server with the HBA port on the disk device. Persistent binding is a function of the OS and HBA drivers on the server. Within the server configuration, the HBA must be told that a device with a particular ID (i.e. /dev/rmt1) is always to be associated with a physical device with a specific ID (i.e. WWPN). This is typically performed by a configuration file that manages the HBA configuration. On Solaris with an Emulex HBA, the file /kernel/drv/lpfc.conf will allow you to manage persistent bindings by associating a specific WWPN or WWNN to a specific scsi ID: e.g. fcp-bind-WWPN=500a098386f7d4f3:lpfc0t0; You also need to ensure that automatic reconfiguration is NOT set. Automatic reconfiguration can be particularly vexing in a fiber channel loop environment where device contention may cause indeterminent delays with multiple target devices (tape drives) attached to a single initiator (server HBA). Cheers, Neil Strand Storage Engineer - Legg Mason Baltimore, MD. (410) 580-7491 Whatever you can do or believe you can, begin it. Boldness has genius, power and magic. -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Hart, Charles A Sent: Monday, October 18, 2010 4:58 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Looking for SAN/tape experts assistance This may be a dumb response but this behavior is similar in Windows and or Solaris, I thought if the person that zoned the device enabled persistent binding these devices would not re-order on but as it scans the FC. Did I completely miss it? -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of giblackwood Sent: Monday, October 18, 2010 1:26 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Looking for SAN/tape experts assistance Mr Forray, I know a lot about this problem you are dealing with. My name is George Blackwood. I was a Systems Engineer with IBM for 30 years. Among other things, I was a SAN, tape, and TSM specialist. I have been retired for 2 years, 1 month. I have my own consulting business doing what I did when I was an IBMer. When Linux is rebooted (RedHat, SLES, whatever), it will scan and re-discover its SCSI and FCP (Fibre Channel Protocol) tape resources without regard of what it knew about those same devices before the reboot (this is not the case with some UNIX systems). So, unless you have one changer and one tape drive, you have no guarantee that the Linux device numbers will be the same after reboot. So, chances are IBMtape0 will be IBMtape20 the next time you reboot. IBM's answer is to set SANDISCOVERY ON. This works sometimes for a small number of drives (under 20), and will sometimes work for more. But after 18 months of being in and out of IBM PMRs and CritSits, I have given up on sandiscovery to fix this issue. I wrote a BASH script to fix this issue. A current customer of mine has 8 RedHat Linux servers sharing 12 TSM instances (we can move them around as need be). Two instances are Library Managers. All instances have access to 4 EMC EDLs. Each EDL has 80 drives. So that comes to 3890 drives paths, plus 4 Library paths to maintain. The script I wrote discovers what TSM instances (Library Servers and Clients) are running on a given Linux server that has just been rebooted. It compensates for any drives that may be mounted, or any Libraries that are in use, and re-defines all the Library and drive paths for any TSM instance on a given Linux server. So if one of the 8 servers needs to be rebooted, the script is run on that server after reboot. There is no need to unmount and quiesce Libraries. The only requirement is the Library Managers must be up. The script will also find what drives are in a SCSI reserve lock out. And, it is safe to be run during full production time. I can give you a few pointers to write a similar script (for free), or for a fee, write it for you. I guarantee my work. George Blackwood Blackwood Data Protection Consulting, LLC 785-218-9961 georgeblackw...@sunflower.com +-- |This was sent by georgeblackw...@sunflower.com via Backup Central. |Forward SPAM to ab...@backupcentral.com. +-- This e-mail, including attachments, may include confidential and/or proprietary information, and may be used only by the person or entity to which it is addressed. If the reader of this e-mail is not the intended recipient or his or her authorized agent, the reader is hereby notified that any dissemination, distribution or copying of this e-mail is prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and delete this e-mail immediately. IMPORTANT: E-mail sent through the Internet is not secure. Legg Mason therefore recommends that you do not send any
Re: Looking for SAN/tape experts assistance
Thx for the clarification, been a while (we use AIX so the ODM takes care of it all). To the point earlier FC devices changing order can be remediated fairly easy. -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Strand, Neil B. Sent: Tuesday, October 19, 2010 8:12 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Looking for SAN/tape experts assistance The zoning process simply associates a server HBA port on the server with the HBA port on the disk device. Persistent binding is a function of the OS and HBA drivers on the server. Within the server configuration, the HBA must be told that a device with a particular ID (i.e. /dev/rmt1) is always to be associated with a physical device with a specific ID (i.e. WWPN). This is typically performed by a configuration file that manages the HBA configuration. On Solaris with an Emulex HBA, the file /kernel/drv/lpfc.conf will allow you to manage persistent bindings by associating a specific WWPN or WWNN to a specific scsi ID: e.g. fcp-bind-WWPN=500a098386f7d4f3:lpfc0t0; You also need to ensure that automatic reconfiguration is NOT set. Automatic reconfiguration can be particularly vexing in a fiber channel loop environment where device contention may cause indeterminent delays with multiple target devices (tape drives) attached to a single initiator (server HBA). Cheers, Neil Strand Storage Engineer - Legg Mason Baltimore, MD. (410) 580-7491 Whatever you can do or believe you can, begin it. Boldness has genius, power and magic. -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Hart, Charles A Sent: Monday, October 18, 2010 4:58 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Looking for SAN/tape experts assistance This may be a dumb response but this behavior is similar in Windows and or Solaris, I thought if the person that zoned the device enabled persistent binding these devices would not re-order on but as it scans the FC. Did I completely miss it? -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of giblackwood Sent: Monday, October 18, 2010 1:26 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Looking for SAN/tape experts assistance Mr Forray, I know a lot about this problem you are dealing with. My name is George Blackwood. I was a Systems Engineer with IBM for 30 years. Among other things, I was a SAN, tape, and TSM specialist. I have been retired for 2 years, 1 month. I have my own consulting business doing what I did when I was an IBMer. When Linux is rebooted (RedHat, SLES, whatever), it will scan and re-discover its SCSI and FCP (Fibre Channel Protocol) tape resources without regard of what it knew about those same devices before the reboot (this is not the case with some UNIX systems). So, unless you have one changer and one tape drive, you have no guarantee that the Linux device numbers will be the same after reboot. So, chances are IBMtape0 will be IBMtape20 the next time you reboot. IBM's answer is to set SANDISCOVERY ON. This works sometimes for a small number of drives (under 20), and will sometimes work for more. But after 18 months of being in and out of IBM PMRs and CritSits, I have given up on sandiscovery to fix this issue. I wrote a BASH script to fix this issue. A current customer of mine has 8 RedHat Linux servers sharing 12 TSM instances (we can move them around as need be). Two instances are Library Managers. All instances have access to 4 EMC EDLs. Each EDL has 80 drives. So that comes to 3890 drives paths, plus 4 Library paths to maintain. The script I wrote discovers what TSM instances (Library Servers and Clients) are running on a given Linux server that has just been rebooted. It compensates for any drives that may be mounted, or any Libraries that are in use, and re-defines all the Library and drive paths for any TSM instance on a given Linux server. So if one of the 8 servers needs to be rebooted, the script is run on that server after reboot. There is no need to unmount and quiesce Libraries. The only requirement is the Library Managers must be up. The script will also find what drives are in a SCSI reserve lock out. And, it is safe to be run during full production time. I can give you a few pointers to write a similar script (for free), or for a fee, write it for you. I guarantee my work. George Blackwood Blackwood Data Protection Consulting, LLC 785-218-9961 georgeblackw...@sunflower.com +-- |This was sent by georgeblackw...@sunflower.com via Backup Central. |Forward SPAM to ab...@backupcentral.com. +-- This e-mail, including attachments, may include confidential and/or proprietary information, and may be used only by the person or entity to which it is addressed. If the reader of this e-mail is not the
Re: Looking for SAN/tape experts assistance
Hi George, The udev config files used by Lin_tape allow for tieing specific device names to attributes of the tape drives like their serial number, or perhaps LUN number. I was a bit surprised to see that all EMC EDL devices assigned to a given path are presented all on the same WWNN, differing only by LUN. Multiple connections to a VTL is common. From what I've seen, only one control path is typically enabled, and this takes LUN 0. LUN 0 is not reserved on any other paths. So moving a control device from one path to another causes all the LUNS to change. I don't know what incentive Tivoli has to make Lin_tape behave better with EDL pretending (badly) to be a 3584, but it does look like they've at least left a breadcrumb trail to allow us to do it ourselves. Does you mechanism for resolving the SCSI reserve problem cover the case where the TSM instances are running on different machines? Thanks, [RC] On Oct 18, 2010, at 11:26 AM, giblackwood tsm-fo...@backupcentral.com wrote: Mr Forray, I know a lot about this problem you are dealing with. My name is George Blackwood. I was a Systems Engineer with IBM for 30 years. Among other things, I was a SAN, tape, and TSM specialist. I have been retired for 2 years, 1 month. I have my own consulting business doing what I did when I was an IBMer. When Linux is rebooted (RedHat, SLES, whatever), it will scan and re-discover its SCSI and FCP (Fibre Channel Protocol) tape resources without regard of what it knew about those same devices before the reboot (this is not the case with some UNIX systems). So, unless you have one changer and one tape drive, you have no guarantee that the Linux device numbers will be the same after reboot. So, chances are IBMtape0 will be IBMtape20 the next time you reboot. IBM's answer is to set SANDISCOVERY ON. This works sometimes for a small number of drives (under 20), and will sometimes work for more. But after 18 months of being in and out of IBM PMRs and CritSits, I have given up on sandiscovery to fix this issue. I wrote a BASH script to fix this issue. A current customer of mine has 8 RedHat Linux servers sharing 12 TSM instances (we can move them around as need be). Two instances are Library Managers. All instances have access to 4 EMC EDLs. Each EDL has 80 drives. So that comes to 3890 drives paths, plus 4 Library paths to maintain. The script I wrote discovers what TSM instances (Library Servers and Clients) are running on a given Linux server that has just been rebooted. It compensates for any drives that may be mounted, or any Libraries that are in use, and re-defines all the Library and drive paths for any TSM instance on a given Linux server. So if one of the 8 servers needs to be rebooted, the script is run on that server after reboot. There is no need to unmount and quiesce Libraries. The only requirement is the Library Managers must be up. The script will also find what drives are in a SCSI reserve lock out. And, it is safe to be run during full production time. I can give you a few pointers to write a similar script (for free), or for a fee, write it for you. I guarantee my work. George Blackwood Blackwood Data Protection Consulting, LLC 785-218-9961 georgeblackw...@sunflower.com +-- |This was sent by georgeblackw...@sunflower.com via Backup Central. |Forward SPAM to ab...@backupcentral.com. +--
Re: Looking for SAN/tape experts assistance
This may be a dumb response but this behavior is similar in Windows and or Solaris, I thought if the person that zoned the device enabled persistent binding these devices would not re-order on but as it scans the FC. Did I completely miss it? -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of giblackwood Sent: Monday, October 18, 2010 1:26 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Looking for SAN/tape experts assistance Mr Forray, I know a lot about this problem you are dealing with. My name is George Blackwood. I was a Systems Engineer with IBM for 30 years. Among other things, I was a SAN, tape, and TSM specialist. I have been retired for 2 years, 1 month. I have my own consulting business doing what I did when I was an IBMer. When Linux is rebooted (RedHat, SLES, whatever), it will scan and re-discover its SCSI and FCP (Fibre Channel Protocol) tape resources without regard of what it knew about those same devices before the reboot (this is not the case with some UNIX systems). So, unless you have one changer and one tape drive, you have no guarantee that the Linux device numbers will be the same after reboot. So, chances are IBMtape0 will be IBMtape20 the next time you reboot. IBM's answer is to set SANDISCOVERY ON. This works sometimes for a small number of drives (under 20), and will sometimes work for more. But after 18 months of being in and out of IBM PMRs and CritSits, I have given up on sandiscovery to fix this issue. I wrote a BASH script to fix this issue. A current customer of mine has 8 RedHat Linux servers sharing 12 TSM instances (we can move them around as need be). Two instances are Library Managers. All instances have access to 4 EMC EDLs. Each EDL has 80 drives. So that comes to 3890 drives paths, plus 4 Library paths to maintain. The script I wrote discovers what TSM instances (Library Servers and Clients) are running on a given Linux server that has just been rebooted. It compensates for any drives that may be mounted, or any Libraries that are in use, and re-defines all the Library and drive paths for any TSM instance on a given Linux server. So if one of the 8 servers needs to be rebooted, the script is run on that server after reboot. There is no need to unmount and quiesce Libraries. The only requirement is the Library Managers must be up. The script will also find what drives are in a SCSI reserve lock out. And, it is safe to be run during full production time. I can give you a few pointers to write a similar script (for free), or for a fee, write it for you. I guarantee my work. George Blackwood Blackwood Data Protection Consulting, LLC 785-218-9961 georgeblackw...@sunflower.com +-- |This was sent by georgeblackw...@sunflower.com via Backup Central. |Forward SPAM to ab...@backupcentral.com. +-- This e-mail, including attachments, may include confidential and/or proprietary information, and may be used only by the person or entity to which it is addressed. If the reader of this e-mail is not the intended recipient or his or her authorized agent, the reader is hereby notified that any dissemination, distribution or copying of this e-mail is prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and delete this e-mail immediately.
Re: Looking for SAN/tape experts assistance
Sergio, Thanks for the experience/guidance/details. Looks like we may be heading in this direction, although we aren't sure why. To update our situation. After doing some digging and pure-luck, I discovered one of the fibre connections had gone amber. Not sure why since when we cabled the server to the switch, we always make sure it is green. Working with the SAN person (tried cable swapping, etc), he discovered that switch port was set to 1GB vs auto. He reconfigured to auto. I rebooted the server 5-times and after the 2nd reboot, the /dev/IBMtape values seemed to stay the same after each reboot. We thought everything was smooth sailing until last night ( http://www.wtvr.com/news/wtvr-downtown-smoke-power-grid-story,0,3680878.story ) After we got things back up and running today, not only did the /dev/IBMtape mappings change on this server, but it also changed on my 1-production 6.1 server, which in the past had been steady. The 4-production 5.5 servers did not change. So, we are sortof back to square-1. I am beginning to wonder if it has something to do with the OS. All of the 5.5 servers are running RH4 or older RH5 kernels (-92 vs -128/-194) Did you finish testing your configuration/accessing the drives? I am curious - are other folks out there who run Linux TSM servers creating these rules? Zoltan Forray TSM Software Hardware Administrator Virginia Commonwealth University UCC/Office of Technology Services zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit http://infosecurity.vcu.edu/phishing.html From: Sergio O. Fuentes sfuen...@umd.edu To: ADSM-L@VM.MARIST.EDU Date: 09/27/2010 01:43 PM Subject: Re: [ADSM-L] Looking for SAN/tape experts assistance Sent by: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU I'm doing this work right now for a big project here. My first endeavor into Linux. The lin_tape drivers for 6.2 will require a .rules file in /etc/udev/rules.d (or wherever your udev stuff lives) mainly because of the instance owner/group requirements to run 6.2 dsmserv processes. Unless you can alter your default udev rules for EVERYTHING, then you'll need the .rule file to assign ownership and mode parameters for the tape devices. Mine, so far, looks like this: #cat /etc/udev/rules.d/98-lin_tape.rules KERNEL==IBMchanger*, SYSFS{primary_path}==Primary, SYSFS{serial_num}==078150090402, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMchanger137B KERNEL==IBMchanger*, SYSFS{primary_path}==Alternate, SYSFS{serial_num}==078150090402, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMchanger138A KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549127, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMtape137 KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549128, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMtape138 KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549129, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMtape139 KERNEL==IBMtape*n, OWNER=tsminst1, MODE=0600 There are a lot of gotchas with this method that I'm running into. I'm not sure if they are kernel bugs or driver issues but not much of this is documented anywhere. Bullet-list (so far): -If you have alternate pathing or data path failover, lin_taped needs be installed and running. Problem is getting persistent binding to work with this. There's a race condition where once modprobe lin_tape is run, the udev files are created with the rules. But the SYSFS{primary_path} key isn't defined correctly until lin_taped is run, BUT lin_taped can't run until lin_tape is loaded. So by the time lin_taped is executed and running, the lin_tape rules have already been processed for udev. oMy workaround will be to create an init script that will run lin_taped and then udevtrigger. Seems to work, but udevtrigger once crashed the system. -Sometimes when lin_tape is loaded, the mode is incorrect for devices. The fix is again udevtrigger. -KERNEL==IBMtape* doesn't work for renaming, because sometimes a symlink to IBMtape1n is used instead of IBMtape1. Which is why I have the character class IBMtape*[0-9] Here's the output for ls /dev/ commands for when I believe things are configured correctly. Caveat: I haven't even tested reading/writing to these devices yet, let alone defining the devices to TSM. #ls -l /dev/IBMtape* crw-r--r-- 1 root root 250, 3071 Sep 27 11:43 /dev/IBMtape crw--- 1 tsminst1 root 250,0 Sep 27 11:43 /dev/IBMtape0 crw--- 1 tsminst1 root 250, 1024 Sep 27 11:43 /dev/IBMtape0n crw--- 1 tsminst1 root 250,1 Sep 27 11:43 /dev/IBMtape1 crw--- 1 tsminst1 root 250, 1025 Sep 27 11:43 /dev/IBMtape1n crw--- 1 tsminst1 root 250,2 Sep 27 11:43
Re: Looking for SAN/tape experts assistance
Well, I've gotten a little further along. The drives and paths have been added to TSM and TSM is reporting all the information correctly (SN, Support RW types, WWN's, etc.). Still haven't tried opening the drives and reading/writing tapes (next step). The init script I use to load lin_tape is as follows: (this is actually just a basic rc script that we have to use for customized applications. lin_tape has its own /etc/init.d/lin_tape start up script, but I have opted not to use it for now): [ -x /etc/init.d/functions ] . /etc/init.d/functions echo -n Starting lin_tape: modprobe lin_tape RETVAL=$? [ $RETVAL -eq 0 ] [ -x /usr/bin/lin_taped ] /usr/bin/lin_taped start RETVAL=$RETVAL||$? echo -n Binding IBMtape and IBMchanger devices: [ $RETVAL -eq 0 ] /sbin/udevtrigger echo Notice how udevtrigger is there. I don't believe that's in the lin_tape init script by default. We've rebooted the server several times, have created new kernel updates regularly and nothing has been able to break the lin_tape functionality, so that's a plus (knock on wood). We're on a pretty recent kernel though, so that must be helping some: $ uname -r 2.6.18-194.11.4.el5 Lin_tape version #modinfo lin_tape filename: /lib/modules/2.6.18-194.11.4.el5/kernel/drivers/scsi/lin_tape.ko version:1.41.1 license:GPL description:IBM Linux SCSI Tape Device Driver for IBM Tape Devices author: IBM Corporation srcversion: CA19B1A253F80F44D925789 depends:scsi_mod vermagic: 2.6.18-194.3.1.el5 SMP mod_unload gcc-4.1 SF -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Zoltan Forray/AC/VCU Sent: Friday, October 08, 2010 12:19 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Looking for SAN/tape experts assistance Sergio, Thanks for the experience/guidance/details. Looks like we may be heading in this direction, although we aren't sure why. To update our situation. After doing some digging and pure-luck, I discovered one of the fibre connections had gone amber. Not sure why since when we cabled the server to the switch, we always make sure it is green. Working with the SAN person (tried cable swapping, etc), he discovered that switch port was set to 1GB vs auto. He reconfigured to auto. I rebooted the server 5-times and after the 2nd reboot, the /dev/IBMtape values seemed to stay the same after each reboot. We thought everything was smooth sailing until last night ( http://www.wtvr.com/news/wtvr-downtown-smoke-power-grid-story,0,3680878.story ) After we got things back up and running today, not only did the /dev/IBMtape mappings change on this server, but it also changed on my 1-production 6.1 server, which in the past had been steady. The 4-production 5.5 servers did not change. So, we are sortof back to square-1. I am beginning to wonder if it has something to do with the OS. All of the 5.5 servers are running RH4 or older RH5 kernels (-92 vs -128/-194) Did you finish testing your configuration/accessing the drives? I am curious - are other folks out there who run Linux TSM servers creating these rules? Zoltan Forray TSM Software Hardware Administrator Virginia Commonwealth University UCC/Office of Technology Services zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit http://infosecurity.vcu.edu/phishing.html From: Sergio O. Fuentes sfuen...@umd.edu To: ADSM-L@VM.MARIST.EDU Date: 09/27/2010 01:43 PM Subject: Re: [ADSM-L] Looking for SAN/tape experts assistance Sent by: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU I'm doing this work right now for a big project here. My first endeavor into Linux. The lin_tape drivers for 6.2 will require a .rules file in /etc/udev/rules.d (or wherever your udev stuff lives) mainly because of the instance owner/group requirements to run 6.2 dsmserv processes. Unless you can alter your default udev rules for EVERYTHING, then you'll need the .rule file to assign ownership and mode parameters for the tape devices. Mine, so far, looks like this: #cat /etc/udev/rules.d/98-lin_tape.rules KERNEL==IBMchanger*, SYSFS{primary_path}==Primary, SYSFS{serial_num}==078150090402, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMchanger137B KERNEL==IBMchanger*, SYSFS{primary_path}==Alternate, SYSFS{serial_num}==078150090402, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMchanger138A KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549127, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMtape137 KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549128, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMtape138 KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549129, OWNER=tsminst1, MODE=0600,
Re: Looking for SAN/tape experts assistance
Thanks for the info and examples. However, I am at a loss to understand why I need this, now. Especially when 2-identical (well, I guess something is different ;--) servers are acting differently. I have never had to do this with *ANY* other of my now 7-servers. Unless hardware changed, a reboot would not change the order of the tape drives. Zoltan Forray TSM Software Hardware Administrator Virginia Commonwealth University UCC/Office of Technology Services zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit http://infosecurity.vcu.edu/phishing.html From: Sergio O. Fuentes sfuen...@umd.edu To: ADSM-L@VM.MARIST.EDU Date: 09/27/2010 01:43 PM Subject: Re: [ADSM-L] Looking for SAN/tape experts assistance Sent by: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU I'm doing this work right now for a big project here. My first endeavor into Linux. The lin_tape drivers for 6.2 will require a .rules file in /etc/udev/rules.d (or wherever your udev stuff lives) mainly because of the instance owner/group requirements to run 6.2 dsmserv processes. Unless you can alter your default udev rules for EVERYTHING, then you'll need the .rule file to assign ownership and mode parameters for the tape devices. Mine, so far, looks like this: #cat /etc/udev/rules.d/98-lin_tape.rules KERNEL==IBMchanger*, SYSFS{primary_path}==Primary, SYSFS{serial_num}==078150090402, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMchanger137B KERNEL==IBMchanger*, SYSFS{primary_path}==Alternate, SYSFS{serial_num}==078150090402, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMchanger138A KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549127, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMtape137 KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549128, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMtape138 KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549129, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMtape139 KERNEL==IBMtape*n, OWNER=tsminst1, MODE=0600 There are a lot of gotchas with this method that I'm running into. I'm not sure if they are kernel bugs or driver issues but not much of this is documented anywhere. Bullet-list (so far): -If you have alternate pathing or data path failover, lin_taped needs be installed and running. Problem is getting persistent binding to work with this. There's a race condition where once modprobe lin_tape is run, the udev files are created with the rules. But the SYSFS{primary_path} key isn't defined correctly until lin_taped is run, BUT lin_taped can't run until lin_tape is loaded. So by the time lin_taped is executed and running, the lin_tape rules have already been processed for udev. oMy workaround will be to create an init script that will run lin_taped and then udevtrigger. Seems to work, but udevtrigger once crashed the system. -Sometimes when lin_tape is loaded, the mode is incorrect for devices. The fix is again udevtrigger. -KERNEL==IBMtape* doesn't work for renaming, because sometimes a symlink to IBMtape1n is used instead of IBMtape1. Which is why I have the character class IBMtape*[0-9] Here's the output for ls /dev/ commands for when I believe things are configured correctly. Caveat: I haven't even tested reading/writing to these devices yet, let alone defining the devices to TSM. #ls -l /dev/IBMtape* crw-r--r-- 1 root root 250, 3071 Sep 27 11:43 /dev/IBMtape crw--- 1 tsminst1 root 250,0 Sep 27 11:43 /dev/IBMtape0 crw--- 1 tsminst1 root 250, 1024 Sep 27 11:43 /dev/IBMtape0n crw--- 1 tsminst1 root 250,1 Sep 27 11:43 /dev/IBMtape1 crw--- 1 tsminst1 root 250, 1025 Sep 27 11:43 /dev/IBMtape1n crw--- 1 tsminst1 root 250,2 Sep 27 11:43 /dev/IBMtape2 crw--- 1 tsminst1 root 250, 1026 Sep 27 11:43 /dev/IBMtape2n #ls -l /dev/lin_tape total 0 lrwxrwxrwx 1 root root 14 Sep 27 11:43 IBMchanger137B - ../IBMchanger0 lrwxrwxrwx 1 root root 14 Sep 27 11:43 IBMchanger138A - ../IBMchanger1 lrwxrwxrwx 1 root root 11 Sep 27 11:43 IBMtape137 - ../IBMtape2 lrwxrwxrwx 1 root root 11 Sep 27 11:43 IBMtape138 - ../IBMtape0 lrwxrwxrwx 1 root root 11 Sep 27 11:43 IBMtape139 - ../IBMtape1 HTH, Sergio -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Zoltan Forray/AC/VCU Sent: Wednesday, September 22, 2010 2:53 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Looking for SAN/tape experts assistance I have mentioned in previous posts that we are putting up 2-new RH Linux based TSM server . These are the first of my existing 5-Linux servers to use EMC SAN storage. With every new adventure, we get new problems. This one is driving everyone crazy and hope
Re: Looking for SAN/tape experts assistance
I'm doing this work right now for a big project here. My first endeavor into Linux. The lin_tape drivers for 6.2 will require a .rules file in /etc/udev/rules.d (or wherever your udev stuff lives) mainly because of the instance owner/group requirements to run 6.2 dsmserv processes. Unless you can alter your default udev rules for EVERYTHING, then you'll need the .rule file to assign ownership and mode parameters for the tape devices. Mine, so far, looks like this: #cat /etc/udev/rules.d/98-lin_tape.rules KERNEL==IBMchanger*, SYSFS{primary_path}==Primary, SYSFS{serial_num}==078150090402, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMchanger137B KERNEL==IBMchanger*, SYSFS{primary_path}==Alternate, SYSFS{serial_num}==078150090402, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMchanger138A KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549127, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMtape137 KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549128, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMtape138 KERNEL==IBMtape*[0-9],SYSFS{ww_port_name}==0x5005076300549129, OWNER=tsminst1, MODE=0600, SYMLINK=lin_tape/IBMtape139 KERNEL==IBMtape*n, OWNER=tsminst1, MODE=0600 There are a lot of gotchas with this method that I'm running into. I'm not sure if they are kernel bugs or driver issues but not much of this is documented anywhere. Bullet-list (so far): -If you have alternate pathing or data path failover, lin_taped needs be installed and running. Problem is getting persistent binding to work with this. There's a race condition where once modprobe lin_tape is run, the udev files are created with the rules. But the SYSFS{primary_path} key isn't defined correctly until lin_taped is run, BUT lin_taped can't run until lin_tape is loaded. So by the time lin_taped is executed and running, the lin_tape rules have already been processed for udev. oMy workaround will be to create an init script that will run lin_taped and then udevtrigger. Seems to work, but udevtrigger once crashed the system. -Sometimes when lin_tape is loaded, the mode is incorrect for devices. The fix is again udevtrigger. -KERNEL==IBMtape* doesn't work for renaming, because sometimes a symlink to IBMtape1n is used instead of IBMtape1. Which is why I have the character class IBMtape*[0-9] Here's the output for ls /dev/ commands for when I believe things are configured correctly. Caveat: I haven't even tested reading/writing to these devices yet, let alone defining the devices to TSM. #ls -l /dev/IBMtape* crw-r--r-- 1 root root 250, 3071 Sep 27 11:43 /dev/IBMtape crw--- 1 tsminst1 root 250,0 Sep 27 11:43 /dev/IBMtape0 crw--- 1 tsminst1 root 250, 1024 Sep 27 11:43 /dev/IBMtape0n crw--- 1 tsminst1 root 250,1 Sep 27 11:43 /dev/IBMtape1 crw--- 1 tsminst1 root 250, 1025 Sep 27 11:43 /dev/IBMtape1n crw--- 1 tsminst1 root 250,2 Sep 27 11:43 /dev/IBMtape2 crw--- 1 tsminst1 root 250, 1026 Sep 27 11:43 /dev/IBMtape2n #ls -l /dev/lin_tape total 0 lrwxrwxrwx 1 root root 14 Sep 27 11:43 IBMchanger137B - ../IBMchanger0 lrwxrwxrwx 1 root root 14 Sep 27 11:43 IBMchanger138A - ../IBMchanger1 lrwxrwxrwx 1 root root 11 Sep 27 11:43 IBMtape137 - ../IBMtape2 lrwxrwxrwx 1 root root 11 Sep 27 11:43 IBMtape138 - ../IBMtape0 lrwxrwxrwx 1 root root 11 Sep 27 11:43 IBMtape139 - ../IBMtape1 HTH, Sergio -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Zoltan Forray/AC/VCU Sent: Wednesday, September 22, 2010 2:53 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Looking for SAN/tape experts assistance I have mentioned in previous posts that we are putting up 2-new RH Linux based TSM server . These are the first of my existing 5-Linux servers to use EMC SAN storage. With every new adventure, we get new problems. This one is driving everyone crazy and hope someone out there can point us in the right direction. We have seen posts in ADSM-L that sorta talk about it, but nothing that explains what is going on with us or how to resolve it. Both new servers have been configured identically when it comes to the OS (RedHat Linux 5.5 kernel 2.6.18-194.11.3.el5) software and other hardware supporting software (EMC Powerpath and IBM lin_tape drivers - 1.41.1 for the TS1120/1130 drives) The problem is this. Every time we reboot one of the new servers, the values in /proc/scsi/IBMtape is different in the assignment of /dev numbers to the drives. It seems to find the tape drives in a different order each time. None of my 5-production nor the other new TSM server have this problem (I have rebooted the 2nd new server 4-times and the /dev/IBMtape? values stay the same). When looking through the fixlist for lin_tape
Re: Looking for SAN/tape experts assistance
Zoltan, Look at how udev config and rules are configured. It may be that the PowerPath installation affected the configuration. Cheers, Neil Strand Storage Engineer - Legg Mason Baltimore, MD. (410) 580-7491 Whatever you can do or believe you can, begin it. Boldness has genius, power and magic. -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Zoltan Forray/AC/VCU Sent: Wednesday, September 22, 2010 2:53 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Looking for SAN/tape experts assistance I have mentioned in previous posts that we are putting up 2-new RH Linux based TSM server . These are the first of my existing 5-Linux servers to use EMC SAN storage. With every new adventure, we get new problems. This one is driving everyone crazy and hope someone out there can point us in the right direction. We have seen posts in ADSM-L that sorta talk about it, but nothing that explains what is going on with us or how to resolve it. Both new servers have been configured identically when it comes to the OS (RedHat Linux 5.5 kernel 2.6.18-194.11.3.el5) software and other hardware supporting software (EMC Powerpath and IBM lin_tape drivers - 1.41.1 for the TS1120/1130 drives) The problem is this. Every time we reboot one of the new servers, the values in /proc/scsi/IBMtape is different in the assignment of /dev numbers to the drives. It seems to find the tape drives in a different order each time. None of my 5-production nor the other new TSM server have this problem (I have rebooted the 2nd new server 4-times and the /dev/IBMtape? values stay the same). When looking through the fixlist for lin_tape (usually engineering-speak), we saw this interesting entry at the 1.37 level: Removed persistent naming script in favor of new method Questions come to mind about things like what naming script...what new method could this possibly be related to what we are experiencing? We have spent all day trying to figure this wrinkle out. Any suggestions are greatly appreciated. Zoltan Forray TSM Software Hardware Administrator Virginia Commonwealth University UCC/Office of Technology Services zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit http://infosecurity.vcu.edu/phishing.html IMPORTANT: E-mail sent through the Internet is not secure. Legg Mason therefore recommends that you do not send any confidential or sensitive information to us via electronic mail, including social security numbers, account numbers, or personal identification numbers. Delivery, and or timely delivery of Internet mail is not guaranteed. Legg Mason therefore recommends that you do not send time sensitive or action-oriented messages to us via electronic mail. This message is intended for the addressee only and may contain privileged or confidential information. Unless you are the intended recipient, you may not use, copy or disclose to anyone any information contained in this message. If you have received this message in error, please notify the author by replying to this message and then kindly delete the message. Thank you.
Re: Looking for SAN/tape experts assistance
There is a by-id section in the IBM Tape device driver installation and user guide (for LTO). Looks like it hooks into the RHEL udev stuff. I would personally try to find the same guide for TS1120 and send you a link, but I'm fighting an EDL problem at the moment. Thanks, [RC] On Sep 22, 2010, at 11:53 AM, Zoltan Forray/AC/VCU zfor...@vcu.edu wrote: I have mentioned in previous posts that we are putting up 2-new RH Linux based TSM server . These are the first of my existing 5-Linux servers to use EMC SAN storage. With every new adventure, we get new problems. This one is driving everyone crazy and hope someone out there can point us in the right direction. We have seen posts in ADSM-L that sorta talk about it, but nothing that explains what is going on with us or how to resolve it. Both new servers have been configured identically when it comes to the OS (RedHat Linux 5.5 kernel 2.6.18-194.11.3.el5) software and other hardware supporting software (EMC Powerpath and IBM lin_tape drivers - 1.41.1 for the TS1120/1130 drives) The problem is this. Every time we reboot one of the new servers, the values in /proc/scsi/IBMtape is different in the assignment of /dev numbers to the drives. It seems to find the tape drives in a different order each time. None of my 5-production nor the other new TSM server have this problem (I have rebooted the 2nd new server 4-times and the /dev/IBMtape? values stay the same). When looking through the fixlist for lin_tape (usually engineering-speak), we saw this interesting entry at the 1.37 level: Removed persistent naming script in favor of new method Questions come to mind about things like what naming script...what new method could this possibly be related to what we are experiencing? We have spent all day trying to figure this wrinkle out. Any suggestions are greatly appreciated. Zoltan Forray TSM Software Hardware Administrator Virginia Commonwealth University UCC/Office of Technology Services zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit http://infosecurity.vcu.edu/phishing.html