Re: NDMP TOC (was Re: TSM 5.3 new goody)
Ben, I've been doing TSM NDMP backups of a pair of clustered FAS940's since January. I'm using TSM 5.2.4.1 and Data OnTap 7.0.0.1. So far I've not had any problems really, only some minor things to work round. Have changed the backups from filer complete jobs to individual volumes with their own individual TOCs which has speeded up the backups restores. Iain Regards, Iain Barnetson IT Systems Administrator UKN Infrastructure Operations -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Ben Bullock Sent: 01 April 2005 00:41 To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] NDMP TOC (was Re: TSM 5.3 new goody) Just a quick update on our NDMP/TSM implementation we have been working on: RECAP - We were attempting to get NDMP to work with TSM, at the qtree level (as opposed to the whole volume) and get a TOC for the files. We had ~some~ qtrees succeed but many failed. We've spent the last month with IBM and NetApp trying to figure out who's fault it was with logs and traces. Resolution: For those of you using NDMP on NetApp servers, here is the bug we ran into: __ Bug ID 152072 Title Tape backups are larger than expected or appear to loop in Phase V. Description Formatted If a data set has more than ~4,000 ACLs, then file data may be mistakenly written out in addition to NT ACL data during Phase V (NT ACLs) of dump. Symptoms of this bug include: Data written taking up significantly more space than what is being dumped or dump appearing to loop in Phase V. The behavior may be seen with dumps initiated from either the filer console or NDMP. Related Solutions Fixed-In Version Data ONTAP 7.0.0.1P2 __ The additional symptom that we bumped into is the inability to create a Table of Contents for the TSM session. I'm kinda surprised that nobody else has stumbled across this bug... Ben -Original Message- From: bbullock Sent: Tuesday, March 22, 2005 10:03 AM To: 'ADSM: Dist Stor Manager' Subject: RE: NDMP TOC (was Re: TSM 5.3 new goody) Hmm, I haven't seen a good rule-of-thumb for the TOC. Perhaps others who are using it more can address this. In my limited testing, I sometimes get a TOC much larger than I would expect on qtrees with the same approx number of files. Might it depend on if there are ACLs for the NTFS qtrees? I'm not sure in NDMP if that type of information is in the actual image file or in the TOC... Ben -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Curtis Stewart Sent: Tuesday, March 22, 2005 9:47 AM To: ADSM-L@VM.MARIST.EDU Subject: NDMP TOC (was Re: TSM 5.3 new goody) Anyone know how to estimate the size required for a TOC? I'm looking at NDPM for our filer, that has about 3 million files. [EMAIL PROTECTED]
Re: another storage agent question
Uwe Thanks for your reply. I had already tried what you suggested. I was dead close but I hadn't got my mgmtclass set up properly on my library client... what a silly mistake.. It now works. -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Schreiber Sent: Friday, April 01, 2005 12:48 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] another storage agent question hi chris, i have this setup working in our AIX TSM environment. Library Manager : AIX 5.2 TSM 5.2.3.5 Library Client : AIX 5.2 TSM 5.2.3.5 LANfree Client : Solaris 9 TSM 5.2.3.4 + Storage Agent 5.2.3.5 + TDP for R/3 3.3.12.0 you have to do the following things: 1. setup the Storage Agent dsmsta setstorageserver myname=storage_agent_node_name mypassword=storage_agent_password myhladdress=storage_agent_hl_address servername=library_client_name serverpassword=password_for_library_client hladdress=library_client_hl_address lladdresslibrary_client_ll_address 2. define a server to server communication from the Storage Agent to the Library Manager via define server .. define server LANfree_Client_Name serverpass=LANfree_Client_Password hladdr=LANfree_Client_hladdress lladdr=LANfree_Client_lladdress 3. define a server to server communication from the Storage Agent to the Library Client via define server ... define server LANfree_Client_Name serverpass=LANfree_Client_Password hladdr=LANfree_Client_hladdress lladdr=LANfree_Client_lladdress 4. define the pathes for the drives for the Storage Agent at the Library Manager via define path ... now you should be able to start the Storage Agent in forground at the LANfree Client and see at least one client session for the LANfree Client at the Library Client TSM-Server Regards Uwe [EMAIL PROTECTED] Sent by: ADSM-L@VM.MARIST.EDU 31.03.2005 17:18 Please respond to ADSM-L@VM.MARIST.EDU To ADSM-L@VM.MARIST.EDU cc Subject another storage agent question Hi All Env = Win2k TSM Server and Storage agent 5.2.2.3 3584 mixed lto1/lto2 libraries 1 tsm server library manager , 1 tsm server library client The 5.2 and 5.3 manual on the storage agent seems to indicate that I should be able to do lanfree backups to a tsm server that is a library client which I don't quite understand because surely there would be no tape paths defined on the library client. Quotes from manual When the Tivoli Storage Manager server (data manager server) is also the library manager for the devices where data is stored by the storage agent, then the storage agent communicates requests to this Tivoli Storage Manager server. When the Tivoli Storage Manager server (data manager server) is another library client, then the storage agent communicates requests for itself or the metadata server directly to the library manager. A library client requests shared library resources, such as drives or media, from the library manager, but uses the resources independently. The library manager coordinates the access to these resources. Data moves over the SAN between the storage device and either the library manager or the library client. Either the library manager or any library client can manage the LAN-free movement of client data as long as the client system includes a storage agent. Has anyone got this working? I have only been able to get lanfree backups working in the normal way to the tsm server which is the library manager. What am I missing here? Cheers ___ Disclaimer Notice This message and any attachments are confidential and should only be read by those to whom they are addressed. If you are not the intended recipient, please contact us, delete the message from your computer and destroy any copies. Any distribution or copying without our prior permission is prohibited. Internet communications are not always secure and therefore the E.ON Group does not accept legal responsibility for this message. The recipient is responsible for verifying its authenticity before acting on the contents. Any views or opinions presented are solely those of the author and do not necessarily represent those of the E.ON Group. E.ON UK plc, Westwood Way, Westwood Business Park, Coventry, CV4 8LG. Registered in England Wales No. 2366970 E.ON UK Trading Ltd, Westwood Way, Westwood Business Park, Coventry, CV4 8LG Registered in England Wales No. 4178314 E.ON UK Trading Ltd is regulated by the Financial Services Authority to carry out investment activities. Telephone +44 (0) 2476 42 4000 Fax +44 (0) 2476 42 5432
Re: Restore performance problem
Thomas, I suspect the media waits are down to the three streams wanting the same tapes If you are really restoring 9 million separate objects then most likely it is going to be the client end writing out the data where the majority elapsed time is spent. What do the session stats show for network data transfer rate? I have not done tracing for a while, but when I did I always found Perform tracing on the client to be most useful The time is usually found to be spent mainly in Transaction:- A general category to capture all time not accounted for in other sections. This category includes file open/close time and other miscellaneous processing on the client. File open/close processing can make total Transaction time a large part of elapsed time with smaller files File I/O:- Requesting data to be read or written on the client file system. Each File I/O usually represents a 32K logical request (or the remaining data if less than 32K). File I/O may be entered one additional time at the end of the file. With compression on some smaller clients a file I/O can represent a request for less than 32K. A file I/O request may require multiple physical accesses. For small files on systems without read ahead, average file I/O time for backup is 15 to 40 ms dependent on the platform. For large files on system doing read ahead this can be significantly reduced. Slow response times from disks will contribute to the ammount of time logged here. Data Verb :- consists of time spent in the network plus time spent on the host server . Thomas Denier [EMAIL PROTECTED] Sent by: ADSM: Dist Stor Manager ADSM-L@vm.marist.edu 31/03/2005 22:32 Please respond to ADSM: Dist Stor Manager ADSM-L@vm.marist.edu To ADSM-L@vm.marist.edu cc Subject Restore performance problem We recently restored a large mail server. We restored about nine million files with a total size of about ninety gigabytes. These were read from nine 3490 K tapes. The node we were restoring is the only node using the storage pool involved. We ran three parallel streams. The restore took just over 24 hours. The client is Intel Linux with 5.2.3.0 client code. The server is mainframe Linux with 5.2.2.0 server code. 'Query session' commands run during the restore showed the sessions in 'Run' status most of the time. Accounting records reported the sessions in media wait most of the time. We think most of this time was spent waiting for movement of tape within a drive, not waiting for tape mounts. Our analysis has so far turned up only two obvious problems: the movebatchsize and movesizethreshold options were smaller than IBM recommends. On the face of it, these options affect server housekeeping operations rather than restores. Could these options have any sort of indirect impact on restore performance? For example, one of my co-workers speculated that the option values might be forcing migration to write smaller blocks on tape, and that the restore performance might be degraded by reading a larger number of blocks. We are thinking of running a test restore with tracing enabled on the client, the server, or both. Which trace classes are likely to be informative without adding too much overhead? We are particularly interested in information on the server side. The IBM documentation for most of the server trace classes seems to be limited to the names of the trace classes. ** The information in this E-Mail is confidential and may be legally privileged. It may not represent the views of Scottish and Southern Energy Group. It is intended solely for the addressees. Access to this E-Mail by anyone else is unauthorised. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Any unauthorised recipient should advise the sender immediately of the error in transmission. Unless specifically stated otherwise, this email (or any attachments to it) is not an offer capable of acceptance or acceptance of an offer and it does not form part of a binding contractual agreement. Scottish Hydro-Electric, Southern Electric, SWALEC, S+S and SSE Power Distribution are trading names of the Scottish and Southern Energy Group. **
Re: Restore performance problem
Hi, I have done quite similar restores on our mailserver. you may also look at the Client what happens to the restore-process. It may happen that the cpu is at 100 % for the 'dsmc restore ..' ? Another thing is the filesystem on the Client and you may check the filesystem/disk-activity/Service-time if there is any 'weakness' that may result from creating that many i-nodes. I have recently done a lot of mailserver-restores (always 3,5 mio Files/140 GB ) using an old tsm-server ( v5.1.9.5 with k-tapes and same konfig like you ... 10 tapes ) and observed that specially this old tsm-server was at the end. Especially our io-konfiguration of that old tsm-server was very bad : db,log, disk-cache are mixed up. This decreases the restore-performance especially when other activity ( backups at night ) happens. So we used dsmc restore -quiet /mail/ /data2/mail/ (tcpwindowsize 64, tcpbuffsize 32, largecommbuffers no, txnbytelimit 25600 resourceutilization 3) and received the 3,5 mio Files/140 GB finally in 09:53:34 For me that was ok because I know about the bad server-constitution. The restore time would be much more worse if the restore comes into a time when the tsm-DB got a lot of other transactions - like nighly backups. ... restoring the same with only one drive results in 51 hours . Running the same mail-restore test on a new hardware ( new db, tsm5.3, with 3592 Drives ) --using the same restore-client--- we finally got 3.5mio Files/150GB restored in 04:52:00 ... using just 1 drive because the data fits on 1 3599-tape. But here I have experienced a reproduceable bug/behaviour ( it is in the moment 'closed' because the solaris10 is not yet supported ) : when starting the restore everything runs fine and fast ( with a restore-performance at about 1 mio Files/hour ) ... after some time -maybe 40 % of the total restore time- the cpu of the client is raising to 100 % and the restore performance ( data/files) is thus slowing down -- there is no reason for this found at the server or at the client. ... maybe it happens when a very big directory with a lot of directory in it is in progress ... In the end I found a 'workaround': I canceled this slowed-down restore-process running at 100%CPU ( 'dsmc restore -quiet /mail/ /data2/mail/' ) with Control-C, and let him shut down ... and then I just restart the restore with 'dsmc restart restore -quiet' . This 'restarted restore' works fast again and finally ends with the 04:52:00 (total time). If I would not stop/restart the client-restore-session the restore will end restoring with 06:49:09 . That is reproduceable and it is a quite big difference ( 30 % faster with interrupting and restarting ) but maybe its because of our unsupported tsm-version ... or has someone else seen this cpu-crunching behaviour ? Greetings Rainer Thomas Denier wrote: We recently restored a large mail server. We restored about nine million files with a total size of about ninety gigabytes. These were read from nine 3490 K tapes. The node we were restoring is the only node using the storage pool involved. We ran three parallel streams. The restore took just over 24 hours. The client is Intel Linux with 5.2.3.0 client code. The server is mainframe Linux with 5.2.2.0 server code. 'Query session' commands run during the restore showed the sessions in 'Run' status most of the time. Accounting records reported the sessions in media wait most of the time. We think most of this time was spent waiting for movement of tape within a drive, not waiting for tape mounts. Our analysis has so far turned up only two obvious problems: the movebatchsize and movesizethreshold options were smaller than IBM recommends. On the face of it, these options affect server housekeeping operations rather than restores. Could these options have any sort of indirect impact on restore performance? For example, one of my co-workers speculated that the option values might be forcing migration to write smaller blocks on tape, and that the restore performance might be degraded by reading a larger number of blocks. We are thinking of running a test restore with tracing enabled on the client, the server, or both. Which trace classes are likely to be informative without adding too much overhead? We are particularly interested in information on the server side. The IBM documentation for most of the server trace classes seems to be limited to the names of the trace classes. -- Rainer Wolf eMail: [EMAIL PROTECTED] kiz - Abt. Infrastruktur Tel/Fax: ++49 731 50-22482/22471 Universitt Ulm wwweb:http://kiz.uni-ulm.de
Re: help:can tsm server manage multi-library?
Yes, ITSM server can do that. -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of ming li Sent: vrijdag 1 april 2005 6:25 To: ADSM-L@VM.MARIST.EDU Subject: help:can tsm server manage multi-library? Hi all,can tsm server manage more than one library in lan-free backup environment?Thx!
Re: Large Linux clients
An old trick I used for many years: to investigate a problem filesystem, do a find in that filesystem. If the find dies, tsm definitly will die. I'll bet your find will die, and that's why your backup will die/hang or whatever also. A find will do a filestat on all files/dirs, actually the same the backup does. So your issue is OS related and not tsm. Cheers Henk () On Tuesday 29 March 2005 12:11, you wrote: On Mar 29, 2005, at 12:37 PM, Zoltan Forray/AC/VCU wrote: ...However, then I try to backup the tree at the third-level (e.g. /coyote/dsk3/), the client pretty much siezes immediately and dsmerror.log says B/A Txn Producer Thread, fatal error, Signal 11. The server shows the session as SendW and nothing going else going on Zoltan - Signal 11 is a segfault - a software failure. The client programming has a defect, which may be incited by a problem in that area of the file system (so have that investigated). A segfault can be induced by memory constraint, which in this context would most likely be Unix Resource Limits, so also enter the command 'limit' in Linux csh or tcsh and potentially boost the stack size ('unlimit stacksize'). This is to say that the client was probably invoked under artificially limited environmentals. Richard Sims
Re: Restore performance problem
On Mar 31, 2005, at 4:32 PM, Thomas Denier wrote: We recently restored a large mail server. We restored about nine million files with a total size of about ninety gigabytes. These were read from nine 3490 K tapes. The node we were restoring is the only node using the storage pool involved. We ran three parallel streams. The restore took just over 24 hours. The client is Intel Linux with 5.2.3.0 client code. The server is mainframe Linux with 5.2.2.0 server code. ... I noticed that you didn't mention the file system type. The effects of file system type and layout of the subject instance is an often overlooked contributor to performance in operations which are mass-populating the file system, as a restoral will. A journaled file system can exhibit a lot of overhead as its journal is written with at least metadata, depending upon type; and an ill-located journal can make for a lot of disk arm diversions during the restoral, aggravating elapsed time. IBM's outstanding documentation store includes a great series on Linux file systems, which one can jump into at http://www-106.ibm.com/developerworks/linux/library/l-fs7.html . Richard Sims
lost volume configuration in my 3583
Hi all The IBM tech update my firmware on the 3583 .. I was able to reconfigure the library path and drive path ... Now I can't see my volumes How can I make TSM to re-learn the volumes HELP Luc Beaudoin Administrateur Réseau / Network Administrator Hopital General Juif S.M.B.D. Tel: (514) 340-8222 ext:4318
export data
Hello All! I have created a new TSM environment, but I changed the management class naming standard. There is data out on the old system that I want moved to the new environment since it's supposed to be retained for 7 years. Is there a way to export the data directly to the new TSM server and associate it with a different management class? How would I get the data from the old server with the old mc naming standard to the new TSM server with the new MC names? Please let me know if anyone has had success in exporting data and how it was accomplished. TIA for any advice! Joni Moyer Highmark Storage Systems Work:(717)302-6603 Fax:(717)302-5974 [EMAIL PROTECTED]
Re: lost volume configuration in my 3583
Luc, I believe the only way to do this is using checkin libv command. Something like : 1) checkin libvol libname status=scratch search=yes checklabel=no 2) checkin libvol libname status=private search=yes checklabel=no Note that you MUST issue those commands in sequence : scratch first, and then private. Not doing it that way will result in having all of your tapes being private ! Hope this helped ... Regards. Arnaud ** Panalpina Management Ltd., Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 Direct: +41 (61) 226 19 78 e-mail: [EMAIL PROTECTED] ** -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Luc Beaudoin Sent: Friday, 01 April, 2005 15:37 To: ADSM-L@VM.MARIST.EDU Subject: lost volume configuration in my 3583 Hi all The IBM tech update my firmware on the 3583 .. I was able to reconfigure the library path and drive path ... Now I can't see my volumes How can I make TSM to re-learn the volumes HELP Luc Beaudoin Administrateur Réseau / Network Administrator Hopital General Juif S.M.B.D. Tel: (514) 340-8222 ext:4318
Re: Restore performance problem
I noticed that you didn't mention the file system type. The effects of file system type and layout of the subject instance is an often overlooked contributor to performance in operations which are mass-populating the file system, as a restoral will. A journaled file system can exhibit a lot of overhead as its journal is written with at least metadata, depending upon type; and an ill-located journal can make for a lot of disk arm diversions during the restoral, aggravating elapsed time. The output file system was Ext2, which is not journalled.
Re: lost volume configuration in my 3583
I've also found that after doing firmware on the library the default or Extended label option gets reset. That's whether you get 6 or 8 characters of the barcode reported by the library. Check to make sure that the library is configured for the correct option for the TSM tapes. Otherwise TSM will think they are all new tapes (different volume name) and try to use them as scratch. But the internal label won't match so you're safe there. Bill Boyer Some days you're the bug, some days you're the windshield - ?? -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of PAC Brion Arnaud Sent: Friday, April 01, 2005 9:30 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: lost volume configuration in my 3583 Luc, I believe the only way to do this is using checkin libv command. Something like : 1) checkin libvol libname status=scratch search=yes checklabel=no 2) checkin libvol libname status=private search=yes checklabel=no Note that you MUST issue those commands in sequence : scratch first, and then private. Not doing it that way will result in having all of your tapes being private ! Hope this helped ... Regards. Arnaud ** Panalpina Management Ltd., Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 Direct: +41 (61) 226 19 78 e-mail: [EMAIL PROTECTED] ** -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Luc Beaudoin Sent: Friday, 01 April, 2005 15:37 To: ADSM-L@VM.MARIST.EDU Subject: lost volume configuration in my 3583 Hi all The IBM tech update my firmware on the 3583 .. I was able to reconfigure the library path and drive path ... Now I can't see my volumes How can I make TSM to re-learn the volumes HELP Luc Beaudoin Administrateur Réseau / Network Administrator Hopital General Juif S.M.B.D. Tel: (514) 340-8222 ext:4318
Re: tsm acsls
Joni Moyer wrote: Hello All! I had an issue where the /tmp directory filled up and acsls stopped running. At that time TSM was doing a backup stgpool process. The TSM server is at 5.2.2.5 and is running on AIX 5.2 and we have ACSLS 7.1. I tried to cancel the job, but it will not. The tapes are in the drives, but I think that something got messed up when the connection was lost and now it doesn't appear to be backing up, but it won't let me cancel it or dismount the tapes. Has anyone ever had this issue before and if so, what did you have to do to fix it? Thanks! there are scripts in /usr/tivoli/tsm/devices/bin to kill and start the acsls agent/daemon/hack thing, kill it, then restart Joni Moyer Highmark Storage Systems Work:(717)302-6603 Fax:(717)302-5974 [EMAIL PROTECTED] -- Met vriendelijke groeten, Remco Post SARA - Reken- en Netwerkdiensten http://www.sara.nl High Performance Computing Tel. +31 20 592 3000Fax. +31 20 668 3167 I really didn't foresee the Internet. But then, neither did the computer industry. Not that that tells us very much of course - the computer industry didn't even foresee that the century was going to end. -- Douglas Adams
Re: export data
Joni Moyer wrote: Hello All! I have created a new TSM environment, but I changed the management class naming standard. There is data out on the old system that I want moved to the new environment since it's supposed to be retained for 7 years. Is there a way to export the data directly to the new TSM server and associate it with a different management class? How would I get the data from the old server with the old mc naming standard to the new TSM server with the new MC names? Please let me know if anyone has had success in exporting data and how it was accomplished. TIA for any advice! Having data bound to a new mc might be a problem, never done that so I can't tell if and how that might work, the basic tric is te define both servers on the other (def server) then on the old server run: export node whatever tos=newserver filedata=all Joni Moyer Highmark Storage Systems Work:(717)302-6603 Fax:(717)302-5974 [EMAIL PROTECTED] -- Met vriendelijke groeten, Remco Post SARA - Reken- en Netwerkdiensten http://www.sara.nl High Performance Computing Tel. +31 20 592 3000Fax. +31 20 668 3167 I really didn't foresee the Internet. But then, neither did the computer industry. Not that that tells us very much of course - the computer industry didn't even foresee that the century was going to end. -- Douglas Adams
Re: Large Linux clients
Thanks for the suggestion. However, this is not true. We already tried this. We did find . | wc -l to get the object count (1.1M) with no problems. But the backup still will not work. Constantly fails, in unpredictable/inconsistant places, with the same Producer Thread error. I spent 2+ days drilling through the various sub-directories (of this directory that causes the failures), one-by-one, and was able to backup 38 of the 40 subdirs, totalling over 980K objects, with out a problem. When I included these two other directories, in the same pile, the backup would fail. When I then went back and individually selected the sub-sub directories of these sub-directories (one at a time), I was able to backup *ALL* of the sub-sub directories, no problem. Then I went back and selected the upper-level directory and backed it up, no problem.. Let me draw a picture of the structure of these directories. The problem directories are in this directory: /coyote/dsk3/patients/prostateReOpt/Mount_0/ . If I try to backup the /Mount_0/ as a whole, crashes every time. If I point to sub-dirs below /Mount_0/ (40 of these - all with the same named 4-subsub dirs ), two of these cause a crash. I noted that these two both have 72K objects while the other 38 have less than 60K objects. Yet when I manually picked the 4-subsub dirs of the Patient_172 dir, the backup worked (sort of - see below). Same for the Patient_173. To really drive me crazy, the first attempt at backing up one of the subsub dirs under Patient_172, the backup crashed. Yet I could backup the other 3 with no issue. So, we started looking at the problem subdir and noticed a weird file name that ended in a tilde (~). When I excluded it, the backup ran. Then when I went back and picked just the file with the tilde, it backed up fine (my head is getting balder-and-balder !!). I then went back and re-selected the whole Patient_172 directory and it backed up (or at least scanned it since everything was backed-up) just fine !!!1 AGGH !! This is maddening and shows no rhyme-or-reason. Henk ten Have [EMAIL PROTECTED] Sent by: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU 04/01/2005 08:29 AM Please respond to ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU To ADSM-L@VM.MARIST.EDU cc Subject Re: [ADSM-L] Large Linux clients An old trick I used for many years: to investigate a problem filesystem, do a find in that filesystem. If the find dies, tsm definitly will die. I'll bet your find will die, and that's why your backup will die/hang or whatever also. A find will do a filestat on all files/dirs, actually the same the backup does. So your issue is OS related and not tsm. Cheers Henk () On Tuesday 29 March 2005 12:11, you wrote: On Mar 29, 2005, at 12:37 PM, Zoltan Forray/AC/VCU wrote: ...However, then I try to backup the tree at the third-level (e.g. /coyote/dsk3/), the client pretty much siezes immediately and dsmerror.log says B/A Txn Producer Thread, fatal error, Signal 11. The server shows the session as SendW and nothing going else going on Zoltan - Signal 11 is a segfault - a software failure. The client programming has a defect, which may be incited by a problem in that area of the file system (so have that investigated). A segfault can be induced by memory constraint, which in this context would most likely be Unix Resource Limits, so also enter the command 'limit' in Linux csh or tcsh and potentially boost the stack size ('unlimit stacksize'). This is to say that the client was probably invoked under artificially limited environmentals. Richard Sims
Lan-free backup limitations?
Has anyone run across a limit (documented or otherwise) for the number of lan-free clients one TSM server can handle? We have one TSM server that we had 23 Exchange servers running lan-free backups. We added 24 and 25, and now we are having a strange problem where it appears these new clients are polling or query command to the tape drives and it is causing tape errors and taking them offline in some cases. We have verified that the two new servers are identical in every way to all the others... Drives, firmware, BIOS, etc.. When we block these two hosts at the port level from seeing the tape drives, the problem goes away. TSM Server AIX 5.2 TSM Version 5.2.3.1 3494 Library 3590H tape drives Windows Hosts B/A Client version 5.2.0.3 ITSM for Mail 5.2.1.0 Storage Agent version 5.2.2.3 Qlogic 2310F HBA - Driver version 9.0.0.13 We have sumbitted dumps to IBM support, but they are painfully slow in getting back to us. Just thought I would poll the group...'- Thanks, Matt Adams Information Technology Services Deloitte Services LP 615-882-6861 www.deloitte.com This message (including any attachments) contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, you should delete this message. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited.
5.3.1.0 server available
Hi *SM-ers! For those of you that haven't noticed it yet: the TSM 5.3.1.0 server code is available for download: ftp://service.boulder.ibm.com/storage/tivoli-storage-management/maintenance/ server/v5r3/ At this moment only for AIX and Linux, I think. Kindest regards, Eric van Loon KLM Royal Dutch Airlines ** For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message. Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt. **
Re: curious behavior
I just rechecked things with the comments from the list. It seems the current access level of the tape didn't have any bearing on the situation. Right now I have a q mo showing one tape as R/W and the other as R/O. The q vol f=d is telling me that both tapes are Read/Write. tsm: BACKUP1q pr Process Process Description Status Number - 616 Space ReclamationOffsite Volume(s) (storage pool NEWCOPYPOOL), Moved Files: 5868, Moved Bytes: 22,303,511,754, Unreadable Files: 543, Unreadable Bytes: 0. Current Physical File (bytes): 11,347,126,815 Current input volume: 03L2. Current output volume: 93L2. tsm: BACKUP1q mo ANR8330I LTO volume 03L2 is mounted R/O in drive DRV4 (mt0.5.0.5), status: IN USE. ANR8330I LTO volume 93L2 is mounted R/W in drive DRV2 (mt0.3.0.5), status: IN USE. ANR8334I 2 matches found. tsm: BACKUP1q vol 03l2 f=d Volume Name: 03L2 Storage Pool Name: NEWTAPEPOOL Device Class Name: LTO2 Estimated Capacity (MB): 431,430.2 Scaled Capacity Applied: Pct Util: 93.8 Volume Status: Full Access: Read/Write Pct. Reclaimable Space: 6.6 Scratch Volume?: Yes In Error State?: No Number of Writable Sides: 1 Number of Times Mounted: 18 Write Pass Number: 1 Approx. Date Last Written: 03/29/2005 17:13:29 Approx. Date Last Read: 04/01/2005 12:36:43 Date Became Pending: Number of Write Errors: 0 Number of Read Errors: 0 Volume Location: Volume is MVS Lanfree Capable : No Last Update by (administrator): Last Update Date/Time: 03/28/2005 21:27:17 tsm: BACKUP1q vol 93l2 f=d Volume Name: 93L2 Storage Pool Name: NEWCOPYPOOL Device Class Name: LTO2 Estimated Capacity (MB): 204,800.0 Scaled Capacity Applied: Pct Util: 10.3 Volume Status: Filling Access: Read/Write Pct. Reclaimable Space: 0.0 Scratch Volume?: Yes In Error State?: No Number of Writable Sides: 1 Number of Times Mounted: 1 Write Pass Number: 1 Approx. Date Last Written: 04/01/2005 12:47:32 Approx. Date Last Read: 04/01/2005 12:02:48 Date Became Pending: Number of Write Errors: 0 Number of Read Errors: 0 Volume Location: Volume is MVS Lanfree Capable : No Last Update by (administrator): Last Update Date/Time: 04/01/2005 12:01:20
Re: curious behavior
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Tyree, David I just rechecked things with the comments from the list. It seems the current access level of the tape didn't have any bearing on the situation. Right now I have a q mo showing one tape as R/W and the other as R/O. The q vol f=d is telling me that both tapes are Read/Write. That is correct behavior. 03L2 is the input tape for your reclamation process; it will be mounted R/O so that nothing can be written to the source tape while it moves data to the target tape. WAD. I *have* seen instances where a source tape, which should be mounted R/O, is instead mounted R/W. I haven't tracked the issue because, quite frankly, it's not much of an issue; I am usually running down more serious problems. -- Mark Stapleton ([EMAIL PROTECTED]) IBM Certified Advanced Deployment Professional Tivoli Storage Management Solutions 2005 Office 262.521.5627
Re: curious behavior
Hi, Not that curious. The input tape for the reclaim process is mounted in the read-Only status, the output tapes is being writen to, so is mounted read/write. This is normal and has nothing to do with the access state of a volume seen by q vol. In case of a readonly volume, mounted in read/write status and also being the output volume of a process, I would begin to get worried Regards, _ Karel Bos Technical Expert level 5 Server Management - Operations Back-up en Restore Customer Unit Nuon Atos Origin Nederland B.V. Arlandaweg 98 1043 HP Amsterdam Office: +31 (0)20 Fax: +31 (0)20 Mobile: +31 (0)6.51.29.88.01 Mail: [EMAIL PROTECTED] The information in this mail is intended only for use of the individual or entity to which it is addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. Access to this mail by anyone else than the addressee is unauthorised. If you are not the intended recipient, any disclosure, copying, distribution or any action taken omitted to be taken in reliance of it, is prohibited and may be unlawful. If you are not the intended recipient please contact the sender by return e-mail and destroy all copies of the original message. -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Tyree, David Sent: vrijdag 1 april 2005 20:01 To: ADSM-L@VM.MARIST.EDU Subject: Re: curious behavior I just rechecked things with the comments from the list. It seems the current access level of the tape didn't have any bearing on the situation. Right now I have a q mo showing one tape as R/W and the other as R/O. The q vol f=d is telling me that both tapes are Read/Write. tsm: BACKUP1q pr Process Process Description Status Number - 616 Space ReclamationOffsite Volume(s) (storage pool NEWCOPYPOOL), Moved Files: 5868, Moved Bytes: 22,303,511,754, Unreadable Files: 543, Unreadable Bytes: 0. Current Physical File (bytes): 11,347,126,815 Current input volume: 03L2. Current output volume: 93L2. tsm: BACKUP1q mo ANR8330I LTO volume 03L2 is mounted R/O in drive DRV4 (mt0.5.0.5), status: IN USE. ANR8330I LTO volume 93L2 is mounted R/W in drive DRV2 (mt0.3.0.5), status: IN USE. ANR8334I 2 matches found. tsm: BACKUP1q vol 03l2 f=d Volume Name: 03L2 Storage Pool Name: NEWTAPEPOOL Device Class Name: LTO2 Estimated Capacity (MB): 431,430.2 Scaled Capacity Applied: Pct Util: 93.8 Volume Status: Full Access: Read/Write Pct. Reclaimable Space: 6.6 Scratch Volume?: Yes In Error State?: No Number of Writable Sides: 1 Number of Times Mounted: 18 Write Pass Number: 1 Approx. Date Last Written: 03/29/2005 17:13:29 Approx. Date Last Read: 04/01/2005 12:36:43 Date Became Pending: Number of Write Errors: 0 Number of Read Errors: 0 Volume Location: Volume is MVS Lanfree Capable : No Last Update by (administrator): Last Update Date/Time: 03/28/2005 21:27:17 tsm: BACKUP1q vol 93l2 f=d Volume Name: 93L2 Storage Pool Name: NEWCOPYPOOL Device Class Name: LTO2 Estimated Capacity (MB): 204,800.0 Scaled Capacity Applied: Pct Util: 10.3 Volume Status: Filling Access: Read/Write Pct. Reclaimable Space: 0.0 Scratch Volume?: Yes In Error State?: No Number of Writable Sides: 1 Number of Times Mounted: 1 Write Pass Number: 1 Approx. Date Last Written: 04/01/2005 12:47:32 Approx. Date Last Read: 04/01/2005 12:02:48 Date Became Pending: Number of Write Errors: 0 Number of Read Errors: 0 Volume Location: Volume is MVS Lanfree Capable : No Last Update by (administrator): Last Update Date/Time: 04/01/2005 12:01:20
Monthly TSM FAQ April 2005 part 1 of 2 (no April Fools here!)
This Frequently Asked Question list for the ADSM-L mailing list is posted on the first day of each month. It was created to cut down on the number of questions that are repeated regularly in the ADSM-L mailing list from vm.marist.edu. I would be grateful for any requests to include additional material. (Please send them directly to me, rather than to the list.) updated 4/1/2005 Questions marked with $ are new or improved since the last posting. QUESTIONS Sections 01, 02, and 03 01. About the list itself 01-01. How do I subscribe to ADSM-L? 01-02. How do I unsubscribe to ADSM-L? 01-03. Why don't I see the questions I post to ADSM-L? 01-04. How can I see the questions I post to ADSM-L? 01-05. Who decides what questions go on ADSM-L? 01-06. Is there a digest or archive of ADSM-L? 01-07. How do I get more information about ADSM-L? 01-08. Does IBM/Tivoli participate in ADSM-L? 01-09 How can I get just a digest of ADSM-L, instead of all the postings? 02. Types of questions asked 02-01. What subjects are covered in this list? 02-02. What kinds of questions can be asked? 02-03. What kinds of questions can I expect answers to? 02-04. What levels of netiquette are expected? 02-05. What's the first thing to do when I have a question about TSM? 02-06. What's the second thing to do when I have a question about TSM? 02-07. What's the third thing to do when I still have a question about TSM? 02-08. What's the fourth thing to do when I STILL have a question about TSM? 02-09. What's the fifth thing to do when I *STILL* have a question about TSM? 02-10. What's the last thing to do when I *STILL* have a question about TSM? 02-11. What are those out of office messages I keep seeing in the list? 02-12. What's the single best thing I can do to improve the list? 02-13. Why don't I get answers to my I need comparisons between TSM and brandX backup software questions? 02-14. What kinds of things shouldn't I post on ADSM-L? 02-15. Is there some sort of acronym list? 02-16. Whatever happened to Richard Sims? 03. Available TSM resources 03-01. What FAQs are already out there? 03-02. What other sources of help can I find? 03-03. How do I get official TSM support? ANSWERS to section 01, 02, and 03 01-01. How do I subscribe to ADSM-L? Send an email to [EMAIL PROTECTED] with a blank subject line and a message consisting only of the line SUBSCRIBE ADSM-L. 01-02. How do I unsubscribe to ADSM-L? Send an email to [EMAIL PROTECTED] with a blank subject line and a message consisting only of the line UNSUBSCRIBE ADSM-L. Do NOT try to unsubscribe by sending email to [EMAIL PROTECTED] All that does is annoy the list members, and it doesn't get you unsubscribed. 01-03. Why don't I see the questions I post to ADSM-L? That's the normal behavior of ADSM-L. 01-04. How can I see the questions I post to ADSM-L? If you want to see your own questions, send an email to [EMAIL PROTECTED] with a blank subject line and a message that consists only of the line SET ADSM-L REPRO. 01-05. Who decides what questions go on ADSM-L? The list members. There appears to be no active moderation of the list. (That's not a license to abuse the list. Complaints from the list members are heeded by the list administrator.) 01-06. Is there a digest or archive of ADSM-L? Indeed. There are two indexed versions of the mailing list. The first is at http://search.adsm.org; the other one is http://www.mail-archive.com/adsm-l@vm.marist.edu/. (Thanks, Richard!) Personally, I prefer the latter; the former's indexing leaves a lot to be desired, and its message threading is practically non-existent. 01-07. How do I get more information about ADSM-L? Send an email to [EMAIL PROTECTED] with a blank subject line and a message consisting only of the line INFO. This will cause an email to be returned to you with a list of documents available about the listserver and instructions on how to get them. 01-08. Does IBM/Tivoli participate in or post to ADSM-L? From Andy Raibeck, of the TSM client development group: This list server is owned and operated by Marist College, and is not in any way affiliated with IBM. While some IBMers do participate on ADSM-L, they do so on an unofficial, voluntary basis, and thus are not *required* to answer your questions. If you require an answer from IBM, or if your situation is of an urgent nature, then you should (also) go through IBM's official support channels for assistance. 01-09 How can I get just a digest of ADSM-L, instead of all the postings? (from Andy Raibeck) Send an email to: [EMAIL PROTECTED] In the body of the email, put *only* the following: info refcard You do not need a subject line. You will get reference information back from the list server that tells you, among other things, how to configure your subscription to receive the
Monthly TSM FAQ April 2005 (part 2 of 2) (no April Fools here!)
This Frequently Asked Question list for the ADSM-L mailing list is posted on the first day of each month. It was created to cut down on the number of questions that are repeated regularly in the ADSM-L mailing list from vm.marist.edu. I would be grateful for any requests to include additional material. (Please send them directly to me, rather than to the list.) updated 4/1/2005 Questions marked with $ are new or improved since the last posting. Questions for sections 04 and 05 04. Frequently-asked questions on ADSM-L 04-01. Is it called ADSM, or TSM, or ITSM? What's the deal here? 04-02. What are backupsets? How can I use them? 04-03. How does TSM do full/incremental/differential backups, just like my old backup software fillintheblank used to? 04-04. How do I unsubscribe to ADSM-L? 04-05. How do I do mailbox-level restores of Exchange using the Tivoli Data Protection Agent for Exchange? 04-06. How do I force TSM to do a full backup of a client? 04-07. Where can I download the latest version of TSM/TDP? 04-08. What's the very first thing I do after TSM is delivered to me? 04-09. I'm getting message ANRX from the TSM server. What does it mean? 04-10. I'm getting message ANSX from the TSM client. What does it mean? 04-11. My large-scale restores are slow. How can I speed them up? 04-12. How do I back up normally open files, like database files? 04-13. What's all this about TSM and SQL select statements? 04-14. My boss wants disaster recovery procedures. What's the best way to do it? 04-15. How do I get TSM to report problems to me? 04-16. Why does version X of TSM have this bad bug in it? 04-17. How come my copy pool tape reclamation runs so slowly? 04-18. I keep getting these server out of license compliance messages. Why? 04-19. My scheduled backups fail (or are incomplete), but my manual ones work fine. Why? 04-20. While backleveling my TSM client from 4.2.1 to 4.1.3, I get a downlevel message. Why? 04-21. Why do I get an ANR1440I All drives in use. Process being preempted by higher priority operation message when my storage pool backup fails? 04-22. I've deleted all data from a tape volume, but it hasn't come back as a scratch tape. Why? 04-23. What is this ANRD error message? I don't understand it. 04-24. I'm upgrading my TSM server/client from version X.X to version Y.Y Any pitfalls? 04-25. How do I restore one client's data onto another client? 04-26. Will my new tape library work with TSM? 04-27. My Windows client backs up the same 3,000 files or so everyday. Why? 04-28. I'm moving TSM to a new physical server. What's the best way to do that? 04-29. How do I back up my NetWare NDS license files? 04-30. What's all this fuss about cleanup backupgroups? 04-31. I'm trying to include some files for backups, but it's not working. Why? 04-32. Can I put TSM db and log volumes on raw devices? 04-33. Why is my client backup {taking so long|running so slowly|sluggish}? 04-34. I have a tape volume that Q CONTENT says is empty, but I can't delete the volume. Why? 04-35 I'm upgrading my TSM server from version x.x.x.x to y.y.y.y. What's the best way to do it? 04-36 TSM is asking me to convert my archives? Why? 04-37 What kind/how many/what configuration should I set up for database disks/volumes/RAIDs? 04-38 How do I move/resize my database/recoverylog volumes? 04-39 I'm moving my TSM server from operating system BrandX to operating system BrandY. Can I just move my database volumes from one machine to another? Why not? 04-40 My library is out of space. What's wrong with TSM? 04-41 What's the difference between a TSM database backup and a TSM database snapshot? 04-42 How can I change the retention time for an archive I've already created? 04-43. Boss and/or the political situation is forcing me to move my TSM server from one operating system to another. Help! 04-44. What kind of tape drive technology should I consider for my TSM server? 04-45 What is the Deadly Embrace? 04-46 What does the message 'Error 2 deleting row from table Expiring.Objects.' mean? Is it bad? 04-47 I've had problems using the TSAFS module on my NetWare 6.x client. How can I make it work? 04-48 How do I back up my SharePoint Portal database? 04-49 How do I perform both full and incremental backups of my database/mail server? 04-50 What kind of copy serialization is best? 04-51 What is the Eternal Triangle? and from IBM, questions about the Tivoli web site. (Thanks for posting these, Andy!) 05-01. I have a Tivoli ID and an IBM.com registered ID. Which one do I use for problem submission? 05-02. The top of
Backup/restore of links in LINUX
I've a user trying to restore a LINUX directory with links to itself (I confess to not understanding what is going on, but here is a listing from the directory:lrwxrwxrwx 1 root root 6 2005-03-29 11:00 swapoff - swapon). When he tries a restore from either the CLI or gui, swapoff is restored, but not swapon. I ran a select on the directory from BACKUPS, and it appears that only the link is TSM. I'm wondering if the file itself was ever backed up by TSM or only the link. The options file has only the bare minimum needed to get the client to run, with no options specified. Fred Johanson ITSM Administrator University of Chicago 773-702-8464
Re: Large Linux clients
Ya, Sorry, I have no answers for you, but you do have my sympathy. I've had to do that kind of detective work before. Some times it is an oddly named file, a very very long-named file, or some times it's a file that somehow got a very bizarre date, like Apr 15 1904. In a few cases it has also been hung NFS mounts somewhere in the path. I've had to drill down each of the subdir one after another just like you did to figure it out, because there was no filename or other hints in the schedule or error logs, just a generic failed message. Luckily I only have to do it about once or twice a year, but it is time consuming. Ben -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Zoltan Forray/AC/VCU Sent: Friday, April 01, 2005 9:03 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: Large Linux clients Thanks for the suggestion. However, this is not true. We already tried this. We did find . | wc -l to get the object count (1.1M) with no problems. But the backup still will not work. Constantly fails, in unpredictable/inconsistant places, with the same Producer Thread error. I spent 2+ days drilling through the various sub-directories (of this directory that causes the failures), one-by-one, and was able to backup 38 of the 40 subdirs, totalling over 980K objects, with out a problem. When I included these two other directories, in the same pile, the backup would fail. When I then went back and individually selected the sub-sub directories of these sub-directories (one at a time), I was able to backup *ALL* of the sub-sub directories, no problem. Then I went back and selected the upper-level directory and backed it up, no problem.. Let me draw a picture of the structure of these directories. The problem directories are in this directory: /coyote/dsk3/patients/prostateReOpt/Mount_0/ . If I try to backup the /Mount_0/ as a whole, crashes every time. If I point to sub-dirs below /Mount_0/ (40 of these - all with the same named 4-subsub dirs ), two of these cause a crash. I noted that these two both have 72K objects while the other 38 have less than 60K objects. Yet when I manually picked the 4-subsub dirs of the Patient_172 dir, the backup worked (sort of - see below). Same for the Patient_173. To really drive me crazy, the first attempt at backing up one of the subsub dirs under Patient_172, the backup crashed. Yet I could backup the other 3 with no issue. So, we started looking at the problem subdir and noticed a weird file name that ended in a tilde (~). When I excluded it, the backup ran. Then when I went back and picked just the file with the tilde, it backed up fine (my head is getting balder-and-balder !!). I then went back and re-selected the whole Patient_172 directory and it backed up (or at least scanned it since everything was backed-up) just fine !!!1 AGGH !! This is maddening and shows no rhyme-or-reason. Henk ten Have [EMAIL PROTECTED] Sent by: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU 04/01/2005 08:29 AM Please respond to ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU To ADSM-L@VM.MARIST.EDU cc Subject Re: [ADSM-L] Large Linux clients An old trick I used for many years: to investigate a problem filesystem, do a find in that filesystem. If the find dies, tsm definitly will die. I'll bet your find will die, and that's why your backup will die/hang or whatever also. A find will do a filestat on all files/dirs, actually the same the backup does. So your issue is OS related and not tsm. Cheers Henk () On Tuesday 29 March 2005 12:11, you wrote: On Mar 29, 2005, at 12:37 PM, Zoltan Forray/AC/VCU wrote: ...However, then I try to backup the tree at the third-level (e.g. /coyote/dsk3/), the client pretty much siezes immediately and dsmerror.log says B/A Txn Producer Thread, fatal error, Signal 11. The server shows the session as SendW and nothing going else going on Zoltan - Signal 11 is a segfault - a software failure. The client programming has a defect, which may be incited by a problem in that area of the file system (so have that investigated). A segfault can be induced by memory constraint, which in this context would most likely be Unix Resource Limits, so also enter the command 'limit' in Linux csh or tcsh and potentially boost the stack size ('unlimit stacksize'). This is to say that the client was probably invoked under artificially limited environmentals. Richard Sims
Re: curious behavior
On Apr 1, 2005, at 1:11 PM, Stapleton, Mark wrote: ...I *have* seen instances where a source tape, which should be mounted R/O, is instead mounted R/W. I haven't tracked the issue because, quite frankly, it's not much of an issue; I am usually running down more serious problems. I wonder if it's a situation where the tape had been mounted for a preceding R/W operation then, during the MOUNTRetention period, along came the next request for it, as an input volume. Richard Sims
3583 Meltdown
Hi everyone, Here's an interesting little piece of activity log from one of my remote TSM servers Gotta like TapeAlert. Other than being an amusing way to start my day, everything checks out OK and I can't believe it. 1. The 6AM job is a copy stgpool operation. 2. Despite the TapeAlert warnings, there were no stuck tapes. 3. L20014 wasn't snapped when visually inspected. 4. The library wouldn't mount tapes even though no drives were being used. 5. Stopped TSM, rebooted the library, started TSM and all was well again. 6. Audited each volume reported below without a single failure. TSM Server: 5.2.4.1 OS AIX 5.2 Library 3583 with 3 LTO2 drives (SCSI) I don't really have a question, just thought it was odd and someone might have insight. Date/TimeMessage -- 03/31/2005 06:00:50 ANR8949E Device /dev/smc0, volume has issued the following Critical TapeAlert: The library can not operate without the magazine. 1. Insert the magazine into the library. 2. Restart the operation. (SESSION: 21550, PROCESS: 784) 03/31/2005 06:11:36 ANR8950W Device /dev/rmt1, volume L20056 has issued the following Warning TapeAlert: The cartridge is not data-grade. Any data you write to the tape is at risk. Replace the cartridge with a data-grade tape. (SESSION: 21550, PROCESS: 784) 03/31/2005 06:11:36 ANR8950W Device /dev/rmt1, volume L20056 has issued the following Warning TapeAlert: The tape drive may have a hardware fault. Run extended diagnostics to verify and diagnose the problem. Check the tape drive users manual for device specific instruction on running extended diagnostic tests. (SESSION: 21550, PROCESS: 784) 03/31/2005 06:17:33 ANR8950W Device /dev/rmt2, volume L20066 has issued the following Warning TapeAlert: The cartridge is not data-grade. Any data you write to the tape is at risk. Replace the cartridge with a data-grade tape. (SESSION: 21550, PROCESS: 786) 03/31/2005 06:17:33 ANR8950W Device /dev/rmt2, volume L20066 has issued the following Warning TapeAlert: The tape drive may have a hardware fault. Run extended diagnostics to verify and diagnose the problem. Check the tape drive users manual for device specific instruction on running extended diagnostic tests. (SESSION: 21550, PROCESS: 786) 03/31/2005 06:39:38 ANR8950W Device /dev/rmt4, volume L20014 has issued the following Warning TapeAlert: The cartridge is not data-grade. Any data you write to the tape is at risk. Replace the cartridge with a data-grade tape. (SESSION: 21550, PROCESS: 785) 03/31/2005 06:39:39 ANR8948S Device /dev/rmt4, volume L20014 has issued the following Critical TapeAlert: The operation has failed because the tape in the drive has snapped: 1. Do not attempt to extract the tape cartridge. 2. Call the tape drive supplier help line. (SESSION: 21550, PROCESS: 785) 03/31/2005 06:39:39 ANR8949E Device /dev/rmt4, volume L20014 has issued the following Critical TapeAlert: The tape drive needs cleaning: 1. If the operation has stopped, eject the tape and clean the drive. 2. If the operation has not stopped, wait for it to finish and then clean the drive. Check the tape drive users manual for device specific cleaning instructions. (SESSION: 21550, PROCESS: 785) 03/31/2005 06:39:39 ANR8950W Device /dev/rmt4, volume L20014 has issued the following Warning TapeAlert: The tape drive is due for routine cleaning: 1. Wait for the current operation to finish. 2. Then use a cleaning cartridge. Check the tape drive users manual for device specific cleaning instructions. (SESSION: 21550, PROCESS: 785) 03/31/2005 06:39:39 ANR8949E Device /dev/rmt4, volume L20014 has issued the following Critical TapeAlert: The tape drive has a hardware fault: 1. Eject the tape or magazine. 2. Reset
Re: 3583 Meltdown
On Fri, 1 Apr 2005, Curtis Stewart wrote: Here's an interesting little piece of activity log from one of my remote TSM servers Gotta like TapeAlert. Other than being an amusing way to start my day, everything checks out OK and I can't believe it. [...lots of bogus TapeAlerts omitted...] And then there were three. :-) So you're now the third person who has posted to this list in the past month or so with this same problem. At least this *looks* to be the same problem - yours is with a 3583, the other instances I'm aware of are with 3584s. Please see recent postings with the subject 'LTO2 corrupted index question', where Jurjen Oskam and I discussed the 'flurry of silly TapeAlerts' problems we've been seeing on our 3584 libraries. In my case, I sometimes see a dozen of these TapeAlerts all at the same time, telling me the tape just snapped, the drive needs cleaning, the drive was just cleaned but the cleaning cartridge is no good, the data cartridge is no good, the drive has a hardware fault, etc., etc. None of which appear to be true. I get no error log entries or Atape dumps at the time of these messages; I'm current on drive and library firmware and on Atape. These may be harmless, or they may not be; in either case, it makes it harder to track and deal with real problems with the tape drives. Last I heard, both Jurjen and I have problems open with IBM hardware support (personally, I suspect a firmware problem, but obviously that's just an uneducated guess). Maybe if everyone who's seeing these messages contacts IBM support, they'll have a better chance of figuring out what's going on. Regards, Bill Bill Kelly Auburn University OIT 334-844-9917
Re: 3583 Meltdown
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Curtis Stewart Here's an interesting little piece of activity log from one of my remote TSM servers Gotta like TapeAlert. Other than being an amusing way to start my day, everything checks out OK and I can't believe it. 1. The 6AM job is a copy stgpool operation. 2. Despite the TapeAlert warnings, there were no stuck tapes. 3. L20014 wasn't snapped when visually inspected. 4. The library wouldn't mount tapes even though no drives were being used. 5. Stopped TSM, rebooted the library, started TSM and all was well again. 6. Audited each volume reported below without a single failure. TSM Server: 5.2.4.1 OS AIX 5.2 Library 3583 with 3 LTO2 drives (SCSI) I don't really have a question, just thought it was odd and someone might have insight. My first inclination is to not trust the TapeAlert messages' doom-and-gloom pronouncements. That being said, I actually don't know a lot about TapeAlert, other than the fact that in older versions of TSM (before version 5) TSM and TapeAlert did not work and play well together. It is possible that the relationship has improved, since one of TSM 5.2's selling points was improved TapeAlert support. It sounds like TapeAlert does something to stop TSM from performing its operations until the something is cleared or TSM is restarted. -- Mark Stapleton ([EMAIL PROTECTED]) IBM Certified Advanced Deployment Professional Tivoli Storage Management Solutions 2005 Office 262.521.5627
Re: 3583 Meltdown
We did find Tape Alert messages useful in the past (told us which tapes were affected by the corrupted index problem). But we too were getting too many of these useless errors so we decided to turn tape alert off and only use if we suspect a problem. (Perhaps we should leave tape alert on but not report on any of the errors ...) We are on TSM 5.2.2.4 on Windows with SCSI LTO1 and LTO2 in 3584's. Tim Rushforth City of Winnipeg -Original Message- From: Bill Kelly [mailto:[EMAIL PROTECTED] Sent: Friday, April 01, 2005 1:58 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: 3583 Meltdown On Fri, 1 Apr 2005, Curtis Stewart wrote: Here's an interesting little piece of activity log from one of my remote TSM servers Gotta like TapeAlert. Other than being an amusing way to start my day, everything checks out OK and I can't believe it. [...lots of bogus TapeAlerts omitted...] And then there were three. :-) So you're now the third person who has posted to this list in the past month or so with this same problem. At least this *looks* to be the same problem - yours is with a 3583, the other instances I'm aware of are with 3584s. Please see recent postings with the subject 'LTO2 corrupted index question', where Jurjen Oskam and I discussed the 'flurry of silly TapeAlerts' problems we've been seeing on our 3584 libraries. In my case, I sometimes see a dozen of these TapeAlerts all at the same time, telling me the tape just snapped, the drive needs cleaning, the drive was just cleaned but the cleaning cartridge is no good, the data cartridge is no good, the drive has a hardware fault, etc., etc. None of which appear to be true. I get no error log entries or Atape dumps at the time of these messages; I'm current on drive and library firmware and on Atape. These may be harmless, or they may not be; in either case, it makes it harder to track and deal with real problems with the tape drives. Last I heard, both Jurjen and I have problems open with IBM hardware support (personally, I suspect a firmware problem, but obviously that's just an uneducated guess). Maybe if everyone who's seeing these messages contacts IBM support, they'll have a better chance of figuring out what's going on. Regards, Bill Bill Kelly Auburn University OIT 334-844-9917
Re: How to schedule the backup?
Thanks Andy. Unfortunately I am still sitting on TSM 5.1/5.2. The customer needs the backup for weekday not for weekend. But for Monday I can only be allowed to start backup at 2:00am on Tuesday, so does Friday, I do it on Saturday 2:00am. Maybe there is a workaround, but haven't gotten a change to try it. 1. Create a regular incremental backup schedule on weekday start at 23:59 2. On the client, customize the preschedulecmd to let the schedule to sleep 2 hours. It is on Unix Client, so just sleep 7200 I will let you guys know my test. On Mar 31, 2005 4:30 PM, Andrew Raibeck [EMAIL PROTECTED] wrote: As has already been mentioned, TSM 5.3 has an enhanced schedule feature that allows you to do with with one schedule. Otherwise you will need to define 5 schedules, one for each day the event should run. While it might take slightly more effort to set up 5 schedules instead of 1, once they are defined, you're done. While admin schedules can be used to define and delete client schedules, you'll lose prior event information when the schedules are deleted. Regards, Andy Andy Raibeck IBM Software Group Tivoli Storage Manager Client Development Internal Notes e-mail: Andrew Raibeck/Tucson/[EMAIL PROTECTED] Internet e-mail: [EMAIL PROTECTED] The only dumb question is the one that goes unasked. The command line is your friend. Good enough is the enemy of excellence. ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU wrote on 2005-03-31 13:01:11: I want to setup one client schedule which only starts at Tue/Wed/Thu/Fri/Sat. How can I do it? TIA.
Licensing again
Hello, If I have an Exchange server, and I'm backing up only exchange database, no file backup; do I need TDP license plus server license, or just TDP license. Regards, Joe Crnjanski Infinity Network Solutions Inc. Phone: 416-235-0931 x26 Fax: 416-235-0265 Web: www.infinitynetwork.com
Re: Licensing again
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Joe Crnjanski If I have an Exchange server, and I'm backing up only exchange database, no file backup; do I need TDP license plus server license, or just TDP license. Being that you cannot run TSM for Mail (Exchange) unless you install the TSM client as well, I suspect you'll need to license both. Talk to your Tivoli reseller; they should be able to tell you. -- Mark Stapleton ([EMAIL PROTECTED]) IBM Certified Advanced Deployment Professional Tivoli Storage Management Solutions 2005 Office 262.521.5627
Re: Licensing again
Yes, he definitely needs both. One is TDP for Exchange Server, one is for TSM Client. Both of them must be registed on TSM Server. On Apr 1, 2005 9:48 PM, Stapleton, Mark [EMAIL PROTECTED] wrote: From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Joe Crnjanski If I have an Exchange server, and I'm backing up only exchange database, no file backup; do I need TDP license plus server license, or just TDP license. Being that you cannot run TSM for Mail (Exchange) unless you install the TSM client as well, I suspect you'll need to license both. Talk to your Tivoli reseller; they should be able to tell you. -- Mark Stapleton ([EMAIL PROTECTED]) IBM Certified Advanced Deployment Professional Tivoli Storage Management Solutions 2005 Office 262.521.5627
Remove from list
Please remove me from your mailing list. Thanks, Dan