All, My collegue found the following article from the University of Florida which explains the unload/load concept but more importantly provides a select statement to show how fragmented the TSM DB to determine if the unload is really needed or not. It takes a few seconds to run but can eliminate unnecessary hours of potential pain.
Also, APAR IC47516 is out there for those running Celerra backups where an Unload will not update the inventory table and will eventually fail future Celerra backups. TSM Database reloading Summary The database at the core of a TSM instance is prone to fragmentation, increasing its' size. (as of Mar 2005) There are no online utilities available to correct this problem. The increased size and fragmentation are reflected in expiration time and backup speed, eventually presenting an obstacle to normal operations. This document describes a detailed procedure for the current recommendation to solve this problem: Unloading and Reloading the TSM database. Evaluating the benefits of a reload Before you set about taking down the server for an unload and reload, it would be wise to estimate wether the size reduction which will follow the procedure is worth the effort. The unload and reload can take rather a long time, so a reduction of small stature is probably not worth it. There is a query recommended by the TSM listserv which purports to estimate the degree of fragmentation which your database is experiencing. SELECT CAST((100 - (CAST(MAX_REDUCTION_MB AS FLOAT) * 256 ) / - (CAST(USABLE_PAGES AS FLOAT) - CAST(USED_PAGES AS FLOAT) ) * 100) AS - DECIMAL(4,2)) AS PERCENT_FRAG FROM DB should generate a number by which you can estimate the amount of benefit would accrue from your unload/reload. FIXME: In this paragraph I will calibrate the returns from the query and suggest when is a good time. Performing the unload-reload 1. Prepare your environment for recovery 2. You're will essentially destroy your TSM database as you perform the unload. You would be well advised to make preparations for a smooth disaster-recovery before you begin. You should, at least: * identify the device class to which you intend to unload the DB. In this example I am going to call it DBUNLOAD. * Ensure that the device class in question has capacity adequate to receive the unload. If you have enough space to sustain your total DB volume, plus 10-20 percent, you should be fine. You expect, of course, that the unload will be substantially smaller than the live DB. * Backup your VOLUMEHISTORY and DEVCONFIG * perform a database backup, full or incremental. * Locate and read the TSM documentation on DSMSERV LOADFORMAT, DSMSERV AUDITDB, DSMSERV UNLOADDB and DSMSERV LOADDB . There is a reference to the IBM and Tivoli documentation presences on the web at the Administrator Documentation <http://open-systems.ufl.edu/services/NSAM/admin_docs/index.html> page of this site. * You might wish to disable sessions in the dsmserv.opt with the disablescheds option. This will avoid interference as you bring the server up again, Just In Case. * Double-check the characteristics of your database and server. Are you in rollforward mode or normal? Are your volumes mirrored as you expect? Are the volumes in the locations you expect? Do you use any server-to-server communications? You'll need to know these things at the end of your reload, if you are to ensure that they are all working properly again. 1. Halt the server. 2. When you stop the TSM server for this process, you will want to do so with the 'quiesce' parameter, which will make it possible to perform the unload and reload without auditing the database thereafter. 3. Perform the unload 4. This will probably be the longest duration of any of your steps. Some examples of how long it's taken others are avaialable in this list <http://open-systems.ufl.edu/services/NSAM/maint_docs/db_un_reload.html> of real-world experiences. During the unload process, the TSM server takes all of the scattered data blocks, and assembles them in order. 5. Be sure to carefully read the documentation of the DSMSERV UNLOADDB command in the TSM docs. I use 6. DSMSERV UNLOADDB devclass=DBUNLOAD \ 7. > /var/tmp/unloaddb.log 2>1 < /dev/null & 8. 9. This formulation lets you watch the log (possibly from some location other than that from which you began the process) and removes some concerns about (say) the machine on which your terminal resides dying in the interim. 10. This command ought to result in a consistent database image. No audit ought to be necessary. 11. At the end of the log output of the unload process, you will see a recap of the list of volumes used. This list will be necessary at reload-time. 12. Format the DB containers. 13. You must prepare the DB containers to receive the load. This process overwrites the recovery log, but you'd already blown away the database in the unload. You did do an incremental up in step 1, right? 14. Be sure to carefully read the documentation of the DSMSERV LOADFORMAT command in the TSM docs. This command will be different for every installation. One of mine is 15. DSMSERV LOADFORMAT 2 /dev/rtwebctlglv01a /dev/rtwebctlglv02a \ 16. 4 /dev/rtwebctdblv01a /dev/rtwebctdblv02a \ 17. /dev/rtwebctdblv03a /dev/rtwebctdblv04a \ 18. > /var/tmp/loadformat.log 2>1 < /dev/null & 19. 20. This formulation lets you watch the log (possibly from some location other than that from which you began the process) and removes some concerns about (say) the machine on which your terminal resides dying in the interim. 21. You may wish to use an alternate log volume for this process, one which is very small. The majority of the time taken by the LOADFORMAT is the initialization of the log. Once your server is up and running, you can add the production log volumes back to the log scheme, and re-extend the log. 22. This format process formats ALL the database volumes supplied as a single database. If your database is mirrored, you should not supply both sets of volumes, only one. You'll re-mirror the database once the process is complete. 23. The logformat process is fairly quick. Expect minutes, rather than tens of minutes. 24. Perform the load 25. The load process is usually substantially shorter than the unload. Less than half is quite common. During this process, the TSM server feeds the well-ordered data blocks back onto your server DB volumes. 26. Be sure to carefully read the documentation of the DSMSERV LOADDB command in the TSM docs. 27. DSMSERV LOADDB devclass=DBUNLOAD \ 28. VOLumenames= vola,volb[,...] \ 29. > /var/tmp/unloaddb.log 2>1 < /dev/null & 30. 31. This formulation lets you watch the log (possibly from some location other than that from which you began the process) and removes some concerns about (say) the machine on which your terminal resides dying in the interim. 32. This command ought to result in a consistent database image. No audit ought to be necessary. 33. Clean up the detritus 34. Now, you are ready to restart the server and check that all is well. Some things you should expect, or expect to do: * Your DB will have its' assigned capacity as the complete capacity of all available volumes. I prefer to run with somewhat less; according to local conventions, you might want to shrink it some. * If your database was mirrored before, re-define the mirror copies. If you accidentally formatted both sets of volumes, blow away the empty ones (there should be plenty of empty ones) and redefine them in a manner that permits the re-mirroring. * If you used a temporary log volume to shorten loadformat time, then put your production volumes in place. * Do a full DB backup. You want to safeguard this new, more organized DB state. * For each of the servers with which you have set up server-to-server communications, perform an UPDATE SERVER FORCESYNC=YES so that the server identification token can be updated. * Backup your VOLUMEHISTORY and DEVCONFIG * If you disallowed sessions in your dsmserv.opt, then re-allow them now, and halt and restart the server. Real-world experiences: Platform Disk tech Original Size Final size Unload time Load time Comments Win2K [unknown] 100GB 50GB (50% decrease) 22 hours 8 hours (64% faster) Inventory expiration went from 21 hours to 7 AIX SSA 43GB 28GB (34% decrease) 11 hours 3 hours (72% faster) AIX SSA 16GB 12GB (25% decrease) 4 hours 1 hour (75% faster) Last updated: 2005-06-06T16:50:21-04:00Home <http://open-systems.ufl.edu/index.html> Send Feedback <mailto:[EMAIL PROTECTED]> Copyright (c) 2004, 2005 Open Systems Group, <http://open-systems.ufl.edu/> Computing and Networking Services, <http://www.cns.ufl.edu/> University of Florida <http://www.ufl.edu/> . Regards, Brian Brian Scott EDS Global Client Engineering-GM MS 3234 4594 W Nancy Dr. Kankakee, IL 60901 ( Phone:+1-815-939-2684) + mailto:[EMAIL PROTECTED] -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Kelly Lipp Sent: Wednesday, May 17, 2006 6:08 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM 5.3.3 loaddb and audit problem Richard, I could not agree more on your stance regarding Dump/Load. However, I'm in Holland teaching a Level 2 class and have been surprised to learn that a lot of my students perform this action as a matter of course on their servers. The objective is to reduce the size of aged TSM databases. In TSM 5.3 we have new functionality to determine if a db reorg would reclaim a significant amount of space. Then the Dump/load is executed to get this space. Do you suppose this new command is encouraging us to do something that is high risk? Alternatives? I guess they've decided the risk is worth the potential gain. I personally have not experience the problem so have not attempted this solution. Kelly J. Lipp VP Manufacturing & CTO STORServer, Inc. 485-B Elkton Drive Colorado Springs, CO 80907 719-266-8777 [EMAIL PROTECTED] -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Richard Sims Sent: Tuesday, May 16, 2006 6:46 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM 5.3.3 loaddb and audit problem Do not take any further actions on your own: call TSM Support and engage them in the problem. You risk doing further damage to your database if you continue tinkering with it, as we have and IBM have stressed in the past. It seems this needs to be stressed again: DO NOT ELECTIVELY RUN UNLOADDB - LOADDB ON YOUR TSM DATABASE!! These are *salvage* utilities. The ADSM-L archived chronicle the horror stories of customers who have followed mis-advice and proceeded to perform "compress" on their TSM database. If you need corroboration on this, review the APARs on these utilities. Such software does not receive a lot of attention from developers, who are pressed to work on new features rather than old, lesser-used utilities like these. And there are no long-term gains in reorganizing your TSM database: it's a lot of risk and no real gain. We've seen too many customers in pain because of this stuff, and I don't want to see any more. Richard Sims On May 16, 2006, at 8:09 AM, Abdulaziz Almuammar wrote: > Dear All, > we did unloaddb and loaddb but after the loaddb we faced a problem on > the backup of the nodes and it was resolved by upgrading TSM server > from 5.3.2 to 5.3.3. > However, we are facing a problem on some nodes when we do restore, > some files could ot be restored and we got a message that those files > are not available on the TSM server :( although all volumes with > "readwrite" access status > > > to make sure that the TSM db information is synced we have to run the > auditdb but the problem with this is it takes a long time to do it and > it is offline proccess > > Is there another way to make sure that the database information is > correct? > Is audit volume command on the storagepool level ( All volumes) will > do the same job as auditdb? although it takes a long time but atleast > the TSM server is up > > > > Regards, > Abdul