Re: Our TSM system is a mess. Suggestions? Ideas?
Look at that Clarion system. While you may have multiple links to the SAN the Clarion might not. I ran into this where our Clarion had lots of disks ( spindles ) but only 2 ports for i/o into the box. It turns out we were running both ports at over 80% !! Try to get your DB and log volumes on local disks for better throughput. Jerry Michalak jerry_...@yahoo.com From: Dury, John C. jd...@duqlight.com To: ADSM-L@VM.MARIST.EDU Sent: Sun, February 14, 2010 7:12:16 AM Subject: [ADSM-L] Our TSM system is a mess. Suggestions? Ideas? We have about 500 nodes and have a backup windows from 5pm until 7am. I have our backup schedule setup so that about 30 nodes do incremental per hour with a few exceptions. We have a 3T disk storage pool and 4 LTO4 drives in our tape library. Our dbbackuptrigger is set at logfull 30% and numincrmeentals of 4. Our recovery log is filling up almost once per hour while backups are running and not emptying fast enough before it hits 80% when all backups come to a crawl until it is emptied below 80%. Sometimes the recovery log is pinned at 70% or so and another backup kicks off immediately which again does not empty fast enough and the whole system goes into slowdown after the recovery log is past 80%. Expiration, which used to run in a matter of about 6 hours, is not completing even after running for 24 hours. Our DB is about 97gig and about 74% full. The recovery log is maxed at 13gig. I don't see anything in the activity log out of the ordinary. The TSM server is AIX 5.3.10.1 TL10 running on an IBM 9131-52A in a logical partition with 20 CPus configured and about 32G of RAM. The TSM DB and disk storage pools are attached to a Clariion CX3-80 via 4G Hbas. I have the recovery log and TSM DB set to use different HBAs then the disk or tape storage pools so the HBAs aren't fighting each other. I've read the tuning and performance manual and matched our settings to match it's suggestions with some small exceptions. We have purchased new hardware to move the whole system to Linux and a monster of a box since we want to get to TSM v6.x eventually, hopefully sooner rather than later. AIX hardware and support is tremendously expensive when compared to an intel based box and like a lot of people, we have a very small budget for anything IT related. . One of the biggest problems we are having is the recovery log filling up too quickly and not emptying fast enough. Even with a log full trigger of 30%, the incremental backup won't finish before the recovery log hits 80% and with the log full setting so low, we are doing TSM DB backups almost every hour while clients are backing up. This really seems excessive to me. Why would an incremental backup of the TSM DB take an hour or so to run and is it normal for the recovery log to fill up so fast while backups are running? We even attempted to do a reorg of the TSM DB but unfortunately it was going to run for much longer than our window allowed so it had to be cancelled. I'm going to try again for next weekend and hopefully talk the powers that be, into a 24 hour window for the reorg. We did do a reorg years ago and the performance improvements were amazing, ie expiration ran in less than an hour. I know that is a bandaid but I have to do something until I can get to version 6 when I can have a bigger recovery log and a new, more powerful server in place. I guess I'm just not sure what to look at at this point and frankly I'm exhausted. Our help desk is calling me daily, every day, at 6am or earlier, as TSM is running slow again. Any suggestions on what else to look at? (Sorry for such a fragmented email. I've had about 3 hours sleep at this point)
Re: Our TSM system is a mess. Suggestions? Ideas?
Verify that the CX3-80 is using different physicals for DB, Log and disk pools. Your AIX server can easily outrun a CX3-80 unless care is taken. Also make sure that the disks are spread between the 2 SPs in the CX3-80. We are running 500 clients on a AIX LPAR with .9 CPU (can steal up to 3 in an 8 way box) and our disk pools are on an overworked CX-300. DB and logs are on DMX. To me it sounds like the disks may be configured in a less than optimal way and as others have said, find and fix/reschedule to nodes that are pinning the log. Andy Huebner -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Dury, John C. Sent: Sunday, February 14, 2010 7:12 AM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Our TSM system is a mess. Suggestions? Ideas? We have about 500 nodes and have a backup windows from 5pm until 7am. I have our backup schedule setup so that about 30 nodes do incremental per hour with a few exceptions. We have a 3T disk storage pool and 4 LTO4 drives in our tape library. Our dbbackuptrigger is set at logfull 30% and numincrmeentals of 4. Our recovery log is filling up almost once per hour while backups are running and not emptying fast enough before it hits 80% when all backups come to a crawl until it is emptied below 80%. Sometimes the recovery log is pinned at 70% or so and another backup kicks off immediately which again does not empty fast enough and the whole system goes into slowdown after the recovery log is past 80%. Expiration, which used to run in a matter of about 6 hours, is not completing even after running for 24 hours. Our DB is about 97gig and about 74% full. The recovery log is maxed at 13gig. I don't see anything in the activity log out of the ordinary. The TSM server is AIX 5.3.10.1 TL10 running on an IBM 9131-52A in a logical partition with 20 CPus configured and about 32G of RAM. The TSM DB and disk storage pools are attached to a Clariion CX3-80 via 4G Hbas. I have the recovery log and TSM DB set to use different HBAs then the disk or tape storage pools so the HBAs aren't fighting each other. I've read the tuning and performance manual and matched our settings to match it's suggestions with some small exceptions. We have purchased new hardware to move the whole system to Linux and a monster of a box since we want to get to TSM v6.x eventually, hopefully sooner rather than later. AIX hardware and support is tremendously expensive when compared to an intel based box and like a lot of people, we have a very small budget for anything IT related. . One of the biggest problems we are having is the recovery log filling up too quickly and not emptying fast enough. Even with a log full trigger of 30%, the incremental backup won't finish before the recovery log hits 80% and with the log full setting so low, we are doing TSM DB backups almost every hour while clients are backing up. This really seems excessive to me. Why would an incremental backup of the TSM DB take an hour or so to run and is it normal for the recovery log to fill up so fast while backups are running? We even attempted to do a reorg of the TSM DB but unfortunately it was going to run for much longer than our window allowed so it had to be cancelled. I'm going to try again for next weekend and hopefully talk the powers that be, into a 24 hour window for the reorg. We did do a reorg years ago and the performance improvements were amazing, ie expiration ran in less than an hour. I know that is a bandaid but I have to do something until I can get to version 6 when I can have a bigger recovery log and a new, more powerful server in place. I guess I'm just not sure what to look at at this point and frankly I'm exhausted. Our help desk is calling me daily, every day, at 6am or earlier, as TSM is running slow again. Any suggestions on what else to look at? (Sorry for such a fragmented email. I've had about 3 hours sleep at this point) This e-mail (including any attachments) is confidential and may be legally privileged. If you are not an intended recipient or an authorized representative of an intended recipient, you are prohibited from using, copying or distributing the information in this e-mail or its attachments. If you have received this e-mail in error, please notify the sender immediately by return e-mail and delete all copies of this message and any attachments. Thank you.
Re: Our TSM system is a mess. Suggestions? Ideas?
First, I want to thank all of you for your replies. I definitely got some good ideas and have some things to look at. I'm going to make some changes to where the DB and rec log are stored. Right now, they are in the same RAID 1 Group with 2 133g drives. I created a new lun of 6 133g drives setup as RAID 1/0. Eventually all of this will be moved to a bigger box with the storage pools and DB and Recovery log living on local disks. Thanks everyone!
Re: Our TSM system is a mess. Suggestions? Ideas?
plug When running AIX (or Linux), run nmon to keep an eye on your systems performance. http://www.ibm.com/developerworks/aix/library/au-analyze_aix/ And off course put the collected nmon files through nmon analyser http://www.ibm.com/developerworks/aix/library/au-nmon_analyser/ We do this once a month or so (if time permits) to identify bottlenecks and hot spots /plug Good luck, 2010/2/15 Dury, John C. jd...@duqlight.com First, I want to thank all of you for your replies. I definitely got some good ideas and have some things to look at. I'm going to make some changes to where the DB and rec log are stored. Right now, they are in the same RAID 1 Group with 2 133g drives. I created a new lun of 6 133g drives setup as RAID 1/0. Eventually all of this will be moved to a bigger box with the storage pools and DB and Recovery log living on local disks. Thanks everyone! -- Kind Regards, Groetje, Marcel Anthonijsz
Re: Our TSM system is a mess. Suggestions? Ideas?
Hi... We have a similar TSM system. We have our TSM DB (over 400GB) only about 1/3 full and proactively run incrementals. We have fourteen LTO3 tapes drives directly connected using 4Gbps FC adapters. IBM 9155-55A with 8-WAY/64GB of memory and an IBM SVC(four nodes)/FAStT system using DS4800s that uses about 6TB of TSM disk cache using 4-4Gbps FC adapters. Using fast disk space helps in TSM DB backups and other TSM activities. Our server environment is SAP R/3 landscapes on RISCs, Intel/MS and VMware farms, etc. I would think increasing TSM DB size and using a faster disk system would help. Placing SVC in front of disk space helps caching the data. -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Dury, John C. Sent: Sunday, February 14, 2010 7:12 AM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Our TSM system is a mess. Suggestions? Ideas? We have about 500 nodes and have a backup windows from 5pm until 7am. I have our backup schedule setup so that about 30 nodes do incremental per hour with a few exceptions. We have a 3T disk storage pool and 4 LTO4 drives in our tape library. Our dbbackuptrigger is set at logfull 30% and numincrmeentals of 4. Our recovery log is filling up almost once per hour while backups are running and not emptying fast enough before it hits 80% when all backups come to a crawl until it is emptied below 80%. Sometimes the recovery log is pinned at 70% or so and another backup kicks off immediately which again does not empty fast enough and the whole system goes into slowdown after the recovery log is past 80%. Expiration, which used to run in a matter of about 6 hours, is not completing even after running for 24 hours. Our DB is about 97gig and about 74% full. The recovery log is maxed at 13gig. I don't see anything in the activity log out of the ordinary. The TSM server is AIX 5.3.10.1 TL10 running on an IBM 9131-52A in a logical partition with 20 CPus configured and about 32G of RAM. The TSM DB and disk storage pools are attached to a Clariion CX3-80 via 4G Hbas. I have the recovery log and TSM DB set to use different HBAs then the disk or tape storage pools so the HBAs aren't fighting each other. I've read the tuning and performance manual and matched our settings to match it's suggestions with some small exceptions. We have purchased new hardware to move the whole system to Linux and a monster of a box since we want to get to TSM v6.x eventually, hopefully sooner rather than later. AIX hardware and support is tremendously expensive when compared to an intel based box and like a lot of people, we have a very small budget for anything IT related. . One of the biggest problems we are having is the recovery log filling up too quickly and not emptying fast enough. Even with a log full trigger of 30%, the incremental backup won't finish before the recovery log hits 80% and with the log full setting so low, we are doing TSM DB backups almost every hour while clients are backing up. This really seems excessive to me. Why would an incremental backup of the TSM DB take an hour or so to run and is it normal for the recovery log to fill up so fast while backups are running? We even attempted to do a reorg of the TSM DB but unfortunately it was going to run for much longer than our window allowed so it had to be cancelled. I'm going to try again for next weekend and hopefully talk the powers that be, into a 24 hour window for the reorg. We did do a reorg years ago and the performance improvements were amazing, ie expiration ran in less than an hour. I know that is a bandaid but I have to do something until I can get to version 6 when I can have a bigger recovery log and a new, more powerful server in place. I guess I'm just not sure what to look at at this point and frankly I'm exhausted. Our help desk is calling me daily, every day, at 6am or earlier, as TSM is running slow again. Any suggestions on what else to look at? (Sorry for such a fragmented email. I've had about 3 hours sleep at this point)
Re: Our TSM system is a mess. Suggestions? Ideas?
BTW, we have a TSM V5.5.2.0 with IBM 3584-L32 and 3-3584-D32s. A TSM system needs I/Os, I/Os, I/Os and fast I/Os and a lot of disk space. -Original Message- From: Lamb, Charles P. Sent: Sunday, February 14, 2010 9:35 AM To: 'ADSM-L@VM.MARIST.EDU' Subject: RE: Our TSM system is a mess. Suggestions? Ideas? Hi... We have a similar TSM system. We have our TSM DB (over 400GB) only about 1/3 full and proactively run incrementals. We have fourteen LTO3 tapes drives directly connected using 4Gbps FC adapters. IBM 9155-55A with 8-WAY/64GB of memory and an IBM SVC(four nodes)/FAStT system using DS4800s that uses about 6TB of TSM disk cache using 4-4Gbps FC adapters. Using fast disk space helps in TSM DB backups and other TSM activities. Our server environment is SAP R/3 landscapes on RISCs, Intel/MS and VMware farms, etc. I would think increasing TSM DB size and using a faster disk system would help. Placing SVC in front of disk space helps caching the data. -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Dury, John C. Sent: Sunday, February 14, 2010 7:12 AM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Our TSM system is a mess. Suggestions? Ideas? We have about 500 nodes and have a backup windows from 5pm until 7am. I have our backup schedule setup so that about 30 nodes do incremental per hour with a few exceptions. We have a 3T disk storage pool and 4 LTO4 drives in our tape library. Our dbbackuptrigger is set at logfull 30% and numincrmeentals of 4. Our recovery log is filling up almost once per hour while backups are running and not emptying fast enough before it hits 80% when all backups come to a crawl until it is emptied below 80%. Sometimes the recovery log is pinned at 70% or so and another backup kicks off immediately which again does not empty fast enough and the whole system goes into slowdown after the recovery log is past 80%. Expiration, which used to run in a matter of about 6 hours, is not completing even after running for 24 hours. Our DB is about 97gig and about 74% full. The recovery log is maxed at 13gig. I don't see anything in the activity log out of the ordinary. The TSM server is AIX 5.3.10.1 TL10 running on an IBM 9131-52A in a logical partition with 20 CPus configured and about 32G of RAM. The TSM DB and disk storage pools are attached to a Clariion CX3-80 via 4G Hbas. I have the recovery log and TSM DB set to use different HBAs then the disk or tape storage pools so the HBAs aren't fighting each other. I've read the tuning and performance manual and matched our settings to match it's suggestions with some small exceptions. We have purchased new hardware to move the whole system to Linux and a monster of a box since we want to get to TSM v6.x eventually, hopefully sooner rather than later. AIX hardware and support is tremendously expensive when compared to an intel based box and like a lot of people, we have a very small budget for anything IT related. . One of the biggest problems we are having is the recovery log filling up too quickly and not emptying fast enough. Even with a log full trigger of 30%, the incremental backup won't finish before the recovery log hits 80% and with the log full setting so low, we are doing TSM DB backups almost every hour while clients are backing up. This really seems excessive to me. Why would an incremental backup of the TSM DB take an hour or so to run and is it normal for the recovery log to fill up so fast while backups are running? We even attempted to do a reorg of the TSM DB but unfortunately it was going to run for much longer than our window allowed so it had to be cancelled. I'm going to try again for next weekend and hopefully talk the powers that be, into a 24 hour window for the reorg. We did do a reorg years ago and the performance improvements were amazing, ie expiration ran in less than an hour. I know that is a bandaid but I have to do something until I can get to version 6 when I can have a bigger recovery log and a new, more powerful server in place. I guess I'm just not sure what to look at at this point and frankly I'm exhausted. Our help desk is calling me daily, every day, at 6am or earlier, as TSM is running slow again. Any suggestions on what else to look at? (Sorry for such a fragmented email. I've had about 3 hours sleep at this point)
Re: Our TSM system is a mess. Suggestions? Ideas?
Hello John, I am sorry, maybe I do not understand your request, but I have very simple advice - just increase logs at least 3-4 times.You will have enough time for everything. I have 130 nodes and 16Gb database with 8GB logs and sometimes it is not enough. I think 13GB logs for so big TSM database and big number of nodes is not enough. Regards, Grigori From: ADSM: Dist Stor Manager [ads...@vm.marist.edu] On Behalf Of Dury, John C. [jd...@duqlight.com] Sent: Sunday, February 14, 2010 4:12 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Our TSM system is a mess. Suggestions? Ideas? We have about 500 nodes and have a backup windows from 5pm until 7am. I have our backup schedule setup so that about 30 nodes do incremental per hour with a few exceptions. We have a 3T disk storage pool and 4 LTO4 drives in our tape library. Our dbbackuptrigger is set at logfull 30% and numincrmeentals of 4. Our recovery log is filling up almost once per hour while backups are running and not emptying fast enough before it hits 80% when all backups come to a crawl until it is emptied below 80%. Sometimes the recovery log is pinned at 70% or so and another backup kicks off immediately which again does not empty fast enough and the whole system goes into slowdown after the recovery log is past 80%. Expiration, which used to run in a matter of about 6 hours, is not completing even after running for 24 hours. Our DB is about 97gig and about 74% full. The recovery log is maxed at 13gig. I don't see anything in the activity log out of the ordinary. The TSM server is AIX 5.3.10.1 TL10 running on an IBM 9131-52A in a logical partition with 20 CPus configured and about 32G of RAM. The TSM DB and disk storage pools are attached to a Clariion CX3-80 via 4G Hbas. I have the recovery log and TSM DB set to use different HBAs then the disk or tape storage pools so the HBAs aren't fighting each other. I've read the tuning and performance manual and matched our settings to match it's suggestions with some small exceptions. We have purchased new hardware to move the whole system to Linux and a monster of a box since we want to get to TSM v6.x eventually, hopefully sooner rather than later. AIX hardware and support is tremendously expensive when compared to an intel based box and like a lot of people, we have a very small budget for anything IT related. . One of the biggest problems we are having is the recovery log filling up too quickly and not emptying fast enough. Even with a log full trigger of 30%, the incremental backup won't finish before the recovery log hits 80% and with the log full setting so low, we are doing TSM DB backups almost every hour while clients are backing up. This really seems excessive to me. Why would an incremental backup of the TSM DB take an hour or so to run and is it normal for the recovery log to fill up so fast while backups are running? We even attempted to do a reorg of the TSM DB but unfortunately it was going to run for much longer than our window allowed so it had to be cancelled. I'm going to try again for next weekend and hopefully talk the powers that be, into a 24 hour window for the reorg. We did do a reorg years ago and the performance improvements were amazing, ie expiration ran in less than an hour. I know that is a bandaid but I have to do something until I can get to version 6 when I can have a bigger recovery log and a new, more powerful server in place. I guess I'm just not sure what to look at at this point and frankly I'm exhausted. Our help desk is calling me daily, every day, at 6am or earlier, as TSM is running slow again. Any suggestions on what else to look at? (Sorry for such a fragmented email. I've had about 3 hours sleep at this point) Please consider the environment before printing this Email. This email message and any attachments transmitted with it may contain confidential and proprietary information, intended only for the named recipient(s). If you have received this message in error, or if you are not the named recipient(s), please delete this email after notifying the sender immediately. BKME cannot guarantee the integrity of this communication and accepts no liability for any damage caused by this email or its attachments due to viruses, any other defects, interception or unauthorized modification. The information, views, opinions and comments of this message are those of the individual and not necessarily endorsed by BKME.
Re: Our TSM system is a mess. Suggestions? Ideas?
John, A few things come to mind; Which nodes are pinning the recovery log? In my experience it are always a few slow nodes (with a lot of small files typically) that pin the log. Try to find out which one do, and try to improve these nodes so that they backup faster. Hell of a job when you have 500 nodes, but try to find those that take longer than 4-5 hours or have a really slow throughput speed. A speed/duplex mismatch on a TSM client can killed my log performance more than once.You can look in TSM reporting for the slowest nodes. IMHO, I think that TSM 6.1.x will not solve your problem. Another solution would be to turn of the cell phone off every other day ;-) Good luck, 2010/2/14 Dury, John C. jd...@duqlight.com We have about 500 nodes and have a backup windows from 5pm until 7am. I have our backup schedule setup so that about 30 nodes do incremental per hour with a few exceptions. We have a 3T disk storage pool and 4 LTO4 drives in our tape library. Our dbbackuptrigger is set at logfull 30% and numincrmeentals of 4. Our recovery log is filling up almost once per hour while backups are running and not emptying fast enough before it hits 80% when all backups come to a crawl until it is emptied below 80%. Sometimes the recovery log is pinned at 70% or so and another backup kicks off immediately which again does not empty fast enough and the whole system goes into slowdown after the recovery log is past 80%. Expiration, which used to run in a matter of about 6 hours, is not completing even after running for 24 hours. Our DB is about 97gig and about 74% full. The recovery log is maxed at 13gig. I don't see anything in the activity log out of the ordinary. The TSM server is AIX 5.3.10.1 TL10 running on an IBM 9131-52A in a logical partition with 20 CPus configured and about 32G of RAM. The TSM DB and disk storage pools are attached to a Clariion CX3-80 via 4G Hbas. I have the recovery log and TSM DB set to use different HBAs then the disk or tape storage pools so the HBAs aren't fighting each other. I've read the tuning and performance manual and matched our settings to match it's suggestions with some small exceptions. We have purchased new hardware to move the whole system to Linux and a monster of a box since we want to get to TSM v6.x eventually, hopefully sooner rather than later. AIX hardware and support is tremendously expensive when compared to an intel based box and like a lot of people, we have a very small budget for anything IT related. . One of the biggest problems we are having is the recovery log filling up too quickly and not emptying fast enough. Even with a log full trigger of 30%, the incremental backup won't finish before the recovery log hits 80% and with the log full setting so low, we are doing TSM DB backups almost every hour while clients are backing up. This really seems excessive to me. Why would an incremental backup of the TSM DB take an hour or so to run and is it normal for the recovery log to fill up so fast while backups are running? We even attempted to do a reorg of the TSM DB but unfortunately it was going to run for much longer than our window allowed so it had to be cancelled. I'm going to try again for next weekend and hopefully talk the powers that be, into a 24 hour window for the reorg. We did do a reorg years ago and the performance improvements were amazing, ie expiration ran in less than an hour. I know that is a bandaid but I have to do something until I can get to version 6 when I can have a bigger recovery log and a new, more powerful server in place. I guess I'm just not sure what to look at at this point and frankly I'm exhausted. Our help desk is calling me daily, every day, at 6am or earlier, as TSM is running slow again. Any suggestions on what else to look at? (Sorry for such a fragmented email. I've had about 3 hours sleep at this point) -- Kind Regards, Groetje, Marcel Anthonijsz
Re: Our TSM system is a mess. Suggestions? Ideas?
On Feb 14, 2010, at 11:32 AM, Grigori Solonovitch wrote: Hello John, I am sorry, maybe I do not understand your request, but I have very simple advice - just increase logs at least 3-4 times.You will have enough time for everything. I have 130 nodes and 16Gb database with 8GB logs and sometimes it is not enough. I think 13GB logs for so big TSM database and big number of nodes is not enough. The architectural size limit for the TSM4,5 Recovery Log is 13 GB. Recovery Log pinning is often due to scheduling congestion or data transmission speed of clients, usually involving very large files. Richard Sims
Re: Our TSM system is a mess. Suggestions? Ideas?
On 14 feb 2010, at 14:12, Dury, John C. wrote: We have about 500 nodes and have a backup windows from 5pm until 7am. I have our backup schedule setup so that about 30 nodes do incremental per hour with a few exceptions. We have a 3T disk storage pool and 4 LTO4 drives in our tape library. Our dbbackuptrigger is set at logfull 30% and numincrmeentals of 4. Our recovery log is filling up almost once per hour while backups are running and not emptying fast enough before it hits 80% when all backups come to a crawl until it is emptied below 80%. Sometimes the recovery log is pinned at 70% or so and another backup kicks off immediately which again does not empty fast enough and the whole system goes into slowdown after the recovery log is past 80%. Expiration, which used to run in a matter of about 6 hours, is not completing even after running for 24 hours. Our DB is about 97gig and about 74% full. The recovery log is maxed at 13gig. I don't see anything in the activity log out of the ordinary. The TSM server is AIX 5.3.10.1 TL10 running on an IBM 9131-52A in a logical partition with 20 CPus configured and about 32G of RAM. The TSM DB and disk storage pools are attached to a Clariion CX3-80 via 4G Hbas. I have the recovery log and TSM DB set to use different HBAs then the disk or tape storage pools so the HBAs aren't fighting each other. I've read the tuning and performance manual and matched our settings to match it's suggestions with some small exceptions. We have purchased new hardware to move the whole system to Linux and a monster of a box since we want to get to TSM v6.x eventually, hopefully sooner rather than later. AIX hardware and support is tremendously expensive when compared to an intel based box and like a lot of people, we have a very small budget for anything IT related. . One of the biggest problems we are having is the recovery log filling up too quickly and not emptying fast enough. Even with a log full trigger of 30%, the incremental backup won't finish before the recovery log hits 80% and with the log full setting so low, we are doing TSM DB backups almost every hour while clients are backing up. This really seems excessive to me. Why would an incremental backup of the TSM DB take an hour or so to run and is it normal for the recovery log to fill up so fast while backups are running? We even attempted to do a reorg of the TSM DB but unfortunately it was going to run for much longer than our window allowed so it had to be cancelled. I'm going to try again for next weekend and hopefully talk the powers that be, into a 24 hour window for the reorg. We did do a reorg years ago and the performance improvements were amazing, ie expiration ran in less than an hour. I know that is a bandaid but I have to do something until I can get to version 6 when I can have a bigger recovery log and a new, more powerful server in place. I guess I'm just not sure what to look at at this point and frankly I'm exhausted. Our help desk is calling me daily, every day, at 6am or earlier, as TSM is running slow again. Any suggestions on what else to look at? (Sorry for such a fragmented email. I've had about 3 hours sleep at this point) Hi John, it looks like you may have a few nodes that are backing up much more slowly than the majority. You could try to reduce the transaction size for those nodes, that could help, if these nodes are not backing up just a single huge file. If you really need to, move these nodes off to a separate TSM instance on the same server. Check out the bufferpool, in 'q db' you'll find the cache hit percentage, if that drops, your database is hitting the disk more often. Below 98% is unacceptable, being above 99% is recommendable. You do mention the type of controller, but not the type of disks. There is a lot to be gained by using LUNS that either stripe across a huge number of very fast disks, or setting up (in your case) about 4 to 6 dedicated raid-1 LUNs of 15k RPM disks for the database. It sounds like your using the log in roll-forward mode. This is of course the recommended setting, but might be worsening the problem. You might want to think about using normal mode, until you upgrade to 6.1. Btw, it sounds like you have quite a large LPAR for your TSM server, much larger than needed. With a database of this size, I'd guess that 2 to 4 CPU's and 4 GB of RAM should be plenty. Do you run other applications on your TSM LPAR? -- Met vriendelijke groeten/Kind Regards, Remco Post r.p...@plcs.nl +31 6 248 21 622