Re: Deduplication number of chunks waiting in queue continues to rise?
Hey, Nick, missed your name the first time around! Being in higher-ed/research we went the cheap route and actually just use direct-attach 15K SAS drives on Dell servers, divvied up into multiple RAID-10 sets. Even a 1TB database only takes us ~1 hour to backup or restore, which is well within our SLA. On 12/20/2013 11:42 AM, Marouf, Nick wrote: Hi Skylar ! Yes that would be the easy way do it, there is an option to rebalance the I/O after you add the new file systems to the database. I had already setup TSM before the performance tuning guideline was released. Doing this way, will require more storage initially and running db2 rebalancing command line tools will spread out the DB I/O load Using IBM XIV's that can handle very large IO requests, in our specific case there was no need to provide physically-separate volumes. I've seen one TSM instance crank upwards of 10,000 IOPS leaving an entire ESX cluster in the dust. -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354
Re: Deduplication number of chunks waiting in queue continues to rise?
Hi Wanda, I'm using Deduplication and have found that tsm life would be much easier if the stg pool was kept smaller under 3TB in size. I haven't done enough testing with this, and I know it is slightly counterproductive to achieve the highest deduplication savings. But it sure does make the administrative side much cleaner and easier to work. Keeping the storage pool smaller does create offsite copies and reclamation faster. It is a divide and conquer. I do have a storage pool that does take longer to reclaim, and this seems to clear up every Sunday as we generally have very little incoming client backups on that day. I'm using client side deduplication to leverage the client processing; and In order to protect against stale or duplicate chunk collisions I keep the cache databases set very low where the clients on average have to reset that cache every few days. Deduplication is very important for me, Please keep me in the loop when you come closer to a resolution. tsm: TSMC04PSHOW DEDUPDELETEINFO Dedup Deletion General Status Number of worker threads : 8 Number of active worker threads : 1 Number of chunks waiting in queue : 1967534 tsm: TSMC04Pq db f=d Database Name: TSMDB1 Total Size of File System (MB): 1,148,760 Space Used by Database(MB): 304,105 Free Space Available (MB): 6,755,930 Total Pages: 33,625,117 Usable Pages: 33,623,517 Used Pages: 33,620,309 Free Pages: 3,208 Buffer Pool Hit Ratio: 98.0 Total Buffer Requests: 21,032,020,059 Sort Overflows: 0 Package Cache Hit Ratio: 99.8 Last Database Reorganization: 12/16/2013 18:21:53 Full Device Class Name: 3592DEV Number of Database Backup Streams: 1 Incrementals Since Last Full: 0 Last Complete Backup Date/Time: 12/19/2013 10:12:53 -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Erwann Simon Sent: Friday, December 20, 2013 12:33 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Hi Wanda, Expire Inventory is queuing chunk for deletion. See the Q PR output when, at the end of the expire inventory process, the total numbers of nodes have been reached. No more deletion of objects occurs, but SHOW DEDUPDELETEINFO shows that the deletion threads are still working, queuing and deleting chunks. This activity does not appear externally and consumes most of the expire inventory time. Let's try with deduplication disabled (dedup=no) for that pool (?). Regards, Erwann Prather, Wanda wanda.prat...@icfi.com a écrit : TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number of chunks waiting in queue has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only server side. I've also set deduprequiresbackup to NO for now, although I hate doing that, to make sure that doesn't' interfere with the reclaim process. But SHOW DEDUPDELETEINFO shows that the number of chunks waiting in queue is *still* increasing. So, WHAT is putting stuff on that dedup delete queue? And how do I ever gain ground? W **Please note new office phone: Wanda Prather | Senior Technical Specialist | wanda.prat...@icfi.com | www.icfi.com ICF International | 443-718-4900 (o) -- Erwann SIMON Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.
Re: Deduplication number of chunks waiting in queue continues to rise?
Wanda, In trying to troubleshoot an unrelated performance PMR, IBM provided me with an e-fix for the dedupdel bottleneck that it sounds like you're experiencing. They obviously will want to do their due-diligence on whether or not this efix will help solve your problems, but it has proved very useful in my environment. They even had to compile a solaris e-fix for me, cause it seems like I'm the only one running TSM on Solaris. The e-fix was very simple to install. What you don't want to do is go to 6.3.4.2, unless they tell you to because the e-fix is for that level (207). Don't run on 6.3.4.2 for even a minute. Only install it to get to the e-fix level. Dedupdel gets populated by anything that deletes data from the stgpool, I.e. move data, expire inv, delete filespace, move nodedata, etc. We run client-side dedupe (which works pretty well, except when you run into performance issues on the server) and so our identifies don't run very long, if at all. It might save you time to run client-side dedupe. BTW, when I finally got this efix and TSM was able to catch-up with the deletes and reclaims as it needed to, I got some serious space space back in my TDP Dedup pool. It went from 90% util to 60% util (with about 10TB of total capacity). What finally really got me before the fix was that I had to delete a bunch of old TDP MSSQL filespaces and it just took forever for TSM to catch up. I have a few deletes to do now, and I'm a bit wary because I don't want to hose my server again. I would escalate with IBM support and have them supply you the e-fix. 6.3.4.3 I don't think is slated for release any time within the next few days, and you'll just be struggling to deal with the performance issue. HTH, Sergio On 12/19/13 11:35 PM, Prather, Wanda wanda.prat...@icfi.com wrote: TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number of chunks waiting in queue has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only server side. I've also set deduprequiresbackup to NO for now, although I hate doing that, to make sure that doesn't' interfere with the reclaim process. But SHOW DEDUPDELETEINFO shows that the number of chunks waiting in queue is *still* increasing. So, WHAT is putting stuff on that dedup delete queue? And how do I ever gain ground? W **Please note new office phone: Wanda Prather | Senior Technical Specialist | wanda.prat...@icfi.com | www.icfi.com ICF International | 443-718-4900 (o)
Re: Deduplication number of chunks waiting in queue continues to rise?
Sergio and Wanda, Thanks for your posts! I opened PMR 10702,L6Q,000 a couple weeks ago for slow performance [recently completely fell off the cliff!] with our SRV3 TSM v6.3.4.200 service that *was* successfully doing client+server deduplication for 72TB BackupDedup STGpool on NetApp FC [soon to be 3par] FC disks. I did not previously know about this command... SHow DEDUPDelinfo now shows 7M enqueued dedupdel chunks @ SRV3 TSM. I just requested escalation to consider whether TSMv6.3.4.207 efix will help us. Thanks again... hoping to re-post with better performance results soon! jim.o...@yale.edu (w#203.432.6693, c#203.494.9201, h#203.387.3030) On 12/20/2013 10:38 AM, Sergio O. Fuentes wrote: Wanda, In trying to troubleshoot an unrelated performance PMR, IBM provided me with an e-fix for the dedupdel bottleneck that it sounds like you're experiencing. They obviously will want to do their due-diligence on whether or not this efix will help solve your problems, but it has proved very useful in my environment. They even had to compile a solaris e-fix for me, cause it seems like I'm the only one running TSM on Solaris. The e-fix was very simple to install. What you don't want to do is go to 6.3.4.2, unless they tell you to because the e-fix is for that level (207). Don't run on 6.3.4.2 for even a minute. Only install it to get to the e-fix level. Dedupdel gets populated by anything that deletes data from the stgpool, I.e. move data, expire inv, delete filespace, move nodedata, etc. We run client-side dedupe (which works pretty well, except when you run into performance issues on the server) and so our identifies don't run very long, if at all. It might save you time to run client-side dedupe. BTW, when I finally got this efix and TSM was able to catch-up with the deletes and reclaims as it needed to, I got some serious space space back in my TDP Dedup pool. It went from 90% util to 60% util (with about 10TB of total capacity). What finally really got me before the fix was that I had to delete a bunch of old TDP MSSQL filespaces and it just took forever for TSM to catch up. I have a few deletes to do now, and I'm a bit wary because I don't want to hose my server again. I would escalate with IBM support and have them supply you the e-fix. 6.3.4.3 I don't think is slated for release any time within the next few days, and you'll just be struggling to deal with the performance issue. HTH, Sergio On 12/19/13 11:35 PM, Prather, Wanda wanda.prat...@icfi.com wrote: TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number of chunks waiting in queue has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only server side. I've also set deduprequiresbackup to NO for now, although I hate doing that, to make sure that doesn't' interfere with the reclaim process. But SHOW DEDUPDELETEINFO shows that the number of chunks waiting in queue is *still* increasing. So, WHAT is putting stuff on that dedup delete queue? And how do I ever gain ground? W **Please note new office phone: Wanda Prather | Senior Technical Specialist | wanda.prat...@icfi.com | www.icfi.com ICF International | 443-718-4900 (o)
Re: Deduplication number of chunks waiting in queue continues to rise?
Woo hoo! That's great news. Will open a ticket and escalate. Also looking at client-side dedup, but I have to do some architectural planning, as all the data is coming from one client, the TSM VE data mover, which is a vm. Re client-side dedup, do you know if there is any cooperation between the client-side dedup and deduprequiresbackup on the server end? I have assumed that the client-side dedup would not offer that protection. W -Original Message- From: Sergio O. Fuentes [mailto:sfuen...@umd.edu] Sent: Friday, December 20, 2013 10:39 AM To: ADSM: Dist Stor Manager Cc: Prather, Wanda Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Wanda, In trying to troubleshoot an unrelated performance PMR, IBM provided me with an e-fix for the dedupdel bottleneck that it sounds like you're experiencing. They obviously will want to do their due-diligence on whether or not this efix will help solve your problems, but it has proved very useful in my environment. They even had to compile a solaris e-fix for me, cause it seems like I'm the only one running TSM on Solaris. The e-fix was very simple to install. What you don't want to do is go to 6.3.4.2, unless they tell you to because the e-fix is for that level (207). Don't run on 6.3.4.2 for even a minute. Only install it to get to the e-fix level. Dedupdel gets populated by anything that deletes data from the stgpool, I.e. move data, expire inv, delete filespace, move nodedata, etc. We run client-side dedupe (which works pretty well, except when you run into performance issues on the server) and so our identifies don't run very long, if at all. It might save you time to run client-side dedupe. BTW, when I finally got this efix and TSM was able to catch-up with the deletes and reclaims as it needed to, I got some serious space space back in my TDP Dedup pool. It went from 90% util to 60% util (with about 10TB of total capacity). What finally really got me before the fix was that I had to delete a bunch of old TDP MSSQL filespaces and it just took forever for TSM to catch up. I have a few deletes to do now, and I'm a bit wary because I don't want to hose my server again. I would escalate with IBM support and have them supply you the e-fix. 6.3.4.3 I don't think is slated for release any time within the next few days, and you'll just be struggling to deal with the performance issue. HTH, Sergio On 12/19/13 11:35 PM, Prather, Wanda wanda.prat...@icfi.com wrote: TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number of chunks waiting in queue has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only server side. I've also set deduprequiresbackup to NO for now, although I hate doing that, to make sure that doesn't' interfere with the reclaim process. But SHOW DEDUPDELETEINFO shows that the number of chunks waiting in queue is *still* increasing. So, WHAT is putting stuff on that dedup delete queue? And how do I ever gain ground? W **Please note new office phone: Wanda Prather | Senior Technical Specialist | wanda.prat...@icfi.com | www.icfi.com ICF International | 443-718-4900 (o)
Re: Deduplication number of chunks waiting in queue continues to rise?
Please do post results - expiration just ran for me, queue 30M! 45 TB dedup pool -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of James R Owen Sent: Friday, December 20, 2013 11:19 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Sergio and Wanda, Thanks for your posts! I opened PMR 10702,L6Q,000 a couple weeks ago for slow performance [recently completely fell off the cliff!] with our SRV3 TSM v6.3.4.200 service that *was* successfully doing client+server deduplication for 72TB BackupDedup STGpool on NetApp FC [soon to be 3par] FC disks. I did not previously know about this command... SHow DEDUPDelinfo now shows 7M enqueued dedupdel chunks @ SRV3 TSM. I just requested escalation to consider whether TSMv6.3.4.207 efix will help us. Thanks again... hoping to re-post with better performance results soon! jim.o...@yale.edu (w#203.432.6693, c#203.494.9201, h#203.387.3030) On 12/20/2013 10:38 AM, Sergio O. Fuentes wrote: Wanda, In trying to troubleshoot an unrelated performance PMR, IBM provided me with an e-fix for the dedupdel bottleneck that it sounds like you're experiencing. They obviously will want to do their due-diligence on whether or not this efix will help solve your problems, but it has proved very useful in my environment. They even had to compile a solaris e-fix for me, cause it seems like I'm the only one running TSM on Solaris. The e-fix was very simple to install. What you don't want to do is go to 6.3.4.2, unless they tell you to because the e-fix is for that level (207). Don't run on 6.3.4.2 for even a minute. Only install it to get to the e-fix level. Dedupdel gets populated by anything that deletes data from the stgpool, I.e. move data, expire inv, delete filespace, move nodedata, etc. We run client-side dedupe (which works pretty well, except when you run into performance issues on the server) and so our identifies don't run very long, if at all. It might save you time to run client-side dedupe. BTW, when I finally got this efix and TSM was able to catch-up with the deletes and reclaims as it needed to, I got some serious space space back in my TDP Dedup pool. It went from 90% util to 60% util (with about 10TB of total capacity). What finally really got me before the fix was that I had to delete a bunch of old TDP MSSQL filespaces and it just took forever for TSM to catch up. I have a few deletes to do now, and I'm a bit wary because I don't want to hose my server again. I would escalate with IBM support and have them supply you the e-fix. 6.3.4.3 I don't think is slated for release any time within the next few days, and you'll just be struggling to deal with the performance issue. HTH, Sergio On 12/19/13 11:35 PM, Prather, Wanda wanda.prat...@icfi.com wrote: TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number of chunks waiting in queue has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only server side. I've also set deduprequiresbackup to NO for now, although I hate doing that, to make sure that doesn't' interfere with the reclaim process. But SHOW DEDUPDELETEINFO shows that the number of chunks waiting in queue is *still* increasing. So, WHAT is putting stuff on that dedup delete queue? And how do I ever gain ground? W **Please note new office phone: Wanda Prather | Senior Technical Specialist | wanda.prat...@icfi.com | www.icfi.com ICF International | 443-718-4900 (o)
Re: Deduplication number of chunks waiting in queue continues to rise?
Client-side dedup and simultaneous-write to a copy pool are mutually exclusive. You can't do both, which is the only theoretical way to enforce deduprequiresbackup with client-side dedup. I suppose IBM could enhance TSM to do a simultaneous-like operation with client-side dedup, but that's not available now. So, I'm not sure how the TSM server enforces deduprequiresbackup with client-side dedup. Ever since 6.1 I have always set that to NO anyway. I have dealt with the repercussions of that as well. Backup stgpool on dedup'd stgpools is not pretty. I have made some architectural changes to the underlying stgpools and the 'backup stgpools' run pretty well, even with 1TB SATA drives. Two things I think helped quite a bit: 1. Use big predefined volumes. My new volumes are 50GB. 2. Use many filesystems for the devclass. I have 5 currently. I would use more if I had the space. Thanks! Sergio On 12/20/13 11:35 AM, Prather, Wanda wanda.prat...@icfi.com wrote: Woo hoo! That's great news. Will open a ticket and escalate. Also looking at client-side dedup, but I have to do some architectural planning, as all the data is coming from one client, the TSM VE data mover, which is a vm. Re client-side dedup, do you know if there is any cooperation between the client-side dedup and deduprequiresbackup on the server end? I have assumed that the client-side dedup would not offer that protection. W -Original Message- From: Sergio O. Fuentes [mailto:sfuen...@umd.edu] Sent: Friday, December 20, 2013 10:39 AM To: ADSM: Dist Stor Manager Cc: Prather, Wanda Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Wanda, In trying to troubleshoot an unrelated performance PMR, IBM provided me with an e-fix for the dedupdel bottleneck that it sounds like you're experiencing. They obviously will want to do their due-diligence on whether or not this efix will help solve your problems, but it has proved very useful in my environment. They even had to compile a solaris e-fix for me, cause it seems like I'm the only one running TSM on Solaris. The e-fix was very simple to install. What you don't want to do is go to 6.3.4.2, unless they tell you to because the e-fix is for that level (207). Don't run on 6.3.4.2 for even a minute. Only install it to get to the e-fix level. Dedupdel gets populated by anything that deletes data from the stgpool, I.e. move data, expire inv, delete filespace, move nodedata, etc. We run client-side dedupe (which works pretty well, except when you run into performance issues on the server) and so our identifies don't run very long, if at all. It might save you time to run client-side dedupe. BTW, when I finally got this efix and TSM was able to catch-up with the deletes and reclaims as it needed to, I got some serious space space back in my TDP Dedup pool. It went from 90% util to 60% util (with about 10TB of total capacity). What finally really got me before the fix was that I had to delete a bunch of old TDP MSSQL filespaces and it just took forever for TSM to catch up. I have a few deletes to do now, and I'm a bit wary because I don't want to hose my server again. I would escalate with IBM support and have them supply you the e-fix. 6.3.4.3 I don't think is slated for release any time within the next few days, and you'll just be struggling to deal with the performance issue. HTH, Sergio On 12/19/13 11:35 PM, Prather, Wanda wanda.prat...@icfi.com wrote: TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number of chunks waiting in queue has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only server side. I've also set deduprequiresbackup to NO for now, although I hate doing that, to make sure that doesn't' interfere with the reclaim process. But SHOW DEDUPDELETEINFO shows that the number of chunks waiting in queue is *still* increasing. So, WHAT is putting stuff on that dedup delete queue? And how do I ever gain ground? W **Please note new office phone: Wanda Prather | Senior Technical Specialist | wanda.prat...@icfi.com | www.icfi.com ICF International | 443-718-4900 (o)
Re: Deduplication number of chunks waiting in queue continues to rise?
I can second that Sergio, Backup stgpools to copy tapes is not pretty, and is an intensive process to rehydrate all that data. The one extra thing I did was split the database across multiple folder for parallel I/O to the Database. That has worked out very well, and I currently have it setup to span across 8 folders, with an XIV backend that can take a beating. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Sergio O. Fuentes Sent: Friday, December 20, 2013 12:04 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Client-side dedup and simultaneous-write to a copy pool are mutually exclusive. You can't do both, which is the only theoretical way to enforce deduprequiresbackup with client-side dedup. I suppose IBM could enhance TSM to do a simultaneous-like operation with client-side dedup, but that's not available now. So, I'm not sure how the TSM server enforces deduprequiresbackup with client-side dedup. Ever since 6.1 I have always set that to NO anyway. I have dealt with the repercussions of that as well. Backup stgpool on dedup'd stgpools is not pretty. I have made some architectural changes to the underlying stgpools and the 'backup stgpools' run pretty well, even with 1TB SATA drives. Two things I think helped quite a bit: 1. Use big predefined volumes. My new volumes are 50GB. 2. Use many filesystems for the devclass. I have 5 currently. I would use more if I had the space. Thanks! Sergio On 12/20/13 11:35 AM, Prather, Wanda wanda.prat...@icfi.com wrote: Woo hoo! That's great news. Will open a ticket and escalate. Also looking at client-side dedup, but I have to do some architectural planning, as all the data is coming from one client, the TSM VE data mover, which is a vm. Re client-side dedup, do you know if there is any cooperation between the client-side dedup and deduprequiresbackup on the server end? I have assumed that the client-side dedup would not offer that protection. W -Original Message- From: Sergio O. Fuentes [mailto:sfuen...@umd.edu] Sent: Friday, December 20, 2013 10:39 AM To: ADSM: Dist Stor Manager Cc: Prather, Wanda Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Wanda, In trying to troubleshoot an unrelated performance PMR, IBM provided me with an e-fix for the dedupdel bottleneck that it sounds like you're experiencing. They obviously will want to do their due-diligence on whether or not this efix will help solve your problems, but it has proved very useful in my environment. They even had to compile a solaris e-fix for me, cause it seems like I'm the only one running TSM on Solaris. The e-fix was very simple to install. What you don't want to do is go to 6.3.4.2, unless they tell you to because the e-fix is for that level (207). Don't run on 6.3.4.2 for even a minute. Only install it to get to the e-fix level. Dedupdel gets populated by anything that deletes data from the stgpool, I.e. move data, expire inv, delete filespace, move nodedata, etc. We run client-side dedupe (which works pretty well, except when you run into performance issues on the server) and so our identifies don't run very long, if at all. It might save you time to run client-side dedupe. BTW, when I finally got this efix and TSM was able to catch-up with the deletes and reclaims as it needed to, I got some serious space space back in my TDP Dedup pool. It went from 90% util to 60% util (with about 10TB of total capacity). What finally really got me before the fix was that I had to delete a bunch of old TDP MSSQL filespaces and it just took forever for TSM to catch up. I have a few deletes to do now, and I'm a bit wary because I don't want to hose my server again. I would escalate with IBM support and have them supply you the e-fix. 6.3.4.3 I don't think is slated for release any time within the next few days, and you'll just be struggling to deal with the performance issue. HTH, Sergio On 12/19/13 11:35 PM, Prather, Wanda wanda.prat...@icfi.com wrote: TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number of chunks waiting in queue has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only
Re: Deduplication number of chunks waiting in queue continues to rise?
While we don't do deduplication (tests show we gain less than 25% from it), we also split our DB2 instances across multiple, physically-separate volumes. The one thing to note is that you have to dump and restore the database to spread existing data across those directories if you add them post-installation. On Fri, Dec 20, 2013 at 02:23:34PM -0500, Marouf, Nick wrote: I can second that Sergio, Backup stgpools to copy tapes is not pretty, and is an intensive process to rehydrate all that data. The one extra thing I did was split the database across multiple folder for parallel I/O to the Database. That has worked out very well, and I currently have it setup to span across 8 folders, with an XIV backend that can take a beating. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Sergio O. Fuentes Sent: Friday, December 20, 2013 12:04 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Client-side dedup and simultaneous-write to a copy pool are mutually exclusive. You can't do both, which is the only theoretical way to enforce deduprequiresbackup with client-side dedup. I suppose IBM could enhance TSM to do a simultaneous-like operation with client-side dedup, but that's not available now. So, I'm not sure how the TSM server enforces deduprequiresbackup with client-side dedup. Ever since 6.1 I have always set that to NO anyway. I have dealt with the repercussions of that as well. Backup stgpool on dedup'd stgpools is not pretty. I have made some architectural changes to the underlying stgpools and the 'backup stgpools' run pretty well, even with 1TB SATA drives. Two things I think helped quite a bit: 1. Use big predefined volumes. My new volumes are 50GB. 2. Use many filesystems for the devclass. I have 5 currently. I would use more if I had the space. Thanks! Sergio On 12/20/13 11:35 AM, Prather, Wanda wanda.prat...@icfi.com wrote: Woo hoo! That's great news. Will open a ticket and escalate. Also looking at client-side dedup, but I have to do some architectural planning, as all the data is coming from one client, the TSM VE data mover, which is a vm. Re client-side dedup, do you know if there is any cooperation between the client-side dedup and deduprequiresbackup on the server end? I have assumed that the client-side dedup would not offer that protection. W -Original Message- From: Sergio O. Fuentes [mailto:sfuen...@umd.edu] Sent: Friday, December 20, 2013 10:39 AM To: ADSM: Dist Stor Manager Cc: Prather, Wanda Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Wanda, In trying to troubleshoot an unrelated performance PMR, IBM provided me with an e-fix for the dedupdel bottleneck that it sounds like you're experiencing. They obviously will want to do their due-diligence on whether or not this efix will help solve your problems, but it has proved very useful in my environment. They even had to compile a solaris e-fix for me, cause it seems like I'm the only one running TSM on Solaris. The e-fix was very simple to install. What you don't want to do is go to 6.3.4.2, unless they tell you to because the e-fix is for that level (207). Don't run on 6.3.4.2 for even a minute. Only install it to get to the e-fix level. Dedupdel gets populated by anything that deletes data from the stgpool, I.e. move data, expire inv, delete filespace, move nodedata, etc. We run client-side dedupe (which works pretty well, except when you run into performance issues on the server) and so our identifies don't run very long, if at all. It might save you time to run client-side dedupe. BTW, when I finally got this efix and TSM was able to catch-up with the deletes and reclaims as it needed to, I got some serious space space back in my TDP Dedup pool. It went from 90% util to 60% util (with about 10TB of total capacity). What finally really got me before the fix was that I had to delete a bunch of old TDP MSSQL filespaces and it just took forever for TSM to catch up. I have a few deletes to do now, and I'm a bit wary because I don't want to hose my server again. I would escalate with IBM support and have them supply you the e-fix. 6.3.4.3 I don't think is slated for release any time within the next few days, and you'll just be struggling to deal with the performance issue. HTH, Sergio On 12/19/13 11:35 PM, Prather, Wanda wanda.prat...@icfi.com wrote: TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number
Re: Deduplication number of chunks waiting in queue continues to rise?
Hi All, Is someone using this script for reporting purpose ? http://www-01.ibm.com/support/docview.wss?uid=swg21596944 -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: Wanda Prather wanda.prat...@icfi.com À: ADSM-L@VM.MARIST.EDU Envoyé: Vendredi 20 Décembre 2013 05:35:38 Objet: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number of chunks waiting in queue has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only server side. I've also set deduprequiresbackup to NO for now, although I hate doing that, to make sure that doesn't' interfere with the reclaim process. But SHOW DEDUPDELETEINFO shows that the number of chunks waiting in queue is *still* increasing. So, WHAT is putting stuff on that dedup delete queue? And how do I ever gain ground? W **Please note new office phone: Wanda Prather | Senior Technical Specialist | wanda.prat...@icfi.com | www.icfi.com ICF International | 443-718-4900 (o)
Re: Deduplication number of chunks waiting in queue continues to rise?
Is anyone doing stgpool backups to a dedup file copy pool? At 02:23 PM 12/20/2013, Marouf, Nick wrote: I can second that Sergio, Backup stgpools to copy tapes is not pretty, and is an intensive process to rehydrate all that data. The one extra thing I did was split the database across multiple folder for parallel I/O to the Database. That has worked out very well, and I currently have it setup to span across 8 folders, with an XIV backend that can take a beating. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Sergio O. Fuentes Sent: Friday, December 20, 2013 12:04 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Client-side dedup and simultaneous-write to a copy pool are mutually exclusive. You can't do both, which is the only theoretical way to enforce deduprequiresbackup with client-side dedup. I suppose IBM could enhance TSM to do a simultaneous-like operation with client-side dedup, but that's not available now. So, I'm not sure how the TSM server enforces deduprequiresbackup with client-side dedup. Ever since 6.1 I have always set that to NO anyway. I have dealt with the repercussions of that as well. Backup stgpool on dedup'd stgpools is not pretty. I have made some architectural changes to the underlying stgpools and the 'backup stgpools' run pretty well, even with 1TB SATA drives. Two things I think helped quite a bit: 1. Use big predefined volumes. My new volumes are 50GB. 2. Use many filesystems for the devclass. I have 5 currently. I would use more if I had the space. Thanks! Sergio On 12/20/13 11:35 AM, Prather, Wanda wanda.prat...@icfi.com wrote: Woo hoo! That's great news. Will open a ticket and escalate. Also looking at client-side dedup, but I have to do some architectural planning, as all the data is coming from one client, the TSM VE data mover, which is a vm. Re client-side dedup, do you know if there is any cooperation between the client-side dedup and deduprequiresbackup on the server end? I have assumed that the client-side dedup would not offer that protection. W -Original Message- From: Sergio O. Fuentes [mailto:sfuen...@umd.edu] Sent: Friday, December 20, 2013 10:39 AM To: ADSM: Dist Stor Manager Cc: Prather, Wanda Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Wanda, In trying to troubleshoot an unrelated performance PMR, IBM provided me with an e-fix for the dedupdel bottleneck that it sounds like you're experiencing. They obviously will want to do their due-diligence on whether or not this efix will help solve your problems, but it has proved very useful in my environment. They even had to compile a solaris e-fix for me, cause it seems like I'm the only one running TSM on Solaris. The e-fix was very simple to install. What you don't want to do is go to 6.3.4.2, unless they tell you to because the e-fix is for that level (207). Don't run on 6.3.4.2 for even a minute. Only install it to get to the e-fix level. Dedupdel gets populated by anything that deletes data from the stgpool, I.e. move data, expire inv, delete filespace, move nodedata, etc. We run client-side dedupe (which works pretty well, except when you run into performance issues on the server) and so our identifies don't run very long, if at all. It might save you time to run client-side dedupe. BTW, when I finally got this efix and TSM was able to catch-up with the deletes and reclaims as it needed to, I got some serious space space back in my TDP Dedup pool. It went from 90% util to 60% util (with about 10TB of total capacity). What finally really got me before the fix was that I had to delete a bunch of old TDP MSSQL filespaces and it just took forever for TSM to catch up. I have a few deletes to do now, and I'm a bit wary because I don't want to hose my server again. I would escalate with IBM support and have them supply you the e-fix. 6.3.4.3 I don't think is slated for release any time within the next few days, and you'll just be struggling to deal with the performance issue. HTH, Sergio On 12/19/13 11:35 PM, Prather, Wanda wanda.prat...@icfi.com wrote: TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number of chunks waiting in queue has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to
Re: Deduplication number of chunks waiting in queue continues to rise?
Hi Skylar ! Yes that would be the easy way do it, there is an option to rebalance the I/O after you add the new file systems to the database. I had already setup TSM before the performance tuning guideline was released. Doing this way, will require more storage initially and running db2 rebalancing command line tools will spread out the DB I/O load Using IBM XIV's that can handle very large IO requests, in our specific case there was no need to provide physically-separate volumes. I've seen one TSM instance crank upwards of 10,000 IOPS leaving an entire ESX cluster in the dust. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Skylar Thompson Sent: Friday, December 20, 2013 2:28 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? While we don't do deduplication (tests show we gain less than 25% from it), we also split our DB2 instances across multiple, physically-separate volumes. The one thing to note is that you have to dump and restore the database to spread existing data across those directories if you add them post-installation. On Fri, Dec 20, 2013 at 02:23:34PM -0500, Marouf, Nick wrote: I can second that Sergio, Backup stgpools to copy tapes is not pretty, and is an intensive process to rehydrate all that data. The one extra thing I did was split the database across multiple folder for parallel I/O to the Database. That has worked out very well, and I currently have it setup to span across 8 folders, with an XIV backend that can take a beating. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Sergio O. Fuentes Sent: Friday, December 20, 2013 12:04 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Client-side dedup and simultaneous-write to a copy pool are mutually exclusive. You can't do both, which is the only theoretical way to enforce deduprequiresbackup with client-side dedup. I suppose IBM could enhance TSM to do a simultaneous-like operation with client-side dedup, but that's not available now. So, I'm not sure how the TSM server enforces deduprequiresbackup with client-side dedup. Ever since 6.1 I have always set that to NO anyway. I have dealt with the repercussions of that as well. Backup stgpool on dedup'd stgpools is not pretty. I have made some architectural changes to the underlying stgpools and the 'backup stgpools' run pretty well, even with 1TB SATA drives. Two things I think helped quite a bit: 1. Use big predefined volumes. My new volumes are 50GB. 2. Use many filesystems for the devclass. I have 5 currently. I would use more if I had the space. Thanks! Sergio On 12/20/13 11:35 AM, Prather, Wanda wanda.prat...@icfi.com wrote: Woo hoo! That's great news. Will open a ticket and escalate. Also looking at client-side dedup, but I have to do some architectural planning, as all the data is coming from one client, the TSM VE data mover, which is a vm. Re client-side dedup, do you know if there is any cooperation between the client-side dedup and deduprequiresbackup on the server end? I have assumed that the client-side dedup would not offer that protection. W -Original Message- From: Sergio O. Fuentes [mailto:sfuen...@umd.edu] Sent: Friday, December 20, 2013 10:39 AM To: ADSM: Dist Stor Manager Cc: Prather, Wanda Subject: Re: [ADSM-L] Deduplication number of chunks waiting in queue continues to rise? Wanda, In trying to troubleshoot an unrelated performance PMR, IBM provided me with an e-fix for the dedupdel bottleneck that it sounds like you're experiencing. They obviously will want to do their due-diligence on whether or not this efix will help solve your problems, but it has proved very useful in my environment. They even had to compile a solaris e-fix for me, cause it seems like I'm the only one running TSM on Solaris. The e-fix was very simple to install. What you don't want to do is go to 6.3.4.2, unless they tell you to because the e-fix is for that level (207). Don't run on 6.3.4.2 for even a minute. Only install it to get to the e-fix level. Dedupdel gets populated by anything that deletes data from the stgpool, I.e. move data, expire inv, delete filespace, move nodedata, etc. We run client-side dedupe (which works pretty well, except when you run into performance issues on the server) and so our identifies don't run very long, if at all. It might save you time to run client-side dedupe. BTW, when I finally got this efix and TSM was able to catch-up with the deletes and reclaims as it needed to, I got some serious space space back in my TDP Dedup pool. It went from 90% util to 60% util (with about 10TB of total capacity). What finally really got me
Re: Deduplication number of chunks waiting in queue continues to rise?
Hi Wanda, some quick rambling thoughts about dereferenced chunk cleanup. Do you know about the 'show banner' command? If IBM sends you an e-fix, this will tell you what it is fixing. tsm: xshow banner * EFIX Cumulative level 6.3.4.207 * * This is a Limited Availability TEMPORARY fix for * * IC94121 - ANR2033E DEFINE ASSOCIATION: Command failed - lock con * * when def assoc immediately follows def sched. * * IC95890 - Allow numeric volser for zOS Media server volumes. * * IC93279 - Redrive failed outbound replication connect requests. * * IC93850 - PAM authentication login protocol exchange failure * * wi3187 - AUDIT LIBVOLUME new command* * IC96637 - SERVER CAN HANG WHEN USING OPERATION CENTER* * IC95938 - ANRD_2644193874 BFCHECKENDTOEND DURING RESTORE/RET * * IC96993 - MOVE NODEDATA OPERATION MIGHT RESULT IN INVALID LINKS * * IC91138 - Enable audit volume to mark one more kind invalid link * * THE RESTARTED RESTORE OPERATION MAY BE SINGLE-THREADED * * Avoid restore stgpool linking to orphaned base chunks * * WI3236 - Oracle T1D tape drive support * * 94297 - Add a parameter DELETEALIASES for DELETE BITFILE utili * * IC96462 - Mount failure retry for zOS Media server tape volumes. * * IC96993 - SLOW DELETION OF DEREFERENCED DEDUPLICATED CHUNKS * * This cumulative efix server is based on code level * * made generally available with FixPack 6.3.4.200 * * * I have 2 servers on 6342.006 and 2 on 6342.007. I have .009 efix waiting to be installed on my biggest, oldest, badest server to fix the chunks in queue problem. On 3 servers, the queue is down to 0, and they usually run without a problem. On the big bad one, here are the stats - tsm: WIN1show dedupdeleteinfo Dedup Deletion General Status Number of worker threads : 15 Number of active worker threads : 1 Number of chunks waiting in queue : 11326513 Dedup Deletion Worker Info Dedup deletion worker id: 1 Total chunks queued : 0 Total chunks deleted: 0 Deleting AF Entries?: Yes In error state? : No Worker thread 2 is not active Worker thread 3 is not active Worker thread 4 is not active Worker thread 5 is not active Worker thread 6 is not active Worker thread 7 is not active Worker thread 8 is not active Worker thread 9 is not active Worker thread 10 is not active Worker thread 11 is not active Worker thread 12 is not active Worker thread 13 is not active Worker thread 14 is not active Worker thread 15 is not active -- Total worker chunks queued : 0 Total worker chunks deleted: 0 The cleanup of reclaimed volumes is done by the thread which has ' Deleting AF Entries?: Yes'. The pending efix is supposed to get this process to finish. It never finishes on this server, something about a bad access plan. When I have a lot of volumes which are empty but won't delete, I generate move data commands for them. Move data to the same pool will manually do what the chunk cleanup process is trying to do. Regards, Bill Colwell Draper lab -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Prather, Wanda Sent: Thursday, December 19, 2013 11:36 PM To: ADSM-L@VM.MARIST.EDU Subject: Deduplication number of chunks waiting in queue continues to rise? TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number of chunks waiting in queue has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only server side. I've also set deduprequiresbackup to NO for now, although I hate doing that, to make sure that doesn't' interfere with the reclaim process. But SHOW DEDUPDELETEINFO shows that the number of chunks waiting in queue is *still* increasing. So, WHAT is putting stuff on that dedup delete queue? And how do I ever
Re: Deduplication number of chunks waiting in queue continues to rise?
Hi Wanda, Expire Inventory is queuing chunk for deletion. See the Q PR output when, at the end of the expire inventory process, the total numbers of nodes have been reached. No more deletion of objects occurs, but SHOW DEDUPDELETEINFO shows that the deletion threads are still working, queuing and deleting chunks. This activity does not appear externally and consumes most of the expire inventory time. Let's try with deduplication disabled (dedup=no) for that pool (?). Regards, Erwann Prather, Wanda wanda.prat...@icfi.com a écrit : TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup chunking problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the number of chunks waiting in queue has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only server side. I've also set deduprequiresbackup to NO for now, although I hate doing that, to make sure that doesn't' interfere with the reclaim process. But SHOW DEDUPDELETEINFO shows that the number of chunks waiting in queue is *still* increasing. So, WHAT is putting stuff on that dedup delete queue? And how do I ever gain ground? W **Please note new office phone: Wanda Prather | Senior Technical Specialist | wanda.prat...@icfi.com | www.icfi.com ICF International | 443-718-4900 (o) -- Erwann SIMON Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.