Spectrum Protect Plus deduplication ratios
It's a new day, so it must be time for more SPP questions. 😉 For the folks running SPP, what deduplication ratios are you seeing? So far I'm still in the testing phase, with approximately 10 VMs that I'm running backup testing on. Two of the VMs have similar database footprints (one is qa and one is dev, but the data is a close approximation of prod and each other.) However, even after a full backup of each system plus three incrementals, overall data deduplication is essentially 0, my ratio is sitting at 1.05. I would think that, because the databases are so similar (not to mention that they're the same Windows version) that I'd be seeing much better dedup than that. Compression is doing quite well, hovering at 2.7, but was hoping to at least get 1.5 or 2 to 1 dedup. Thanks and have a good weekend! __ Matthew McGeary Service Delivery Manager / Solutions Architect Data Center & Network Management, Nutrien IT T: (306) 933-8921 www.nutrien.com For more information on Nutrien's email policy or to unsubscribe, click here: https://www.nutrien.com/important-notice Pour plus de renseignements sur la politique de courrier électronique de Nutrien ou pour vous désabonner, cliquez ici: https://www.nutrien.com/avis-important
Re: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool
I dind't want to pull you into that Arnaud, I was just interested in the performance test results, that's all. I hope it's gets worked out and the performance improves, good luck. On Thu, Apr 12, 2018 at 5:15 PM, PAC Brion Arnaud < arnaud.br...@panalpina.com> wrote: > Stefan, > > I do not want to enter the details of a 6 months lasting story, but to > summarize it, such performance tests have been successfully conducted > against our very first setup, which in between time has been subject to > countless changes (TSM version, O.S. version, endianness from Big to Little > endian, extension of the FS900 capacity, redesign of the storage pools > layout and so on), the whole under huge time pressure, having the result > that the current setup could not be benchmarked anymore, as it was already > productive ... > > Cheers. > > Arnaud > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Stefan Folkerts > Sent: Thursday, April 12, 2018 2:10 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: disabling compression and/or deduplication for a client > backing up against deduped/compressed directory-based storage pool > > I understand, I didn't know you were that deep into the case already, I > wouldn't presume to be able to solve this via a few emails if support is > working on it. > > What I am interested in is did you run the blueprint benchmarks? the perl > script that can benchmark your database and your containerpool volumes? > The blueprints give values that you should be getting in order to expect > blueprint performance results, this way you can quantify your performance > issue in how many IOP/s or MB/s your behind where you need to be to run the > load you need to run. > > Regards, > Stefan > > > > > On Wed, Apr 11, 2018 at 2:10 PM, PAC Brion Arnaud < > arnaud.br...@panalpina.com> wrote: > > > Dear Stefan, > > > > Thanks a lot for your very kind offer ! > > > > Without underrating the power of this list, I however doubt that we will > > able to find a solution that easily : we opened a case with IBM and > > involved EMC/Dell as well, so far without much success, even after 5 > months > > intensive monitoring and tuning attempts at all levels (Linux kernel, > > communication layer, TSM DB fixes, access mode change on Isilon etc ...) > > > > I must share as well that some of our partners voices raised when the > > decision had been made to go with Isilon storage, warning us that the > > offered solution would not be powerful enough to sustain the intended > > workload. It might very well be that they were right, and that in this > > particular case, budget considerations have ruled over pragmatism, > leading > > to that very uncomfortable situation ... > > > > To finally answer your question : both active log and database are stored > > on a Flashsytem 900 array, dedicated to TSM server only. > > > > Cheers. > > > > Arnaud > > > > > > > > Backup and Recovery Systems Administrator > > Panalpina Management Ltd., Basle, Switzerland, > > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > > Direct: +41 (61) 226 19 78 > > e-mail: arnaud.br...@panalpina.com > > This electronic message transmission contains information from Panalpina > > and is confidential or privileged. > > This information is intended only for the person (s) named above. If you > > are not the intended recipient, any disclosure, copying, distribution or > > use or any other action based on the contents of this > > information is strictly prohibited. > > > > If you receive this electronic transmission in error, please notify the > > sender by e-mail, telephone or fax at the numbers listed above. Thank > you. > > > > > > > > -Original Message- > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > > Stefan Folkerts > > Sent: Wednesday, April 11, 2018 7:43 AM > > To: ADSM-L@VM.MARIST.EDU > > Subject: Re: disabling compression and/or deduplication for a client > > backing up against deduped/compressed directory-based storage pool > > > > That's no fun, maybe we can help! > > What storage are you using for your active log and database? > > &
Re: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool
Stefan, I do not want to enter the details of a 6 months lasting story, but to summarize it, such performance tests have been successfully conducted against our very first setup, which in between time has been subject to countless changes (TSM version, O.S. version, endianness from Big to Little endian, extension of the FS900 capacity, redesign of the storage pools layout and so on), the whole under huge time pressure, having the result that the current setup could not be benchmarked anymore, as it was already productive ... Cheers. Arnaud -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Stefan Folkerts Sent: Thursday, April 12, 2018 2:10 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool I understand, I didn't know you were that deep into the case already, I wouldn't presume to be able to solve this via a few emails if support is working on it. What I am interested in is did you run the blueprint benchmarks? the perl script that can benchmark your database and your containerpool volumes? The blueprints give values that you should be getting in order to expect blueprint performance results, this way you can quantify your performance issue in how many IOP/s or MB/s your behind where you need to be to run the load you need to run. Regards, Stefan On Wed, Apr 11, 2018 at 2:10 PM, PAC Brion Arnaud < arnaud.br...@panalpina.com> wrote: > Dear Stefan, > > Thanks a lot for your very kind offer ! > > Without underrating the power of this list, I however doubt that we will > able to find a solution that easily : we opened a case with IBM and > involved EMC/Dell as well, so far without much success, even after 5 months > intensive monitoring and tuning attempts at all levels (Linux kernel, > communication layer, TSM DB fixes, access mode change on Isilon etc ...) > > I must share as well that some of our partners voices raised when the > decision had been made to go with Isilon storage, warning us that the > offered solution would not be powerful enough to sustain the intended > workload. It might very well be that they were right, and that in this > particular case, budget considerations have ruled over pragmatism, leading > to that very uncomfortable situation ... > > To finally answer your question : both active log and database are stored > on a Flashsytem 900 array, dedicated to TSM server only. > > Cheers. > > Arnaud > > > > Backup and Recovery Systems Administrator > Panalpina Management Ltd., Basle, Switzerland, > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > Direct: +41 (61) 226 19 78 > e-mail: arnaud.br...@panalpina.com > This electronic message transmission contains information from Panalpina > and is confidential or privileged. > This information is intended only for the person (s) named above. If you > are not the intended recipient, any disclosure, copying, distribution or > use or any other action based on the contents of this > information is strictly prohibited. > > If you receive this electronic transmission in error, please notify the > sender by e-mail, telephone or fax at the numbers listed above. Thank you. > > > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Stefan Folkerts > Sent: Wednesday, April 11, 2018 7:43 AM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: disabling compression and/or deduplication for a client > backing up against deduped/compressed directory-based storage pool > > That's no fun, maybe we can help! > What storage are you using for your active log and database? > > Regards, >Stefan > > On Mon, Apr 9, 2018 at 6:06 PM, PAC Brion Arnaud < > arnaud.br...@panalpina.com > > wrote: > > > Hi Stefan, > > > > Thanks a lot for appreciated feedback ! > > > > >> You can, however, disable compression on the storagepool-level. > > > > This is unfortunately what I intended to avoid : if I disable it, then > > lots of clients will be impacted, and the server's performance will for > > sure improve ... > > > > >> Are you using an IBM blueprint configuration for the Spectrum Protect > > > > I wish I could : my life would have been much easier ! Unfortunately > > management took the (definitively bad) decision to inves
Re: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool
Hi Chavdar, Yes, that was one of IBM's first requests when we opened this case, unfortunately nothing obvious came out of it ... Cheers. Arnaud -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Chavdar Cholev Sent: Wednesday, April 11, 2018 8:22 PM To: ADSM-L@VM.MARIST.EDU Subject: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool Hi Arnaud, did you run tsm server instrumentation. It could help to identify where the issue is. We have tsm server that is connected to FS900 and vtl with v5k and no perf. issues. Best regards Chavdar On Wednesday, April 11, 2018, PAC Brion Arnaud wrote: > Dear Stefan, > > Thanks a lot for your very kind offer ! > > Without underrating the power of this list, I however doubt that we will > able to find a solution that easily : we opened a case with IBM and > involved EMC/Dell as well, so far without much success, even after 5 months > intensive monitoring and tuning attempts at all levels (Linux kernel, > communication layer, TSM DB fixes, access mode change on Isilon etc ...) > > I must share as well that some of our partners voices raised when the > decision had been made to go with Isilon storage, warning us that the > offered solution would not be powerful enough to sustain the intended > workload. It might very well be that they were right, and that in this > particular case, budget considerations have ruled over pragmatism, leading > to that very uncomfortable situation ... > > To finally answer your question : both active log and database are stored > on a Flashsytem 900 array, dedicated to TSM server only. > > Cheers. > > Arnaud > > > > Backup and Recovery Systems Administrator > Panalpina Management Ltd., Basle, Switzerland, > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > Direct: +41 (61) 226 19 78 > e-mail: arnaud.br...@panalpina.com > This electronic message transmission contains information from Panalpina > and is confidential or privileged. > This information is intended only for the person (s) named above. If you > are not the intended recipient, any disclosure, copying, distribution or > use or any other action based on the contents of this > information is strictly prohibited. > > If you receive this electronic transmission in error, please notify the > sender by e-mail, telephone or fax at the numbers listed above. Thank you. > > > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Stefan Folkerts > Sent: Wednesday, April 11, 2018 7:43 AM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: disabling compression and/or deduplication for a client > backing up against deduped/compressed directory-based storage pool > > That's no fun, maybe we can help! > What storage are you using for your active log and database? > > Regards, >Stefan > > On Mon, Apr 9, 2018 at 6:06 PM, PAC Brion Arnaud < > arnaud.br...@panalpina.com > > wrote: > > > Hi Stefan, > > > > Thanks a lot for appreciated feedback ! > > > > >> You can, however, disable compression on the storagepool-level. > > > > This is unfortunately what I intended to avoid : if I disable it, then > > lots of clients will be impacted, and the server's performance will for > > sure improve ... > > > > >> Are you using an IBM blueprint configuration for the Spectrum Protect > > > > I wish I could : my life would have been much easier ! Unfortunately > > management took the (definitively bad) decision to invest in a EMC/Dell > > Isilon array to be our Spectrum Scale server storage. > > I'm now fighting since 6 months to have the whole working together, so > far > > without real success : performance is horrible :-( > > > > Cheers. > > > > Arnaud > > > > > > -Original Message- > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > > Stefan Folkerts > > Sent: Thursday, April 05, 2018 5:48 PM > > To: ADSM-L@VM.MARIST.EDU > > Subject: Re: disabling compression and/or deduplication for a client > > backing up against deduped/compressed directory-based storage pool > > > > Hi, > > > > With the directory containerpool you cannot, for as far as I know, > disable > &
Re: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool
I understand, I didn't know you were that deep into the case already, I wouldn't presume to be able to solve this via a few emails if support is working on it. What I am interested in is did you run the blueprint benchmarks? the perl script that can benchmark your database and your containerpool volumes? The blueprints give values that you should be getting in order to expect blueprint performance results, this way you can quantify your performance issue in how many IOP/s or MB/s your behind where you need to be to run the load you need to run. Regards, Stefan On Wed, Apr 11, 2018 at 2:10 PM, PAC Brion Arnaud < arnaud.br...@panalpina.com> wrote: > Dear Stefan, > > Thanks a lot for your very kind offer ! > > Without underrating the power of this list, I however doubt that we will > able to find a solution that easily : we opened a case with IBM and > involved EMC/Dell as well, so far without much success, even after 5 months > intensive monitoring and tuning attempts at all levels (Linux kernel, > communication layer, TSM DB fixes, access mode change on Isilon etc ...) > > I must share as well that some of our partners voices raised when the > decision had been made to go with Isilon storage, warning us that the > offered solution would not be powerful enough to sustain the intended > workload. It might very well be that they were right, and that in this > particular case, budget considerations have ruled over pragmatism, leading > to that very uncomfortable situation ... > > To finally answer your question : both active log and database are stored > on a Flashsytem 900 array, dedicated to TSM server only. > > Cheers. > > Arnaud > > > > Backup and Recovery Systems Administrator > Panalpina Management Ltd., Basle, Switzerland, > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > Direct: +41 (61) 226 19 78 > e-mail: arnaud.br...@panalpina.com > This electronic message transmission contains information from Panalpina > and is confidential or privileged. > This information is intended only for the person (s) named above. If you > are not the intended recipient, any disclosure, copying, distribution or > use or any other action based on the contents of this > information is strictly prohibited. > > If you receive this electronic transmission in error, please notify the > sender by e-mail, telephone or fax at the numbers listed above. Thank you. > > > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Stefan Folkerts > Sent: Wednesday, April 11, 2018 7:43 AM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: disabling compression and/or deduplication for a client > backing up against deduped/compressed directory-based storage pool > > That's no fun, maybe we can help! > What storage are you using for your active log and database? > > Regards, >Stefan > > On Mon, Apr 9, 2018 at 6:06 PM, PAC Brion Arnaud < > arnaud.br...@panalpina.com > > wrote: > > > Hi Stefan, > > > > Thanks a lot for appreciated feedback ! > > > > >> You can, however, disable compression on the storagepool-level. > > > > This is unfortunately what I intended to avoid : if I disable it, then > > lots of clients will be impacted, and the server's performance will for > > sure improve ... > > > > >> Are you using an IBM blueprint configuration for the Spectrum Protect > > > > I wish I could : my life would have been much easier ! Unfortunately > > management took the (definitively bad) decision to invest in a EMC/Dell > > Isilon array to be our Spectrum Scale server storage. > > I'm now fighting since 6 months to have the whole working together, so > far > > without real success : performance is horrible :-( > > > > Cheers. > > > > Arnaud > > > > > > -Original Message- > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > > Stefan Folkerts > > Sent: Thursday, April 05, 2018 5:48 PM > > To: ADSM-L@VM.MARIST.EDU > > Subject: Re: disabling compression and/or deduplication for a client > > backing up against deduped/compressed directory-based storage pool > > > > Hi, > > > > With the directory containerpool you cannot, for as far as I know, > disable > > an attempt to deduplicate the data and if the
disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool
Hi Arnaud, did you run tsm server instrumentation. It could help to identify where the issue is. We have tsm server that is connected to FS900 and vtl with v5k and no perf. issues. Best regards Chavdar On Wednesday, April 11, 2018, PAC Brion Arnaud wrote: > Dear Stefan, > > Thanks a lot for your very kind offer ! > > Without underrating the power of this list, I however doubt that we will > able to find a solution that easily : we opened a case with IBM and > involved EMC/Dell as well, so far without much success, even after 5 months > intensive monitoring and tuning attempts at all levels (Linux kernel, > communication layer, TSM DB fixes, access mode change on Isilon etc ...) > > I must share as well that some of our partners voices raised when the > decision had been made to go with Isilon storage, warning us that the > offered solution would not be powerful enough to sustain the intended > workload. It might very well be that they were right, and that in this > particular case, budget considerations have ruled over pragmatism, leading > to that very uncomfortable situation ... > > To finally answer your question : both active log and database are stored > on a Flashsytem 900 array, dedicated to TSM server only. > > Cheers. > > Arnaud > > > > Backup and Recovery Systems Administrator > Panalpina Management Ltd., Basle, Switzerland, > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > Direct: +41 (61) 226 19 78 > e-mail: arnaud.br...@panalpina.com > This electronic message transmission contains information from Panalpina > and is confidential or privileged. > This information is intended only for the person (s) named above. If you > are not the intended recipient, any disclosure, copying, distribution or > use or any other action based on the contents of this > information is strictly prohibited. > > If you receive this electronic transmission in error, please notify the > sender by e-mail, telephone or fax at the numbers listed above. Thank you. > > > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Stefan Folkerts > Sent: Wednesday, April 11, 2018 7:43 AM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: disabling compression and/or deduplication for a client > backing up against deduped/compressed directory-based storage pool > > That's no fun, maybe we can help! > What storage are you using for your active log and database? > > Regards, >Stefan > > On Mon, Apr 9, 2018 at 6:06 PM, PAC Brion Arnaud < > arnaud.br...@panalpina.com > > wrote: > > > Hi Stefan, > > > > Thanks a lot for appreciated feedback ! > > > > >> You can, however, disable compression on the storagepool-level. > > > > This is unfortunately what I intended to avoid : if I disable it, then > > lots of clients will be impacted, and the server's performance will for > > sure improve ... > > > > >> Are you using an IBM blueprint configuration for the Spectrum Protect > > > > I wish I could : my life would have been much easier ! Unfortunately > > management took the (definitively bad) decision to invest in a EMC/Dell > > Isilon array to be our Spectrum Scale server storage. > > I'm now fighting since 6 months to have the whole working together, so > far > > without real success : performance is horrible :-( > > > > Cheers. > > > > Arnaud > > > > > > -Original Message- > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > > Stefan Folkerts > > Sent: Thursday, April 05, 2018 5:48 PM > > To: ADSM-L@VM.MARIST.EDU > > Subject: Re: disabling compression and/or deduplication for a client > > backing up against deduped/compressed directory-based storage pool > > > > Hi, > > > > With the directory containerpool you cannot, for as far as I know, > disable > > an attempt to deduplicate the data and if the data is able to deduplicate > > it will be deduplicated. > > You can, however, disable compression on the storagepool-level. If you > > disable it on the containerpool client-side settings for deduplication > will > > have no effect on compression within the pool. > > > > Are you using an IBM blueprint configuration for the Spectrum Protect > > environment? > > >
Re: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool
Dear Stefan, Thanks a lot for your very kind offer ! Without underrating the power of this list, I however doubt that we will able to find a solution that easily : we opened a case with IBM and involved EMC/Dell as well, so far without much success, even after 5 months intensive monitoring and tuning attempts at all levels (Linux kernel, communication layer, TSM DB fixes, access mode change on Isilon etc ...) I must share as well that some of our partners voices raised when the decision had been made to go with Isilon storage, warning us that the offered solution would not be powerful enough to sustain the intended workload. It might very well be that they were right, and that in this particular case, budget considerations have ruled over pragmatism, leading to that very uncomfortable situation ... To finally answer your question : both active log and database are stored on a Flashsytem 900 array, dedicated to TSM server only. Cheers. Arnaud Backup and Recovery Systems Administrator Panalpina Management Ltd., Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 Direct: +41 (61) 226 19 78 e-mail: arnaud.br...@panalpina.com This electronic message transmission contains information from Panalpina and is confidential or privileged. This information is intended only for the person (s) named above. If you are not the intended recipient, any disclosure, copying, distribution or use or any other action based on the contents of this information is strictly prohibited. If you receive this electronic transmission in error, please notify the sender by e-mail, telephone or fax at the numbers listed above. Thank you. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Stefan Folkerts Sent: Wednesday, April 11, 2018 7:43 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool That's no fun, maybe we can help! What storage are you using for your active log and database? Regards, Stefan On Mon, Apr 9, 2018 at 6:06 PM, PAC Brion Arnaud wrote: > Hi Stefan, > > Thanks a lot for appreciated feedback ! > > >> You can, however, disable compression on the storagepool-level. > > This is unfortunately what I intended to avoid : if I disable it, then > lots of clients will be impacted, and the server's performance will for > sure improve ... > > >> Are you using an IBM blueprint configuration for the Spectrum Protect > > I wish I could : my life would have been much easier ! Unfortunately > management took the (definitively bad) decision to invest in a EMC/Dell > Isilon array to be our Spectrum Scale server storage. > I'm now fighting since 6 months to have the whole working together, so far > without real success : performance is horrible :-( > > Cheers. > > Arnaud > > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Stefan Folkerts > Sent: Thursday, April 05, 2018 5:48 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: disabling compression and/or deduplication for a client > backing up against deduped/compressed directory-based storage pool > > Hi, > > With the directory containerpool you cannot, for as far as I know, disable > an attempt to deduplicate the data and if the data is able to deduplicate > it will be deduplicated. > You can, however, disable compression on the storagepool-level. If you > disable it on the containerpool client-side settings for deduplication will > have no effect on compression within the pool. > > Are you using an IBM blueprint configuration for the Spectrum Protect > environment? > > Regards, >Stefan > > > On Tue, Apr 3, 2018 at 6:06 PM, PAC Brion Arnaud < > arnaud.br...@panalpina.com > > wrote: > > > Hi All, > > > > Following to global client backup performance issues on some new TSM > > server, which I suspect to be related to the workload induced on TSM > > instance by deduplication/compression operations, I would like to do some > > testing with a client, selectively disabling compression or > deduplication, > > possibly both of them on it. > > > > However, the TSM server has been configured to only make use of > > directory-based storage pools, which have been defined having > deduplication > > and compression enabled. > > > > Thus my question : is ther
Re: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool
That's no fun, maybe we can help! What storage are you using for your active log and database? Regards, Stefan On Mon, Apr 9, 2018 at 6:06 PM, PAC Brion Arnaud wrote: > Hi Stefan, > > Thanks a lot for appreciated feedback ! > > >> You can, however, disable compression on the storagepool-level. > > This is unfortunately what I intended to avoid : if I disable it, then > lots of clients will be impacted, and the server's performance will for > sure improve ... > > >> Are you using an IBM blueprint configuration for the Spectrum Protect > > I wish I could : my life would have been much easier ! Unfortunately > management took the (definitively bad) decision to invest in a EMC/Dell > Isilon array to be our Spectrum Scale server storage. > I'm now fighting since 6 months to have the whole working together, so far > without real success : performance is horrible :-( > > Cheers. > > Arnaud > > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Stefan Folkerts > Sent: Thursday, April 05, 2018 5:48 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: disabling compression and/or deduplication for a client > backing up against deduped/compressed directory-based storage pool > > Hi, > > With the directory containerpool you cannot, for as far as I know, disable > an attempt to deduplicate the data and if the data is able to deduplicate > it will be deduplicated. > You can, however, disable compression on the storagepool-level. If you > disable it on the containerpool client-side settings for deduplication will > have no effect on compression within the pool. > > Are you using an IBM blueprint configuration for the Spectrum Protect > environment? > > Regards, >Stefan > > > On Tue, Apr 3, 2018 at 6:06 PM, PAC Brion Arnaud < > arnaud.br...@panalpina.com > > wrote: > > > Hi All, > > > > Following to global client backup performance issues on some new TSM > > server, which I suspect to be related to the workload induced on TSM > > instance by deduplication/compression operations, I would like to do some > > testing with a client, selectively disabling compression or > deduplication, > > possibly both of them on it. > > > > However, the TSM server has been configured to only make use of > > directory-based storage pools, which have been defined having > deduplication > > and compression enabled. > > > > Thus my question : is there any mean to configure a client, so that its > > data will not be compressed or deduplicated ? > > > > From my understanding, setting up "compression no" in the client option > > file will be of no use, as the server will still be compressing the data > at > > storage pool level. > > Likewise, setting up "deduplication no" in the client option file will > > refrain the client to proceed to deduplication, but the server still > will. > > The last remaining possibility that I can think of, to disable > > deduplication, would be to make use of some "exclude.dedup" statement on > > client side, that would exclude anything subject to backup. > > > > What are your thoughts ? Am I condemned to define new storage pools not > > enabled for deduplication and or compression to do such testing, or is > > there some other mean ? > > > > Thanks a lot for appreciated feedback ! > > > > Cheers. > > > > Arnaud > > > > > > > > Backup and Recovery Systems Administrator > > Panalpina Management Ltd., Basle, Switzerland, > > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > > Direct: +41 (61) 226 19 78 > > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > > This electronic message transmission contains information from Panalpina > > and is confidential or privileged. > > This information is intended only for the person (s) named above. If you > > are not the intended recipient, any disclosure, copying, distribution or > > use or any other action based on the contents of this > > information is strictly prohibited. > > > > If you receive this electronic transmission in error, please notify the > > sender by e-mail, telephone or fax at the numbers listed above. Thank > you. > > > > > > >
Re: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool
Hi Stefan, Thanks a lot for appreciated feedback ! >> You can, however, disable compression on the storagepool-level. This is unfortunately what I intended to avoid : if I disable it, then lots of clients will be impacted, and the server's performance will for sure improve ... >> Are you using an IBM blueprint configuration for the Spectrum Protect I wish I could : my life would have been much easier ! Unfortunately management took the (definitively bad) decision to invest in a EMC/Dell Isilon array to be our Spectrum Scale server storage. I'm now fighting since 6 months to have the whole working together, so far without real success : performance is horrible :-( Cheers. Arnaud -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Stefan Folkerts Sent: Thursday, April 05, 2018 5:48 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool Hi, With the directory containerpool you cannot, for as far as I know, disable an attempt to deduplicate the data and if the data is able to deduplicate it will be deduplicated. You can, however, disable compression on the storagepool-level. If you disable it on the containerpool client-side settings for deduplication will have no effect on compression within the pool. Are you using an IBM blueprint configuration for the Spectrum Protect environment? Regards, Stefan On Tue, Apr 3, 2018 at 6:06 PM, PAC Brion Arnaud wrote: > Hi All, > > Following to global client backup performance issues on some new TSM > server, which I suspect to be related to the workload induced on TSM > instance by deduplication/compression operations, I would like to do some > testing with a client, selectively disabling compression or deduplication, > possibly both of them on it. > > However, the TSM server has been configured to only make use of > directory-based storage pools, which have been defined having deduplication > and compression enabled. > > Thus my question : is there any mean to configure a client, so that its > data will not be compressed or deduplicated ? > > From my understanding, setting up "compression no" in the client option > file will be of no use, as the server will still be compressing the data at > storage pool level. > Likewise, setting up "deduplication no" in the client option file will > refrain the client to proceed to deduplication, but the server still will. > The last remaining possibility that I can think of, to disable > deduplication, would be to make use of some "exclude.dedup" statement on > client side, that would exclude anything subject to backup. > > What are your thoughts ? Am I condemned to define new storage pools not > enabled for deduplication and or compression to do such testing, or is > there some other mean ? > > Thanks a lot for appreciated feedback ! > > Cheers. > > Arnaud > > > > Backup and Recovery Systems Administrator > Panalpina Management Ltd., Basle, Switzerland, > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > Direct: +41 (61) 226 19 78 > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > This electronic message transmission contains information from Panalpina > and is confidential or privileged. > This information is intended only for the person (s) named above. If you > are not the intended recipient, any disclosure, copying, distribution or > use or any other action based on the contents of this > information is strictly prohibited. > > If you receive this electronic transmission in error, please notify the > sender by e-mail, telephone or fax at the numbers listed above. Thank you. > > >
Re: disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool
Hi, With the directory containerpool you cannot, for as far as I know, disable an attempt to deduplicate the data and if the data is able to deduplicate it will be deduplicated. You can, however, disable compression on the storagepool-level. If you disable it on the containerpool client-side settings for deduplication will have no effect on compression within the pool. Are you using an IBM blueprint configuration for the Spectrum Protect environment? Regards, Stefan On Tue, Apr 3, 2018 at 6:06 PM, PAC Brion Arnaud wrote: > Hi All, > > Following to global client backup performance issues on some new TSM > server, which I suspect to be related to the workload induced on TSM > instance by deduplication/compression operations, I would like to do some > testing with a client, selectively disabling compression or deduplication, > possibly both of them on it. > > However, the TSM server has been configured to only make use of > directory-based storage pools, which have been defined having deduplication > and compression enabled. > > Thus my question : is there any mean to configure a client, so that its > data will not be compressed or deduplicated ? > > From my understanding, setting up "compression no" in the client option > file will be of no use, as the server will still be compressing the data at > storage pool level. > Likewise, setting up "deduplication no" in the client option file will > refrain the client to proceed to deduplication, but the server still will. > The last remaining possibility that I can think of, to disable > deduplication, would be to make use of some "exclude.dedup" statement on > client side, that would exclude anything subject to backup. > > What are your thoughts ? Am I condemned to define new storage pools not > enabled for deduplication and or compression to do such testing, or is > there some other mean ? > > Thanks a lot for appreciated feedback ! > > Cheers. > > Arnaud > > > > Backup and Recovery Systems Administrator > Panalpina Management Ltd., Basle, Switzerland, > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > Direct: +41 (61) 226 19 78 > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > This electronic message transmission contains information from Panalpina > and is confidential or privileged. > This information is intended only for the person (s) named above. If you > are not the intended recipient, any disclosure, copying, distribution or > use or any other action based on the contents of this > information is strictly prohibited. > > If you receive this electronic transmission in error, please notify the > sender by e-mail, telephone or fax at the numbers listed above. Thank you. > > >
disabling compression and/or deduplication for a client backing up against deduped/compressed directory-based storage pool
Hi All, Following to global client backup performance issues on some new TSM server, which I suspect to be related to the workload induced on TSM instance by deduplication/compression operations, I would like to do some testing with a client, selectively disabling compression or deduplication, possibly both of them on it. However, the TSM server has been configured to only make use of directory-based storage pools, which have been defined having deduplication and compression enabled. Thus my question : is there any mean to configure a client, so that its data will not be compressed or deduplicated ? >From my understanding, setting up "compression no" in the client option file >will be of no use, as the server will still be compressing the data at storage >pool level. Likewise, setting up "deduplication no" in the client option file will refrain the client to proceed to deduplication, but the server still will. The last remaining possibility that I can think of, to disable deduplication, would be to make use of some "exclude.dedup" statement on client side, that would exclude anything subject to backup. What are your thoughts ? Am I condemned to define new storage pools not enabled for deduplication and or compression to do such testing, or is there some other mean ? Thanks a lot for appreciated feedback ! Cheers. Arnaud Backup and Recovery Systems Administrator Panalpina Management Ltd., Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 Direct: +41 (61) 226 19 78 e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> This electronic message transmission contains information from Panalpina and is confidential or privileged. This information is intended only for the person (s) named above. If you are not the intended recipient, any disclosure, copying, distribution or use or any other action based on the contents of this information is strictly prohibited. If you receive this electronic transmission in error, please notify the sender by e-mail, telephone or fax at the numbers listed above. Thank you.
Re: Select statement for deduplication & compression statistics ...
Hi Anders, Whished it would be that simple ... Unfortunately, there are quite a lot of discrepancies between the data reported by our query, and the output from "q stg", like demonstrated here : Output for "q stg xxx f=d" DIR_DB2 : Deduplication Savings: 7,018 G (27.69%) Compression Savings: 10,696 G (58.36%) DIR_EXCH : Deduplication Savings: 40,039 G (71.34%) Compression Savings: 6,369 G (39.59%) DIR_INF : Deduplication Savings: 0 (0%) Compression Savings: 1,695 G (71.90%) DIR_ORA : Deduplication Savings: 871 G (42.30%) Compression Savings: 959 G (80.74%) DIR_SQL : Deduplication Savings: 2,438 G (55.50%) Compression Savings: 1,616 G (82.63%) DIR_UNIX : Deduplication Savings: 2,070 G (8.29%) Compression Savings: 17,350 G (75.75%) DIR_VM : Deduplication Savings: 16,347 G (45.92%) Compression Savings: 10,787 G (56.04%) DIR_WIN : Deduplication Savings: 7,018 G (27.69%) Compression Savings: 10,697 G (58.35%) Output of your query : STGPOOL_NAME Dedup savings Compression savings --- --- DIR_DB2 29.3500% 63.3200% DIR_EXCH 71.8600% 40.6300% DIR_INF .% 35.4000% DIR_ORA 17.7000% 23.6800% DIR_SQL 34.1200% 34.3200% DIR_UNIX 8.0800% 73.6800% DIR_VM 44.8700% 53.7100% DIR_WIN 3.1800% 2.9900% Some results are relatively close, but some other ones (dir_inf, dir_ora, dir_sql, dir_win) are totally divergent ... This is exactly my problem ! This might be due to the fact that the total capacity of my storage array is quite huge (Estimated Capacity: 3,098,067 G), shared by all the storage pools, and that TSM reported Pct Util precision is not good enough (one decimal only), or even something else (reclamable space, additional data for replication ?) no idea ... But the values are not matching :( I'm wondering if IBM could not be making use of the dedupstats table to get its values ... I' working on this at present time ... Cheers. Arnaud ** Backup and Recovery Systems Administrator Panalpina Management Ltd., Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 Direct: +41 (61) 226 19 78 e-mail: arnaud.br...@panalpina.com This electronic message transmission contains information from Panalpina and is confidential or privileged. This information is intended only for the person (s) named above. If you are not the intended recipient, any disclosure, copying, distribution or use or any other action based on the contents of this information is strictly prohibited. If you receive this electronic transmission in error, please notify the sender by e-mail, telephone or fax at the numbers listed above. Thank you. ** -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Anders Räntilä Sent: Monday, December 11, 2017 10:30 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: Select statement for deduplication & compression statistics ... Hi This is simple math select stgpool_name,DEDUP_SPACE_SAVED_MB/(DEDUP_SPACE_SAVED_MB+COMP_SPACE_SAVED_MB+(EST_CAPACITY_MB*PCT_UTILIZED/100))*100||'%' as "Dedup savings" from stgpools select stgpool_name,COMP_SPACE_SAVED_MB/(COMP_SPACE_SAVED_MB+(EST_CAPACITY_MB*PCT_UTILIZED/100))*100||'%' as "Compression savings" from stgpools Best Regards Anders Räntilä
Re: Select statement for deduplication & compression statistics ...
Hi This is simple math select stgpool_name,DEDUP_SPACE_SAVED_MB/(DEDUP_SPACE_SAVED_MB+COMP_SPACE_SAVED_MB+(EST_CAPACITY_MB*PCT_UTILIZED/100))*100||'%' as "Dedup savings" from stgpools select stgpool_name,COMP_SPACE_SAVED_MB/(COMP_SPACE_SAVED_MB+(EST_CAPACITY_MB*PCT_UTILIZED/100))*100||'%' as "Compression savings" from stgpools Best Regards Anders Räntilä
Select statement for deduplication & compression statistics ...
Hi All, Simple question : did any of you succeeded in building a query that would provide accurate statistics on deduplication and compression factors for the new TSM directory-based pools ? I would simply get following : Stgpool name, space that would be used without dedup & compression, dedup savings GB , dedup savings %, compression savings GB, compression savings %, total data reduction GB, total data reduction % Basically, a "q stg f=d" is able to report such information (with the exception of space that would be used without dedup & compression), as following example shows : Storage Pool Name: DIR_DB2 Storage Pool Type: Primary Device Class Name: Storage Type: DIRECTORY Cloud Type: Cloud URL: Cloud Identity: Cloud Location: Estimated Capacity: 3,222,437 G Space Trigger Util: Pct Util: 0.2 (skipped data) Deduplication Savings: 7,095 G (31.34%) Compression Savings: 9,548 G (61.42%) Total Space Saved: 16,644 G (73.51%) However, the "stgpools" table is only providing following related fields : TOTAL_SPACE_MB (which is always empty),SPACE_SAVED_MB, COMP_SPACE_SAVED_MB,DEDUP_SPACE_SAVED_MB. What magic does IBM use to be able to display percentages for compression and dedup in "q stg f=d" output ? So far I could not found it ... Thanks in advance for any hint ! Cheers. Arnaud ** Backup and Recovery Systems Administrator Panalpina Management Ltd., Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 Direct: +41 (61) 226 19 78 e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> This electronic message transmission contains information from Panalpina and is confidential or privileged. This information is intended only for the person (s) named above. If you are not the intended recipient, any disclosure, copying, distribution or use or any other action based on the contents of this information is strictly prohibited. If you receive this electronic transmission in error, please notify the sender by e-mail, telephone or fax at the numbers listed above. Thank you. **
Re: Deduplication
Hi Stefan! I have it set to 0, but like Dave mentioned, you have to wait a few hours. The reason it wasn't working as expected in my case was that another user backed up the exact same directory too. So I had to delete not only my backup files, but also his files. It's a test environment by the way. :-) Kind regards, Eric van Loon Air France/KLM Storage Engineering -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Stefan Folkerts Sent: woensdag 12 april 2017 17:56 To: ADSM-L@VM.MARIST.EDU Subject: Re: Deduplication Eric, The containerpool has a reuse delay setting in day's that, in effect, works the same as the reuse delay on traditional storagepools, did you set this to 0? It's in day's not hours and the default is 1. Regards, Stefan On Tue, Apr 11, 2017 at 2:21 PM, Loon, Eric van (ITOPT3) - KLM < eric-van.l...@klm.com> wrote: > Hi Dave! > Thank you very much for your reply! > I deleted some data this morning and waited for 4 hours before making > a new backup, but that doesn't seem to be enough. Is there any way to > influence this waiting period? A certain table reorg or stop/start of > the server or is it just hard-coded? > Thanks again for your help! > Kind regards, > Eric van Loon > Air France/KLM Storage Engineering > > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf > Of Del Hoobler > Sent: maandag 10 april 2017 18:12 > To: ADSM-L@VM.MARIST.EDU > Subject: Re: Deduplication > > Hi Eric, > > A few things: > > - Client-side provides better overall throughput for Spectrum Protect > because the deduplication is spread across more CPU's. So if you can > afford to do the deduplication client-side, that is the best overall result. > > - Client-side helps reduce network traffic > > - The algorithms on how deduplication is performed are the same > between client and server. > > > The behavior you are seeing has to do with the reusedelay impact on > deduplicated chunks. If the reusedelay is 1 day (default), that means > Spectrum Protect keeps the deduplicated chunks pinned in storage until > that time has passed. If the reusedelay is 0, there is a still a small > cushion window that might allow the chunks to still be linked to. If > you waited for a couple of hours AFTER the deletion occurred, I would > not expect those chunks to be reused. > > > > Del > > > > "ADSM: Dist Stor Manager" wrote on 04/10/2017 > 10:57:27 AM: > > > From: "Loon, Eric van (ITOPT3) - KLM" > > To: ADSM-L@VM.MARIST.EDU > > Date: 04/10/2017 11:01 AM > > Subject: Deduplication > > Sent by: "ADSM: Dist Stor Manager" > > > > Hi guys! > > We are trying to make a fair comparison between server- and client- > > side deduplication. I'm running into an 'issue' where I notice that > > once you created a backup of a certain set of data, it is always > > deduplicated 100% afterwards when you start a new client-side > > deduped backup. Even when you delete all previous backup on the > > server > first! > > So I backed up a directory, retrieved all objectids through a select > > * from backups and deleted all objects, but still a new backup is > > deduplicated 100%. I don't understand why. I though it maybe had > > something to do with data still being in the container pool, but > > even with reusdelay=0, everything is deduplicated... > > Thanks for any help (Andy? :)) in advance. > > Kind regards, > > Eric van Loon > > Air France/KLM Storage Engineering > > > > For information, services and offers, please visit our web site: > > http://www.klm.com. This e-mail and any attachment may contain > > confidential and privileged material intended for the addressee > > only. If you are not the addressee, you are notified that no part of > > the e-mail or any attachment may be disclosed, copied or > > distributed, and that any other action related to this e-mail or > > attachment is strictly prohibited, and may be unlawful. If you have > > received this e-mail by error, please notify the sender immediately > > by return e-mail, and delete this message. > > > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/ > > or its employees shall not be liable for the incorrect or incomplete > > transmission of this e-mail or any attachments, nor responsible for > > any delay in receipt. > > Koninklijke Luchtvaart Maatschappij
Re: Deduplication
Eric, The containerpool has a reuse delay setting in day's that, in effect, works the same as the reuse delay on traditional storagepools, did you set this to 0? It's in day's not hours and the default is 1. Regards, Stefan On Tue, Apr 11, 2017 at 2:21 PM, Loon, Eric van (ITOPT3) - KLM < eric-van.l...@klm.com> wrote: > Hi Dave! > Thank you very much for your reply! > I deleted some data this morning and waited for 4 hours before making a > new backup, but that doesn't seem to be enough. Is there any way to > influence this waiting period? A certain table reorg or stop/start of the > server or is it just hard-coded? > Thanks again for your help! > Kind regards, > Eric van Loon > Air France/KLM Storage Engineering > > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Del Hoobler > Sent: maandag 10 april 2017 18:12 > To: ADSM-L@VM.MARIST.EDU > Subject: Re: Deduplication > > Hi Eric, > > A few things: > > - Client-side provides better overall throughput for Spectrum Protect > because the deduplication is spread across more CPU's. So if you can afford > to do the deduplication client-side, that is the best overall result. > > - Client-side helps reduce network traffic > > - The algorithms on how deduplication is performed are the same between > client and server. > > > The behavior you are seeing has to do with the reusedelay impact on > deduplicated chunks. If the reusedelay is 1 day (default), that means > Spectrum Protect keeps the deduplicated chunks pinned in storage until > that time has passed. If the reusedelay is 0, there is a still a small > cushion window that might allow the chunks to still be linked to. If you > waited for a couple of hours AFTER the deletion occurred, I would not > expect those chunks to be reused. > > > > Del > > > > "ADSM: Dist Stor Manager" wrote on 04/10/2017 > 10:57:27 AM: > > > From: "Loon, Eric van (ITOPT3) - KLM" > > To: ADSM-L@VM.MARIST.EDU > > Date: 04/10/2017 11:01 AM > > Subject: Deduplication > > Sent by: "ADSM: Dist Stor Manager" > > > > Hi guys! > > We are trying to make a fair comparison between server- and client- > > side deduplication. I'm running into an 'issue' where I notice that > > once you created a backup of a certain set of data, it is always > > deduplicated 100% afterwards when you start a new client-side > > deduped backup. Even when you delete all previous backup on the server > first! > > So I backed up a directory, retrieved all objectids through a select > > * from backups and deleted all objects, but still a new backup is > > deduplicated 100%. I don't understand why. I though it maybe had > > something to do with data still being in the container pool, but > > even with reusdelay=0, everything is deduplicated... > > Thanks for any help (Andy? :)) in advance. > > Kind regards, > > Eric van Loon > > Air France/KLM Storage Engineering > > > > For information, services and offers, please visit our web site: > > http://www.klm.com. This e-mail and any attachment may contain > > confidential and privileged material intended for the addressee > > only. If you are not the addressee, you are notified that no part of > > the e-mail or any attachment may be disclosed, copied or > > distributed, and that any other action related to this e-mail or > > attachment is strictly prohibited, and may be unlawful. If you have > > received this e-mail by error, please notify the sender immediately > > by return e-mail, and delete this message. > > > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/ > > or its employees shall not be liable for the incorrect or incomplete > > transmission of this e-mail or any attachments, nor responsible for > > any delay in receipt. > > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal > > Dutch Airlines) is registered in Amstelveen, The Netherlands, with > > registered number 33014286 > > > > > > For information, services and offers, please visit our web site: > http://www.klm.com. This e-mail and any attachment may contain > confidential and privileged material intended for the addressee only. If > you are not the addressee, you are notified that no part of the e-mail or > any attachment may be disclosed, copied or
Re: Deduplication
Hi Dave! Thank you very much for your reply! I deleted some data this morning and waited for 4 hours before making a new backup, but that doesn't seem to be enough. Is there any way to influence this waiting period? A certain table reorg or stop/start of the server or is it just hard-coded? Thanks again for your help! Kind regards, Eric van Loon Air France/KLM Storage Engineering -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Del Hoobler Sent: maandag 10 april 2017 18:12 To: ADSM-L@VM.MARIST.EDU Subject: Re: Deduplication Hi Eric, A few things: - Client-side provides better overall throughput for Spectrum Protect because the deduplication is spread across more CPU's. So if you can afford to do the deduplication client-side, that is the best overall result. - Client-side helps reduce network traffic - The algorithms on how deduplication is performed are the same between client and server. The behavior you are seeing has to do with the reusedelay impact on deduplicated chunks. If the reusedelay is 1 day (default), that means Spectrum Protect keeps the deduplicated chunks pinned in storage until that time has passed. If the reusedelay is 0, there is a still a small cushion window that might allow the chunks to still be linked to. If you waited for a couple of hours AFTER the deletion occurred, I would not expect those chunks to be reused. Del "ADSM: Dist Stor Manager" wrote on 04/10/2017 10:57:27 AM: > From: "Loon, Eric van (ITOPT3) - KLM" > To: ADSM-L@VM.MARIST.EDU > Date: 04/10/2017 11:01 AM > Subject: Deduplication > Sent by: "ADSM: Dist Stor Manager" > > Hi guys! > We are trying to make a fair comparison between server- and client- > side deduplication. I'm running into an 'issue' where I notice that > once you created a backup of a certain set of data, it is always > deduplicated 100% afterwards when you start a new client-side > deduped backup. Even when you delete all previous backup on the server first! > So I backed up a directory, retrieved all objectids through a select > * from backups and deleted all objects, but still a new backup is > deduplicated 100%. I don't understand why. I though it maybe had > something to do with data still being in the container pool, but > even with reusdelay=0, everything is deduplicated... > Thanks for any help (Andy? :)) in advance. > Kind regards, > Eric van Loon > Air France/KLM Storage Engineering > > For information, services and offers, please visit our web site: > http://www.klm.com. This e-mail and any attachment may contain > confidential and privileged material intended for the addressee > only. If you are not the addressee, you are notified that no part of > the e-mail or any attachment may be disclosed, copied or > distributed, and that any other action related to this e-mail or > attachment is strictly prohibited, and may be unlawful. If you have > received this e-mail by error, please notify the sender immediately > by return e-mail, and delete this message. > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/ > or its employees shall not be liable for the incorrect or incomplete > transmission of this e-mail or any attachments, nor responsible for > any delay in receipt. > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal > Dutch Airlines) is registered in Amstelveen, The Netherlands, with > registered number 33014286 > > For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message. Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt. Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch Airlines) is registered in Amstelveen, The Netherlands, with registered number 33014286
Re: Deduplication
Hi Steve! I would hope not, because in that case TSM would have a data integrity issue... Kind regards, Eric van Loon Air France/KLM Storage Engineering -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Schaub, Steve Sent: maandag 10 april 2017 17:30 To: ADSM-L@VM.MARIST.EDU Subject: Re: Deduplication Perhaps the client side dedupe is keeping a dedupe hash-bitmap that is not getting fully refreshed when you purge the backup data from the server? -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Loon, Eric van (ITOPT3) - KLM Sent: Monday, April 10, 2017 10:57 AM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Deduplication Hi guys! We are trying to make a fair comparison between server- and client-side deduplication. I'm running into an 'issue' where I notice that once you created a backup of a certain set of data, it is always deduplicated 100% afterwards when you start a new client-side deduped backup. Even when you delete all previous backup on the server first! So I backed up a directory, retrieved all objectids through a select * from backups and deleted all objects, but still a new backup is deduplicated 100%. I don't understand why. I though it maybe had something to do with data still being in the container pool, but even with reusdelay=0, everything is deduplicated... Thanks for any help (Andy? :)) in advance. Kind regards, Eric van Loon Air France/KLM Storage Engineering For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message. Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt. Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch Airlines) is registered in Amstelveen, The Netherlands, with registered number 33014286 -- Please see the following link for the BlueCross BlueShield of Tennessee E-mail disclaimer: http://www.bcbst.com/email_disclaimer.shtm For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message. Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt. Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch Airlines) is registered in Amstelveen, The Netherlands, with registered number 33014286
Re: Deduplication
Hi Eric, A few things: - Client-side provides better overall throughput for Spectrum Protect because the deduplication is spread across more CPU's. So if you can afford to do the deduplication client-side, that is the best overall result. - Client-side helps reduce network traffic - The algorithms on how deduplication is performed are the same between client and server. The behavior you are seeing has to do with the reusedelay impact on deduplicated chunks. If the reusedelay is 1 day (default), that means Spectrum Protect keeps the deduplicated chunks pinned in storage until that time has passed. If the reusedelay is 0, there is a still a small cushion window that might allow the chunks to still be linked to. If you waited for a couple of hours AFTER the deletion occurred, I would not expect those chunks to be reused. Del "ADSM: Dist Stor Manager" wrote on 04/10/2017 10:57:27 AM: > From: "Loon, Eric van (ITOPT3) - KLM" > To: ADSM-L@VM.MARIST.EDU > Date: 04/10/2017 11:01 AM > Subject: Deduplication > Sent by: "ADSM: Dist Stor Manager" > > Hi guys! > We are trying to make a fair comparison between server- and client- > side deduplication. I'm running into an 'issue' where I notice that > once you created a backup of a certain set of data, it is always > deduplicated 100% afterwards when you start a new client-side > deduped backup. Even when you delete all previous backup on the server first! > So I backed up a directory, retrieved all objectids through a select > * from backups and deleted all objects, but still a new backup is > deduplicated 100%. I don't understand why. I though it maybe had > something to do with data still being in the container pool, but > even with reusdelay=0, everything is deduplicated... > Thanks for any help (Andy? :)) in advance. > Kind regards, > Eric van Loon > Air France/KLM Storage Engineering > > For information, services and offers, please visit our web site: > http://www.klm.com. This e-mail and any attachment may contain > confidential and privileged material intended for the addressee > only. If you are not the addressee, you are notified that no part of > the e-mail or any attachment may be disclosed, copied or > distributed, and that any other action related to this e-mail or > attachment is strictly prohibited, and may be unlawful. If you have > received this e-mail by error, please notify the sender immediately > by return e-mail, and delete this message. > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/ > or its employees shall not be liable for the incorrect or incomplete > transmission of this e-mail or any attachments, nor responsible for > any delay in receipt. > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal > Dutch Airlines) is registered in Amstelveen, The Netherlands, with > registered number 33014286 > >
Re: Deduplication
Perhaps the client side dedupe is keeping a dedupe hash-bitmap that is not getting fully refreshed when you purge the backup data from the server? -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Loon, Eric van (ITOPT3) - KLM Sent: Monday, April 10, 2017 10:57 AM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Deduplication Hi guys! We are trying to make a fair comparison between server- and client-side deduplication. I'm running into an 'issue' where I notice that once you created a backup of a certain set of data, it is always deduplicated 100% afterwards when you start a new client-side deduped backup. Even when you delete all previous backup on the server first! So I backed up a directory, retrieved all objectids through a select * from backups and deleted all objects, but still a new backup is deduplicated 100%. I don't understand why. I though it maybe had something to do with data still being in the container pool, but even with reusdelay=0, everything is deduplicated... Thanks for any help (Andy? :)) in advance. Kind regards, Eric van Loon Air France/KLM Storage Engineering For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message. Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt. Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch Airlines) is registered in Amstelveen, The Netherlands, with registered number 33014286 -- Please see the following link for the BlueCross BlueShield of Tennessee E-mail disclaimer: http://www.bcbst.com/email_disclaimer.shtm
Deduplication
Hi guys! We are trying to make a fair comparison between server- and client-side deduplication. I'm running into an 'issue' where I notice that once you created a backup of a certain set of data, it is always deduplicated 100% afterwards when you start a new client-side deduped backup. Even when you delete all previous backup on the server first! So I backed up a directory, retrieved all objectids through a select * from backups and deleted all objects, but still a new backup is deduplicated 100%. I don't understand why. I though it maybe had something to do with data still being in the container pool, but even with reusdelay=0, everything is deduplicated... Thanks for any help (Andy? :)) in advance. Kind regards, Eric van Loon Air France/KLM Storage Engineering For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message. Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt. Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch Airlines) is registered in Amstelveen, The Netherlands, with registered number 33014286
Re: SAP Hana deduplication savings in directory stgpool
Hi Anri, since TSM server 7.1 it seems that server splits large objects into smaller chunks (default=yes) > help update node ... SPLITLARGEObjects Specifies whether large objects that are stored by this node are automatically split into smaller pieces, by the server, to optimize server processing. Specifying Yes causes the server to split large objects (over 10 GB) into smaller pieces when stored by a client node. Specifying No bypasses this process. Specify No only if your primary concern is maximizing throughput of backups directly to tape. The default value is Yes. On TSM 7.1.6 I'm observing following savings during full backup: 09/12/2016 01:33:03 ANR0951I Session 424697 for node SAP_PEP processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 319,384,083,580. Inline data deduplication reduced the data by 15,943,331,431 bytes and inline compression reduced the data by 126,899,071,930 bytes. (SESSION: 424697) 09/12/2016 01:33:06 ANR0951I Session 424704 for node SAP_PEP processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 319,393,258,760. Inline data deduplication reduced the data by 15,541,967,567 bytes and inline compression reduced the data by 124,230,367,566 bytes. (SESSION: 424704) 09/12/2016 01:33:54 ANR0951I Session 424706 for node SAP_PEP processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 319,348,693,600. Inline data deduplication reduced the data by 15,800,496,590 bytes and inline compression reduced the data by 126,268,578,855 bytes. (SESSION: 424706) 09/12/2016 01:34:12 ANR0951I Session 424701 for node SAP_PEP processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 319,402,433,940. Inline data deduplication reduced the data by 15,290,376,532 bytes and inline compression reduced the data by 126,178,977,588 bytes. (SESSION: 424701) 09/12/2016 01:35:57 ANR0951I Session 424700 for node SAP_PEP processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 319,389,326,540. Inline data deduplication reduced the data by 16,444,469,647 bytes and inline compression reduced the data by 125,242,586,296 bytes. (SESSION: 424700) Original data 5*310GB Deduplicated by 5*15GB Compressed by 5*125GB Dedup+compression savings ~45% -> this seems got better recently, but far from SAPORA / ORA / MSSQL / TSM4VE savings Log files have also pretty decent savings: 09/12/2016 02:08:19 ANR0951I Session 427521 for node SAP_PEP processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 78,644,400. Inline data deduplication reduced the data by 77,979,824 bytes and inline compression reduced the data by 487,099 bytes. (SESSION: 427521) 09/12/2016 02:17:15 ANR0951I Session 427621 for node SAP_PEP processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 3,932,220. Inline data deduplication reduced the data by 3,782,214 bytes and inline compression reduced the data by 141,029 bytes. (SESSION: 427621) 09/12/2016 03:21:19 ANR0951I Session 428230 for node SAP_PEP processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 78,644,400. Inline data deduplication reduced the data by 74,887,548 bytes and inline compression reduced the data by 2,984,879 bytes. (SESSION: 428230) 09/12/2016 03:23:16 ANR0951I Session 428235 for node SAP_PEP processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 319,820,560. Inline data deduplication reduced the data by 12,583,338 bytes and inline compression reduced the data by 199,794,984 bytes. (SESSION: 428235) 09/12/2016 03:23:35 ANR0951I Session 428239 for node SAP_PEP processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 78,644,400. Inline data deduplication reduced the data by 74,066,550 bytes and inline compression reduced the data by 3,596,699 bytes. (SESSION: 428239) "ADSM: Dist Stor Manager" wrote on 09/12/2016 01:51:36 PM: > From: Arni Snorri Eggertsson > To: ADSM-L@VM.MARIST.EDU > Date: 09/12/2016 01:54 PM > Subject: Re: [ADSM-L] SAP Hana deduplication savings in directory stgpool > Sent by: "ADSM: Dist Stor Manager" > > Hi Martin, > > From what I have gathered, it looks like the HANA database is backed up in > such big objects that deduplication fails to deduplicate, I have seen the > deduplication does something for the transaction logs but not for the full > backups, > > Snippet from actlog when transaction logs are shipped to TSM: > > ANR0951I Session 708064 for node processed 1 fi
Re: SAP Hana deduplication savings in directory stgpool
Hi Martin, >From what I have gathered, it looks like the HANA database is backed up in such big objects that deduplication fails to deduplicate, I have seen the deduplication does something for the transaction logs but not for the full backups, Snippet from actlog when transaction logs are shipped to TSM: ANR0951I Session 708064 for node processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 18,483,972. Inline data deduplication reduced the data by 126,899 bytes and inline compression reduced the data by 0 bytes. 09/12/2016 00:20:25 ANR0951I Session 695360 for node processed 1 files by using inline data deduplication or compression, or both. The number of original bytes was 797,085,373,292. Inline data deduplication reduced the data by 0 bytes and inline compression reduced the data by 0 bytes. (SESSION: 695360) And even if compression was enabled in the HANA database itself I would assume to see some compression in the dedup process (small but not zero) Dedup stats for said node Date/Time: 09/09/2016 14:07:06 Storage Pool Name: TSP3 Node Name: Filespace Name: /tdpmux FSID: 2 Type: Arch Total Data Protected (MB): 3,490,805 Total Space Used (MB): 3,405,573 Total Space Saved (MB): 85,232 Total Saving Percentage: 2.44 Deduplication Savings: 89,372,414,238 Deduplication Percentage: 2.44 Non-Deduplicated Extent Count: 4,651 Non-Deduplicated Extent Space Used: 1,832,923 Unique Extent Count: 5,460,237 Unique Extent Space Used: 3,551,809,145,807 Shared Extent Count: 205,396 Shared Extent Data Protected: 108,563,366,200 Shared Extent Space Used: 18,488,232,525 Compression Savings: 0 Compression Percentage: 0.00 Compressed Extent Count: 0 Uncompressed Extent count: 5,670,284 I am yet to try to enable compression on the stgpool since we just upgraded to 7.1.6. *Arni Snorri Eggertsson* +45 40 80 70 31 <+45+40+80+70+31> | ar...@gormur.com | http://gormur.com | Skype: arnieggertsson <http://dk.linkedin.com/in/arnieggertsson> On Wed, Sep 7, 2016 at 10:20 AM, Martin Janosik wrote: > Hello all, > > is anyone storing SAP HANA backups (using Data Protection for ERP) in > directory storage pools? > What are deduplication savings in your environment? > > In our environment we see only 40% savings (35%-50%), comparing to > predicted dedup savings 1:9 (this ratio is currently valid for backups of > TDP4ERP Oracle, MS SQL, Oracle, ...). > This completely messes up the initial capacity planning :( > > tsm: PRYTSM1>q dedupstats DEDUPPOOL_REPL SAP_PEP f=d > > Date/Time: 09/02/2016 21:01:00 > Storage Pool Name: DEDUPPOOL_REPL > Node Name: SAP_PEP > Filespace Name: /tdpmux > FSID: 2 > Type: Arch > Total Data Protected (MB): 27,045,165 > Total Space Used (MB): 16,228,576 > Total Space Saved (MB): 10,816,588 >Total Saving Percentage: 39.99 > Deduplication Savings: 1,699,912,761,679 > Deduplication Percentage: 5.99 > Non-Deduplicated Extent Count: 8,414 > Non-Deduplicated Extent Space Used: 3,329,503 >Unique Extent Count: 15,846,440 > Unique Extent Space Used: 26,082,116,838,846 >Shared Extent Count: 7,340,821 > Shared Extent Data Protected: 2,276,790,425,670 > Shared Extent Space Used: 574,756,400,378 >Compression Savings: 9,642,102,240,318 > Compression Percentage: 36.17 >Compressed Extent Count: 21,757,839 > Uncompressed Extent count: 1,437,836 > > Thank you in advance. > > Kind regards > Martin Janosik >
Re: SAP Hana deduplication savings in directory stgpool
Hi, Very often SAP HANA admins use the data compression to save memory. if so deduplication efficiency should fall. Efim > 7 сент. 2016 г., в 11:20, Martin Janosik > написал(а): > > Hello all, > > is anyone storing SAP HANA backups (using Data Protection for ERP) in > directory storage pools? > What are deduplication savings in your environment? > > In our environment we see only 40% savings (35%-50%), comparing to > predicted dedup savings 1:9 (this ratio is currently valid for backups of > TDP4ERP Oracle, MS SQL, Oracle, ...). > This completely messes up the initial capacity planning :( > > tsm: PRYTSM1>q dedupstats DEDUPPOOL_REPL SAP_PEP f=d > > Date/Time: 09/02/2016 21:01:00 > Storage Pool Name: DEDUPPOOL_REPL > Node Name: SAP_PEP >Filespace Name: /tdpmux > FSID: 2 > Type: Arch > Total Data Protected (MB): 27,045,165 > Total Space Used (MB): 16,228,576 >Total Space Saved (MB): 10,816,588 > Total Saving Percentage: 39.99 > Deduplication Savings: 1,699,912,761,679 > Deduplication Percentage: 5.99 > Non-Deduplicated Extent Count: 8,414 > Non-Deduplicated Extent Space Used: 3,329,503 > Unique Extent Count: 15,846,440 > Unique Extent Space Used: 26,082,116,838,846 > Shared Extent Count: 7,340,821 > Shared Extent Data Protected: 2,276,790,425,670 > Shared Extent Space Used: 574,756,400,378 > Compression Savings: 9,642,102,240,318 >Compression Percentage: 36.17 > Compressed Extent Count: 21,757,839 > Uncompressed Extent count: 1,437,836 > > Thank you in advance. > > Kind regards > Martin Janosik
SAP Hana deduplication savings in directory stgpool
Hello all, is anyone storing SAP HANA backups (using Data Protection for ERP) in directory storage pools? What are deduplication savings in your environment? In our environment we see only 40% savings (35%-50%), comparing to predicted dedup savings 1:9 (this ratio is currently valid for backups of TDP4ERP Oracle, MS SQL, Oracle, ...). This completely messes up the initial capacity planning :( tsm: PRYTSM1>q dedupstats DEDUPPOOL_REPL SAP_PEP f=d Date/Time: 09/02/2016 21:01:00 Storage Pool Name: DEDUPPOOL_REPL Node Name: SAP_PEP Filespace Name: /tdpmux FSID: 2 Type: Arch Total Data Protected (MB): 27,045,165 Total Space Used (MB): 16,228,576 Total Space Saved (MB): 10,816,588 Total Saving Percentage: 39.99 Deduplication Savings: 1,699,912,761,679 Deduplication Percentage: 5.99 Non-Deduplicated Extent Count: 8,414 Non-Deduplicated Extent Space Used: 3,329,503 Unique Extent Count: 15,846,440 Unique Extent Space Used: 26,082,116,838,846 Shared Extent Count: 7,340,821 Shared Extent Data Protected: 2,276,790,425,670 Shared Extent Space Used: 574,756,400,378 Compression Savings: 9,642,102,240,318 Compression Percentage: 36.17 Compressed Extent Count: 21,757,839 Uncompressed Extent count: 1,437,836 Thank you in advance. Kind regards Martin Janosik
Re: Cleversafe onprem + deduplication with TSM
that is because the container (cloud or directory) manages deduplication. As the data is ingested, Spectrum Protect detemines if the data is to be deduplicated. Inside the storage pool, you will see two types of containers, a container that is deduplicated and a non-deduplicated. To answer your question, yes you can create a deduplicated IBM Cloud Object Storage System (ICOSS) (formerly cleversafe) storage pool. Please be aware of the differences between Block storage pools and Object storage pools. The Object storage pool will give you the ability to manage Exabytes of data efficiently, but there is a cost to be able to do that, which is performance. Object storage pools should be used for unstructured data backups that require a rare restore and archives of structured data Best Regards, Ronald C. Delaware IBM Level 2 - IT Plus Certified Specialist - Expert IBM Certification Exam Developer IBM Certified Solution Advisor - Spectrum Storage Solutions v4 From: TSM ORT To: ADSM-L@VM.MARIST.EDU Date: 04/29/16 00:23 Subject:[ADSM-L] Cleversafe onprem + deduplication with TSM Sent by:"ADSM: Dist Stor Manager" Hi Can I create a deduplication enabled storage pool using Cleversafe cloud using TSM 7.1.5. I can find that there are flags to enable / disable encryption for on-premises however there are no flags to enable for disable for dedduplication nor compression. Thanks
Cleversafe onprem + deduplication with TSM
Hi Can I create a deduplication enabled storage pool using Cleversafe cloud using TSM 7.1.5. I can find that there are flags to enable / disable encryption for on-premises however there are no flags to enable for disable for dedduplication nor compression. Thanks
Re: Deduplication and database backups
If I understand correctly the large object size will place it in the highest dedup tier and that will make Spectrum Protect create less chunks of a larger size than if the object would be smaller. But even then I would still expect to see some dedup results, even if it would be just a few percent. Especially if you create a second full backup. On Fri, Apr 1, 2016 at 11:00 PM, Arni Snorri Eggertsson wrote: > Hi Guys, > > Thanks for the feedback, My feeling is that it must be that the HANA api > does not split the objects into smaller chunks, I am actually seeing the > same issue when doing Sybase ACE backups, again large objects, but still > under 50GB > > I see good deduplication on MSSQL and Domino backups, in directory > container pools, > > Eric, HANA is SAP's own in-memory database, no oracle. > > I have client compression turned off, and even if database compression > would be turned on I would expect som deduplication, 0 is a pretty > definitive no dedup. > > *Arni Snorri Eggertsson* > ar...@gormur.com > > > On Fri, Apr 1, 2016 at 9:01 AM, Loon, EJ van (ITOPT3) - KLM < > eric-van.l...@klm.com> wrote: > > > Hi Arni! > > Just a thought: could it be that Oracle compression is turned on? > > Kind regards, > > Eric van Loon > > Air France/KLM Storage Engineering > > > > -Original Message- > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > > Stefan Folkerts > > Sent: donderdag 31 maart 2016 17:55 > > To: ADSM-L@VM.MARIST.EDU > > Subject: Re: Deduplication and database backups > > > > I've seen plenty of databases go to container pools and get fair to good > > deduplications results even on the first backup. > > It should not matter that it is one large object, it will make the chunks > > larger but normally you should still get some deduplication as long as > it's > > not encrypted. > > It would seem like something strange that might just be HANA specific? > > > > On Wed, Mar 30, 2016 at 3:34 PM, Arni Snorri Eggertsson < > ar...@gormur.com> > > wrote: > > > > > Hi all, > > > > > > I want to hear what others are doing in regards of deduplication and > > > large files / database backups, > > > > > > on a recent setup we are taking backups of a SAP Hana system to a > > > direcotry container, I see great dedup stats when the system is doing > > > log backups, but I get no deduplication effects when we are doing full > > > backups, the database is roughly 250 GB in size, and it looks like > > > TSM sees the object as one file. > > > > > > ANR0951I Session 550996 for node x processed 1 files using inline > > > deduplication. 251,754,067,764 bytes were reduced by 0 bytes. > (SESSION: > > > 550996) > > > > > > > > > I am not 100% sure how to handle this, are others using > directory > > > containers at all? are you using them for TDP Database backups ? any > > > thoughts ? > > > > > > > > > > > > > > > *Arni Snorri Eggertsson* > > > ar...@gormur.com > > > > > > > For information, services and offers, please visit our web site: > > http://www.klm.com. This e-mail and any attachment may contain > > confidential and privileged material intended for the addressee only. If > > you are not the addressee, you are notified that no part of the e-mail or > > any attachment may be disclosed, copied or distributed, and that any > other > > action related to this e-mail or attachment is strictly prohibited, and > may > > be unlawful. If you have received this e-mail by error, please notify the > > sender immediately by return e-mail, and delete this message. > > > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its > > employees shall not be liable for the incorrect or incomplete > transmission > > of this e-mail or any attachments, nor responsible for any delay in > receipt. > > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch > > Airlines) is registered in Amstelveen, The Netherlands, with registered > > number 33014286 > > > > > > >
Re: Deduplication and database backups
Hi Guys, Thanks for the feedback, My feeling is that it must be that the HANA api does not split the objects into smaller chunks, I am actually seeing the same issue when doing Sybase ACE backups, again large objects, but still under 50GB I see good deduplication on MSSQL and Domino backups, in directory container pools, Eric, HANA is SAP's own in-memory database, no oracle. I have client compression turned off, and even if database compression would be turned on I would expect som deduplication, 0 is a pretty definitive no dedup. *Arni Snorri Eggertsson* ar...@gormur.com On Fri, Apr 1, 2016 at 9:01 AM, Loon, EJ van (ITOPT3) - KLM < eric-van.l...@klm.com> wrote: > Hi Arni! > Just a thought: could it be that Oracle compression is turned on? > Kind regards, > Eric van Loon > Air France/KLM Storage Engineering > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Stefan Folkerts > Sent: donderdag 31 maart 2016 17:55 > To: ADSM-L@VM.MARIST.EDU > Subject: Re: Deduplication and database backups > > I've seen plenty of databases go to container pools and get fair to good > deduplications results even on the first backup. > It should not matter that it is one large object, it will make the chunks > larger but normally you should still get some deduplication as long as it's > not encrypted. > It would seem like something strange that might just be HANA specific? > > On Wed, Mar 30, 2016 at 3:34 PM, Arni Snorri Eggertsson > wrote: > > > Hi all, > > > > I want to hear what others are doing in regards of deduplication and > > large files / database backups, > > > > on a recent setup we are taking backups of a SAP Hana system to a > > direcotry container, I see great dedup stats when the system is doing > > log backups, but I get no deduplication effects when we are doing full > > backups, the database is roughly 250 GB in size, and it looks like > > TSM sees the object as one file. > > > > ANR0951I Session 550996 for node x processed 1 files using inline > > deduplication. 251,754,067,764 bytes were reduced by 0 bytes. (SESSION: > > 550996) > > > > > > I am not 100% sure how to handle this, are others using directory > > containers at all? are you using them for TDP Database backups ? any > > thoughts ? > > > > > > > > > > *Arni Snorri Eggertsson* > > ar...@gormur.com > > > > For information, services and offers, please visit our web site: > http://www.klm.com. This e-mail and any attachment may contain > confidential and privileged material intended for the addressee only. If > you are not the addressee, you are notified that no part of the e-mail or > any attachment may be disclosed, copied or distributed, and that any other > action related to this e-mail or attachment is strictly prohibited, and may > be unlawful. If you have received this e-mail by error, please notify the > sender immediately by return e-mail, and delete this message. > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its > employees shall not be liable for the incorrect or incomplete transmission > of this e-mail or any attachments, nor responsible for any delay in receipt. > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch > Airlines) is registered in Amstelveen, The Netherlands, with registered > number 33014286 > > >
Re: Deduplication and database backups
Hi Arni! Just a thought: could it be that Oracle compression is turned on? Kind regards, Eric van Loon Air France/KLM Storage Engineering -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Stefan Folkerts Sent: donderdag 31 maart 2016 17:55 To: ADSM-L@VM.MARIST.EDU Subject: Re: Deduplication and database backups I've seen plenty of databases go to container pools and get fair to good deduplications results even on the first backup. It should not matter that it is one large object, it will make the chunks larger but normally you should still get some deduplication as long as it's not encrypted. It would seem like something strange that might just be HANA specific? On Wed, Mar 30, 2016 at 3:34 PM, Arni Snorri Eggertsson wrote: > Hi all, > > I want to hear what others are doing in regards of deduplication and > large files / database backups, > > on a recent setup we are taking backups of a SAP Hana system to a > direcotry container, I see great dedup stats when the system is doing > log backups, but I get no deduplication effects when we are doing full > backups, the database is roughly 250 GB in size, and it looks like > TSM sees the object as one file. > > ANR0951I Session 550996 for node x processed 1 files using inline > deduplication. 251,754,067,764 bytes were reduced by 0 bytes. (SESSION: > 550996) > > > I am not 100% sure how to handle this, are others using directory > containers at all? are you using them for TDP Database backups ? any > thoughts ? > > > > > *Arni Snorri Eggertsson* > ar...@gormur.com > For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message. Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt. Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch Airlines) is registered in Amstelveen, The Netherlands, with registered number 33014286
Re: Deduplication and database backups
I've seen plenty of databases go to container pools and get fair to good deduplications results even on the first backup. It should not matter that it is one large object, it will make the chunks larger but normally you should still get some deduplication as long as it's not encrypted. It would seem like something strange that might just be HANA specific? On Wed, Mar 30, 2016 at 3:34 PM, Arni Snorri Eggertsson wrote: > Hi all, > > I want to hear what others are doing in regards of deduplication and large > files / database backups, > > on a recent setup we are taking backups of a SAP Hana system to a direcotry > container, I see great dedup stats when the system is doing log backups, > but I get no deduplication effects when we are doing full backups, > the database is roughly 250 GB in size, and it looks like TSM sees the > object as one file. > > ANR0951I Session 550996 for node x processed 1 files using inline > deduplication. 251,754,067,764 bytes were reduced by 0 bytes. (SESSION: > 550996) > > > I am not 100% sure how to handle this, are others using directory > containers at all? are you using them for TDP Database backups ? any > thoughts ? > > > > > *Arni Snorri Eggertsson* > ar...@gormur.com >
Deduplication and database backups
Hi all, I want to hear what others are doing in regards of deduplication and large files / database backups, on a recent setup we are taking backups of a SAP Hana system to a direcotry container, I see great dedup stats when the system is doing log backups, but I get no deduplication effects when we are doing full backups, the database is roughly 250 GB in size, and it looks like TSM sees the object as one file. ANR0951I Session 550996 for node x processed 1 files using inline deduplication. 251,754,067,764 bytes were reduced by 0 bytes. (SESSION: 550996) I am not 100% sure how to handle this, are others using directory containers at all? are you using them for TDP Database backups ? any thoughts ? *Arni Snorri Eggertsson* ar...@gormur.com
Re: Real world deduplication rates with TSM 7.1 and container pools
Arnaud, If an object is already compressed (client side compression activated), it will be ignored/excluded by server side compression. -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "PAC Brion Arnaud" À: ADSM-L@VM.MARIST.EDU Envoyé: Mercredi 23 Mars 2016 10:27:58 Objet: Re: [ADSM-L] Real world deduplication rates with TSM 7.1 and container pools Erwann, Thanks for your input : I had a look at the video which clarified few points, but not all of them ... One question : if using deduped and compressed pool, shall the "compression" option on the node definition be left to default "client" value, or must I update it to "no" ? I'm asking because I updated my container pools yesterday to use compression, and this morning compression rate on this storage pool is still zero :-( Cheers. Arnaud -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Erwann SIMON Sent: Tuesday, March 22, 2016 11:18 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: Real world deduplication rates with TSM 7.1 and container pools Hi all, TSM deduplication is effective combined with compression. Without compression, I'm not sure that it worths what it costs (1:2,5 ratio or 65% is what I generally see with mixed data and "standard" retention). You all should whatch this Tricia Jiang's video on YouTube (from 7:11) : https://www.youtube.com/watch?v=ISWRrkY5RS8 -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "PAC Brion Arnaud" À: ADSM-L@VM.MARIST.EDU Envoyé: Vendredi 18 Mars 2016 15:41:06 Objet: [ADSM-L] Real world deduplication rates with TSM 7.1 and container pools Hi All, We are currently testing TSM 7.1 deduplication feature, in conjunction with container based storage pools. So far, my test TSM instances, installed with such a setup are reporting dedup percentage of 45 %, means dedup factor around 1.81, using a sample of clients which are representative of our production environment. This is unfortunately pretty far from what was promised by IBM (dedup factor of 4) ... I'm wondering if anybody making use of container based storage pools and deduplication would be sharing his deduplication factor, so that I could have a better appreciation of real world figures. If you would be so kind to share your information (possibly with the kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention values ...) I would be very grateful ! Thanks in advance for appreciated feedback. Cheers. Arnaud ** Backup and Recovery Systems Administrator Panalpina Management Ltd., Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 Direct: +41 (61) 226 19 78 e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> This electronic message transmission contains information from Panalpina and is confidential or privileged. This information is intended only for the person (s) named above. If you are not the intended recipient, any disclosure, copying, distribution or use or any other action based on the contents of this information is strictly prohibited. If you receive this electronic transmission in error, please notify the sender by e-mail, telephone or fax at the numbers listed above. Thank you. **
Re: Real world deduplication rates with TSM 7.1 and container pools
Erwann, Thanks for your input : I had a look at the video which clarified few points, but not all of them ... One question : if using deduped and compressed pool, shall the "compression" option on the node definition be left to default "client" value, or must I update it to "no" ? I'm asking because I updated my container pools yesterday to use compression, and this morning compression rate on this storage pool is still zero :-( Cheers. Arnaud -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Erwann SIMON Sent: Tuesday, March 22, 2016 11:18 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: Real world deduplication rates with TSM 7.1 and container pools Hi all, TSM deduplication is effective combined with compression. Without compression, I'm not sure that it worths what it costs (1:2,5 ratio or 65% is what I generally see with mixed data and "standard" retention). You all should whatch this Tricia Jiang's video on YouTube (from 7:11) : https://www.youtube.com/watch?v=ISWRrkY5RS8 -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "PAC Brion Arnaud" À: ADSM-L@VM.MARIST.EDU Envoyé: Vendredi 18 Mars 2016 15:41:06 Objet: [ADSM-L] Real world deduplication rates with TSM 7.1 and container pools Hi All, We are currently testing TSM 7.1 deduplication feature, in conjunction with container based storage pools. So far, my test TSM instances, installed with such a setup are reporting dedup percentage of 45 %, means dedup factor around 1.81, using a sample of clients which are representative of our production environment. This is unfortunately pretty far from what was promised by IBM (dedup factor of 4) ... I'm wondering if anybody making use of container based storage pools and deduplication would be sharing his deduplication factor, so that I could have a better appreciation of real world figures. If you would be so kind to share your information (possibly with the kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention values ...) I would be very grateful ! Thanks in advance for appreciated feedback. Cheers. Arnaud ** Backup and Recovery Systems Administrator Panalpina Management Ltd., Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 Direct: +41 (61) 226 19 78 e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> This electronic message transmission contains information from Panalpina and is confidential or privileged. This information is intended only for the person (s) named above. If you are not the intended recipient, any disclosure, copying, distribution or use or any other action based on the contents of this information is strictly prohibited. If you receive this electronic transmission in error, please notify the sender by e-mail, telephone or fax at the numbers listed above. Thank you. **
Re: Real world deduplication rates with TSM 7.1 and container pools
Hi all, TSM deduplication is effective combined with compression. Without compression, I'm not sure that it worths what it costs (1:2,5 ratio or 65% is what I generally see with mixed data and "standard" retention). You all should whatch this Tricia Jiang's video on YouTube (from 7:11) : https://www.youtube.com/watch?v=ISWRrkY5RS8 -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "PAC Brion Arnaud" À: ADSM-L@VM.MARIST.EDU Envoyé: Vendredi 18 Mars 2016 15:41:06 Objet: [ADSM-L] Real world deduplication rates with TSM 7.1 and container pools Hi All, We are currently testing TSM 7.1 deduplication feature, in conjunction with container based storage pools. So far, my test TSM instances, installed with such a setup are reporting dedup percentage of 45 %, means dedup factor around 1.81, using a sample of clients which are representative of our production environment. This is unfortunately pretty far from what was promised by IBM (dedup factor of 4) ... I'm wondering if anybody making use of container based storage pools and deduplication would be sharing his deduplication factor, so that I could have a better appreciation of real world figures. If you would be so kind to share your information (possibly with the kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention values ...) I would be very grateful ! Thanks in advance for appreciated feedback. Cheers. Arnaud ** Backup and Recovery Systems Administrator Panalpina Management Ltd., Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 Direct: +41 (61) 226 19 78 e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> This electronic message transmission contains information from Panalpina and is confidential or privileged. This information is intended only for the person (s) named above. If you are not the intended recipient, any disclosure, copying, distribution or use or any other action based on the contents of this information is strictly prohibited. If you receive this electronic transmission in error, please notify the sender by e-mail, telephone or fax at the numbers listed above. Thank you. **
Re: Deduplication questions, again
I'll third the odd percentages... using 7.1.3.100. tsm: TSMPRD02>select sum(reporting_mb) from OCCUPANCY where stgpool_name='SASCONT0' Unnamed[1] -- 182520798.90 tsm: TSMPRD02>q stg sascont0 Storage Device Storage Estimated Pct Pct Hig- Lo- Next Stora- Pool Name Class Name TypeCapacity Util Migr h M- w ge Pool ig Mi- Pct g Pct --- -- - -- - - --- --- SASCONT0 DIRECTORY 204,729 G 50.5 VTLPOOL02 Rounded numbers- Pool Capacity: 204TB Pool Used: 50% Pool Used: 102TB Reporting Capacity: 180TB Savings: 56% OC shows savings of 0%. Servergraph also shows a pool savings of 0%. That said, some of our TDP backups used to see 93% savings and now show 0% so something may be going on with our containers. --- David Nixon Storage Engineer II Technology Services Group Carilion Clinic 451 Kimball Ave. Roanoke, VA 24015 Phone: 540-224-3903 cdni...@carilionclinic.org Our mission: Improve the health of the communities we serve. Notice: The information and attachment(s) contained in this communication are intended for the addressee only, and may be confidential and/or legally privileged. If you have received this communication in error, please contact the sender immediately, and delete this communication from any computer or network system. Any interception, review, printing, copying, re-transmission, dissemination, or other use of, or taking of any action upon this information by persons or entities other than the intended recipient is strictly prohibited by law and may subject them to criminal or civil liability. Carilion Clinic shall not be liable for the improper and/or incomplete transmission of the information contained in this communication or for any delay in its receipt.
Re: Real world deduplication rates with TSM 7.1 and container pools
Posting again... seems like the first one was rejected... Hello, Correct, current client-side compression is not affected. Spectrum Protect is targeting client-side compression integrated with dedup and this new compression later this year. As for replication, if the target storage pool is a container pool with compression enabled, the data extents are sent over compressed. Thank you, Del "ADSM: Dist Stor Manager" wrote on 03/22/2016 10:47:24 AM: > From: PAC Brion Arnaud > To: ADSM-L@VM.MARIST.EDU > Date: 03/22/2016 10:48 AM > Subject: Re: Real world deduplication rates with TSM 7.1 and container pools > Sent by: "ADSM: Dist Stor Manager" > > @Del, > > Additional question : if replicating compressed data, will the data > be transmitted compressed or uncompressed ? > > Cheers. > > Arnaud -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On > Behalf Of Alexander Heindl > Sent: Tuesday, March 22, 2016 3:19 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: Real world deduplication rates with TSM 7.1 and container pools > > Hi, > > but already client side compressed data isn't affected, isn't it? > what if data is replicated. Is that data then compressed? > > Regards, > Alex
Re: Deduplication questions, again
Matthew, Just a question : how do you know the size of pre-dedup data ? Did you make use of backup reports on each clients to get that information, or built some query based on dedupstats table, or anything else ? Here again, I cannot seem to find coherent information in TSM output ... Let's take an example : q occ CH2RS901 / Node Name Type Filespace FSID Storage Number ofPhysical Logical NamePool NameFiles Space Space OccupiedOccupied (MB)(MB) -- -- -- --- --- --- CH2RS901 Bkup / 4 CONT_STG 8,148 - 165.78 So, from "q occ" output we have 165.78 MB logical space occupied But: q dedupstats CONT_STG CH2RS901 / f=d Date/Time: 03/22/16 16:03:30 Storage Pool Name: CONT_STG Node Name: CH2RS901 Filespace Name: / FSID: 4 Type: Bkup Total Data Protected (MB): 167 Total Space Used (MB): 36 Total Space Saved (MB): 131 Total Saving Percentage: 78.34 Deduplication Savings: 137,056,854 Deduplication Percentage: 78.34 Non-Deduplicated Extent Count: 8,161 Non-Deduplicated Extent Space Used: 7,903,461 Unique Extent Count: 6 Unique Extent Space Used: 100,858 Shared Extent Count: 3,176 Shared Extent Data Protected: 166,937,957 Shared Extent Space Used: 29,783,486 Compression Savings: 0 Compression Percentage: 0.00 Compressed Extent Count: 0 Uncompressed Extent count: 11,343 If I trust this output, I have backed up 167 MB, and TSM deduped it down to 36 MB ... Could anyone explain how it comes that TSM "q occ" finds a "logical space occupied" of 165.78 MB ? Shouldn't it be 36 MB ? The help of "q occ" command states : Logical Space Occupied (MB) The amount of space that is occupied by logical files in the file space. Logical space is the space that is actually used to store files, excluding empty space within aggregates. For this value, 1 MB = 1048576 bytes. I'm lost here ... Cheers. Arnaud From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Matthew McGeary Sent: Tuesday, March 22, 2016 2:23 PM To: ADSM-L@VM.MARIST.EDU<mailto:ADSM-L@VM.MARIST.EDU> Subject: Re: Deduplication questions, again Arnaud, I too am seeing odd percentages where containerpools and dedup is concerned. I have a small remote server pair that protects ~23 TB of pre dedup data, but my containerpools show an occupancy of ~10 TB, which should be a data reduction of over 50%. However, a q stg on the containerpool only shows a data reduction ratio of 21%. Of note, I use client-side dedup on all the client nodes at this particular site and I think that's mucking up the data reduction numbers on the containerpool. The 21% figure seems to be the reduction AFTER client-side dedup, not the total data reduction. It's confusing. On the plus side, I just put in the new 7.1.5 code at this site and the compression is working well and does not appear to add a noticeable amount CPU cycles during ingest. Since the install date on the 18th, I've backed up around 1 TB pre-dedup and the compression savings are rated at ~400 GB, which is pretty impressive. I'm going to do a test restore today and see how it performs but so far so good. __ Matthew McGeary Technical Specialist - Infrastructure PotashCorp T: (306) 933-8921 www.potashcorp.com<http://www.potashcorp.com> From:PAC Brion Arnaud mailto:arnaud.br...@panalpina.com>> To:ADSM-L@VM.MARIST.EDU<mailto:ADSM-L@VM.MARIST.EDU> Date:03/22/2016 03:52 AM Subject:[ADSM-L] Deduplication questions, again Sent by:"ADSM: Dist Stor Manager" mailto:ADSM-L@VM.MARIST.EDU>> Hi All, Another question in regards of TSM container based deduplicated pools ... Are you experiencing the same behavior than this : using "q stg f=d" targeting a deduped container based storage pool, I observe following output : q stg f=d Storage Pool Name: CONT_STG Storage Pool Type: Primary Device Class Name: Storage Type: DIRECTORY Cloud Type: Cloud URL: Cloud Identity: Cloud Location: Estimated Capacity: 5,087 G Space Trigge
Re: Real world deduplication rates with TSM 7.1 and container pools
If you are using node replication, you could re-replicate the data and get it compressed (both ways). It may be a bit much work, feasible perhaps for selected nodes. Joerg Pohlmann -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Del Hoobler Sent: March 22, 2016 07:11 To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Real world deduplication rates with TSM 7.1 and container pools Hi David, No. Only newly stored data will be compressed. Del "ADSM: Dist Stor Manager" wrote on 03/22/2016 09:41:27 AM: > From: David Ehresman > To: ADSM-L@VM.MARIST.EDU > Date: 03/22/2016 09:42 AM > Subject: Re: Real world deduplication rates with TSM 7.1 and container pools > Sent by: "ADSM: Dist Stor Manager" > > Del, > > After upgrading to 7.1.5 is there a way to get pre-existing container > data compressed? > > David > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf > Of Del Hoobler > Sent: Monday, March 21, 2016 5:53 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: [ADSM-L] Real world deduplication rates with TSM 7.1 and > container pools > > I think most of you know Spectrum Protect just added in-line > compression > to the container and cloud deduplicated pools in version 7.1.5: > > > https://urldefense.proofpoint.com/v2/url? > u=http-3A__www.ibm.com_support_knowledgecenter_SSGSG7-5F7.1. > 5_srv.common_r-5Ftechchg-5Fsrv-5Fcompress-5F715.html&d=AwIFAg&c=SgMrq23dbjbG X6e0ZsSHgEZX6A4IAf1SO3AJ2bNrHlk&r=dOGCMY197NTNH1k_wcsrWS3_fxedKW4rpKJ8cHCD2L 8&m=OMHhA4Nmsv9Buo24T83zOFo8w4mvoMyvC_nagZn36T0&s=C4shp8VD_EyXjruT5Lu2v68tMV pfpzHcluAuRJfaIK0&e= > > Adding incremental forever, the new in-line deduplication - client or > server-based (7.1.3), new in-line compression (7.1.5) I think you will > find that Protect continues to drive overall data reduction. This is being > done in the software, so you can choose what disk you want to use. > > I encourage folks to try out the the new deduplication along with the new > compression to see how it helps with the overall data reduction. > > > Thank you, > > Del > > > > > > "ADSM: Dist Stor Manager" wrote on 03/18/2016 > 10:41:06 AM: > > > From: PAC Brion Arnaud > > To: ADSM-L@VM.MARIST.EDU > > Date: 03/18/2016 10:41 AM > > Subject: Real world deduplication rates with TSM 7.1 and container pools > > Sent by: "ADSM: Dist Stor Manager" > > > > Hi All, > > > > We are currently testing TSM 7.1 deduplication feature, in > > conjunction with container based storage pools. > > So far, my test TSM instances, installed with such a setup are > > reporting dedup percentage of 45 %, means dedup factor around 1.81, > > using a sample of clients which are representative of our production > > environment. > > This is unfortunately pretty far from what was promised by IBM > > (dedup factor of 4) ... > > > > I'm wondering if anybody making use of container based storage pools > > and deduplication would be sharing his deduplication factor, so that > > I could have a better appreciation of real world figures. > > If you would be so kind to share your information (possibly with the > > kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention > > values ...) I would be very grateful ! > > > > Thanks in advance for appreciated feedback. > > > > Cheers. > > > > Arnaud > > > > > ** > > Backup and Recovery Systems Administrator Panalpina Management Ltd., > > Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 > > Basel/CH > > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > > Direct: +41 (61) 226 19 78 > > e-mail: > > arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > > This electronic message transmission contains information from > > Panalpina and is confidential or privileged. This information is > > intended only for the person (s) named above. If you are not the > > intended recipient, any disclosure, copying, distribution or use or > > any other action based on the contents of this information is > > strictly prohibited. > > > > If you receive this electronic transmission in error, please notify > > the sender by e-mail, telephone or fax at the numbers listed above. > Thank you. > > > ** > > >
Re: Real world deduplication rates with TSM 7.1 and container pools
Hello, Correct, current client-side compression is not affected. Spectrum Protect is targeting client-side compression integrated with dedup and this new compression later this year. As for replication, if the target storage pool is a container pool with compression enabled, the data extents are sent over compressed. Thank you, Del "ADSM: Dist Stor Manager" wrote on 03/22/2016 10:18:53 AM: > From: Alexander Heindl > To: ADSM-L@VM.MARIST.EDU > Date: 03/22/2016 10:19 AM > Subject: Re: Real world deduplication rates with TSM 7.1 and container pools > Sent by: "ADSM: Dist Stor Manager" > > Hi, > > but already client side compressed data isn't affected, isn't it? > what if data is replicated. Is that data then compressed? > > Regards, > Alex > > > > > Von:Del Hoobler > An: ADSM-L@VM.MARIST.EDU, > Datum: 22.03.2016 15:14 > Betreff:Re: [ADSM-L] Real world deduplication rates with TSM 7.1 > and container pools > Gesendet von: "ADSM: Dist Stor Manager" > > > > Hi David, > > No. Only newly stored data will be compressed. > > > Del > > > > > "ADSM: Dist Stor Manager" wrote on 03/22/2016 > 09:41:27 AM: > > > From: David Ehresman > > To: ADSM-L@VM.MARIST.EDU > > Date: 03/22/2016 09:42 AM > > Subject: Re: Real world deduplication rates with TSM 7.1 and container > pools > > Sent by: "ADSM: Dist Stor Manager" > > > > Del, > > > > After upgrading to 7.1.5 is there a way to get pre-existing > > container data compressed? > > > > David > > > > -Original Message- > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On > > Behalf Of Del Hoobler > > Sent: Monday, March 21, 2016 5:53 PM > > To: ADSM-L@VM.MARIST.EDU > > Subject: Re: [ADSM-L] Real world deduplication rates with TSM 7.1 > > and container pools > > > > I think most of you know Spectrum Protect just added in-line compression > > > > to the container and cloud deduplicated pools in version 7.1.5: > > > > > > https://urldefense.proofpoint.com/v2/url? > > u=http-3A__www.ibm.com_support_knowledgecenter_SSGSG7-5F7.1. > > > 5_srv.common_r-5Ftechchg-5Fsrv-5Fcompress-5F715.html&d=AwIFAg&c=SgMrq23dbjbGX6e0ZsSHgEZX6A4IAf1SO3AJ2bNrHlk&r=dOGCMY197NTNH1k_wcsrWS3_fxedKW4rpKJ8cHCD2L8&m=OMHhA4Nmsv9Buo24T83zOFo8w4mvoMyvC_nagZn36T0&s=C4shp8VD_EyXjruT5Lu2v68tMVpfpzHcluAuRJfaIK0&e= > > > > Adding incremental forever, the new in-line deduplication - client or > > server-based (7.1.3), new in-line compression (7.1.5) I think you will > > find that Protect continues to drive overall data reduction. This is > being > > done in the software, so you can choose what disk you want to use. > > > > I encourage folks to try out the the new deduplication along with the > new > > compression to see how it helps with the overall data reduction. > > > > > > Thank you, > > > > Del > > > > > > > > > > > > "ADSM: Dist Stor Manager" wrote on 03/18/2016 > > 10:41:06 AM: > > > > > From: PAC Brion Arnaud > > > To: ADSM-L@VM.MARIST.EDU > > > Date: 03/18/2016 10:41 AM > > > Subject: Real world deduplication rates with TSM 7.1 and container > pools > > > Sent by: "ADSM: Dist Stor Manager" > > > > > > Hi All, > > > > > > We are currently testing TSM 7.1 deduplication feature, in > > > conjunction with container based storage pools. > > > So far, my test TSM instances, installed with such a setup are > > > reporting dedup percentage of 45 %, means dedup factor around 1.81, > > > using a sample of clients which are representative of our production > > > environment. > > > This is unfortunately pretty far from what was promised by IBM > > > (dedup factor of 4) ... > > > > > > I'm wondering if anybody making use of container based storage pools > > > and deduplication would be sharing his deduplication factor, so that > > > I could have a better appreciation of real world figures. > > > If you would be so kind to share your information (possibly with the > > > kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention > > > values ...) I would be very grateful ! > > > > > > Thanks in advance for appreciated feedback.
Re: Real world deduplication rates with TSM 7.1 and container pools
@Del, Additional question : if replicating compressed data, will the data be transmitted compressed or uncompressed ? Cheers. Arnaud -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Alexander Heindl Sent: Tuesday, March 22, 2016 3:19 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: Real world deduplication rates with TSM 7.1 and container pools Hi, but already client side compressed data isn't affected, isn't it? what if data is replicated. Is that data then compressed? Regards, Alex Von:Del Hoobler An: ADSM-L@VM.MARIST.EDU, Datum: 22.03.2016 15:14 Betreff:Re: [ADSM-L] Real world deduplication rates with TSM 7.1 and container pools Gesendet von: "ADSM: Dist Stor Manager" Hi David, No. Only newly stored data will be compressed. Del "ADSM: Dist Stor Manager" wrote on 03/22/2016 09:41:27 AM: > From: David Ehresman > To: ADSM-L@VM.MARIST.EDU > Date: 03/22/2016 09:42 AM > Subject: Re: Real world deduplication rates with TSM 7.1 and container pools > Sent by: "ADSM: Dist Stor Manager" > > Del, > > After upgrading to 7.1.5 is there a way to get pre-existing > container data compressed? > > David > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On > Behalf Of Del Hoobler > Sent: Monday, March 21, 2016 5:53 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: [ADSM-L] Real world deduplication rates with TSM 7.1 > and container pools > > I think most of you know Spectrum Protect just added in-line compression > to the container and cloud deduplicated pools in version 7.1.5: > > > https://urldefense.proofpoint.com/v2/url? > u=http-3A__www.ibm.com_support_knowledgecenter_SSGSG7-5F7.1. > 5_srv.common_r-5Ftechchg-5Fsrv-5Fcompress-5F715.html&d=AwIFAg&c=SgMrq23dbjbGX6e0ZsSHgEZX6A4IAf1SO3AJ2bNrHlk&r=dOGCMY197NTNH1k_wcsrWS3_fxedKW4rpKJ8cHCD2L8&m=OMHhA4Nmsv9Buo24T83zOFo8w4mvoMyvC_nagZn36T0&s=C4shp8VD_EyXjruT5Lu2v68tMVpfpzHcluAuRJfaIK0&e= > > Adding incremental forever, the new in-line deduplication - client or > server-based (7.1.3), new in-line compression (7.1.5) I think you will > find that Protect continues to drive overall data reduction. This is being > done in the software, so you can choose what disk you want to use. > > I encourage folks to try out the the new deduplication along with the new > compression to see how it helps with the overall data reduction. > > > Thank you, > > Del > > ---- > > > > "ADSM: Dist Stor Manager" wrote on 03/18/2016 > 10:41:06 AM: > > > From: PAC Brion Arnaud > > To: ADSM-L@VM.MARIST.EDU > > Date: 03/18/2016 10:41 AM > > Subject: Real world deduplication rates with TSM 7.1 and container pools > > Sent by: "ADSM: Dist Stor Manager" > > > > Hi All, > > > > We are currently testing TSM 7.1 deduplication feature, in > > conjunction with container based storage pools. > > So far, my test TSM instances, installed with such a setup are > > reporting dedup percentage of 45 %, means dedup factor around 1.81, > > using a sample of clients which are representative of our production > > environment. > > This is unfortunately pretty far from what was promised by IBM > > (dedup factor of 4) ... > > > > I'm wondering if anybody making use of container based storage pools > > and deduplication would be sharing his deduplication factor, so that > > I could have a better appreciation of real world figures. > > If you would be so kind to share your information (possibly with the > > kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention > > values ...) I would be very grateful ! > > > > Thanks in advance for appreciated feedback. > > > > Cheers. > > > > Arnaud > > > > > ** > > Backup and Recovery Systems Administrator > > Panalpina Management Ltd., Basle, Switzerland, > > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > > Direct: +41 (61) 226 19 78 > > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > > This electronic message transmission contains information from > > Panalpina and is confidential or privileged. This information is > > intended only for the person (s) named above. If you are not the > > intended recipient, any disclosure, copying, distribution or use or > > any other action based on the contents of this information is > > strictly prohibited. > > > > If you receive this electronic transmission in error, please notify > > the sender by e-mail, telephone or fax at the numbers listed above. > Thank you. > > > ** > > >
Re: Real world deduplication rates with TSM 7.1 and container pools
Hi, but already client side compressed data isn't affected, isn't it? what if data is replicated. Is that data then compressed? Regards, Alex Von:Del Hoobler An: ADSM-L@VM.MARIST.EDU, Datum: 22.03.2016 15:14 Betreff:Re: [ADSM-L] Real world deduplication rates with TSM 7.1 and container pools Gesendet von: "ADSM: Dist Stor Manager" Hi David, No. Only newly stored data will be compressed. Del "ADSM: Dist Stor Manager" wrote on 03/22/2016 09:41:27 AM: > From: David Ehresman > To: ADSM-L@VM.MARIST.EDU > Date: 03/22/2016 09:42 AM > Subject: Re: Real world deduplication rates with TSM 7.1 and container pools > Sent by: "ADSM: Dist Stor Manager" > > Del, > > After upgrading to 7.1.5 is there a way to get pre-existing > container data compressed? > > David > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On > Behalf Of Del Hoobler > Sent: Monday, March 21, 2016 5:53 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: [ADSM-L] Real world deduplication rates with TSM 7.1 > and container pools > > I think most of you know Spectrum Protect just added in-line compression > to the container and cloud deduplicated pools in version 7.1.5: > > > https://urldefense.proofpoint.com/v2/url? > u=http-3A__www.ibm.com_support_knowledgecenter_SSGSG7-5F7.1. > 5_srv.common_r-5Ftechchg-5Fsrv-5Fcompress-5F715.html&d=AwIFAg&c=SgMrq23dbjbGX6e0ZsSHgEZX6A4IAf1SO3AJ2bNrHlk&r=dOGCMY197NTNH1k_wcsrWS3_fxedKW4rpKJ8cHCD2L8&m=OMHhA4Nmsv9Buo24T83zOFo8w4mvoMyvC_nagZn36T0&s=C4shp8VD_EyXjruT5Lu2v68tMVpfpzHcluAuRJfaIK0&e= > > Adding incremental forever, the new in-line deduplication - client or > server-based (7.1.3), new in-line compression (7.1.5) I think you will > find that Protect continues to drive overall data reduction. This is being > done in the software, so you can choose what disk you want to use. > > I encourage folks to try out the the new deduplication along with the new > compression to see how it helps with the overall data reduction. > > > Thank you, > > Del > > ---- > > > > "ADSM: Dist Stor Manager" wrote on 03/18/2016 > 10:41:06 AM: > > > From: PAC Brion Arnaud > > To: ADSM-L@VM.MARIST.EDU > > Date: 03/18/2016 10:41 AM > > Subject: Real world deduplication rates with TSM 7.1 and container pools > > Sent by: "ADSM: Dist Stor Manager" > > > > Hi All, > > > > We are currently testing TSM 7.1 deduplication feature, in > > conjunction with container based storage pools. > > So far, my test TSM instances, installed with such a setup are > > reporting dedup percentage of 45 %, means dedup factor around 1.81, > > using a sample of clients which are representative of our production > > environment. > > This is unfortunately pretty far from what was promised by IBM > > (dedup factor of 4) ... > > > > I'm wondering if anybody making use of container based storage pools > > and deduplication would be sharing his deduplication factor, so that > > I could have a better appreciation of real world figures. > > If you would be so kind to share your information (possibly with the > > kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention > > values ...) I would be very grateful ! > > > > Thanks in advance for appreciated feedback. > > > > Cheers. > > > > Arnaud > > > > > ** > > Backup and Recovery Systems Administrator > > Panalpina Management Ltd., Basle, Switzerland, > > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > > Direct: +41 (61) 226 19 78 > > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > > This electronic message transmission contains information from > > Panalpina and is confidential or privileged. This information is > > intended only for the person (s) named above. If you are not the > > intended recipient, any disclosure, copying, distribution or use or > > any other action based on the contents of this information is > > strictly prohibited. > > > > If you receive this electronic transmission in error, please notify > > the sender by e-mail, telephone or fax at the numbers listed above. > Thank you. > > > ** > > >
Re: Real world deduplication rates with TSM 7.1 and container pools
Hi David, No. Only newly stored data will be compressed. Del "ADSM: Dist Stor Manager" wrote on 03/22/2016 09:41:27 AM: > From: David Ehresman > To: ADSM-L@VM.MARIST.EDU > Date: 03/22/2016 09:42 AM > Subject: Re: Real world deduplication rates with TSM 7.1 and container pools > Sent by: "ADSM: Dist Stor Manager" > > Del, > > After upgrading to 7.1.5 is there a way to get pre-existing > container data compressed? > > David > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On > Behalf Of Del Hoobler > Sent: Monday, March 21, 2016 5:53 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: [ADSM-L] Real world deduplication rates with TSM 7.1 > and container pools > > I think most of you know Spectrum Protect just added in-line compression > to the container and cloud deduplicated pools in version 7.1.5: > > > https://urldefense.proofpoint.com/v2/url? > u=http-3A__www.ibm.com_support_knowledgecenter_SSGSG7-5F7.1. > 5_srv.common_r-5Ftechchg-5Fsrv-5Fcompress-5F715.html&d=AwIFAg&c=SgMrq23dbjbGX6e0ZsSHgEZX6A4IAf1SO3AJ2bNrHlk&r=dOGCMY197NTNH1k_wcsrWS3_fxedKW4rpKJ8cHCD2L8&m=OMHhA4Nmsv9Buo24T83zOFo8w4mvoMyvC_nagZn36T0&s=C4shp8VD_EyXjruT5Lu2v68tMVpfpzHcluAuRJfaIK0&e= > > Adding incremental forever, the new in-line deduplication - client or > server-based (7.1.3), new in-line compression (7.1.5) I think you will > find that Protect continues to drive overall data reduction. This is being > done in the software, so you can choose what disk you want to use. > > I encourage folks to try out the the new deduplication along with the new > compression to see how it helps with the overall data reduction. > > > Thank you, > > Del > > > > > > "ADSM: Dist Stor Manager" wrote on 03/18/2016 > 10:41:06 AM: > > > From: PAC Brion Arnaud > > To: ADSM-L@VM.MARIST.EDU > > Date: 03/18/2016 10:41 AM > > Subject: Real world deduplication rates with TSM 7.1 and container pools > > Sent by: "ADSM: Dist Stor Manager" > > > > Hi All, > > > > We are currently testing TSM 7.1 deduplication feature, in > > conjunction with container based storage pools. > > So far, my test TSM instances, installed with such a setup are > > reporting dedup percentage of 45 %, means dedup factor around 1.81, > > using a sample of clients which are representative of our production > > environment. > > This is unfortunately pretty far from what was promised by IBM > > (dedup factor of 4) ... > > > > I'm wondering if anybody making use of container based storage pools > > and deduplication would be sharing his deduplication factor, so that > > I could have a better appreciation of real world figures. > > If you would be so kind to share your information (possibly with the > > kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention > > values ...) I would be very grateful ! > > > > Thanks in advance for appreciated feedback. > > > > Cheers. > > > > Arnaud > > > > > ** > > Backup and Recovery Systems Administrator > > Panalpina Management Ltd., Basle, Switzerland, > > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > > Direct: +41 (61) 226 19 78 > > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > > This electronic message transmission contains information from > > Panalpina and is confidential or privileged. This information is > > intended only for the person (s) named above. If you are not the > > intended recipient, any disclosure, copying, distribution or use or > > any other action based on the contents of this information is > > strictly prohibited. > > > > If you receive this electronic transmission in error, please notify > > the sender by e-mail, telephone or fax at the numbers listed above. > Thank you. > > > ** > > >
Re: Real world deduplication rates with TSM 7.1 and container pools
Del, After upgrading to 7.1.5 is there a way to get pre-existing container data compressed? David -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Del Hoobler Sent: Monday, March 21, 2016 5:53 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Real world deduplication rates with TSM 7.1 and container pools I think most of you know Spectrum Protect just added in-line compression to the container and cloud deduplicated pools in version 7.1.5: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ibm.com_support_knowledgecenter_SSGSG7-5F7.1.5_srv.common_r-5Ftechchg-5Fsrv-5Fcompress-5F715.html&d=AwIFAg&c=SgMrq23dbjbGX6e0ZsSHgEZX6A4IAf1SO3AJ2bNrHlk&r=dOGCMY197NTNH1k_wcsrWS3_fxedKW4rpKJ8cHCD2L8&m=OMHhA4Nmsv9Buo24T83zOFo8w4mvoMyvC_nagZn36T0&s=C4shp8VD_EyXjruT5Lu2v68tMVpfpzHcluAuRJfaIK0&e= Adding incremental forever, the new in-line deduplication - client or server-based (7.1.3), new in-line compression (7.1.5) I think you will find that Protect continues to drive overall data reduction. This is being done in the software, so you can choose what disk you want to use. I encourage folks to try out the the new deduplication along with the new compression to see how it helps with the overall data reduction. Thank you, Del "ADSM: Dist Stor Manager" wrote on 03/18/2016 10:41:06 AM: > From: PAC Brion Arnaud > To: ADSM-L@VM.MARIST.EDU > Date: 03/18/2016 10:41 AM > Subject: Real world deduplication rates with TSM 7.1 and container pools > Sent by: "ADSM: Dist Stor Manager" > > Hi All, > > We are currently testing TSM 7.1 deduplication feature, in > conjunction with container based storage pools. > So far, my test TSM instances, installed with such a setup are > reporting dedup percentage of 45 %, means dedup factor around 1.81, > using a sample of clients which are representative of our production > environment. > This is unfortunately pretty far from what was promised by IBM > (dedup factor of 4) ... > > I'm wondering if anybody making use of container based storage pools > and deduplication would be sharing his deduplication factor, so that > I could have a better appreciation of real world figures. > If you would be so kind to share your information (possibly with the > kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention > values ...) I would be very grateful ! > > Thanks in advance for appreciated feedback. > > Cheers. > > Arnaud > > ** > Backup and Recovery Systems Administrator > Panalpina Management Ltd., Basle, Switzerland, > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > Direct: +41 (61) 226 19 78 > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > This electronic message transmission contains information from > Panalpina and is confidential or privileged. This information is > intended only for the person (s) named above. If you are not the > intended recipient, any disclosure, copying, distribution or use or > any other action based on the contents of this information is > strictly prohibited. > > If you receive this electronic transmission in error, please notify > the sender by e-mail, telephone or fax at the numbers listed above. Thank you. > ** >
Re: Deduplication questions, again
Arnaud, I too am seeing odd percentages where containerpools and dedup is concerned. I have a small remote server pair that protects ~23 TB of pre dedup data, but my containerpools show an occupancy of ~10 TB, which should be a data reduction of over 50%. However, a q stg on the containerpool only shows a data reduction ratio of 21%. Of note, I use client-side dedup on all the client nodes at this particular site and I think that's mucking up the data reduction numbers on the containerpool. The 21% figure seems to be the reduction AFTER client-side dedup, not the total data reduction. It's confusing. On the plus side, I just put in the new 7.1.5 code at this site and the compression is working well and does not appear to add a noticeable amount CPU cycles during ingest. Since the install date on the 18th, I've backed up around 1 TB pre-dedup and the compression savings are rated at ~400 GB, which is pretty impressive. I'm going to do a test restore today and see how it performs but so far so good. __ Matthew McGeary Technical Specialist - Infrastructure PotashCorp T: (306) 933-8921 www.potashcorp.com From: PAC Brion Arnaud To: ADSM-L@VM.MARIST.EDU Date: 03/22/2016 03:52 AM Subject:[ADSM-L] Deduplication questions, again Sent by:"ADSM: Dist Stor Manager" Hi All, Another question in regards of TSM container based deduplicated pools ... Are you experiencing the same behavior than this : using "q stg f=d" targeting a deduped container based storage pool, I observe following output : q stg f=d Storage Pool Name: CONT_STG Storage Pool Type: Primary Device Class Name: Storage Type: DIRECTORY Cloud Type: Cloud URL: Cloud Identity: Cloud Location: Estimated Capacity: 5,087 G Space Trigger Util: Pct Util: 55.8 Pct Migr: Pct Logical: 100.0 High Mig Pct: Skipped few lines ... Compressed: No Deduplication Savings: 0 (0%) Compression Savings: 0 (0%) Total Space Saved: 0 (0%) Auto-copy Mode: Contains Data Deduplicated by Client?: Maximum Simultaneous Writers: No Limit Protection Storage Pool: CONT_STG Date of Last Protection: 03/22/16 05:00:27 Deduplicate Requires Backup?: Encrypted: Space Utilized(MB): Note the "deduplication savings" output ( 0 %) However, using "q dedupstats" on the same stgpool, I get following output : (just a snippet of it) Date/Time: 03/17/16 16:31:24 Storage Pool Name: CONT_STG Node Name: CH1RS901 Filespace Name: / FSID: 3 Type: Bkup Total Saving Percentage: 78.11 Total Data Protected (MB): 170 Date/Time: 03/17/16 16:31:24 Storage Pool Name: CONT_STG Node Name: CH1RS901 Filespace Name: /usr FSID: 4 Type: Bkup Total Saving Percentage: 62.25 Total Data Protected (MB): 2,260 How does it come that on one side I witness dedup, but not on the other one ? Thanks for enlightenments ! Cheers. Arnaud
Re: Real world deduplication rates with TSM 7.1 and container pools
Hi Arnaud, I was having a conversation about this other day. With today's social media blitz, it can become an overload of information and different people like to learn about things in different ways. Some people told me they like something in their InBox. Others said their InBox was too full already. Others told me Twitter. Others LinkedIn. Some said ADSM-L. You indicate below that you would like to see information about new releases in the "IBM my notifications" or the "Technical Support Newsletter". One option is post it everywhere and it. This is all good feedback. Thank you, Del "ADSM: Dist Stor Manager" wrote on 03/22/2016 05:36:46 AM: > From: PAC Brion Arnaud > To: ADSM-L@VM.MARIST.EDU > Date: 03/22/2016 05:37 AM > Subject: Re: Real world deduplication rates with TSM 7.1 and container pools > Sent by: "ADSM: Dist Stor Manager" > > Hi Del, > > Thanks for notifying that new feature : I will test it immediately ! > I'm however sorry to inform you that contrarily to your assumption, > and despite the fact that I subscribed to "IBM my notifications" as > well as to " Technical Support Newsletter", both related to Spectrum > protect, that information did not make its way to me ... > I assume that lots of TSM admins, all of them being pretty occupied > by daily business activities, and not mandatorily having enough time > to seek information are in the same case ! > I incidentally learnt as well thru your link that " Repair damaged > data on a replication target" feature had been added in TSM 7.1.5, > which is definitively a good thing, but I find it a shame that such > information is not pushed to customers thru the means of > newsletters, and that we have to dig in the depths of IBM's web site > to get such - important - information. > I fully understand that TSM / Spectrum protect has been subject to > numerous bugs fixes and improvements in the latest months ( I never > lived such an upgrade waltz during 15 years of TSM administration), > but to my eyes there's still lot of space for communication > improvement by IBM ! > Of course, only MHO ... > > Cheers. > > Arnaud > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On > Behalf Of Del Hoobler > Sent: Monday, March 21, 2016 10:53 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: Real world deduplication rates with TSM 7.1 and container pools > > I think most of you know Spectrum Protect just added in-line compression > to the container and cloud deduplicated pools in version 7.1.5: > > > http://www.ibm.com/support/knowledgecenter/SSGSG7_7.1.5/srv.common/ > r_techchg_srv_compress_715.html > > Adding incremental forever, the new in-line deduplication - client or > server-based (7.1.3), new in-line compression (7.1.5) I think you will > find that Protect continues to drive overall data reduction. This is being > done in the software, so you can choose what disk you want to use. > > I encourage folks to try out the the new deduplication along with the new > compression to see how it helps with the overall data reduction. > > > Thank you, > > Del > > > > > > "ADSM: Dist Stor Manager" wrote on 03/18/2016 > 10:41:06 AM: > > > From: PAC Brion Arnaud > > To: ADSM-L@VM.MARIST.EDU > > Date: 03/18/2016 10:41 AM > > Subject: Real world deduplication rates with TSM 7.1 and container pools > > Sent by: "ADSM: Dist Stor Manager" > > > > Hi All, > > > > We are currently testing TSM 7.1 deduplication feature, in > > conjunction with container based storage pools. > > So far, my test TSM instances, installed with such a setup are > > reporting dedup percentage of 45 %, means dedup factor around 1.81, > > using a sample of clients which are representative of our production > > environment. > > This is unfortunately pretty far from what was promised by IBM > > (dedup factor of 4) ... > > > > I'm wondering if anybody making use of container based storage pools > > and deduplication would be sharing his deduplication factor, so that > > I could have a better appreciation of real world figures. > > If you would be so kind to share your information (possibly with the > > kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention > > values ...) I would be very grateful ! > > > > Thanks in advance for appreciated feedback. > > > > Cheers. > > > > Arnaud > > >
Deduplication questions, again
Hi All, Another question in regards of TSM container based deduplicated pools ... Are you experiencing the same behavior than this : using "q stg f=d" targeting a deduped container based storage pool, I observe following output : q stg f=d Storage Pool Name: CONT_STG Storage Pool Type: Primary Device Class Name: Storage Type: DIRECTORY Cloud Type: Cloud URL: Cloud Identity: Cloud Location: Estimated Capacity: 5,087 G Space Trigger Util: Pct Util: 55.8 Pct Migr: Pct Logical: 100.0 High Mig Pct: Skipped few lines ... Compressed: No Deduplication Savings: 0 (0%) Compression Savings: 0 (0%) Total Space Saved: 0 (0%) Auto-copy Mode: Contains Data Deduplicated by Client?: Maximum Simultaneous Writers: No Limit Protection Storage Pool: CONT_STG Date of Last Protection: 03/22/16 05:00:27 Deduplicate Requires Backup?: Encrypted: Space Utilized(MB): Note the "deduplication savings" output ( 0 %) However, using "q dedupstats" on the same stgpool, I get following output : (just a snippet of it) Date/Time: 03/17/16 16:31:24 Storage Pool Name: CONT_STG Node Name: CH1RS901 Filespace Name: / FSID: 3 Type: Bkup Total Saving Percentage: 78.11 Total Data Protected (MB): 170 Date/Time: 03/17/16 16:31:24 Storage Pool Name: CONT_STG Node Name: CH1RS901 Filespace Name: /usr FSID: 4 Type: Bkup Total Saving Percentage: 62.25 Total Data Protected (MB): 2,260 How does it come that on one side I witness dedup, but not on the other one ? Thanks for enlightenments ! Cheers. Arnaud
Re: Real world deduplication rates with TSM 7.1 and container pools
Hi Del, Thanks for notifying that new feature : I will test it immediately ! I'm however sorry to inform you that contrarily to your assumption, and despite the fact that I subscribed to "IBM my notifications" as well as to " Technical Support Newsletter", both related to Spectrum protect, that information did not make its way to me ... I assume that lots of TSM admins, all of them being pretty occupied by daily business activities, and not mandatorily having enough time to seek information are in the same case ! I incidentally learnt as well thru your link that " Repair damaged data on a replication target" feature had been added in TSM 7.1.5, which is definitively a good thing, but I find it a shame that such information is not pushed to customers thru the means of newsletters, and that we have to dig in the depths of IBM's web site to get such - important - information. I fully understand that TSM / Spectrum protect has been subject to numerous bugs fixes and improvements in the latest months ( I never lived such an upgrade waltz during 15 years of TSM administration), but to my eyes there's still lot of space for communication improvement by IBM ! Of course, only MHO ... Cheers. Arnaud -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Del Hoobler Sent: Monday, March 21, 2016 10:53 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: Real world deduplication rates with TSM 7.1 and container pools I think most of you know Spectrum Protect just added in-line compression to the container and cloud deduplicated pools in version 7.1.5: http://www.ibm.com/support/knowledgecenter/SSGSG7_7.1.5/srv.common/r_techchg_srv_compress_715.html Adding incremental forever, the new in-line deduplication - client or server-based (7.1.3), new in-line compression (7.1.5) I think you will find that Protect continues to drive overall data reduction. This is being done in the software, so you can choose what disk you want to use. I encourage folks to try out the the new deduplication along with the new compression to see how it helps with the overall data reduction. Thank you, Del "ADSM: Dist Stor Manager" wrote on 03/18/2016 10:41:06 AM: > From: PAC Brion Arnaud > To: ADSM-L@VM.MARIST.EDU > Date: 03/18/2016 10:41 AM > Subject: Real world deduplication rates with TSM 7.1 and container pools > Sent by: "ADSM: Dist Stor Manager" > > Hi All, > > We are currently testing TSM 7.1 deduplication feature, in > conjunction with container based storage pools. > So far, my test TSM instances, installed with such a setup are > reporting dedup percentage of 45 %, means dedup factor around 1.81, > using a sample of clients which are representative of our production > environment. > This is unfortunately pretty far from what was promised by IBM > (dedup factor of 4) ... > > I'm wondering if anybody making use of container based storage pools > and deduplication would be sharing his deduplication factor, so that > I could have a better appreciation of real world figures. > If you would be so kind to share your information (possibly with the > kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention > values ...) I would be very grateful ! > > Thanks in advance for appreciated feedback. > > Cheers. > > Arnaud > > ** > Backup and Recovery Systems Administrator > Panalpina Management Ltd., Basle, Switzerland, > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > Direct: +41 (61) 226 19 78 > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > This electronic message transmission contains information from > Panalpina and is confidential or privileged. This information is > intended only for the person (s) named above. If you are not the > intended recipient, any disclosure, copying, distribution or use or > any other action based on the contents of this information is > strictly prohibited. > > If you receive this electronic transmission in error, please notify > the sender by e-mail, telephone or fax at the numbers listed above. Thank you. > ** >
Re: Real world deduplication rates with TSM 7.1 and container pools
I think most of you know Spectrum Protect just added in-line compression to the container and cloud deduplicated pools in version 7.1.5: http://www.ibm.com/support/knowledgecenter/SSGSG7_7.1.5/srv.common/r_techchg_srv_compress_715.html Adding incremental forever, the new in-line deduplication - client or server-based (7.1.3), new in-line compression (7.1.5) I think you will find that Protect continues to drive overall data reduction. This is being done in the software, so you can choose what disk you want to use. I encourage folks to try out the the new deduplication along with the new compression to see how it helps with the overall data reduction. Thank you, Del "ADSM: Dist Stor Manager" wrote on 03/18/2016 10:41:06 AM: > From: PAC Brion Arnaud > To: ADSM-L@VM.MARIST.EDU > Date: 03/18/2016 10:41 AM > Subject: Real world deduplication rates with TSM 7.1 and container pools > Sent by: "ADSM: Dist Stor Manager" > > Hi All, > > We are currently testing TSM 7.1 deduplication feature, in > conjunction with container based storage pools. > So far, my test TSM instances, installed with such a setup are > reporting dedup percentage of 45 %, means dedup factor around 1.81, > using a sample of clients which are representative of our production > environment. > This is unfortunately pretty far from what was promised by IBM > (dedup factor of 4) ... > > I'm wondering if anybody making use of container based storage pools > and deduplication would be sharing his deduplication factor, so that > I could have a better appreciation of real world figures. > If you would be so kind to share your information (possibly with the > kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention > values ...) I would be very grateful ! > > Thanks in advance for appreciated feedback. > > Cheers. > > Arnaud > > ** > Backup and Recovery Systems Administrator > Panalpina Management Ltd., Basle, Switzerland, > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > Direct: +41 (61) 226 19 78 > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > This electronic message transmission contains information from > Panalpina and is confidential or privileged. This information is > intended only for the person (s) named above. If you are not the > intended recipient, any disclosure, copying, distribution or use or > any other action based on the contents of this information is > strictly prohibited. > > If you receive this electronic transmission in error, please notify > the sender by e-mail, telephone or fax at the numbers listed above. Thank you. > ** >
Re: Real world deduplication rates with TSM 7.1 and container pools
Hi Bas, Thanks for feedback. On our side average dedup is 38% for VM's data , with daily incremental and 30 days retention ... Cheers. Arnaud -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Bas van Kampen Sent: Monday, March 21, 2016 12:15 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: Real world deduplication rates with TSM 7.1 and container pools We have around 900 VM's in 2 container pools and we get 35% dedup percentage. Regards, Bas van Kampen On 18-3-2016 15:41, PAC Brion Arnaud wrote: > Hi All, > > We are currently testing TSM 7.1 deduplication feature, in conjunction with > container based storage pools. > So far, my test TSM instances, installed with such a setup are reporting > dedup percentage of 45 %, means dedup factor around 1.81, using a sample of > clients which are representative of our production environment. > This is unfortunately pretty far from what was promised by IBM (dedup factor > of 4) ... > > I'm wondering if anybody making use of container based storage pools and > deduplication would be sharing his deduplication factor, so that I could have > a better appreciation of real world figures. > If you would be so kind to share your information (possibly with the kind of > backed-up data i.e. VM, DB, NAS, Exchange, and retention values ...) I would > be very grateful ! > > Thanks in advance for appreciated feedback. > > Cheers. > > Arnaud > > ** > Backup and Recovery Systems Administrator > Panalpina Management Ltd., Basle, Switzerland, > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > Direct: +41 (61) 226 19 78 > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > This electronic message transmission contains information from Panalpina and > is confidential or privileged. This information is intended only for the > person (s) named above. If you are not the intended recipient, any > disclosure, copying, distribution or use or any other action based on the > contents of this information is strictly prohibited. > > If you receive this electronic transmission in error, please notify the > sender by e-mail, telephone or fax at the numbers listed above. Thank you. > ** >
Re: Real world deduplication rates with TSM 7.1 and container pools
We have around 900 VM's in 2 container pools and we get 35% dedup percentage. Regards, Bas van Kampen On 18-3-2016 15:41, PAC Brion Arnaud wrote: Hi All, We are currently testing TSM 7.1 deduplication feature, in conjunction with container based storage pools. So far, my test TSM instances, installed with such a setup are reporting dedup percentage of 45 %, means dedup factor around 1.81, using a sample of clients which are representative of our production environment. This is unfortunately pretty far from what was promised by IBM (dedup factor of 4) ... I'm wondering if anybody making use of container based storage pools and deduplication would be sharing his deduplication factor, so that I could have a better appreciation of real world figures. If you would be so kind to share your information (possibly with the kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention values ...) I would be very grateful ! Thanks in advance for appreciated feedback. Cheers. Arnaud ** Backup and Recovery Systems Administrator Panalpina Management Ltd., Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 Direct: +41 (61) 226 19 78 e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> This electronic message transmission contains information from Panalpina and is confidential or privileged. This information is intended only for the person (s) named above. If you are not the intended recipient, any disclosure, copying, distribution or use or any other action based on the contents of this information is strictly prohibited. If you receive this electronic transmission in error, please notify the sender by e-mail, telephone or fax at the numbers listed above. Thank you. **
Re: Real world deduplication rates with TSM 7.1 and container pools
We are also getting around 60% in dedup savings. When I compare similar data going to tape I find that simple compression saves us about 40%. Hans Chr. On Mon, Mar 21, 2016 at 8:54 AM, PAC Brion Arnaud < arnaud.br...@panalpina.com> wrote: > Michael, Ken and Stephan, > > Thanks a lot for valuable input ! > > Based on your feedback, I believe that IBM is effectively overselling it's > product : best value announced is around 65 %, which means dedup factor > around 2.5 ... > > Reason why I asked is that we are currently making use of Data Domain > VTL's in our shop, which at present time have a dedup factor of 7.7, but > are aging and should soon be retired. > I was wondering if their replacement with a combination of cheap disk > storage and TSM deduped container would be a good idea ... > So far I still need to be convinced : disk (IBM, Hitachi ...) is way > cheaper than a VTL, but TSM dedup rates are seeming to be less than > expected : this will probably force us to buy more disks, thus making such > a solution less attractive. > > Cheers. > > Arnaud > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Stefan Folkerts > Sent: Friday, March 18, 2016 5:32 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: Real world deduplication rates with TSM 7.1 and container > pools > > We see around 50-65% deduplication savings on the fileclass storagepools, > most common seems to be around 55%. > It requires what I call "deep reclaims" with very low values that need a > lot of time. > We are seeing 60-70% on containerpools but on average it is more like 65% > but that is based on a much smaller install base. > Both in heterogeneous environments. > > > On Fri, Mar 18, 2016 at 5:06 PM, Ken Bury wrote: > > > I have two 7.1.4 servers, one with devclass file with dedupe, and the > other > > is using containers. The two servers are in a node replication pair so > the > > data on each server is exactly the same. The workload is almost > exclusively > > vmware backups with datamover dedupe and compression. The data reduction > > for both pools is 89%. I like what I am getting from container pools and > > replication. > > On Fri, Mar 18, 2016 at 11:35 Ryder, Michael S < > michael_s.ry...@roche.com> > > wrote: > > > > > Hi Arnaud > > > > > > If IBM made that commitment in black and white, then you should hold > > their > > > feet to the fire. But I am willing to bet this was a salesman > promising > > > "similar performance." > > > > > > There is no technology I know where any deduplication factor can be > > > guaranteed. Perhaps "UP to 4" for certain kinds of data... And > overall > > > reduction of storage is what you should be comparing, not simply the > > > deduplication percentage. > > > > > > Here, try reading at least the introduction of this document, " > Effective > > > Planning and Use of TSM V6 and V7 Deduplication" > > > > > > > > > > > > http://www.ibm.com/developerworks/community/wikis/form/anonymous/api/wiki/f731037e-c0cf-436e-88b5-862b9a6597c3/page/82e361b4-8e96-42cf-b559-0b77df9aed2c/attachment/5cf980b3-807f-464b-a1c0-b896b0cec7e6/media/TSM%20Dedup%20Best%20Practices%20-%20v2.1.pdf > > > > > > We haven't adopted the directory-container pools yet due to their > lacking > > > of support for important features like migration and copy pools, but I > > have > > > no doubt that IBM will be delivering those abilities soon; otherwise, > > there > > > are very limited use-cases for directory-containers. > > > > > > Best regards, > > > > > > Mike > > > RMD IT Client Services > > > > > > On Fri, Mar 18, 2016 at 10:41 AM, PAC Brion Arnaud < > > > arnaud.br...@panalpina.com> wrote: > > > > > > > Hi All, > > > > > > > > We are currently testing TSM 7.1 deduplication feature, in > conjunction > > > > with container based storage pools. > > > > So far, my test TSM instances, installed with such a setup are > > reporting > > > > dedup percentage of 45 %, means dedup factor around 1.81, using a > > sample > > > of > > > > clients which are representative of our production environment. > > > > This is unfortunately pretty far from what was promised by IBM (dedup > > > > factor of 4) ... > > > > > > > > I'm wondering if anybody making use of container bas
Re: Real world deduplication rates with TSM 7.1 and container pools
Michael, Ken and Stephan, Thanks a lot for valuable input ! Based on your feedback, I believe that IBM is effectively overselling it's product : best value announced is around 65 %, which means dedup factor around 2.5 ... Reason why I asked is that we are currently making use of Data Domain VTL's in our shop, which at present time have a dedup factor of 7.7, but are aging and should soon be retired. I was wondering if their replacement with a combination of cheap disk storage and TSM deduped container would be a good idea ... So far I still need to be convinced : disk (IBM, Hitachi ...) is way cheaper than a VTL, but TSM dedup rates are seeming to be less than expected : this will probably force us to buy more disks, thus making such a solution less attractive. Cheers. Arnaud -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Stefan Folkerts Sent: Friday, March 18, 2016 5:32 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: Real world deduplication rates with TSM 7.1 and container pools We see around 50-65% deduplication savings on the fileclass storagepools, most common seems to be around 55%. It requires what I call "deep reclaims" with very low values that need a lot of time. We are seeing 60-70% on containerpools but on average it is more like 65% but that is based on a much smaller install base. Both in heterogeneous environments. On Fri, Mar 18, 2016 at 5:06 PM, Ken Bury wrote: > I have two 7.1.4 servers, one with devclass file with dedupe, and the other > is using containers. The two servers are in a node replication pair so the > data on each server is exactly the same. The workload is almost exclusively > vmware backups with datamover dedupe and compression. The data reduction > for both pools is 89%. I like what I am getting from container pools and > replication. > On Fri, Mar 18, 2016 at 11:35 Ryder, Michael S > wrote: > > > Hi Arnaud > > > > If IBM made that commitment in black and white, then you should hold > their > > feet to the fire. But I am willing to bet this was a salesman promising > > "similar performance." > > > > There is no technology I know where any deduplication factor can be > > guaranteed. Perhaps "UP to 4" for certain kinds of data... And overall > > reduction of storage is what you should be comparing, not simply the > > deduplication percentage. > > > > Here, try reading at least the introduction of this document, " Effective > > Planning and Use of TSM V6 and V7 Deduplication" > > > > > > > http://www.ibm.com/developerworks/community/wikis/form/anonymous/api/wiki/f731037e-c0cf-436e-88b5-862b9a6597c3/page/82e361b4-8e96-42cf-b559-0b77df9aed2c/attachment/5cf980b3-807f-464b-a1c0-b896b0cec7e6/media/TSM%20Dedup%20Best%20Practices%20-%20v2.1.pdf > > > > We haven't adopted the directory-container pools yet due to their lacking > > of support for important features like migration and copy pools, but I > have > > no doubt that IBM will be delivering those abilities soon; otherwise, > there > > are very limited use-cases for directory-containers. > > > > Best regards, > > > > Mike > > RMD IT Client Services > > > > On Fri, Mar 18, 2016 at 10:41 AM, PAC Brion Arnaud < > > arnaud.br...@panalpina.com> wrote: > > > > > Hi All, > > > > > > We are currently testing TSM 7.1 deduplication feature, in conjunction > > > with container based storage pools. > > > So far, my test TSM instances, installed with such a setup are > reporting > > > dedup percentage of 45 %, means dedup factor around 1.81, using a > sample > > of > > > clients which are representative of our production environment. > > > This is unfortunately pretty far from what was promised by IBM (dedup > > > factor of 4) ... > > > > > > I'm wondering if anybody making use of container based storage pools > and > > > deduplication would be sharing his deduplication factor, so that I > could > > > have a better appreciation of real world figures. > > > If you would be so kind to share your information (possibly with the > kind > > > of backed-up data i.e. VM, DB, NAS, Exchange, and retention values > ...) > > I > > > would be very grateful ! > > > > > > Thanks in advance for appreciated feedback. > > > > > > Cheers. > > > > > > Arnaud > > > > > > > > > > > > ** > > > Backup and Recovery Systems A
Real world deduplication rates with TSM 7.1 and container pools
Hi All, We are currently testing TSM 7.1 deduplication feature, in conjunction with container based storage pools. So far, my test TSM instances, installed with such a setup are reporting dedup percentage of 45 %, means dedup factor around 1.81, using a sample of clients which are representative of our production environment. This is unfortunately pretty far from what was promised by IBM (dedup factor of 4) ... I'm wondering if anybody making use of container based storage pools and deduplication would be sharing his deduplication factor, so that I could have a better appreciation of real world figures. If you would be so kind to share your information (possibly with the kind of backed-up data i.e. VM, DB, NAS, Exchange, and retention values ...) I would be very grateful ! Thanks in advance for appreciated feedback. Cheers. Arnaud ** Backup and Recovery Systems Administrator Panalpina Management Ltd., Basle, Switzerland, CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 Direct: +41 (61) 226 19 78 e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> This electronic message transmission contains information from Panalpina and is confidential or privileged. This information is intended only for the person (s) named above. If you are not the intended recipient, any disclosure, copying, distribution or use or any other action based on the contents of this information is strictly prohibited. If you receive this electronic transmission in error, please notify the sender by e-mail, telephone or fax at the numbers listed above. Thank you. **
Re: Real world deduplication rates with TSM 7.1 and container pools
I have two 7.1.4 servers, one with devclass file with dedupe, and the other is using containers. The two servers are in a node replication pair so the data on each server is exactly the same. The workload is almost exclusively vmware backups with datamover dedupe and compression. The data reduction for both pools is 89%. I like what I am getting from container pools and replication. On Fri, Mar 18, 2016 at 11:35 Ryder, Michael S wrote: > Hi Arnaud > > If IBM made that commitment in black and white, then you should hold their > feet to the fire. But I am willing to bet this was a salesman promising > "similar performance." > > There is no technology I know where any deduplication factor can be > guaranteed. Perhaps "UP to 4" for certain kinds of data... And overall > reduction of storage is what you should be comparing, not simply the > deduplication percentage. > > Here, try reading at least the introduction of this document, " Effective > Planning and Use of TSM V6 and V7 Deduplication" > > > http://www.ibm.com/developerworks/community/wikis/form/anonymous/api/wiki/f731037e-c0cf-436e-88b5-862b9a6597c3/page/82e361b4-8e96-42cf-b559-0b77df9aed2c/attachment/5cf980b3-807f-464b-a1c0-b896b0cec7e6/media/TSM%20Dedup%20Best%20Practices%20-%20v2.1.pdf > > We haven't adopted the directory-container pools yet due to their lacking > of support for important features like migration and copy pools, but I have > no doubt that IBM will be delivering those abilities soon; otherwise, there > are very limited use-cases for directory-containers. > > Best regards, > > Mike > RMD IT Client Services > > On Fri, Mar 18, 2016 at 10:41 AM, PAC Brion Arnaud < > arnaud.br...@panalpina.com> wrote: > > > Hi All, > > > > We are currently testing TSM 7.1 deduplication feature, in conjunction > > with container based storage pools. > > So far, my test TSM instances, installed with such a setup are reporting > > dedup percentage of 45 %, means dedup factor around 1.81, using a sample > of > > clients which are representative of our production environment. > > This is unfortunately pretty far from what was promised by IBM (dedup > > factor of 4) ... > > > > I'm wondering if anybody making use of container based storage pools and > > deduplication would be sharing his deduplication factor, so that I could > > have a better appreciation of real world figures. > > If you would be so kind to share your information (possibly with the kind > > of backed-up data i.e. VM, DB, NAS, Exchange, and retention values ...) > I > > would be very grateful ! > > > > Thanks in advance for appreciated feedback. > > > > Cheers. > > > > Arnaud > > > > > > > ** > > Backup and Recovery Systems Administrator > > Panalpina Management Ltd., Basle, Switzerland, > > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > > Direct: +41 (61) 226 19 78 > > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > > This electronic message transmission contains information from Panalpina > > and is confidential or privileged. This information is intended only for > > the person (s) named above. If you are not the intended recipient, any > > disclosure, copying, distribution or use or any other action based on the > > contents of this information is strictly prohibited. > > > > If you receive this electronic transmission in error, please notify the > > sender by e-mail, telephone or fax at the numbers listed above. Thank > you. > > > > > ** > > > > > -- Ken
Re: Real world deduplication rates with TSM 7.1 and container pools
Hi Arnaud If IBM made that commitment in black and white, then you should hold their feet to the fire. But I am willing to bet this was a salesman promising "similar performance." There is no technology I know where any deduplication factor can be guaranteed. Perhaps "UP to 4" for certain kinds of data... And overall reduction of storage is what you should be comparing, not simply the deduplication percentage. Here, try reading at least the introduction of this document, " Effective Planning and Use of TSM V6 and V7 Deduplication" http://www.ibm.com/developerworks/community/wikis/form/anonymous/api/wiki/f731037e-c0cf-436e-88b5-862b9a6597c3/page/82e361b4-8e96-42cf-b559-0b77df9aed2c/attachment/5cf980b3-807f-464b-a1c0-b896b0cec7e6/media/TSM%20Dedup%20Best%20Practices%20-%20v2.1.pdf We haven't adopted the directory-container pools yet due to their lacking of support for important features like migration and copy pools, but I have no doubt that IBM will be delivering those abilities soon; otherwise, there are very limited use-cases for directory-containers. Best regards, Mike RMD IT Client Services On Fri, Mar 18, 2016 at 10:41 AM, PAC Brion Arnaud < arnaud.br...@panalpina.com> wrote: > Hi All, > > We are currently testing TSM 7.1 deduplication feature, in conjunction > with container based storage pools. > So far, my test TSM instances, installed with such a setup are reporting > dedup percentage of 45 %, means dedup factor around 1.81, using a sample of > clients which are representative of our production environment. > This is unfortunately pretty far from what was promised by IBM (dedup > factor of 4) ... > > I'm wondering if anybody making use of container based storage pools and > deduplication would be sharing his deduplication factor, so that I could > have a better appreciation of real world figures. > If you would be so kind to share your information (possibly with the kind > of backed-up data i.e. VM, DB, NAS, Exchange, and retention values ...) I > would be very grateful ! > > Thanks in advance for appreciated feedback. > > Cheers. > > Arnaud > > > ** > Backup and Recovery Systems Administrator > Panalpina Management Ltd., Basle, Switzerland, > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > Direct: +41 (61) 226 19 78 > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > This electronic message transmission contains information from Panalpina > and is confidential or privileged. This information is intended only for > the person (s) named above. If you are not the intended recipient, any > disclosure, copying, distribution or use or any other action based on the > contents of this information is strictly prohibited. > > If you receive this electronic transmission in error, please notify the > sender by e-mail, telephone or fax at the numbers listed above. Thank you. > > ** > >
Re: Real world deduplication rates with TSM 7.1 and container pools
We see around 50-65% deduplication savings on the fileclass storagepools, most common seems to be around 55%. It requires what I call "deep reclaims" with very low values that need a lot of time. We are seeing 60-70% on containerpools but on average it is more like 65% but that is based on a much smaller install base. Both in heterogeneous environments. On Fri, Mar 18, 2016 at 5:06 PM, Ken Bury wrote: > I have two 7.1.4 servers, one with devclass file with dedupe, and the other > is using containers. The two servers are in a node replication pair so the > data on each server is exactly the same. The workload is almost exclusively > vmware backups with datamover dedupe and compression. The data reduction > for both pools is 89%. I like what I am getting from container pools and > replication. > On Fri, Mar 18, 2016 at 11:35 Ryder, Michael S > wrote: > > > Hi Arnaud > > > > If IBM made that commitment in black and white, then you should hold > their > > feet to the fire. But I am willing to bet this was a salesman promising > > "similar performance." > > > > There is no technology I know where any deduplication factor can be > > guaranteed. Perhaps "UP to 4" for certain kinds of data... And overall > > reduction of storage is what you should be comparing, not simply the > > deduplication percentage. > > > > Here, try reading at least the introduction of this document, " Effective > > Planning and Use of TSM V6 and V7 Deduplication" > > > > > > > http://www.ibm.com/developerworks/community/wikis/form/anonymous/api/wiki/f731037e-c0cf-436e-88b5-862b9a6597c3/page/82e361b4-8e96-42cf-b559-0b77df9aed2c/attachment/5cf980b3-807f-464b-a1c0-b896b0cec7e6/media/TSM%20Dedup%20Best%20Practices%20-%20v2.1.pdf > > > > We haven't adopted the directory-container pools yet due to their lacking > > of support for important features like migration and copy pools, but I > have > > no doubt that IBM will be delivering those abilities soon; otherwise, > there > > are very limited use-cases for directory-containers. > > > > Best regards, > > > > Mike > > RMD IT Client Services > > > > On Fri, Mar 18, 2016 at 10:41 AM, PAC Brion Arnaud < > > arnaud.br...@panalpina.com> wrote: > > > > > Hi All, > > > > > > We are currently testing TSM 7.1 deduplication feature, in conjunction > > > with container based storage pools. > > > So far, my test TSM instances, installed with such a setup are > reporting > > > dedup percentage of 45 %, means dedup factor around 1.81, using a > sample > > of > > > clients which are representative of our production environment. > > > This is unfortunately pretty far from what was promised by IBM (dedup > > > factor of 4) ... > > > > > > I'm wondering if anybody making use of container based storage pools > and > > > deduplication would be sharing his deduplication factor, so that I > could > > > have a better appreciation of real world figures. > > > If you would be so kind to share your information (possibly with the > kind > > > of backed-up data i.e. VM, DB, NAS, Exchange, and retention values > ...) > > I > > > would be very grateful ! > > > > > > Thanks in advance for appreciated feedback. > > > > > > Cheers. > > > > > > Arnaud > > > > > > > > > > > > ** > > > Backup and Recovery Systems Administrator > > > Panalpina Management Ltd., Basle, Switzerland, > > > CIT Department Viadukstrasse 42, P.O. Box 4002 Basel/CH > > > Phone: +41 (61) 226 11 11, FAX: +41 (61) 226 17 01 > > > Direct: +41 (61) 226 19 78 > > > e-mail: arnaud.br...@panalpina.com<mailto:arnaud.br...@panalpina.com> > > > This electronic message transmission contains information from > Panalpina > > > and is confidential or privileged. This information is intended only > for > > > the person (s) named above. If you are not the intended recipient, any > > > disclosure, copying, distribution or use or any other action based on > the > > > contents of this information is strictly prohibited. > > > > > > If you receive this electronic transmission in error, please notify the > > > sender by e-mail, telephone or fax at the numbers listed above. Thank > > you. > > > > > > > > > ** > > > > > > > > > -- > Ken >
Re: Windows 2012R2 Deduplication
Arni, I was confronted with this issue some time ago and the answer I received at that time was TSM was unable to perform "optimized" backups of deduplicated file systems on Win2012. The data gets re-hydrated during backup and more importantly during restore, which may require additional disk space. Support stated then that they were unaware on any plans on the table to address the issue. Rick Adamson -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Arni Snorri Eggertsson Sent: Sunday, May 03, 2015 9:54 AM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Windows 2012R2 Deduplication Hi all, I have a question about how you should handle backups when running File system de-duplication in Windows, I have a customer who runs Hyper-V, one guest in this HyperV setup is a file server, with disks to big to handle with TSM for HyperV, so that is not an option (not that I think it should be used against file servers), Now the customer has decided that to run data de-duplication inside Windows to lower requirements for disk storage, as far as I can tell TSM does not have any API interface with Windows to handle this, which means that files need to be hydrated during backup. This is very time consuming, and I am worried what happens when we need to restore, (in time perspective) as well as we need to have disk space for fully de-duplicated storage during the restore process. Also after the customer decided to activate file system de-duplication TSM client assumed all the files had changed, so right now we are taking a full backup of the entire file server, does anyone know what happens next time we run incremental backups, can I assume that TSM will take full backups every time or? Does IBM have any plans for supporting this dedup feature in Windows? it's been in Windows 2012R2 since it was released, so I would assume that there are people using it out there, or does the customer need to find alternative ways to solve this. Thanks in advance, --- Arni
Windows 2012R2 Deduplication
Hi all, I have a question about how you should handle backups when running File system de-duplication in Windows, I have a customer who runs Hyper-V, one guest in this HyperV setup is a file server, with disks to big to handle with TSM for HyperV, so that is not an option (not that I think it should be used against file servers), Now the customer has decided that to run data de-duplication inside Windows to lower requirements for disk storage, as far as I can tell TSM does not have any API interface with Windows to handle this, which means that files need to be hydrated during backup. This is very time consuming, and I am worried what happens when we need to restore, (in time perspective) as well as we need to have disk space for fully de-duplicated storage during the restore process. Also after the customer decided to activate file system de-duplication TSM client assumed all the files had changed, so right now we are taking a full backup of the entire file server, does anyone know what happens next time we run incremental backups, can I assume that TSM will take full backups every time or? Does IBM have any plans for supporting this dedup feature in Windows? it's been in Windows 2012R2 since it was released, so I would assume that there are people using it out there, or does the customer need to find alternative ways to solve this. Thanks in advance, --- Arni
Re: Oracle RMAN and TDP with source-side deduplication
Sergio, I don't have exact numbers but from what I recall we were running 150-200MB/s, this is not the network load but the effective thruput using client-side deduplication. Use multiple sessions (how many is best is trial/error) and have fast filepool storage as well, you will see a lot of reads on the filepool because TSM reads the data before calling "I got the data stored" on the TSM server. If the filepool storage is slow you will see a high wait i/o because of this, I have seen the random read speed of the filepool be the bottleneck with client-side dedup due to high wait I/O's (30%+) Regards, Stefan On Wed, Jan 21, 2015 at 11:33 PM, Sergio O. Fuentes wrote: > Thanks for the feedback. If you've successfully done source-side dedup > with an SSD-backed TSM DB with Oracle TDP, can you provide any indication > of the transfer rates that you can see? Like, would you be able to > successfully backup a 100GB DB in say less than an hour. That would be > about 27 MB/sec? I'm currently seeing rates of 3MB/sec. Pretty pitiful. > > Our DB is on an auto-tiering VMAX (FAST VP) backed by about 1TB of Flash. > FAST VP transfers the hotspots to Flash normally dictated by some > secret-sauce formula of read-rates and cache misses or something to that > effect. I have very little I/O wait coming from the TSM server DB's > during the TDP backup. I'll have to check on the client to see if there's > a bottleneck there. > > Thanks again! > Sergio > > > On 1/21/15, 4:16 AM, "Stefan Folkerts" wrote: > > >What type of storage are you using for the TSM database? > >I've seen client-side dedup running fast for databases with SSD's for TSM > >database and activelog but not so much with traditional harddrives, even > >if > >it was bunch of 15K disks, it just doesn't deliver like 4 SSD's. > > > >From what I have seen you pretty much require SSD's for the TSM database > >if > >you want any serieus performance on client-side deduped structured data > >backups that want to stream a lot of data. Also for client-side dedup of > >VM's when you are backing up a lot of them at the same time. > > > >File-level incrementals have a different impact because there is > >filesystem > >scanning going on to somewhat relieve the TSM DB disk iop/s wise, start a > >few databases with client-side dedup and see the wait i/o increase > >significantly if you don't use SSD's. > > > >This is just my experience from a few customer sites. > > > > > > > >On Tue, Jan 20, 2015 at 10:07 PM, Sergio O. Fuentes > >wrote: > > > >> Hello folks, > >> > >> I've been using source-side deduplication pretty successfully for most > >>of > >> my clients (Unix and Windows and TDP for MSSQL) for at least two years > >> now. The backup window for the source-side is significantly shorter for > >> Unix clients, minimally shorter for Windows and somewhat longer for > >>MSSQL > >> nodes on average. The pain I've been having is with Oracle RMAN and > >>TDP. > >> I'm unsure if our older Oracle servers are just really undersized or if > >> it's a function of TSM dedup overhead while doing source-side dedupe > >>that > >> is expanding the backup window way too long. I have tested several > >> versions of TSM against several versions of Oracle (10 and 11) on > >>several > >> different hardware Oracle Solaris tiers. I wanted to see if anyone in > >>the > >> group has had any significant achievements with source-side dedup and > >> Oracle DB's. Am I being overly optimistic with TSM or TDP and its > >>ability > >> to process over 100GB of DB data for one node for a level 0? Our > >>databases > >> are not that large, with the largest being about 400GB (and doing level > >>0's > >> on that thing is a nightmare). > >> > >> Here's some info on the environment settings that I'm currently testing > >> > >> Deduplication ON in TSM and Client > >> Compression ON in TSM > >> Filesperset 1 in Oracle for Data files > >> Filesperset 10 in Oracle for Archive logs > >> Archive and Data files are both processed for dedup (I don't like the > >> complexity of managing a non-dedup storage tier just for logs, so I'll > >>try > >> to eat the overhead on that) > >> TSM API at 7.1.1.0 > >> TDP Version at 6.3.0 > >> RMAN version ? > >> Oracle version 10, moving to 11 but having same performance issues > >> TSM catalog is on an auto-tiering SAN array with flash. > >> > >> Right now, my failback is to do post-process deduplication and that's > >> worked out fine, but I really want to see what kind of ingestion rates > >>we > >> should be able to see with Oracle RMAN and TSM source-side > >>deduplication. > >> > >> Also, I'm not shelling out money for a VTL right now. The decision was > >>to > >> stick with TSM Dedup and aside from nagging clients like Orace RMAN and > >> TDP, I've had no issues with TSM dedup. (Running a TSM server on > >>Solaris > >> was awful, however). > >> > >> Thanks! > >> Sergio > >> >
Re: Oracle RMAN and TDP with source-side deduplication
Thanks for the feedback. If you've successfully done source-side dedup with an SSD-backed TSM DB with Oracle TDP, can you provide any indication of the transfer rates that you can see? Like, would you be able to successfully backup a 100GB DB in say less than an hour. That would be about 27 MB/sec? I'm currently seeing rates of 3MB/sec. Pretty pitiful. Our DB is on an auto-tiering VMAX (FAST VP) backed by about 1TB of Flash. FAST VP transfers the hotspots to Flash normally dictated by some secret-sauce formula of read-rates and cache misses or something to that effect. I have very little I/O wait coming from the TSM server DB's during the TDP backup. I'll have to check on the client to see if there's a bottleneck there. Thanks again! Sergio On 1/21/15, 4:16 AM, "Stefan Folkerts" wrote: >What type of storage are you using for the TSM database? >I've seen client-side dedup running fast for databases with SSD's for TSM >database and activelog but not so much with traditional harddrives, even >if >it was bunch of 15K disks, it just doesn't deliver like 4 SSD's. > >From what I have seen you pretty much require SSD's for the TSM database >if >you want any serieus performance on client-side deduped structured data >backups that want to stream a lot of data. Also for client-side dedup of >VM's when you are backing up a lot of them at the same time. > >File-level incrementals have a different impact because there is >filesystem >scanning going on to somewhat relieve the TSM DB disk iop/s wise, start a >few databases with client-side dedup and see the wait i/o increase >significantly if you don't use SSD's. > >This is just my experience from a few customer sites. > > > >On Tue, Jan 20, 2015 at 10:07 PM, Sergio O. Fuentes >wrote: > >> Hello folks, >> >> I've been using source-side deduplication pretty successfully for most >>of >> my clients (Unix and Windows and TDP for MSSQL) for at least two years >> now. The backup window for the source-side is significantly shorter for >> Unix clients, minimally shorter for Windows and somewhat longer for >>MSSQL >> nodes on average. The pain I've been having is with Oracle RMAN and >>TDP. >> I'm unsure if our older Oracle servers are just really undersized or if >> it's a function of TSM dedup overhead while doing source-side dedupe >>that >> is expanding the backup window way too long. I have tested several >> versions of TSM against several versions of Oracle (10 and 11) on >>several >> different hardware Oracle Solaris tiers. I wanted to see if anyone in >>the >> group has had any significant achievements with source-side dedup and >> Oracle DB's. Am I being overly optimistic with TSM or TDP and its >>ability >> to process over 100GB of DB data for one node for a level 0? Our >>databases >> are not that large, with the largest being about 400GB (and doing level >>0's >> on that thing is a nightmare). >> >> Here's some info on the environment settings that I'm currently testing >> >> Deduplication ON in TSM and Client >> Compression ON in TSM >> Filesperset 1 in Oracle for Data files >> Filesperset 10 in Oracle for Archive logs >> Archive and Data files are both processed for dedup (I don't like the >> complexity of managing a non-dedup storage tier just for logs, so I'll >>try >> to eat the overhead on that) >> TSM API at 7.1.1.0 >> TDP Version at 6.3.0 >> RMAN version ? >> Oracle version 10, moving to 11 but having same performance issues >> TSM catalog is on an auto-tiering SAN array with flash. >> >> Right now, my failback is to do post-process deduplication and that's >> worked out fine, but I really want to see what kind of ingestion rates >>we >> should be able to see with Oracle RMAN and TSM source-side >>deduplication. >> >> Also, I'm not shelling out money for a VTL right now. The decision was >>to >> stick with TSM Dedup and aside from nagging clients like Orace RMAN and >> TDP, I've had no issues with TSM dedup. (Running a TSM server on >>Solaris >> was awful, however). >> >> Thanks! >> Sergio >>
Re: Oracle RMAN and TDP with source-side deduplication
What type of storage are you using for the TSM database? I've seen client-side dedup running fast for databases with SSD's for TSM database and activelog but not so much with traditional harddrives, even if it was bunch of 15K disks, it just doesn't deliver like 4 SSD's. >From what I have seen you pretty much require SSD's for the TSM database if you want any serieus performance on client-side deduped structured data backups that want to stream a lot of data. Also for client-side dedup of VM's when you are backing up a lot of them at the same time. File-level incrementals have a different impact because there is filesystem scanning going on to somewhat relieve the TSM DB disk iop/s wise, start a few databases with client-side dedup and see the wait i/o increase significantly if you don't use SSD's. This is just my experience from a few customer sites. On Tue, Jan 20, 2015 at 10:07 PM, Sergio O. Fuentes wrote: > Hello folks, > > I've been using source-side deduplication pretty successfully for most of > my clients (Unix and Windows and TDP for MSSQL) for at least two years > now. The backup window for the source-side is significantly shorter for > Unix clients, minimally shorter for Windows and somewhat longer for MSSQL > nodes on average. The pain I've been having is with Oracle RMAN and TDP. > I'm unsure if our older Oracle servers are just really undersized or if > it's a function of TSM dedup overhead while doing source-side dedupe that > is expanding the backup window way too long. I have tested several > versions of TSM against several versions of Oracle (10 and 11) on several > different hardware Oracle Solaris tiers. I wanted to see if anyone in the > group has had any significant achievements with source-side dedup and > Oracle DB's. Am I being overly optimistic with TSM or TDP and its ability > to process over 100GB of DB data for one node for a level 0? Our databases > are not that large, with the largest being about 400GB (and doing level 0's > on that thing is a nightmare). > > Here's some info on the environment settings that I'm currently testing > > Deduplication ON in TSM and Client > Compression ON in TSM > Filesperset 1 in Oracle for Data files > Filesperset 10 in Oracle for Archive logs > Archive and Data files are both processed for dedup (I don't like the > complexity of managing a non-dedup storage tier just for logs, so I'll try > to eat the overhead on that) > TSM API at 7.1.1.0 > TDP Version at 6.3.0 > RMAN version ? > Oracle version 10, moving to 11 but having same performance issues > TSM catalog is on an auto-tiering SAN array with flash. > > Right now, my failback is to do post-process deduplication and that's > worked out fine, but I really want to see what kind of ingestion rates we > should be able to see with Oracle RMAN and TSM source-side deduplication. > > Also, I'm not shelling out money for a VTL right now. The decision was to > stick with TSM Dedup and aside from nagging clients like Orace RMAN and > TDP, I've had no issues with TSM dedup. (Running a TSM server on Solaris > was awful, however). > > Thanks! > Sergio >
Oracle RMAN and TDP with source-side deduplication
Hello folks, I've been using source-side deduplication pretty successfully for most of my clients (Unix and Windows and TDP for MSSQL) for at least two years now. The backup window for the source-side is significantly shorter for Unix clients, minimally shorter for Windows and somewhat longer for MSSQL nodes on average. The pain I've been having is with Oracle RMAN and TDP. I'm unsure if our older Oracle servers are just really undersized or if it's a function of TSM dedup overhead while doing source-side dedupe that is expanding the backup window way too long. I have tested several versions of TSM against several versions of Oracle (10 and 11) on several different hardware Oracle Solaris tiers. I wanted to see if anyone in the group has had any significant achievements with source-side dedup and Oracle DB's. Am I being overly optimistic with TSM or TDP and its ability to process over 100GB of DB data for one node for a level 0? Our databases are not that large, with the largest being about 400GB (and doing level 0's on that thing is a nightmare). Here's some info on the environment settings that I'm currently testing Deduplication ON in TSM and Client Compression ON in TSM Filesperset 1 in Oracle for Data files Filesperset 10 in Oracle for Archive logs Archive and Data files are both processed for dedup (I don't like the complexity of managing a non-dedup storage tier just for logs, so I'll try to eat the overhead on that) TSM API at 7.1.1.0 TDP Version at 6.3.0 RMAN version ? Oracle version 10, moving to 11 but having same performance issues TSM catalog is on an auto-tiering SAN array with flash. Right now, my failback is to do post-process deduplication and that's worked out fine, but I really want to see what kind of ingestion rates we should be able to see with Oracle RMAN and TSM source-side deduplication. Also, I'm not shelling out money for a VTL right now. The decision was to stick with TSM Dedup and aside from nagging clients like Orace RMAN and TDP, I've had no issues with TSM dedup. (Running a TSM server on Solaris was awful, however). Thanks! Sergio
Re: TSM level for deduplication
Awesome... thank you Dave! On Tue, Dec 9, 2014 at 9:52 AM, Dave Canan wrote: > I submitted this request to IBM TSM development, and am posting it on > their behalf: > TSM Data deduplication has been in the product since 6.1 (server side data > deduplication) and 6.2 (client side data deduplication) and is therefore > considered "mature" at the 7.1 version. The data deduplication mechanism > itself has remained largely unchanged since initial release but > improvements have been focused on performance and increasing the size of > files that could be deduplicated. We recommend that customers be at either > TSM 6.3.5 (or above) or TSM 7.1.1.100 (or above). The TSM 6.3.5 version > contains the important performance improvements though this version did not > add improvements in deduplicating large files. TSM 7.1.1.100 contains not > only the performance improvements but also ability to deduplicate larger > files as well as replication performance improvements. All 7.1 > deduplication customers should be at 7.1.1.100 due to this problem: > http://www-01.ibm.com/support/docview.wss?uid=swg21688321 > > Dave CananIBM SRT (TSM Solutions Response > Team)ddca...@us.ibm.com916-723-2409Office Hours 9:00 - 5:00 PT > > > Date: Mon, 8 Dec 2014 13:00:53 -0500 > > From: yodaw...@gmail.com > > Subject: Re: [ADSM-L] TSM level for deduplication > > To: ADSM-L@VM.MARIST.EDU > > > > Would anyone from IBM care to comment on this thread? is dedup a stable > > mature feature in 7.1.1? > > > > On Mon, Dec 8, 2014 at 12:50 PM, J. Pohlmann wrote: > > > > > FYI - 7.1.1.000 is still on the FTP site. 7.1.1.100 is also on the FTP > > > site. > > > Ref http://www-01.ibm.com/support/docview.wss?uid=swg24035122 > > > > > > Best regards, > > > > > > Joerg Pohlmann > > > > > > -Original Message- > > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf > Of > > > Thomas Denier > > > Sent: December 8, 2014 08:34 > > > To: ADSM-L@VM.MARIST.EDU > > > Subject: Re: [ADSM-L] TSM level for deduplication > > > > > > Bent, > > > > > > TSM 7.1.1.000 had a bug that sometimes caused restores of large files > to > > > fail. IBM considered the bug serious enough to warrant removing > 7.1.1.000 > > > from its software distribution servers. > > > > > > Thomas Denier > > > Thomas Jefferson University Hospital > > > > > > -Original Message- > > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf > Of > > > Bent Christensen > > > Sent: Saturday, December 06, 2014 6:38 PM > > > To: ADSM-L@VM.MARIST.EDU > > > Subject: [ADSM-L] SV: TSM level for deduplication > > > > > > Hi Thomas, > > > > > > when you are calling 7.1.1- an "utter distaster" when it comes to dedup > > > then > > > what issues are you referring to? > > > > > > I have been using 7.1.1 in a production environment dedupping some 500 > TB, > > > approx 400 nodes, without any bigger issues for more than a year now. > > > > > > Surely, there are still lots of "not-very-well-documented features" in > TSM > > > 7, and I am not at all impressed by IBM support, and especially not DB2 > > > support and their lack of willingness to recognize TSM DB2 as being a > > > production environment, but when it comes to dedupping it has been > smooth > > > sailing for us up until now. > > > > > > > > > - Bent > > > > > > > > > Fra: ADSM: Dist Stor Manager [ADSM-L@VM.MARIST.EDU] På vegne af > > > Thomas > > > Denier [thomas.den...@jefferson.edu] > > > Sendt: 5. december 2014 20:56 > > > Til: ADSM-L@VM.MARIST.EDU > > > Emne: [ADSM-L] TSM level for deduplication > > > > > > My management is very eager to deploy TSM deduplication in our > production > > > environment. We have been testing deduplication on a TSM 6.2.5.0 test > > > server, but the list of known bugs makes me very uncomfortable about > using > > > that level for production deployment of deduplication. The same is > true of > > > later Version 6 levels and TSM 7.1.0. TSM 7.1.1.000 was an utter > disaster. > > > Is there any currently available level in which the deduplication code > is > > > really fit for production use? > > > > > > IBM has historically described patch levels as being less thor
Re: TSM level for deduplication
I submitted this request to IBM TSM development, and am posting it on their behalf: TSM Data deduplication has been in the product since 6.1 (server side data deduplication) and 6.2 (client side data deduplication) and is therefore considered "mature" at the 7.1 version. The data deduplication mechanism itself has remained largely unchanged since initial release but improvements have been focused on performance and increasing the size of files that could be deduplicated. We recommend that customers be at either TSM 6.3.5 (or above) or TSM 7.1.1.100 (or above). The TSM 6.3.5 version contains the important performance improvements though this version did not add improvements in deduplicating large files. TSM 7.1.1.100 contains not only the performance improvements but also ability to deduplicate larger files as well as replication performance improvements. All 7.1 deduplication customers should be at 7.1.1.100 due to this problem: http://www-01.ibm.com/support/docview.wss?uid=swg21688321 Dave CananIBM SRT (TSM Solutions Response Team)ddca...@us.ibm.com916-723-2409Office Hours 9:00 - 5:00 PT > Date: Mon, 8 Dec 2014 13:00:53 -0500 > From: yodaw...@gmail.com > Subject: Re: [ADSM-L] TSM level for deduplication > To: ADSM-L@VM.MARIST.EDU > > Would anyone from IBM care to comment on this thread? is dedup a stable > mature feature in 7.1.1? > > On Mon, Dec 8, 2014 at 12:50 PM, J. Pohlmann wrote: > > > FYI - 7.1.1.000 is still on the FTP site. 7.1.1.100 is also on the FTP > > site. > > Ref http://www-01.ibm.com/support/docview.wss?uid=swg24035122 > > > > Best regards, > > > > Joerg Pohlmann > > > > -Original Message- > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > > Thomas Denier > > Sent: December 8, 2014 08:34 > > To: ADSM-L@VM.MARIST.EDU > > Subject: Re: [ADSM-L] TSM level for deduplication > > > > Bent, > > > > TSM 7.1.1.000 had a bug that sometimes caused restores of large files to > > fail. IBM considered the bug serious enough to warrant removing 7.1.1.000 > > from its software distribution servers. > > > > Thomas Denier > > Thomas Jefferson University Hospital > > > > -Original Message- > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > > Bent Christensen > > Sent: Saturday, December 06, 2014 6:38 PM > > To: ADSM-L@VM.MARIST.EDU > > Subject: [ADSM-L] SV: TSM level for deduplication > > > > Hi Thomas, > > > > when you are calling 7.1.1- an "utter distaster" when it comes to dedup > > then > > what issues are you referring to? > > > > I have been using 7.1.1 in a production environment dedupping some 500 TB, > > approx 400 nodes, without any bigger issues for more than a year now. > > > > Surely, there are still lots of "not-very-well-documented features" in TSM > > 7, and I am not at all impressed by IBM support, and especially not DB2 > > support and their lack of willingness to recognize TSM DB2 as being a > > production environment, but when it comes to dedupping it has been smooth > > sailing for us up until now. > > > > > > - Bent > > > > > > Fra: ADSM: Dist Stor Manager [ADSM-L@VM.MARIST.EDU] På vegne af > > Thomas > > Denier [thomas.den...@jefferson.edu] > > Sendt: 5. december 2014 20:56 > > Til: ADSM-L@VM.MARIST.EDU > > Emne: [ADSM-L] TSM level for deduplication > > > > My management is very eager to deploy TSM deduplication in our production > > environment. We have been testing deduplication on a TSM 6.2.5.0 test > > server, but the list of known bugs makes me very uncomfortable about using > > that level for production deployment of deduplication. The same is true of > > later Version 6 levels and TSM 7.1.0. TSM 7.1.1.000 was an utter disaster. > > Is there any currently available level in which the deduplication code is > > really fit for production use? > > > > IBM has historically described patch levels as being less thoroughly tested > > than maintenance levels. Because of that I have avoided patch levels unless > > they were the only option for fixing crippling bugs in code we were already > > using. > > Is that attitude still warranted? In particular, is that attitude warranted > > for TSM 7.1.1.100? > > > > Has IBM dropped any hints about the likely availability date for TSM > > 7.1.2.000? > > > > Thomas Denier > > Thomas Jefferson University Hospital > > > > > > The information contained in this transmission contains
Re: TSM level for deduplication
The web page at https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli+Storage+Manager/page/TSM+Schedule+for+Fix-Packs has the note "Removed from FTP site 12/1 due to IT05283. Replaced by 7.1.1.100." in reference to 7.1.1.000. I just looked at the FTP site and 7.1.1.000 is indeed still there. I overestimated IBM's ability to keep track of the contents of its own Web sites and FTP servers. Thomas Denier Thomas Jefferson University Hospital -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of J. Pohlmann Sent: Monday, December 08, 2014 12:50 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM level for deduplication FYI - 7.1.1.000 is still on the FTP site. 7.1.1.100 is also on the FTP site. Ref http://www-01.ibm.com/support/docview.wss?uid=swg24035122 Best regards, Joerg Pohlmann -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Thomas Denier Sent: December 8, 2014 08:34 To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM level for deduplication Bent, TSM 7.1.1.000 had a bug that sometimes caused restores of large files to fail. IBM considered the bug serious enough to warrant removing 7.1.1.000 from its software distribution servers. Thomas Denier Thomas Jefferson University Hospital -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Bent Christensen Sent: Saturday, December 06, 2014 6:38 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] SV: TSM level for deduplication Hi Thomas, when you are calling 7.1.1- an "utter distaster" when it comes to dedup then what issues are you referring to? I have been using 7.1.1 in a production environment dedupping some 500 TB, approx 400 nodes, without any bigger issues for more than a year now. Surely, there are still lots of "not-very-well-documented features" in TSM 7, and I am not at all impressed by IBM support, and especially not DB2 support and their lack of willingness to recognize TSM DB2 as being a production environment, but when it comes to dedupping it has been smooth sailing for us up until now. - Bent Fra: ADSM: Dist Stor Manager [ADSM-L@VM.MARIST.EDU] På vegne af Thomas Denier [thomas.den...@jefferson.edu] Sendt: 5. december 2014 20:56 Til: ADSM-L@VM.MARIST.EDU Emne: [ADSM-L] TSM level for deduplication My management is very eager to deploy TSM deduplication in our production environment. We have been testing deduplication on a TSM 6.2.5.0 test server, but the list of known bugs makes me very uncomfortable about using that level for production deployment of deduplication. The same is true of later Version 6 levels and TSM 7.1.0. TSM 7.1.1.000 was an utter disaster. Is there any currently available level in which the deduplication code is really fit for production use? IBM has historically described patch levels as being less thoroughly tested than maintenance levels. Because of that I have avoided patch levels unless they were the only option for fixing crippling bugs in code we were already using. Is that attitude still warranted? In particular, is that attitude warranted for TSM 7.1.1.100? Has IBM dropped any hints about the likely availability date for TSM 7.1.2.000? Thomas Denier Thomas Jefferson University Hospital The information contained in this transmission contains privileged and confidential information. It is intended only for the use of the person named above. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. CAUTION: Intended recipients should NOT use email communication for emergent or urgent health care matters.
Re: TSM level for deduplication
Would anyone from IBM care to comment on this thread? is dedup a stable mature feature in 7.1.1? On Mon, Dec 8, 2014 at 12:50 PM, J. Pohlmann wrote: > FYI - 7.1.1.000 is still on the FTP site. 7.1.1.100 is also on the FTP > site. > Ref http://www-01.ibm.com/support/docview.wss?uid=swg24035122 > > Best regards, > > Joerg Pohlmann > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Thomas Denier > Sent: December 8, 2014 08:34 > To: ADSM-L@VM.MARIST.EDU > Subject: Re: [ADSM-L] TSM level for deduplication > > Bent, > > TSM 7.1.1.000 had a bug that sometimes caused restores of large files to > fail. IBM considered the bug serious enough to warrant removing 7.1.1.000 > from its software distribution servers. > > Thomas Denier > Thomas Jefferson University Hospital > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Bent Christensen > Sent: Saturday, December 06, 2014 6:38 PM > To: ADSM-L@VM.MARIST.EDU > Subject: [ADSM-L] SV: TSM level for deduplication > > Hi Thomas, > > when you are calling 7.1.1- an "utter distaster" when it comes to dedup > then > what issues are you referring to? > > I have been using 7.1.1 in a production environment dedupping some 500 TB, > approx 400 nodes, without any bigger issues for more than a year now. > > Surely, there are still lots of "not-very-well-documented features" in TSM > 7, and I am not at all impressed by IBM support, and especially not DB2 > support and their lack of willingness to recognize TSM DB2 as being a > production environment, but when it comes to dedupping it has been smooth > sailing for us up until now. > > > - Bent > > > Fra: ADSM: Dist Stor Manager [ADSM-L@VM.MARIST.EDU] På vegne af > Thomas > Denier [thomas.den...@jefferson.edu] > Sendt: 5. december 2014 20:56 > Til: ADSM-L@VM.MARIST.EDU > Emne: [ADSM-L] TSM level for deduplication > > My management is very eager to deploy TSM deduplication in our production > environment. We have been testing deduplication on a TSM 6.2.5.0 test > server, but the list of known bugs makes me very uncomfortable about using > that level for production deployment of deduplication. The same is true of > later Version 6 levels and TSM 7.1.0. TSM 7.1.1.000 was an utter disaster. > Is there any currently available level in which the deduplication code is > really fit for production use? > > IBM has historically described patch levels as being less thoroughly tested > than maintenance levels. Because of that I have avoided patch levels unless > they were the only option for fixing crippling bugs in code we were already > using. > Is that attitude still warranted? In particular, is that attitude warranted > for TSM 7.1.1.100? > > Has IBM dropped any hints about the likely availability date for TSM > 7.1.2.000? > > Thomas Denier > Thomas Jefferson University Hospital > > > The information contained in this transmission contains privileged and > confidential information. It is intended only for the use of the person > named above. If you are not the intended recipient, you are hereby notified > that any review, dissemination, distribution or duplication of this > communication is strictly prohibited. If you are not the intended > recipient, > please contact the sender by reply email and destroy all copies of the > original message. > > CAUTION: Intended recipients should NOT use email communication for > emergent > or urgent health care matters. >
Re: TSM level for deduplication
FYI - 7.1.1.000 is still on the FTP site. 7.1.1.100 is also on the FTP site. Ref http://www-01.ibm.com/support/docview.wss?uid=swg24035122 Best regards, Joerg Pohlmann -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Thomas Denier Sent: December 8, 2014 08:34 To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM level for deduplication Bent, TSM 7.1.1.000 had a bug that sometimes caused restores of large files to fail. IBM considered the bug serious enough to warrant removing 7.1.1.000 from its software distribution servers. Thomas Denier Thomas Jefferson University Hospital -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Bent Christensen Sent: Saturday, December 06, 2014 6:38 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] SV: TSM level for deduplication Hi Thomas, when you are calling 7.1.1- an "utter distaster" when it comes to dedup then what issues are you referring to? I have been using 7.1.1 in a production environment dedupping some 500 TB, approx 400 nodes, without any bigger issues for more than a year now. Surely, there are still lots of "not-very-well-documented features" in TSM 7, and I am not at all impressed by IBM support, and especially not DB2 support and their lack of willingness to recognize TSM DB2 as being a production environment, but when it comes to dedupping it has been smooth sailing for us up until now. - Bent Fra: ADSM: Dist Stor Manager [ADSM-L@VM.MARIST.EDU] På vegne af Thomas Denier [thomas.den...@jefferson.edu] Sendt: 5. december 2014 20:56 Til: ADSM-L@VM.MARIST.EDU Emne: [ADSM-L] TSM level for deduplication My management is very eager to deploy TSM deduplication in our production environment. We have been testing deduplication on a TSM 6.2.5.0 test server, but the list of known bugs makes me very uncomfortable about using that level for production deployment of deduplication. The same is true of later Version 6 levels and TSM 7.1.0. TSM 7.1.1.000 was an utter disaster. Is there any currently available level in which the deduplication code is really fit for production use? IBM has historically described patch levels as being less thoroughly tested than maintenance levels. Because of that I have avoided patch levels unless they were the only option for fixing crippling bugs in code we were already using. Is that attitude still warranted? In particular, is that attitude warranted for TSM 7.1.1.100? Has IBM dropped any hints about the likely availability date for TSM 7.1.2.000? Thomas Denier Thomas Jefferson University Hospital The information contained in this transmission contains privileged and confidential information. It is intended only for the use of the person named above. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. CAUTION: Intended recipients should NOT use email communication for emergent or urgent health care matters.
Re: TSM level for deduplication
Bent, TSM 7.1.1.000 had a bug that sometimes caused restores of large files to fail. IBM considered the bug serious enough to warrant removing 7.1.1.000 from its software distribution servers. Thomas Denier Thomas Jefferson University Hospital -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Bent Christensen Sent: Saturday, December 06, 2014 6:38 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] SV: TSM level for deduplication Hi Thomas, when you are calling 7.1.1- an "utter distaster" when it comes to dedup then what issues are you referring to? I have been using 7.1.1 in a production environment dedupping some 500 TB, approx 400 nodes, without any bigger issues for more than a year now. Surely, there are still lots of "not-very-well-documented features" in TSM 7, and I am not at all impressed by IBM support, and especially not DB2 support and their lack of willingness to recognize TSM DB2 as being a production environment, but when it comes to dedupping it has been smooth sailing for us up until now. - Bent Fra: ADSM: Dist Stor Manager [ADSM-L@VM.MARIST.EDU] På vegne af Thomas Denier [thomas.den...@jefferson.edu] Sendt: 5. december 2014 20:56 Til: ADSM-L@VM.MARIST.EDU Emne: [ADSM-L] TSM level for deduplication My management is very eager to deploy TSM deduplication in our production environment. We have been testing deduplication on a TSM 6.2.5.0 test server, but the list of known bugs makes me very uncomfortable about using that level for production deployment of deduplication. The same is true of later Version 6 levels and TSM 7.1.0. TSM 7.1.1.000 was an utter disaster. Is there any currently available level in which the deduplication code is really fit for production use? IBM has historically described patch levels as being less thoroughly tested than maintenance levels. Because of that I have avoided patch levels unless they were the only option for fixing crippling bugs in code we were already using. Is that attitude still warranted? In particular, is that attitude warranted for TSM 7.1.1.100? Has IBM dropped any hints about the likely availability date for TSM 7.1.2.000? Thomas Denier Thomas Jefferson University Hospital The information contained in this transmission contains privileged and confidential information. It is intended only for the use of the person named above. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. CAUTION: Intended recipients should NOT use email communication for emergent or urgent health care matters.
Re: TSM level for deduplication
Hi Thomas, I agree, don't even think about using dedup at 6.2.5. Too many performance and data handling bugs. We're deduping 4.5-5 TB a day (Windows TSM server) and even had issues at 6.3.4. Have seen no issues at all so far at 6.3.5 (and it should be easy for you to get there from 6.2.5). W -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Thomas Denier Sent: Friday, December 05, 2014 2:56 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] TSM level for deduplication My management is very eager to deploy TSM deduplication in our production environment. We have been testing deduplication on a TSM 6.2.5.0 test server, but the list of known bugs makes me very uncomfortable about using that level for production deployment of deduplication. The same is true of later Version 6 levels and TSM 7.1.0. TSM 7.1.1.000 was an utter disaster. Is there any currently available level in which the deduplication code is really fit for production use? IBM has historically described patch levels as being less thoroughly tested than maintenance levels. Because of that I have avoided patch levels unless they were the only option for fixing crippling bugs in code we were already using. Is that attitude still warranted? In particular, is that attitude warranted for TSM 7.1.1.100? Has IBM dropped any hints about the likely availability date for TSM 7.1.2.000? Thomas Denier Thomas Jefferson University Hospital The information contained in this transmission contains privileged and confidential information. It is intended only for the use of the person named above. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. CAUTION: Intended recipients should NOT use email communication for emergent or urgent health care matters.
SV: TSM level for deduplication
Hi Thomas, when you are calling 7.1.1- an "utter distaster" when it comes to dedup then what issues are you referring to? I have been using 7.1.1 in a production environment dedupping some 500 TB, approx 400 nodes, without any bigger issues for more than a year now. Surely, there are still lots of "not-very-well-documented features" in TSM 7, and I am not at all impressed by IBM support, and especially not DB2 support and their lack of willingness to recognize TSM DB2 as being a production environment, but when it comes to dedupping it has been smooth sailing for us up until now. - Bent Fra: ADSM: Dist Stor Manager [ADSM-L@VM.MARIST.EDU] På vegne af Thomas Denier [thomas.den...@jefferson.edu] Sendt: 5. december 2014 20:56 Til: ADSM-L@VM.MARIST.EDU Emne: [ADSM-L] TSM level for deduplication My management is very eager to deploy TSM deduplication in our production environment. We have been testing deduplication on a TSM 6.2.5.0 test server, but the list of known bugs makes me very uncomfortable about using that level for production deployment of deduplication. The same is true of later Version 6 levels and TSM 7.1.0. TSM 7.1.1.000 was an utter disaster. Is there any currently available level in which the deduplication code is really fit for production use? IBM has historically described patch levels as being less thoroughly tested than maintenance levels. Because of that I have avoided patch levels unless they were the only option for fixing crippling bugs in code we were already using. Is that attitude still warranted? In particular, is that attitude warranted for TSM 7.1.1.100? Has IBM dropped any hints about the likely availability date for TSM 7.1.2.000? Thomas Denier Thomas Jefferson University Hospital The information contained in this transmission contains privileged and confidential information. It is intended only for the use of the person named above. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. CAUTION: Intended recipients should NOT use email communication for emergent or urgent health care matters.
TSM level for deduplication
My management is very eager to deploy TSM deduplication in our production environment. We have been testing deduplication on a TSM 6.2.5.0 test server, but the list of known bugs makes me very uncomfortable about using that level for production deployment of deduplication. The same is true of later Version 6 levels and TSM 7.1.0. TSM 7.1.1.000 was an utter disaster. Is there any currently available level in which the deduplication code is really fit for production use? IBM has historically described patch levels as being less thoroughly tested than maintenance levels. Because of that I have avoided patch levels unless they were the only option for fixing crippling bugs in code we were already using. Is that attitude still warranted? In particular, is that attitude warranted for TSM 7.1.1.100? Has IBM dropped any hints about the likely availability date for TSM 7.1.2.000? Thomas Denier Thomas Jefferson University Hospital The information contained in this transmission contains privileged and confidential information. It is intended only for the use of the person named above. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. CAUTION: Intended recipients should NOT use email communication for emergent or urgent health care matters.
Deduplication anomalies
I am trying to determine the causes of two anomalies in the behavior of a deduplicated storage pool in our TSM test environment. The test environment uses TSM 6.2.5.0 server code running under zSeries Linux. The environment has been using only server side deduplication since early September. Some tests before that time used client side deduplication. The first anomaly has to do with reclamation of the deduplicated storage pool. For the last several days 'reclaim stgpool' commands have ended immediately with the message: ANR2111W RECLAIM STGPOOL: There is no data to process for LDH. This was surprising, given the amount of duplicated date reported by 'identify duplicates' processes. Yesterday I discovered that the storage pool had several volumes that were eligible for reclamation with the threshold that had been specified in the 'reclaim stgpool' commands. There had been a successful storage pool backup after the then most recent client backup sessions. I was able to perform 'move data' commands for each of the eligible volumes. The second anomaly has to do with filling volumes. The deduplicated storage pool has 187 filling volumes with a reported occupancy of 0.0. Most of these also have the percentage of reclaimable space also reported as 0.0, and all have the percentage of reclaimable space below 20. Most of the last write dates are concentrated in three afternoons. I maintain a document in which I log changes in the test environment and observations of the behavior of the environment. This document does not show any change in the environment or any observed anomalies on the days when most of the low occupancy volumes were last written. The test environment has two collocation groups. I have verified that the deduplicated storage pool is configured for collocation by group and that every node is in one of the collocation groups. All of the volumes in the storage pool have an access setting of 'READWRITE'. I have tried performing 'move data' commands for a few of the low occupancy volumes. The test environment consistently allocated a new scratch volume for output rather than adding the contents of the input volume to one of the few filling volumes with substantial amounts of data or to one of the many other low occupancy volumes. Web searches for the ANR2111W message turned up nothing except reminders that a storage pool backup is needed before reclamation. Web searches for various groups of keywords related to the second anomaly have turned up nothing recognizably relevant. Thomas Denier Thomas Jefferson University Hospital The information contained in this transmission contains privileged and confidential information. It is intended only for the use of the person named above. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. CAUTION: Intended recipients should NOT use email communication for emergent or urgent health care matters.
Re: Encrypted files and TSM Deduplication on TSM 7.1
Thank for your explanation, Andrew. In trying to architect a new service, we're having to architect for both deduplicable and non-deduplicable data. Encrypted data would generally fall into the non-deduplicable category, especially if it's 3rd party encryption with no transparent decryption. I just needed to understand in what scenarios we would recommend one over the other. Thanks! SF On 8/29/14 2:26 PM, "Andrew Raibeck" wrote: >Hi Sergio, > >The statement in the Admin Guide refers to data encrypted by the TSM >client. It is as you surmised, via include.encrypt. The TSM server does >not >otherwise know the contents of the files that are stored, so if the data >is >in some encrypted state by a 3rd party, the TSM server is not aware of >this, and it could be eligible for deduplication. How effective >deduplication will be with such data depends on how well this encrypted >data lends itself to being deduped. > >Thus the statement does not apply to data encrypted by a 3rd-party tool, >i.e., if the data has already been encrypted. HOWEVER, there are some >other >issues to understand and consider! > >Some encryption software handles encryption and decryption transparently. >That is, data will be stored on disk in an encrypted state; and when read >back by an authorized user, will be presented in its unencrypted state. >This type of software protects the data from theft of the physical asset >or >from unauthorized users. With one exception, when TSM reads the data, it >will be presented to TSM in its unencrypted state and backed up thus >(unless you use TSM client encryption). > >The one exception is files that are encrypted with Microsoft Windows EFS >(Encrypted File System). In this case, TSM uses the Microsoft EFS APIs to >back up and restore the data in its encrypted form. That is, the data is >NOT backed up or restored by TSM in its unencrypted form. > >Of course, if the files are statically encrypted, such that they appear >encrypted to any other application that tries to open them (there is no >transparent decryption), then TSM will back them up in that form, and has >no awareness of them being encrypted. > >Regards, > >- Andy > >__ >__ > >Andrew Raibeck | Tivoli Storage Manager Level 3 Technical Lead | >stor...@us.ibm.com > >IBM Tivoli Storage Manager links: >Product support: >http://www.ibm.com/support/entry/portal/Overview/Software/Tivoli/Tivoli_St >orage_Manager > >Online documentation: >https://www.ibm.com/developerworks/mydeveloperworks/wikis/home/wiki/Tivoli >+Documentation+Central/page/Tivoli+Storage+Manager >Product Wiki: >https://www.ibm.com/developerworks/mydeveloperworks/wikis/home/wiki/Tivoli >+Storage+Manager/page/Home > >"ADSM: Dist Stor Manager" wrote on 2014-08-29 >11:34:50: > >> From: "Sergio O. Fuentes" >> To: ADSM-L@VM.MARIST.EDU >> Date: 2014-08-29 11:35 >> Subject: Encrypted files and TSM Deduplication on TSM 7.1 >> Sent by: "ADSM: Dist Stor Manager" >> >> Here's an excerpt from official TSM documentation for TSM Server 7.1 >> as a limitation for deduplication: >> >> Encrypted files >> The Tivoli Storage Manager server and the backup-archive client >> cannot deduplicate encrypted files. If an encrypted file is >> encountered during data deduplication processing, the file is not >> deduplicated, and a message is logged. >> >> Can we get more information on this statement? How does TSM know >> that it has encountered an encrypted file? Is it based solely on >> include.encrypt options from the client? Will it look at the object >> metadata to see if it's encrypted? Will it try to post-process an >> encrypted file? In other words, if a file is encrypted say, using >> bitlocker or some third-party app, will TSM know not to process >> those objects for deduplication with the identify procedures? >> What's the overhead in calculating this scenario out? Has anyone >> tested this with TSM server 7.1? >> >> Thanks! >> Sergio
Re: Encrypted files and TSM Deduplication on TSM 7.1
Hi Sergio, The statement in the Admin Guide refers to data encrypted by the TSM client. It is as you surmised, via include.encrypt. The TSM server does not otherwise know the contents of the files that are stored, so if the data is in some encrypted state by a 3rd party, the TSM server is not aware of this, and it could be eligible for deduplication. How effective deduplication will be with such data depends on how well this encrypted data lends itself to being deduped. Thus the statement does not apply to data encrypted by a 3rd-party tool, i.e., if the data has already been encrypted. HOWEVER, there are some other issues to understand and consider! Some encryption software handles encryption and decryption transparently. That is, data will be stored on disk in an encrypted state; and when read back by an authorized user, will be presented in its unencrypted state. This type of software protects the data from theft of the physical asset or from unauthorized users. With one exception, when TSM reads the data, it will be presented to TSM in its unencrypted state and backed up thus (unless you use TSM client encryption). The one exception is files that are encrypted with Microsoft Windows EFS (Encrypted File System). In this case, TSM uses the Microsoft EFS APIs to back up and restore the data in its encrypted form. That is, the data is NOT backed up or restored by TSM in its unencrypted form. Of course, if the files are statically encrypted, such that they appear encrypted to any other application that tries to open them (there is no transparent decryption), then TSM will back them up in that form, and has no awareness of them being encrypted. Regards, - Andy Andrew Raibeck | Tivoli Storage Manager Level 3 Technical Lead | stor...@us.ibm.com IBM Tivoli Storage Manager links: Product support: http://www.ibm.com/support/entry/portal/Overview/Software/Tivoli/Tivoli_Storage_Manager Online documentation: https://www.ibm.com/developerworks/mydeveloperworks/wikis/home/wiki/Tivoli +Documentation+Central/page/Tivoli+Storage+Manager Product Wiki: https://www.ibm.com/developerworks/mydeveloperworks/wikis/home/wiki/Tivoli +Storage+Manager/page/Home "ADSM: Dist Stor Manager" wrote on 2014-08-29 11:34:50: > From: "Sergio O. Fuentes" > To: ADSM-L@VM.MARIST.EDU > Date: 2014-08-29 11:35 > Subject: Encrypted files and TSM Deduplication on TSM 7.1 > Sent by: "ADSM: Dist Stor Manager" > > Here's an excerpt from official TSM documentation for TSM Server 7.1 > as a limitation for deduplication: > > Encrypted files > The Tivoli Storage Manager server and the backup-archive client > cannot deduplicate encrypted files. If an encrypted file is > encountered during data deduplication processing, the file is not > deduplicated, and a message is logged. > > Can we get more information on this statement? How does TSM know > that it has encountered an encrypted file? Is it based solely on > include.encrypt options from the client? Will it look at the object > metadata to see if it's encrypted? Will it try to post-process an > encrypted file? In other words, if a file is encrypted say, using > bitlocker or some third-party app, will TSM know not to process > those objects for deduplication with the identify procedures? > What's the overhead in calculating this scenario out? Has anyone > tested this with TSM server 7.1? > > Thanks! > Sergio >
Encrypted files and TSM Deduplication on TSM 7.1
Here's an excerpt from official TSM documentation for TSM Server 7.1 as a limitation for deduplication: Encrypted files The Tivoli Storage Manager server and the backup-archive client cannot deduplicate encrypted files. If an encrypted file is encountered during data deduplication processing, the file is not deduplicated, and a message is logged. Can we get more information on this statement? How does TSM know that it has encountered an encrypted file? Is it based solely on include.encrypt options from the client? Will it look at the object metadata to see if it's encrypted? Will it try to post-process an encrypted file? In other words, if a file is encrypted say, using bitlocker or some third-party app, will TSM know not to process those objects for deduplication with the identify procedures? What's the overhead in calculating this scenario out? Has anyone tested this with TSM server 7.1? Thanks! Sergio
Re: deduplication status
39 is actually not a great number; it means you are getting less than 2 for 1 dedup. Unless you have backups running hard 24 hours a day, those dedup processes should finish. When you do Q PROC, if the processes have any work to do, they show as ACTIVE, if not they show IDLE. I'd think that at some point during the day, you should have at least one of them go idle, then you know you have been as aggressive as you can be. If not I'd add processes until you can see some idle time on at least one of them. Just my 2cents. W -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Tyree, David Sent: Monday, July 28, 2014 11:42 AM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] deduplication status I've searched the archives but I can't really find the answer I'm looking for. Running on version 6.3.1.0. I have a primary storage pool running dedup. I run the "ID DEDUP" command, the reclamation command, and the expiration command throughout the daily cycle. I run the Q STGP F=D command to check the "Duplicate Data Not Stored" numbers and I'm getting 39% right now. Which sounds pretty good, I guess. My question is how to I tell if I'm running the dedup processes aggressively enough. Can I do something to increase that number? I realize that the dedup processes are never really finished because of the new data that is constantly coming in and old data is getting expired. Is there something I can look at to be able to tell if I need to adjust the ID DUP commands I'm running? More processes, less processes or change how long I run it... Something that tells me how much data has not been deduped yet versus how much has been processed. Is that kind of info accessible? David Tyree System Administrator South Georgia Medical Center 229.333.1155
deduplication status
I've searched the archives but I can't really find the answer I'm looking for. Running on version 6.3.1.0. I have a primary storage pool running dedup. I run the "ID DEDUP" command, the reclamation command, and the expiration command throughout the daily cycle. I run the Q STGP F=D command to check the "Duplicate Data Not Stored" numbers and I'm getting 39% right now. Which sounds pretty good, I guess. My question is how to I tell if I'm running the dedup processes aggressively enough. Can I do something to increase that number? I realize that the dedup processes are never really finished because of the new data that is constantly coming in and old data is getting expired. Is there something I can look at to be able to tell if I need to adjust the ID DUP commands I'm running? More processes, less processes or change how long I run it... Something that tells me how much data has not been deduped yet versus how much has been processed. Is that kind of info accessible? David Tyree System Administrator South Georgia Medical Center 229.333.1155
Re: TSM and VTL Deduplication
Because of this, and if you want to use TSM dedup, you could access your storage over NFS as a FILE device pool. That is, instead of using a VTL interface over fiberchannel you could bring up a NFS server on the Linux box and access it as a TSM FILE device pool over ethernet. Whether to use TSM dedup or VTL sftw dedup may depend on which provides better dedup ratio. You might want to perform a test with a large sample of your data to see what ratio you get with each. Rick -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Ehresman,David E. Sent: Friday, June 13, 2014 7:53 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM and VTL Deduplication Just so we're all clear here. You cannot TSM dedup to virtual tape, even though the virtual tape is actually disk. TSM dedup has to go to a TSM defined FILE storage pool, not a TSM defined tape storage pool. If you write to a virtual tape storage pool, the data will be written to those virtual tapes un-deduped by TSM. It the virtual tape does dedup, it will do so but TSM will have no part in that operation and will in fact not know that it has been done. David -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 4:31 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Yes, one of the two. If TSM deduplication is enabled and the target is a virtual tape, i doubt if the VTL can deduplicate anything from the write data. On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 12:51 PM Unless you have a specific requirement, I would suggest you choose either TSM dedup to disk or go straight to virtual tape. There is not usually a need to do both. David -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 2:41 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Thanks for all the answers. So SSDs (Looking at SSD caching) for the database storage and 10GB per TB of total backup data on the safer side. On Thu, 6/12/14, Erwann Simon wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 8:47 AM Hi, I'd rather say 6 to 10 times, or 10 GB of DB for each 1 TB of data (native, not deduped) stored. -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "Norman Gee" À: ADSM-L@VM.MARIST.EDU Envoyé: Jeudi 12 Juin 2014 16:55:29 Objet: Re: [ADSM-L] TSM and VTL Deduplication Be prepare for your database size to double or triple if you are using TSM deduplication. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Prather, Wanda Sent: Thursday, June 12, 2014 7:15 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM and VTL Deduplication And if you are on the licensing-by-TB model, when it gets un-deduped (reduped, rehydrated, whatever), your costs go up! -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 9:48 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Understood. Thanks ! On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 5:33 AM If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it - The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message.
Re: TSM and VTL Deduplication
Just so we're all clear here. You cannot TSM dedup to virtual tape, even though the virtual tape is actually disk. TSM dedup has to go to a TSM defined FILE storage pool, not a TSM defined tape storage pool. If you write to a virtual tape storage pool, the data will be written to those virtual tapes un-deduped by TSM. It the virtual tape does dedup, it will do so but TSM will have no part in that operation and will in fact not know that it has been done. David -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 4:31 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Yes, one of the two. If TSM deduplication is enabled and the target is a virtual tape, i doubt if the VTL can deduplicate anything from the write data. On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 12:51 PM Unless you have a specific requirement, I would suggest you choose either TSM dedup to disk or go straight to virtual tape. There is not usually a need to do both. David -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 2:41 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Thanks for all the answers. So SSDs (Looking at SSD caching) for the database storage and 10GB per TB of total backup data on the safer side. On Thu, 6/12/14, Erwann Simon wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 8:47 AM Hi, I'd rather say 6 to 10 times, or 10 GB of DB for each 1 TB of data (native, not deduped) stored. -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "Norman Gee" À: ADSM-L@VM.MARIST.EDU Envoyé: Jeudi 12 Juin 2014 16:55:29 Objet: Re: [ADSM-L] TSM and VTL Deduplication Be prepare for your database size to double or triple if you are using TSM deduplication. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Prather, Wanda Sent: Thursday, June 12, 2014 7:15 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM and VTL Deduplication And if you are on the licensing-by-TB model, when it gets un-deduped (reduped, rehydrated, whatever), your costs go up! -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 9:48 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Understood. Thanks ! On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 5:33 AM If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it
Re: TSM and VTL Deduplication
IBM supplies a perl script to measure the cost of dedup. See http://www-01.ibm.com/support/docview.wss?uid=swg21596944 I just ran it in an instance with an 800 GB db, here are the final summary lines - Final Dedup and Database Impact Report Deduplication Database Totals - Total Dedup Chunks in DB : 1171344436 Average Dedup Chunk Size : 447243.5 Deduplication Impact to Database and Storage Pools --- Estimated DB Cost of Deduplication: 796.51 GB Total Storage Pool Savings: 230466.30 GB That works out to ~3.5 GB per TB saved. The db is not on SSD. It is on a 6 disk raid 10 array internal on a Dell server. Overall I am very happy with TSM dedup. Thanks, Bill Colwell Draper lab -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 4:31 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM and VTL Deduplication Yes, one of the two. If TSM deduplication is enabled and the target is a virtual tape, i doubt if the VTL can deduplicate anything from the write data. On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 12:51 PM Unless you have a specific requirement, I would suggest you choose either TSM dedup to disk or go straight to virtual tape. There is not usually a need to do both. David -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 2:41 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Thanks for all the answers. So SSDs (Looking at SSD caching) for the database storage and 10GB per TB of total backup data on the safer side. On Thu, 6/12/14, Erwann Simon wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 8:47 AM Hi, I'd rather say 6 to 10 times, or 10 GB of DB for each 1 TB of data (native, not deduped) stored. -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "Norman Gee" À: ADSM-L@VM.MARIST.EDU Envoyé: Jeudi 12 Juin 2014 16:55:29 Objet: Re: [ADSM-L] TSM and VTL Deduplication Be prepare for your database size to double or triple if you are using TSM deduplication. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Prather, Wanda Sent: Thursday, June 12, 2014 7:15 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM and VTL Deduplication And if you are on the licensing-by-TB model, when it gets un-deduped (reduped, rehydrated, whatever), your costs go up! -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 9:48 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Understood. Thanks ! On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 5:33 AM If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it
Re: TSM and VTL Deduplication
Yes, one of the two. If TSM deduplication is enabled and the target is a virtual tape, i doubt if the VTL can deduplicate anything from the write data. On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 12:51 PM Unless you have a specific requirement, I would suggest you choose either TSM dedup to disk or go straight to virtual tape. There is not usually a need to do both. David -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 2:41 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Thanks for all the answers. So SSDs (Looking at SSD caching) for the database storage and 10GB per TB of total backup data on the safer side. On Thu, 6/12/14, Erwann Simon wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 8:47 AM Hi, I'd rather say 6 to 10 times, or 10 GB of DB for each 1 TB of data (native, not deduped) stored. -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "Norman Gee" À: ADSM-L@VM.MARIST.EDU Envoyé: Jeudi 12 Juin 2014 16:55:29 Objet: Re: [ADSM-L] TSM and VTL Deduplication Be prepare for your database size to double or triple if you are using TSM deduplication. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Prather, Wanda Sent: Thursday, June 12, 2014 7:15 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM and VTL Deduplication And if you are on the licensing-by-TB model, when it gets un-deduped (reduped, rehydrated, whatever), your costs go up! -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 9:48 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Understood. Thanks ! On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 5:33 AM If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it
Re: TSM and VTL Deduplication
Unless you have a specific requirement, I would suggest you choose either TSM dedup to disk or go straight to virtual tape. There is not usually a need to do both. David -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 2:41 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Thanks for all the answers. So SSDs (Looking at SSD caching) for the database storage and 10GB per TB of total backup data on the safer side. On Thu, 6/12/14, Erwann Simon wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 8:47 AM Hi, I'd rather say 6 to 10 times, or 10 GB of DB for each 1 TB of data (native, not deduped) stored. -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "Norman Gee" À: ADSM-L@VM.MARIST.EDU Envoyé: Jeudi 12 Juin 2014 16:55:29 Objet: Re: [ADSM-L] TSM and VTL Deduplication Be prepare for your database size to double or triple if you are using TSM deduplication. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Prather, Wanda Sent: Thursday, June 12, 2014 7:15 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM and VTL Deduplication And if you are on the licensing-by-TB model, when it gets un-deduped (reduped, rehydrated, whatever), your costs go up! -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 9:48 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Understood. Thanks ! On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 5:33 AM If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it
Re: TSM and VTL Deduplication
Thanks for all the answers. So SSDs (Looking at SSD caching) for the database storage and 10GB per TB of total backup data on the safer side. On Thu, 6/12/14, Erwann Simon wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 8:47 AM Hi, I'd rather say 6 to 10 times, or 10 GB of DB for each 1 TB of data (native, not deduped) stored. -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "Norman Gee" À: ADSM-L@VM.MARIST.EDU Envoyé: Jeudi 12 Juin 2014 16:55:29 Objet: Re: [ADSM-L] TSM and VTL Deduplication Be prepare for your database size to double or triple if you are using TSM deduplication. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Prather, Wanda Sent: Thursday, June 12, 2014 7:15 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM and VTL Deduplication And if you are on the licensing-by-TB model, when it gets un-deduped (reduped, rehydrated, whatever), your costs go up! -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 9:48 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Understood. Thanks ! On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 5:33 AM If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it
Re: TSM and VTL Deduplication
Hi, I'd rather say 6 to 10 times, or 10 GB of DB for each 1 TB of data (native, not deduped) stored. -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "Norman Gee" À: ADSM-L@VM.MARIST.EDU Envoyé: Jeudi 12 Juin 2014 16:55:29 Objet: Re: [ADSM-L] TSM and VTL Deduplication Be prepare for your database size to double or triple if you are using TSM deduplication. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Prather, Wanda Sent: Thursday, June 12, 2014 7:15 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM and VTL Deduplication And if you are on the licensing-by-TB model, when it gets un-deduped (reduped, rehydrated, whatever), your costs go up! -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 9:48 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Understood. Thanks ! On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 5:33 AM If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it
Re: TSM and VTL Deduplication
<76426ad6b99f3b4196e5acfbc47974ca60f0a...@legmsx11.calegis.net> In-Reply-To: <76426ad6b99f3b4196e5acfbc47974ca60f0a...@legmsx11.calegis.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.19.3.150] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:199.223.22.238;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(6009001)(13464003)(24454002)(199002)(189002)(479174003)(377454003)(2656002)(23726002)(2171001)(97756001)(87936001)(97736001)(77982001)(76482001)(44976005)(19580405001)(19580395003)(46406003)(6806004)(83322001)(68736004)(81542001)(86362001)(4396001)(84676001)(46102001)(83072002)(85852003)(74662001)(74502001)(99396002)(76176999)(50986999)(54356999)(50466002)(81342001)(93886003)(64706001)(79102001)(80022001)(20776003)(47776003)(33646001)(24736002);DIR:OUT;SFP:;SCL:1;SRVR:BY2PR03MB441;H:VNUCITEX08.ICFI.icfconsulting.com;FPR:;MLV:sfv;PTR:InfoDomainNonexistent;A:1;MX:1;LANG:en; X-Microsoft-Antispam: BL:0;ACTION:Default;RISK:Low;SCL:0;SPMLVL:NotSpam;PCL:0;RULEID: X-Forefront-PRVS: 02408926C4 Received-SPF: SoftFail (: domain of transitioning icfi.com discourages use of 199.223.22.238 as permitted sender) Authentication-Results: spf=softfail (sender IP is 199.223.22.238) smtp.mailfrom=wanda.prat...@icfi.com; X-OriginatorOrg: icfi.com X-Barracuda-Connect: mail-bn1blp0189.outbound.protection.outlook.com[207.46.163.189] X-Barracuda-Start-Time: 1402587508 X-Barracuda-Encrypted: AES128-SHA X-Barracuda-URL: http://148.100.49.28:8000/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at marist.edu X-Barracuda-BRTS-Status: 1 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=MAILTO_TO_SPAM_ADDR X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.6592 Rule breakdown below pts rule name description -- -- 0.00 MAILTO_TO_SPAM_ADDRURI: Includes a link to a likely spammer email And if you are deduping more than a couple TB a day, be prepared to move i= t to SSD or flash! -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Ge= e, Norman Sent: Thursday, June 12, 2014 10:55 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Be prepare for your database size to double or triple if you are using TSM = deduplication. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Pr= ather, Wanda Sent: Thursday, June 12, 2014 7:15 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM and VTL Deduplication And if you are on the licensing-by-TB model, when it gets un-deduped (redup= ed, rehydrated, whatever), your costs go up! -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Da= n Haufer Sent: Thursday, June 12, 2014 9:48 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Understood. Thanks ! On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 5:33 AM If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it
Re: TSM and VTL Deduplication
Be prepare for your database size to double or triple if you are using TSM deduplication. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Prather, Wanda Sent: Thursday, June 12, 2014 7:15 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: TSM and VTL Deduplication And if you are on the licensing-by-TB model, when it gets un-deduped (reduped, rehydrated, whatever), your costs go up! -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 9:48 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Understood. Thanks ! On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 5:33 AM If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it
Re: TSM and VTL Deduplication
And if you are on the licensing-by-TB model, when it gets un-deduped (reduped, rehydrated, whatever), your costs go up! -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 9:48 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] TSM and VTL Deduplication Understood. Thanks ! On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 5:33 AM If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it
Re: TSM and VTL Deduplication
Understood. Thanks ! On Thu, 6/12/14, Ehresman,David E. wrote: Subject: Re: [ADSM-L] TSM and VTL Deduplication To: ADSM-L@VM.MARIST.EDU Date: Thursday, June 12, 2014, 5:33 AM If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it
Re: TSM and VTL Deduplication
If TSM moves data from a (disk) dedup pool to tape, TSM has to un-dedup the data as it reads it and sends it to tape. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Dan Haufer Sent: Thursday, June 12, 2014 8:03 AM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] TSM and VTL Deduplication Greetings, We are trying to evaluate the possibility of introducing deduplication into our backups. Our initial deployment will be based on quadstor vtl http://www.quadstor.com/virtual-tape-library.html but at the same time we are trying to understand the TSM deduplication feature. Could anyone tell me what the following means from the TSM deduplication FAQ https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Deduplication%20FAQ -- (Q4) When should I consider using TSM deduplication? Consider using TSM deduplication when the following conditions apply: You plan to use a disk-only backup solution (your primary backup storage pool will remain on disk). -- I understand why the backup pool should be a disk pool, but the "disk-only backup solution" confuses me. Does tape migration cause any effects ? TSM Server version v6.3.3 --
TSM and VTL Deduplication
Greetings, We are trying to evaluate the possibility of introducing deduplication into our backups. Our initial deployment will be based on quadstor vtl http://www.quadstor.com/virtual-tape-library.html but at the same time we are trying to understand the TSM deduplication feature. Could anyone tell me what the following means from the TSM deduplication FAQ https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Deduplication%20FAQ -- (Q4) When should I consider using TSM deduplication? Consider using TSM deduplication when the following conditions apply: You plan to use a disk-only backup solution (your primary backup storage pool will remain on disk). -- I understand why the backup pool should be a disk pool, but the "disk-only backup solution" confuses me. Does tape migration cause any effects ? TSM Server version v6.3.3 --
Re: Test Deduplication
You can however you this to get an idea : http://www-01.ibm.com/support/docview.wss?uid=swg21596944 On Tue, Mar 25, 2014 at 10:25 AM, Erwann Simon wrote: > Hi Christian, > > No, it's not possible. the Storage Pool must be deduplication enabled to > be able to run identify duplicates against it. If you enable your Storage > Storage for deduplication, and start identify process, your DB will start > to grown at a very fast rate and you won't be able this space after that. > > -- > Best regards / Cordialement / مع تحياتي > Erwann SIMON > > - Mail original - > De: "Christian Svensson" > À: ADSM-L@VM.MARIST.EDU > Envoyé: Jeudi 13 Février 2014 09:42:34 > Objet: [ADSM-L] Test Deduplication > > Hi *SM-nerds. > I just wonder if it is possible to run a deduplication Identify process on > an existing File Class Storage Pool without deduplicat any data? > We just want know how much data that will be deduplicate and if it worth > it on this storage pool. > > We are running TSM 6.3.4.300 running on a Windows Server 2012. > Best Regards / Med vänlig hälsning > Christian Svensson > __ > Knarrarnäsgatan 7, Kista Entré > SE-164 40 Kista Sweden > Cell :+ 46-70 325 1577 > E-mail: christian.svens...@cristie.se<mailto:christian.svens...@cristie.se > > > "Genom att säkra era data spar ni tid, tid som ni kan använda till > viktigare saker." >
Re: Test Deduplication
Hi Christian, No, it's not possible. the Storage Pool must be deduplication enabled to be able to run identify duplicates against it. If you enable your Storage Storage for deduplication, and start identify process, your DB will start to grown at a very fast rate and you won't be able this space after that. -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "Christian Svensson" À: ADSM-L@VM.MARIST.EDU Envoyé: Jeudi 13 Février 2014 09:42:34 Objet: [ADSM-L] Test Deduplication Hi *SM-nerds. I just wonder if it is possible to run a deduplication Identify process on an existing File Class Storage Pool without deduplicat any data? We just want know how much data that will be deduplicate and if it worth it on this storage pool. We are running TSM 6.3.4.300 running on a Windows Server 2012. Best Regards / Med vänlig hälsning Christian Svensson __ Knarrarnäsgatan 7, Kista Entré SE-164 40 Kista Sweden Cell :+ 46-70 325 1577 E-mail: christian.svens...@cristie.se<mailto:christian.svens...@cristie.se> "Genom att säkra era data spar ni tid, tid som ni kan använda till viktigare saker."
Test Deduplication
Hi *SM-nerds. I just wonder if it is possible to run a deduplication Identify process on an existing File Class Storage Pool without deduplicat any data? We just want know how much data that will be deduplicate and if it worth it on this storage pool. We are running TSM 6.3.4.300 running on a Windows Server 2012. Best Regards / Med vänlig hälsning Christian Svensson __ Knarrarnäsgatan 7, Kista Entré SE-164 40 Kista Sweden Cell :+ 46-70 325 1577 E-mail: christian.svens...@cristie.se<mailto:christian.svens...@cristie.se> "Genom att säkra era data spar ni tid, tid som ni kan använda till viktigare saker."
Re: Deduplication "number of chunks waiting in queue" continues to rise?
Hey, Nick, missed your name the first time around! Being in higher-ed/research we went the cheap route and actually just use direct-attach 15K SAS drives on Dell servers, divvied up into multiple RAID-10 sets. Even a 1TB database only takes us ~1 hour to backup or restore, which is well within our SLA. On 12/20/2013 11:42 AM, Marouf, Nick wrote: > Hi Skylar ! > > Yes that would be the easy way do it, there is an option to rebalance > the I/O after you add the new file systems to the database. I had already > setup TSM before the performance tuning guideline was released. Doing this > way, will require more storage initially and running db2 rebalancing command > line tools will spread out the DB I/O load > > Using IBM XIV's that can handle very large IO requests, in our specific > case there was no need to provide physically-separate volumes. I've seen one > TSM instance crank upwards of 10,000 IOPS leaving an entire ESX cluster in > the dust. -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354
Re: Deduplication "number of chunks waiting in queue" continues to rise?
Hi Wanda, some quick rambling thoughts about dereferenced chunk cleanup. Do you know about the 'show banner' command? If IBM sends you an e-fix, this will tell you what it is fixing. tsm: x>show banner * EFIX Cumulative level 6.3.4.207 * * This is a Limited Availability TEMPORARY fix for * * IC94121 - ANR2033E DEFINE ASSOCIATION: Command failed - lock con * * when def assoc immediately follows def sched. * * IC95890 - Allow numeric volser for zOS Media server volumes. * * IC93279 - Redrive failed outbound replication connect requests. * * IC93850 - PAM authentication login protocol exchange failure * * wi3187 - AUDIT LIBVOLUME new command* * IC96637 - SERVER CAN HANG WHEN USING OPERATION CENTER* * IC95938 - ANRD_2644193874 BFCHECKENDTOEND DURING RESTORE/RET * * IC96993 - MOVE NODEDATA OPERATION MIGHT RESULT IN INVALID LINKS * * IC91138 - Enable audit volume to mark one more kind invalid link * * THE RESTARTED RESTORE OPERATION MAY BE SINGLE-THREADED * * Avoid restore stgpool linking to orphaned base chunks * * WI3236 - Oracle T1D tape drive support * * 94297 - Add a parameter DELETEALIASES for DELETE BITFILE utili * * IC96462 - Mount failure retry for zOS Media server tape volumes. * * IC96993 - SLOW DELETION OF DEREFERENCED DEDUPLICATED CHUNKS * * This cumulative efix server is based on code level * * made generally available with FixPack 6.3.4.200 * * * I have 2 servers on 6342.006 and 2 on 6342.007. I have .009 efix waiting to be installed on my biggest, oldest, badest server to fix the chunks in queue problem. On 3 servers, the queue is down to 0, and they usually run without a problem. On the big bad one, here are the stats - tsm: WIN1>show dedupdeleteinfo Dedup Deletion General Status Number of worker threads : 15 Number of active worker threads : 1 Number of chunks waiting in queue : 11326513 Dedup Deletion Worker Info Dedup deletion worker id: 1 Total chunks queued : 0 Total chunks deleted: 0 Deleting AF Entries?: Yes In error state? : No Worker thread 2 is not active Worker thread 3 is not active Worker thread 4 is not active Worker thread 5 is not active Worker thread 6 is not active Worker thread 7 is not active Worker thread 8 is not active Worker thread 9 is not active Worker thread 10 is not active Worker thread 11 is not active Worker thread 12 is not active Worker thread 13 is not active Worker thread 14 is not active Worker thread 15 is not active -- Total worker chunks queued : 0 Total worker chunks deleted: 0 The cleanup of reclaimed volumes is done by the thread which has ' Deleting AF Entries?: Yes'. The pending efix is supposed to get this process to finish. It never finishes on this server, something about a bad access plan. When I have a lot of volumes which are empty but won't delete, I generate move data commands for them. Move data to the same pool will manually do what the chunk cleanup process is trying to do. Regards, Bill Colwell Draper lab -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Prather, Wanda Sent: Thursday, December 19, 2013 11:36 PM To: ADSM-L@VM.MARIST.EDU Subject: Deduplication "number of chunks waiting in queue" continues to rise? TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup "chunking" problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the "number of chunks waiting in queue" has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only server side. I've also set deduprequiresbackup to NO for now, although I hate doing that, to make sure that doesn't' interfere with the reclaim process. But SHOW DEDUPDELETEINFO shows that the "number of chunks waiting in queue"
Re: Deduplication "number of chunks waiting in queue" continues to rise?
Hi Skylar ! Yes that would be the easy way do it, there is an option to rebalance the I/O after you add the new file systems to the database. I had already setup TSM before the performance tuning guideline was released. Doing this way, will require more storage initially and running db2 rebalancing command line tools will spread out the DB I/O load Using IBM XIV's that can handle very large IO requests, in our specific case there was no need to provide physically-separate volumes. I've seen one TSM instance crank upwards of 10,000 IOPS leaving an entire ESX cluster in the dust. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Skylar Thompson Sent: Friday, December 20, 2013 2:28 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Deduplication "number of chunks waiting in queue" continues to rise? While we don't do deduplication (tests show we gain less than 25% from it), we also split our DB2 instances across multiple, physically-separate volumes. The one thing to note is that you have to dump and restore the database to spread existing data across those directories if you add them post-installation. On Fri, Dec 20, 2013 at 02:23:34PM -0500, Marouf, Nick wrote: > I can second that Sergio, > Backup stgpools to copy tapes is not pretty, and is an intensive process to > rehydrate all that data. > > The one extra thing I did was split the database across multiple folder for > parallel I/O to the Database. That has worked out very well, and I currently > have it setup to span across 8 folders, with an XIV backend that can take a > beating. > > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf > Of Sergio O. Fuentes > Sent: Friday, December 20, 2013 12:04 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: [ADSM-L] Deduplication "number of chunks waiting in queue" > continues to rise? > > Client-side dedup and simultaneous-write to a copy pool are mutually > exclusive. You can't do both, which is the only theoretical way to enforce > deduprequiresbackup with client-side dedup. I suppose IBM could enhance TSM > to do a simultaneous-like operation with client-side dedup, but that's not > available now. So, I'm not sure how the TSM server enforces > deduprequiresbackup with client-side dedup. Ever since 6.1 I have always set > that to NO anyway. I have dealt with the repercussions of that as well. > Backup stgpool on dedup'd stgpools is not pretty. > > I have made some architectural changes to the underlying stgpools and the > 'backup stgpools' run pretty well, even with 1TB SATA drives. Two things I > think helped quite a bit: > > 1. Use big predefined volumes. My new volumes are 50GB. > 2. Use many filesystems for the devclass. I have 5 currently. I would use > more if I had the space. > > Thanks! > > Sergio > > > On 12/20/13 11:35 AM, "Prather, Wanda" wrote: > > >Woo hoo! > >That's great news. > >Will open a ticket and escalate. > > > >Also looking at client-side dedup, but I have to do some > >architectural planning, as all the data is coming from one client, > >the TSM VE data mover, which is a vm. > > > >Re client-side dedup, do you know if there is any cooperation between > >the client-side dedup and deduprequiresbackup on the server end? > >I have assumed that the client-side dedup would not offer that protection. > > > >W > > > >-Original Message- > >From: Sergio O. Fuentes [mailto:sfuen...@umd.edu] > >Sent: Friday, December 20, 2013 10:39 AM > >To: ADSM: Dist Stor Manager > >Cc: Prather, Wanda > >Subject: Re: [ADSM-L] Deduplication "number of chunks waiting in queue" > >continues to rise? > > > >Wanda, > > > >In trying to troubleshoot an unrelated performance PMR, IBM provided > >me with an e-fix for the dedupdel bottleneck that it sounds like > >you're experiencing. They obviously will want to do their > >due-diligence on whether or not this efix will help solve your > >problems, but it has proved very useful in my environment. They even > >had to compile a solaris e-fix for me, cause it seems like I'm the > >only one running TSM on Solaris. The e-fix was very simple to install. > > > >What you don't want to do is go to 6.3.4.2, unless they tell you to > >because the e-fix is for that level (207). Don't run on 6.3.4.2 for > >even a minute. Only install it to get to the e-fix level. > > > >Dedupdel gets populated by anything that deletes data from the > >stgpool, I.e
Re: Deduplication "number of chunks waiting in queue" continues to rise?
Is anyone doing stgpool backups to a dedup file copy pool? At 02:23 PM 12/20/2013, Marouf, Nick wrote: >I can second that Sergio, > Backup stgpools to copy tapes is not pretty, and is an intensive process to > rehydrate all that data. > >The one extra thing I did was split the database across multiple folder for >parallel I/O to the Database. That has worked out very well, and I currently >have it setup to span across 8 folders, with an XIV backend that can take a >beating. > > >-Original Message- >From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of >Sergio O. Fuentes >Sent: Friday, December 20, 2013 12:04 PM >To: ADSM-L@VM.MARIST.EDU >Subject: Re: [ADSM-L] Deduplication "number of chunks waiting in queue" >continues to rise? > >Client-side dedup and simultaneous-write to a copy pool are mutually >exclusive. You can't do both, which is the only theoretical way to enforce >deduprequiresbackup with client-side dedup. I suppose IBM could enhance TSM >to do a simultaneous-like operation with client-side dedup, but that's not >available now. So, I'm not sure how the TSM server enforces >deduprequiresbackup with client-side dedup. Ever since 6.1 I have always set >that to NO anyway. I have dealt with the repercussions of that as well. >Backup stgpool on dedup'd stgpools is not pretty. > >I have made some architectural changes to the underlying stgpools and the >'backup stgpools' run pretty well, even with 1TB SATA drives. Two things I >think helped quite a bit: > >1. Use big predefined volumes. My new volumes are 50GB. >2. Use many filesystems for the devclass. I have 5 currently. I would use >more if I had the space. > >Thanks! > >Sergio > > >On 12/20/13 11:35 AM, "Prather, Wanda" wrote: > >>Woo hoo! >>That's great news. >>Will open a ticket and escalate. >> >>Also looking at client-side dedup, but I have to do some architectural >>planning, as all the data is coming from one client, the TSM VE data >>mover, which is a vm. >> >>Re client-side dedup, do you know if there is any cooperation between >>the client-side dedup and deduprequiresbackup on the server end? >>I have assumed that the client-side dedup would not offer that protection. >> >>W >> >>-Original Message- >>From: Sergio O. Fuentes [mailto:sfuen...@umd.edu] >>Sent: Friday, December 20, 2013 10:39 AM >>To: ADSM: Dist Stor Manager >>Cc: Prather, Wanda >>Subject: Re: [ADSM-L] Deduplication "number of chunks waiting in queue" >>continues to rise? >> >>Wanda, >> >>In trying to troubleshoot an unrelated performance PMR, IBM provided me >>with an e-fix for the dedupdel bottleneck that it sounds like you're >>experiencing. They obviously will want to do their due-diligence on >>whether or not this efix will help solve your problems, but it has >>proved very useful in my environment. They even had to compile a >>solaris e-fix for me, cause it seems like I'm the only one running TSM >>on Solaris. The e-fix was very simple to install. >> >>What you don't want to do is go to 6.3.4.2, unless they tell you to >>because the e-fix is for that level (207). Don't run on 6.3.4.2 for >>even a minute. Only install it to get to the e-fix level. >> >>Dedupdel gets populated by anything that deletes data from the stgpool, >>I.e. move data, expire inv, delete filespace, move nodedata, etc. >> >>We run client-side dedupe (which works pretty well, except when you run >>into performance issues on the server) and so our identifies don't run >>very long, if at all. It might save you time to run client-side dedupe. >> >>BTW, when I finally got this efix and TSM was able to catch-up with the >>deletes and reclaims as it needed to, I got some serious space space >>back in my TDP Dedup pool. It went from 90% util to 60% util (with >>about 10TB of total capacity). What finally really got me before the >>fix was that I had to delete a bunch of old TDP MSSQL filespaces and it >>just took forever for TSM to catch up. I have a few deletes to do now, >>and I'm a bit wary because I don't want to hose my server again. >> >>I would escalate with IBM support and have them supply you the e-fix. >>6.3.4.3 I don't think is slated for release any time within the next >>few days, and you'll just be struggling to deal with the performance issue. >> >>HTH, >>Sergio >> >> >> >>On 12/19/13 11:35 PM, "Prather, Wanda" wrote:
Re: Deduplication "number of chunks waiting in queue" continues to rise?
Hi All, Is someone using this script for reporting purpose ? http://www-01.ibm.com/support/docview.wss?uid=swg21596944 -- Best regards / Cordialement / مع تحياتي Erwann SIMON - Mail original - De: "Wanda Prather" À: ADSM-L@VM.MARIST.EDU Envoyé: Vendredi 20 Décembre 2013 05:35:38 Objet: [ADSM-L] Deduplication "number of chunks waiting in queue" continues to rise? TSM 6.3.4.00 on Win2K8 Perhaps some of you that have dealt with the dedup "chunking" problem can enlighten me. TSM/VE backs up to a dedup file pool, about 4 TB of changed blocks per day I currently have more than 2 TB (yep, terabytes) of volumes in that file pool that will not reclaim. We were told by support that when you do: SHOW DEDUPDELETEINFO That the "number of chunks waiting in queue" has to go to zero for those volumes to reclaim. (I know that there is a fix at 6.3.4.200 to improve the chunking process, but that has been APARed, and waiting on 6.3.4.300.) I have shut down IDENTIFY DUPLICATES and reclamation for this pool. There are no clients writing into the pool, we have redirected backups to a non-dedup pool for now to try and get this cleared up. There is no client-side dedup here, only server side. I've also set deduprequiresbackup to NO for now, although I hate doing that, to make sure that doesn't' interfere with the reclaim process. But SHOW DEDUPDELETEINFO shows that the "number of chunks waiting in queue" is *still* increasing. So, WHAT is putting stuff on that dedup delete queue? And how do I ever gain ground? W **Please note new office phone: Wanda Prather | Senior Technical Specialist | wanda.prat...@icfi.com | www.icfi.com ICF International | 443-718-4900 (o)
Re: Deduplication "number of chunks waiting in queue" continues to rise?
While we don't do deduplication (tests show we gain less than 25% from it), we also split our DB2 instances across multiple, physically-separate volumes. The one thing to note is that you have to dump and restore the database to spread existing data across those directories if you add them post-installation. On Fri, Dec 20, 2013 at 02:23:34PM -0500, Marouf, Nick wrote: > I can second that Sergio, > Backup stgpools to copy tapes is not pretty, and is an intensive process to > rehydrate all that data. > > The one extra thing I did was split the database across multiple folder for > parallel I/O to the Database. That has worked out very well, and I currently > have it setup to span across 8 folders, with an XIV backend that can take a > beating. > > > -Original Message- > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of > Sergio O. Fuentes > Sent: Friday, December 20, 2013 12:04 PM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: [ADSM-L] Deduplication "number of chunks waiting in queue" > continues to rise? > > Client-side dedup and simultaneous-write to a copy pool are mutually > exclusive. You can't do both, which is the only theoretical way to enforce > deduprequiresbackup with client-side dedup. I suppose IBM could enhance TSM > to do a simultaneous-like operation with client-side dedup, but that's not > available now. So, I'm not sure how the TSM server enforces > deduprequiresbackup with client-side dedup. Ever since 6.1 I have always set > that to NO anyway. I have dealt with the repercussions of that as well. > Backup stgpool on dedup'd stgpools is not pretty. > > I have made some architectural changes to the underlying stgpools and the > 'backup stgpools' run pretty well, even with 1TB SATA drives. Two things I > think helped quite a bit: > > 1. Use big predefined volumes. My new volumes are 50GB. > 2. Use many filesystems for the devclass. I have 5 currently. I would use > more if I had the space. > > Thanks! > > Sergio > > > On 12/20/13 11:35 AM, "Prather, Wanda" wrote: > > >Woo hoo! > >That's great news. > >Will open a ticket and escalate. > > > >Also looking at client-side dedup, but I have to do some architectural > >planning, as all the data is coming from one client, the TSM VE data > >mover, which is a vm. > > > >Re client-side dedup, do you know if there is any cooperation between > >the client-side dedup and deduprequiresbackup on the server end? > >I have assumed that the client-side dedup would not offer that protection. > > > >W > > > >-Original Message- > >From: Sergio O. Fuentes [mailto:sfuen...@umd.edu] > >Sent: Friday, December 20, 2013 10:39 AM > >To: ADSM: Dist Stor Manager > >Cc: Prather, Wanda > >Subject: Re: [ADSM-L] Deduplication "number of chunks waiting in queue" > >continues to rise? > > > >Wanda, > > > >In trying to troubleshoot an unrelated performance PMR, IBM provided me > >with an e-fix for the dedupdel bottleneck that it sounds like you're > >experiencing. They obviously will want to do their due-diligence on > >whether or not this efix will help solve your problems, but it has > >proved very useful in my environment. They even had to compile a > >solaris e-fix for me, cause it seems like I'm the only one running TSM > >on Solaris. The e-fix was very simple to install. > > > >What you don't want to do is go to 6.3.4.2, unless they tell you to > >because the e-fix is for that level (207). Don't run on 6.3.4.2 for > >even a minute. Only install it to get to the e-fix level. > > > >Dedupdel gets populated by anything that deletes data from the stgpool, > >I.e. move data, expire inv, delete filespace, move nodedata, etc. > > > >We run client-side dedupe (which works pretty well, except when you run > >into performance issues on the server) and so our identifies don't run > >very long, if at all. It might save you time to run client-side dedupe. > > > >BTW, when I finally got this efix and TSM was able to catch-up with the > >deletes and reclaims as it needed to, I got some serious space space > >back in my TDP Dedup pool. It went from 90% util to 60% util (with > >about 10TB of total capacity). What finally really got me before the > >fix was that I had to delete a bunch of old TDP MSSQL filespaces and it > >just took forever for TSM to catch up. I have a few deletes to do now, > >and I'm a bit wary because I don't want to hose my server again. > > > >I would esca