Chris, yours is easy to answer: You have a performance problem, not a TSM problem. For VM-backup I have a dedicated LPAR in a S842, 3vCPU, 128GB RAM, SR-IOV 10GB. Storage is a V7000, SSD/SAS. There has never been a backup or restore problem performance wise, everything is running with wire-speed. The only problem we hsd was client side dedup, II28096 hit us bad, 5 days of audit for each TSM-instance over and over again until the reason was identified.
For a start: keep the size of hdisk small for the db. Better 2 x 50GB that 1 100GB. chfs -e x and reorgvg is your friend. mount option "rbrw,noatime" is of great use for db, log, arc and stg pools. Check you queue_depth and set it to max. As for SQL: use as many streams as the client can handle. CPU is there to handle load, not to waste energy by idleing. -- Michael Prix On Fri, 2019-07-19 at 13:02 +0000, Kizzire, Chris wrote: > Alas... we are not the only ones.... I knew it... > We went from TSM 6.3 to SP 8.1.4.0. Performance is 5 times slower in the new > Environment overall. We use Container Pools, Dedup, & Compression. We use SP > for VE for most VM's. Baclient & SQL for physical machines & vm's w/ SQL. > It took about 17 hours to restore 4.5TB SQL DB the other day. > Main Server: > IBM Power 750 running AIX 7.2 . > 3.1TB DB is on SSD. > 128GB RAM > Server at DR site > IBM Power 770 running AIX 7.2 > 3.1TB DB NOT on SSD > 100ish GB RAM > Container Pool is on 1.5 year old IBM V5030 > w/ Identical V5030 at DR site. > > IBM says open a Performance PMR for AIX- which we have yet to do. > Protect Stgpool runs for days & we have to cancel because we get too far > behind on Replication. If we are lucky we might can replicate 12TB in a 24 > hour period w/ 100 sessions (maxsessions=50) > > > Chris Kizzire > Backup Administrator (Network Engineer II) > > BROOKWOOD BAPTIST HEALTH > Information Systems > O: 205.820.5973 > > chris.kizz...@bhsala.com > BROOKWOODBAPTISTHEALTH.COM > > > -----Original Message----- > From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Michael > Prix > Sent: Friday, July 19, 2019 5:11 AM > To: ADSM-L@VM.MARIST.EDU > Subject: Re: [ADSM-L] deletion performance of large deduplicated files > > > CAUTION: ***EXTERNAL EMAIL*** Do NOT click links or open attachments unless > you recognize the sender and know the content is safe. If you are unsure, > please use PhishAlarm to report suspicious emails. > > Hello Eric, > > welcome to my nightmares. Take a seat, wanna have a drink? > > I had the pleasure of performance and data corruption PMRs during the last > two years with TDP Oracle. Yes, at first the customer got blamed for not > adhering completely to to blueprints, but after some weeks it boild down > to ... > silence. > Data corruption was because of what ended in IT28096 - now fixed. > Performance is interesting, but resembles to what you have written. We work > with MAXPIECESIZE settings on RMAN to keep the backup pieces small and got > some interesting values, pending further observation, but we might be on a > cheerful way. I'm talking about database sizes of 50TB here, warehouse > style. > In between we moved the big DBs to a dedicated server to prove that the > performance drop is because of the big DBs, and the remaining "small" DBs - > size of 500MB up to 5TB - didn't put any measurable stress on the DB in > terms of expiration and protect stgpool. Even the big DBs on their dedicated > server performed better in terms of expiration and protect stgpool, which > might have been a coincidence of these DBs holding nearly the same data and > having the same retention period. > > What I can't observe is a slowness of the DB. Queries are answered in the > normal time - depending on the query. a count(*) from backupobjects > naturally takes some time, considerably longer when you use dedup, but the > daily queries are answered in the "normal" timeframe. > > What helped immediately was some tuning: > - More LUNS and filesystems for the TSM-DB > - smaller disks, but more of them, for each filesystem. > changing the disks from 100GB to 2 x 50GB for each DB-filesystem got me a > performance boost of 200% in expiration and backup db. Unbelievable, but > true. > Yes, I'm using SSD. And SVC. And multiple storage systems. Performance isn't > the problem, we are measuring 2ms respone time for write AND read. > - stripeset for each fileset > > > -- > Michael Prix > > On Fri, 2019-07-19 at 07:29 +0000, Loon, Eric van (ITOP NS) - KLM wrote: > > Hi TSM/SP-ers, > > > > We are struggling with the performance of our TSM servers for months > > now. We are running several servers with hardware (Data Domain) dedup > > for years without any problems, but on our new servers with directory > > container pools performance is really, really bad. > > The servers and storage are designed according to the Blueprints and > > they are working fine as long as you do not add large database (Oracle > > and SAP) client to them. As soon as you do, the overall server > > performance becomes very bad: client and admin session initiation > > takes 20 to 40 seconds, SQL queries run for minutes where they should > > take a few seconds and q query stgpool sometimes takes more than a minute > > to respond! > > I have two cases open for this. In one case we focused a lot on the OS > > and disk performance, but during that process I noticed that the > > performance is most likely caused by the way TSM processes large (few > > hundred MB) files. I performed a large amount of tests and came to the > > conclusion that it takes TSM a huge amount of time to delete large > > deduplicated files, both in container pools as deduplicated file > > pools. As test I use an TDP for Oracle client which uses a backup > > piece size of 900 MB. The client contains about > > 5000 files. Deleting the files from a container pool takes more than > > an hour. When you run a delete object for the files individually I see > > that most files take more than a second(!) to delete. If I put that > > same data in a non-deduplicated file pool, a delete filespace takes about > > 15 seconds... > > The main issue is that the TDP clients are doing the exact same thing: > > as soon as a backup file is no longer needed, it's removed from the > > RMAN catalog and deleted from TSM. Since we have several huge database > > clients (multiple TB's each) these Oracle delete jobs tend to run for > > hours. These delete jobs also seem to slow down each other, because > > when there are several of those jobs running at the same time, they become > > even more slow. > > At this point I have one server where these jobs are running 24 hours > > per day! This server is at the moment the worst performing TSM server > > I have ever seen. On the other container pool servers I was able to > > move the Oracle and SAP server away to the old servers (the ones with > > the Data Domain), but on this one I can't because of Data Domain capacity > > reasons. > > For this file deletion performance I also have a case open, but there > > is absolutely no progress. I proved IBM how bad the performance is and > > I even offered them a copy of our database so they can see for > > themselves, but only silence from development... > > One thing I do not understand: I find it very hard to believe that we > > are the only one suffering from this issue. There must be dozens of > > TSM users out there that backup large databases to TSM container pools? > > > > Kind regards, > > Eric van Loon > > Air France/KLM Storage & Backup > > ******************************************************** > > For information, services and offers, please visit our web site: > > http://www.klm.com. This e-mail and any attachment may contain > > confidential and privileged material intended for the addressee only. > > If you are not the addressee, you are notified that no part of the > > e-mail or any attachment may be disclosed, copied or distributed, and > > that any other action related to this e-mail or attachment is strictly > > prohibited, and may be unlawful. If you have received this e-mail by > > error, please notify the sender immediately by return e-mail, and delete > > this message. > > > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or > > its employees shall not be liable for the incorrect or incomplete > > transmission of this e-mail or any attachments, nor responsible for any > > delay in receipt. > > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal > > Dutch > > Airlines) is registered in Amstelveen, The Netherlands, with > > registered number 33014286 > > ******************************************************** > > ------------------------------------------- Confidentiality Notice: The > information contained in this email message is privileged and confidential > information and intended only for the use of the individual or entity named > in the address. If you are not the intended recipient, you are hereby > notified that any dissemination, distribution, or copying of this > information is strictly prohibited. If you received this information in > error, please notify the sender and delete this information from your > computer and retain no copies of any of this information.