Hmmm, an issue we all deal with in varying degrees... Answer #1: Yes, it will take longer to do your expire inventory. TSM will scan through all the files looking for candidates, and performance is typically measured in " X thousand files examined in X seconds", the more files the more seconds... no way around the math.
Answer #2, 3 & 4: If the data is fairly static once it's written and will be on an NT host, it sounds like the journaling option for backups would be the best solution to improve backup performance. IMHO. I have a few monster hosts as you do, with millions of files & even more directories. They reside on Unix hosts, so journaling backups are not available. On some, the option of "multiple TSM instances" works well as the data is on separate mount points. You can create a TSM instance for each mount point and have multiple backups running on the host at the same time. The downside is that each will compete for CPU, memory, and I/O, so performance may suffer. Additional issue you will eventually encounter.... While you are concerned with the backing up of the data, my concern/problem has always been with restoring the data in a timely manner. The more files in a filesystem, the longer it takes to do restores on it. I won't bore you with the details, but you may want to look in the ADSM archives for 2001 at http://search.adsm.org/ for a discussion called "Performance Large Files vs. Small Files" that went on about the type of data you have (static, small files, long term). Thanks, Ben -----Original Message----- From: Todd Lundstedt [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 04, 2003 11:38 AM To: [EMAIL PROTECTED] Subject: Backup Strategy and Performance using a PACS Digital Imaging System Heya *SMers, This is a long one, grab a beverage. I am running TSM 4.2.1.7 on AIX 4.3.3 on a dual processor B80 backing up around 112 nodes with 5.2TB capacity 2.3TB used, storing 18.6 million objects (3.2TB) in the primary and copy storage pools. The TSM database is 8.7GB running around 80% utilization. All but a few of the nodes are server class machines. We backup about 250GB a night, and on weekends we do full TDP SQL database backups to the tune of an additional 450 GB (and growing). Expiration processing occurs daily, and completes in approximately 70 minutes. The four Fuji PACS servers we have are included in the above numbers, but only the application and OS, not the images and clinical notes (less than 1k text files). FYI.. where TSM and disk management are concerned, Fuji is the DEVIL!. Each image, and each 1k note file with text referencing an image are stored in their own directory.. image_directory/imagefile and text_directory/textfile.. a one to one relationship. To backup the directories/textfiles now takes the backup process over 12 hours to complete, incrementally backing up very little. The backup has to scan the files to see what needs to be backed up (this is not TSM yet, but some other backup software). The powers that be are asking what it would take to move all of the data stored on DVDs in the DVD jukebox (images) to magnetic media disk based storage. Then, start backing all of that up to TSM. I have some numbers from the PACS administrator. On the four PACS servers, the additional data they would like TSM to backup tallies up to... 1.5+ million folders 1.0+ million files (yes... more folders than files...) 2.2+ TB storage (images and text files) All of this data will not change. Once it backs up, it will very likely never need to be backed up again. Because of that, I am recommending three tape storage pools at a minimum: one primary, one on-site copy, and one off-site copy. I would actually like to have two off-site copy storage pools. Since this data doesn't change, and no additional backups will occur for the files, there will be no need for reclamation. The extra copy storage pools are a safety net in case we have bad media spots/tapes. Without reclamation, we will never know if we have bad media. So, at a minimum, 3 storage pools containing a total of 7.5+ million objects ((directory+files)*3) will use up 4.3GB of a TSM database (7.5 million * 600 bytes). The amount of growth per year is being estimated at about 4+ GB of TSM database, so, approximately another 2.3+ million files/folders each year. It will very likely be more. (Daily estimates are 6500 additional files/folders). Keep in mind. This data will NOT be changed or deleted in the foreseeable future. New data incoming daily. NO data expires. I don't know if Fuji will ever change the way they store their images/text files. So, here is what I am trying to figure out. 1. Will adding the additional objects from the PACS servers significantly increase my expiration processing run time? Will TSM have to scan all of those database objects during expiration processing? 2. I have heard it is possible to run another instance of TSM server on the same machine. Would that be a good idea here? It makes sense to this novice user. I wouldn't have to run expiration processing daily on the PACS TSM instance. 3. If a second TSM server instance is the recommended course, how difficult is that to setup? Any redbooks or how-tos out there regarding that? What issues do I have with sharing my LTO library between the two TSM server instances? Any redbooks or how-tos on sharing a single 3584 library (five LTO drives, and hoping to get more out of this project) between two TSM server instances on the same machine? 4. Regardless of how many TSM server instances, journaling will have to be setup on each of the NT4.0 PACS servers. What kind of overhead can we expect to run journaling on the NT servers (I haven't setup journaling anywhere, yet)? Three of the servers each have about 400-500K image objects, and the fourth server (the one with all of the <1k text files) has close to 1 million image/text objects (none of that includes the OS or application files/databases, just image/textreport files). 5. Due to the fact that the directory and file objects will likely not change, would there be a pressing need to use a DIRMC (non-migrating)? I would suppose not. 6. Is there a better way to do this? (probably should have asked that question first.) <:+) Thanks in advance. Todd Lundstedt Technical Specialist Via Christi Information Management Services ofc. (316) 261-8385 fax. (316) 660-0036 [EMAIL PROTECTED]