Re: Freezing my Email backups and starting fresh
(Sorry for the old reply, I'm still catching up on email...) On Wed, Jan 09, 2008 at 10:22:03AM -0600, Wanda Prather wrote: There are alternatives, including a backupset and an export tape. The problem I have with those: the data is on the tape, but the older backups will continue to expire out of the TSM DB. So, 6 months from now, you don't have anyway to figure out what is ON the backupset or EXPORT tape. This way you can query the DB to see what is in there, if your lawyers need something. I would use a backupset. TSM 5.4 allows you to generate a table of contents for that backupset that the client can query/load to do per-file browsing and restoring based off the backupset. I'm not sure how well the backupset would work with application backup tools (if you're backing up with the Exchange TSM addon), but in terms of taking an archive-like backup from pre-existing backups without changing current/future backups, backupsets seem like the way to go. The TOC can be stored either on tape or to a file devclass so it's always browsable even if the backupset media is offline. (Or a TOC can be recreated later from a pre-existing backupset, if the TOC is lost or didn't exist when the backupset was taken.) Making the node/domain/copygroup changes you described have the benefit of storing all the active/inactive files for the node when you move it over (rather than just the files active at the point-in-time defined in the backupset.) But that's a lot of configuration/policy/domain overhead for one client/one point in time. There's no right answer; only what works best for your situation and comfort level. I like that TSM provides these alternatives. Dave
Re: Tracking Permission Changes
On Tue, Dec 11, 2007 at 09:43:29AM -0600, Shawn Drew wrote: I'm trying to figure out when the permission of certain files changed. I ran a few tests, and I'm not sure I like the behavior of TSM. Maybe I'm missing something. - I backed up a file a month ago. - The file hasn't changed for a month - Today I changed the permission from 644 to 444. - I ran a dsmc i and it didn't back up the file again, just updated: Total number of objects updated: 1 - When I browse the files for restore, there was no indication of the permission change - When I restore the file with a point-in-time from before this update, it still restores the file with the current permission! It doesn't seem this is how it should work, but I do understand that the metadata of the file is probably stored in the TSM DB and there is probably no facility to maintain metadata history.Does this seem strange to anyone? Hi Shawn, AFAIK, TSM doesn't keep a back history of unix file permissions. It's one of those gotcha cases where point-in-time restores from inactive versions aren't actually point-in-time (because they'll use whatever the active file's permissions are.) It does seem strange. It's also not consistent with different file types. NTFS permissions are stored with the file, and so you'll see an active/inactive increment when permissions are changed. The QuickFacts explains it as this is warranted as the attributes are vital to the files. I'm not sure why that's true for Windows and not Unix. (Other than the fact my Unix admins would probably figure it out and be able to move on if they had to do a restore and then adjust permissions. My Windows admins would make a much bigger fuss about recreating that permissions structure.) I'm not sure how other file/attribute combinations are handled (zfs, ext2/3 extended attributes, ufs, macos) but it would be nice if that was consistent and/or documented somewhere. At first blush, I'd at least want that information backed up (I'm not sure ext2 extended attributes are backed up at all) and, it would be great they were versioned with the files. Dave
Re: Scheduling a full archive of data once/month
On Mon, Oct 22, 2007 at 09:49:00AM -0500, Joni Moyer wrote: I have just received the task of scheduling an archive of all data from several windows servers the first Saturday of every month for 7 years. What is the best method of accomplishing this? I would like to do this from the TSM server which is an AIX 5.3 TSM 5.3.5.2 server. Any suggestions are greatly appreciated! I'd probably do this as backupsets in 5.4. They're easy to generate from the server, work from data already on the server, have their own retention times different from the incremental values, are file-level browsable in a ToC that lives in file space (not TSM DB-space) -- which means you could offload them to tape, or leave the ToC on a disk/file stgpool for quick access. Backupsets have worked well as a point in time archive on a few nodes of mine. Dave
Re: Data Deduplication
On Thu, Aug 30, 2007 at 03:09:09AM -0400, Curtis Preston wrote: Unlike a de-dupe VTL that can be used with TSM, de-dupe backup software would replace TSM (or NBU, NW, etc) where it's used. De-dupe backup software takes TSM's progressive incremental much farther, only backing up new blocks/fragements/pieces of data that have never been seen by the backup server. This makes de-dupe backup software really great at backing up remote offices. We had Avamar out a few years ago pitching their solution, and we liked everything about it except the price. (And now that they're a part of EMC, I don't expect that price to drop much... *smirk*) But since we're talking about software, there's an aspect of de-dupe that I don't think has been explicitly mentioned yet. Avamar said their software got 10-20% reduction on a backup of a stock Windows XP installation. A single system, say it's the first one you added to your backup group. That's not two users with the same email attachments saved, or identical files across two systems - that's hashing files in the OS (I presume from headers in DLLs and such.) So if you backup two identical stock XP installs, you get 20% reduction on the first one and 100% on the second and beyond. Scale that up to hundreds of systems, and that's an incredible cost savings. Suddenly backing up entire systems doesn't seem so inefficient anymore. Dave
Re: TSM vs. Legato Networker Comparison
Hi John, Here's some perspective from someone who's currently transitioning from Networker to TSM. My first blush is I'm a little surprised EMC came in at such a dramatic discount: Saving umpteen thousands of dollars is the main reason we switched from them to IBM. (Although I think the state IBM contracts give us an advantage.) It's probably important to understand how Networker is licensed. The quote may not map to your environment, and might explain some of the cost. EMC nickel and dimes you to death. You need individual client count licenses for each system. You need a blanket license for each different operating system (Windows, Linux, Solaris, MacOS, etc.) You need the server license. You need a license per jukebox (varying costs depending on size.) You need a license for disk storage (varying costs depending on size.) NDMP? VSS? Clusters? Those are individual licenses too (per system, not blanket for all systems.) Also, investigate their maintenance costs... Ours were on the order of 10x what IBM offered. This list might complain about the idiocay of CPU licensing (which I agree with, especially for a storage-centric product,) but it's night-and-day better than the ala carte menu Networker requires. It also means that when that next new gotta-have-it feature comes out, it too is probably not included in your software maintenance and will need a new license. We'd been using Networker since it was BudTool, but to add the software licensing for the advanced disk objects (B2D) and MacOS support (another thing we were adding,) on top of our yearly support, was enough to justify investigating other products and deciding to purchase TSM. So, if your costs are too good to be true, they might just be. Also, as you surmised, the transition is hard. Getting up to speed on new backup software, learning its quirks, documenting it for administration and user-level docs, different reporting needs, etc.. Dealing with the hardware juggling to support an old production server and a new server that moves from evaluation to semi-production to production is a challenge. I'm a year into it, and I don't foresee shutting down our Networker server in 2007; probably not entirely until next summer. (Of course, this is a problem that throwing money into can solve, but you're in this situation because you wanted to save money, right?) In terms of functionality, both software packages will probably meet your objectives, but introduce unique quirks on how to do them. Networker's advantages are less per-file management, so you can put more clients and more files on a single server. (The relatively low supportable file quantities per server is one of my big hovering concerns with TSM that didn't exist entirely with Networker.) Networker also allows multiplexing sessions to a single tape, so provided your network/disk pipe is big enough, I'd say it's easier to keep the tape drives streaming. The disadvantage to switching to Networker from TSM is that a lot more media management is required. There's no reclaimation, so when it's on a tape, you're locked in and if you want that tape to recycle appropriately, you need to make sure the dependencies on it also cycle appropriately. That can be a pretty manual task, especially as clients go on/off the network. You'll find the staging and cloning tools in Networker require much more work than in TSM (although I understand that's improving, most admins I know control this with their own home-brew scripts, which is questionable when off-site copies are critical to your backups.) Other than that, that software's pretty much the same. It backs up and restores your stuff. It runs on almost anything (client and server.) Both companies have new upgrades that force new graphic admin tools on them their customers don't like. Navigating either product's support tools/websites can be menacing at times. Both have listservs with passionate, sharp, seasoned admins willing to help others. Both are exorbitantly expensive because, well, they can be. I was a little disgusted with EMC when we decided to purchase TSM, for more reasons than I've listed here, so maybe I'm a little biased. (Contact me off-list if you'd like to know more.) I think it's important to toe the waters with backup software and hardware every few years to find out what other products are doing, evaluate your costs against new pricing, etc. but I would caution to really spend some time investigating what the new software will cost you in terms of support, functionality, daily maintenance times, transition times, etc. and decide if those umpteen thousands are worth it. Ask me in a year or two if they did in our environment. ;) Dave On Fri, Jul 27, 2007 at 01:06:10PM -0500, Schneider, John wrote: Kelly, Thank you for your post. There is no reason to say we are unhappy with TSM. Since I inherited this environment about a year ago, due to lots of hardware and software version upgrades,
backing up just a few directories?
I have a few systems I'd like to add to TSM but only backup a few directory hierarchies. These aren't always mount points (for example, the /etc directory under the / mount point.) Does TSM really not have a way for me to define just back up X without worrying about anything else on that mount? I know I can do a exclude /.../* include /etc/.../* but then I get all the directories backed up all over /. I could append exclude.dirs for the larger hierarchies (/lib, /usr, etc.) but that seems awkward too. I feel like I'm going at this problem the wrong way, but I haven't found a right way. I tried putting a non-mount point in the domain line, but the client didn't like that. Recommendations? Thanks, Dave
example cloptset lists?
Hey gurus, Do you apply server based cloptsets to your clients, based on OS, to filter files you know are going to be a problem to backup and/or don't need to be restored? Can someone share theirs (or point me to an archive/resource of them? I looked but didn't see anything.) I know it's typical to exclude OS and application cache and temp directories. I'd like to see some other lists to help me build my own. It doesn't look to me like there's a client-side config way to override an exclude in a server-based cloptset, so I'm interested in seeing what's in a good base exclude list. I'm looking for Windows, MacOS, and Linux, but for the sake of discussion any OS block lists would be interesting. Thanks, Dave
MacOS client failing with rc 12 troubleshooting
Hello gurus, I have an Intel MacOS client running 5.4.0.0 that's failing it's scheduled noon-window backup with an RC 12 message, and I'm not sure why. I'm not sure what's a cause of what. From the dsmerror.log: 06/07/2007 12:44:47 ANS1228E Sending of object '/Library/Logs/tivoli/tsm/dsmsched.log' failed 06/07/2007 12:44:47 ANS4037E Object '/Library/Logs/tivoli/tsm/dsmsched.log' changed during processing. Object skipped. 06/07/2007 12:52:06 ANS1228E Sending of object '/private/var/tmp/folders.501/TemporaryItems/Acr1614669.tmp' failed 06/07/2007 12:52:06 ANS4005E Error processing '/private/var/tmp/folders.501/TemporaryItems/Acr1614669.tmp': file not found 06/07/2007 12:52:07 ANS1228E Sending of object '/private/var/tmp/folders.501/TemporaryItems/Acr1616363.tmp' failed 06/07/2007 12:52:07 ANS4005E Error processing '/private/var/tmp/folders.501/TemporaryItems/Acr1616363.tmp': file not found 06/07/2007 13:27:38 ANS1228E Sending of object '/Users/bob/Documents/Microsoft User Data/Office 2004 Identities/Main Identity/Database' failed 06/07/2007 13:27:38 ANS4037E Object '/Users/bob/Documents/Microsoft User Data/Office 2004 Identities/Main Identity/Database' changed during processing. Object skipped. 06/07/2007 13:39:41 ANS1228E Sending of object '/Users/bob/Library/Application Support/Firefox/Profiles/ap5cuske.default/cookies.txt' failed 06/07/2007 13:39:41 ANS4047E There is a read error on '/Users/bob/Library/Application Support/Firefox/Profiles/ap5cuske.default/cookies.txt'. The file is skipped. 06/07/2007 13:46:14 ANS1802E Incremental backup of '/' finished with 5 failure 06/07/2007 13:46:14 ANS1512E Scheduled event '1215PM' failed. Return code = 12. 06/07/2007 13:46:16 TCP/IP received rc 4 trying to accept connection from server. 06/07/2007 13:46:16 Error -50 accepting inbound connection Questions: Other than the five file failing, did the backup finished correctly? Is the ANS1512E error occurring because there were individual file failures, or because of the networking errors that follow the message? I couldn't find those errors on the TSM site or listserv archives. I'm not sure if those are a cause or effect of the ANS1512E or unrelated. The QuickFacts seem to imply it is just the failed files. The serialization for the class is shared-static, so I expect ANS4037E errors -- and other clients in that schedule/domain are probably getting them too, but aren't failing their backups. Is it the ANS4047E error that's tripping the ANS1512E? Dave
Re: ext2 extended file attributes?
On Tue, May 29, 2007 at 03:51:49PM -0400, Mueller, Ken wrote: I ran into a similar situation where the TSM client (v5.2.3 at the time) wouldn't backup/restore the ext3 extended acls on RHEL3. The problem turned out to be that the TSM client is looking for libacl.so but couldn't find it. It was available as libacl.so.1 (which symlinked to libacl.so.1.1.0). By adding a symlink for /lib/libacl.so pointing to libacl.so.1 I was able to backup and restore files w/extended acls. Thanks for the info, Ken, but it doesn't look like that's applicable in my case. (The symlink is already there.) But it did point me to a few ICs and a comment in the documentation about RHEL4 needing certain packages, which are installed. I upgraded my client to 5.4.0.0 and I still am not getting the ext3 ACLs restored. Time to call IBM? Dave
ext2 extended file attributes?
One of my users pointed out that ext2 has an extended attribute to tell dump not to back it up. He was wondering if TSM honored that setting. I gave it a try (chattr +d test_file) and then did an incremental backup of the directory. TSM seemed to ignore the no_dump bit and backed up the file, and upon restore, didn't restore the ext2 attributes (they were all blank.) Is this expected? I can understand, perhaps, not honoring the no_dump (since it's not dump,) but I'm a little concerned extended file attributes for ext3 aren't backed up/restored. Is there a special flag or setting I need to define on the client for that behavior? A bug? This is on a 5.3.3 client on a RedHat AS4 system. Dave
lingering copypool copies?
Hello list, Is there a best practice for moving nodes between domains, when the domains have different storage pools, copy pools, management classes, etc.? I noticed when I changed nodes to the new domain, the new management classes and rebinding occurred (which is fine,) but any data stored in the previous storage groups still remains, along with its copypool copies. I didn't really expect pre-existing data to magically migrate, and since I was taking the old domain's primary disk storage pool out of service anyway, I did a 'move data' for those old disk volumes to a different tape pool, and then removed the disk volumes. I expected that the next expire inventory would remove the copies for the files no longer in the stgpool they were copied from (or even that the stgpool no longer existed,) but that didn't happen. (I guess the copies are tied to the file without any regard for the stgpool they are/were in.) The new domain doesn't have any copypools configured, and I'd like to free those copy tapes up. How do I do that? Dave
flooded with ANR8210E messages
Hello, I've noticed that when a client either jumps off the network or has a firewall installed that blocked server scheduled sessions, a message is logged every 4 minutes during the schedule window stating it could not connect to the client: Mon Nov 6 06:01:11 2006 ANR8210E Unable to establish TCP/IP session with IP address - connection request timed out. Mon Nov 6 06:04:50 2006 ANR8210E Unable to establish TCP/IP session with IP address - connection request timed out. Mon Nov 6 06:08:29 2006 ANR8210E Unable to establish TCP/IP session with IP address - connection request timed out. Mon Nov 6 06:12:08 2006 ANR8210E Unable to establish TCP/IP session with IP address - connection request timed out. Mon Nov 6 06:15:47 2006 ANR8210E Unable to establish TCP/IP session with IP address - connection request timed out. Mon Nov 6 06:19:26 2006 ANR8210E Unable to establish TCP/IP session with IP address - connection request timed out. Mon Nov 6 06:23:05 2006 ANR8210E Unable to establish TCP/IP session with IP address - connection request timed out. I anticipate this could happen quite a bit for a number of clients (where I want them in a schedule in case they happen to be available, but if they aren't I'm not going to worry too much.) Is there a way to adjust this level of logging? Say, 10 clients during a 12 hour schedule window spitting out log messages every 3-4 minutes is a lot of static in the logs and reports. Ideas? Dave
Re: Lots of newbie questions
On Fri, Aug 11, 2006 at 09:29:26AM -0400, Allen S. Rout wrote: Said from another direction, TSM doesn't worry about selecting only backed-up data from a stgpool when it's migrating, it takes whatever is there. Thanks for the clarification. Best practice-wise, do you try to define different private groups and tape labels for onsite, archive and offsite storage? Or do Your recordkeeping problem is too complex to do by casual inspection, ... You will inevitably end up needing more Primary tapes when all you've got on hand will be labeled for Reverse thudpucker use or some such; Then you'll cross your own lines, and after the first time it gets easier and easier I've been running purposedly labeled tapes/pools for quite a while in the leveled backup world of Networker. Jumping to a 'just the system pick it' mentality is one of the hurdles (and appeals) of TSM. I fully understand your point on the media labels, and will learn to live with it. :) Having an extra drive outside a library isn't exactly a _bad_ idea, but unless you're trying to write a different format I'd expect you to get more value out of adding it to the library. I don't think I can convert an external drive to run in the library. You make good arguments for having internal drives, and I don't disagree. But given that I have an external tape drive, it shouldn't be an either/or for me to use it or the library. If you really, really want to do this, I'd suggest: - Define all of the drives to be in the library. set the one which is physically outside to usually be offline. - When you want to use the external drive, set the interior drives offline, the exterior online. Run the job, mount, dismount, etc. - When you're done, re-set to normal operations. I suspect that by itself wouldn't work because the SCSI library would want to do something to checkin/mount the drive, and I wouldn't want to reconfigure switching the library to manual or not. I'm thinking more along the lines of defining the SCSI library and manual library, and switching the devclass between them as needed. That's just a stopgap - it doesn't really make it usable, unless I define an entirely new devclass and use it for relatively few things (like backupsets or db backups to tape.) No one else has dealt with transitioning between both manual and automatic libraries? Personally, I'd much prefer checkin and checkout of desired volumes to this. And get a quote on how much the next bigger step of library is, and count the amount of time you spend screwing around with inadequate library space. That way you can demonstrate to the folks who are hoarding the beans when they start losing money because they didn't cough up a library at the outset. TSM is tremendously economical with tape and drive resources compared to other backup products. Feed it well; feed it hardware instead of your time. That point is muddied when a drive configuration that worked just fine with other backup products is either unusable or inadequate for TSM. I'm sticking with my current hardware/specs for this first year and have already warned the bean counters that odds are I'll need something near the start of year two -- be it more drives, more library space, more staging disk, beefier or multiple servers, etc. Now, if you were me, you'd try to develop a theory of copystg utilization workflow, and solve it for a minimum. But I suggest you just twirl the knob to the other end, and see if you like that tradeoff better. Good point. I'll try some variations and see how it goes. You should consider data which has expired to be Gone, except for heroic measures. If you Really Need data which expired out of the database in the last week or so (a common period for retention of DB backups) then yes, you can do a complete DR scenario, and consult the as-yet unreused tape volumes for the data. Icky squicky. Thanks, that's the definition I was looking for. As for Really Need, it depends who asks... :) It's usually an answer like 'You messed this up last month, overwrote it every week and didn't notice until the first of this month, and have waited until NOW to ask me to get it back? No, it's gone.. My backups will probably be augmented with regular archives (or backupsets) for the important systems, so the provision of 'losing' anything (at least for the people who Really Need it) should be pretty low. Any opinion on archives vs. backupsets for a monthly snapshot kept for 6 months? Thanks for humoring me, Dave
Lots of newbie questions
Hello, Forgive the laundry list of questions, but here's a few things as a newbie I don't quite understand. Each paragraph is a different question/topic, so feel free to chime in on just a few or any that you're comfortable answering. Thanks! I'm using Operational Reporting with the automatic notification turned on for failed or missed schedules. I have a node associated with a schedule that no longer exists (not powered on,) just to test failures and notifications. However, I never get notifications about failed or missed schedules from it (not the email or mentioned in the daily report.) In the client schedules part of the report, it's always in a Pending status. At what point does pending turn into failed or missed? How can I configure that so I get notifications about systems that missed their scheduled backup? I'm using an administrative schedule to backup my DB to a FILE class twice a day, and then I do full backups of the DB to tape right before my offsite rotation. I read somewhere that since I'm using DRM, I shouldn't use 'del volhist' to remove old db backups. However, I don't think the DRMDBBACKUPEXPIREDAYS configuration setting is applying to my FILE backups. Is that normal? Should I be running both drm and 'del volhist'? I do my backups to a DISK pool that has a 85/40 migrate percentage and a tape next storage pool. If everything (my disk and tape pool) is synced up to my copypool before a backup runs, and the backup only goes to disk, the 'backup stg' for the tape has nothing to do. I understand that. If I backup the disk pool and manually migrate the data from disk to tape, and then backup the tape pool, it has nothing to do. (Since that data was already in the copypool.) I understand that. But, if during a backup the tape pool starts an automatic migration, the next time I do a 'backup stg' for the tape pool, it has data to copy to the copypool. So, what's happening? Since the migration is going on, does TSM automatically route data from the node directly to the tape? (My maxsize parameter for the disk pool is No Limit, so I would guess no.) Or is TSM migrating from the disk pool the newest data that wasn't already copied to the copypool? In that case, why doesn't it migrate older data that's already copied? What's the selection criteria for what gets migrated? Or would best practice say to manually migrate the disk pool daily to minimize the chance of this condition? Best practice-wise, do you try to define different private groups and tape labels for onsite, archive and offsite storage? Or do people really just make one large 'pool' of tapes and not care if tape 0004 has archive data on it and stays for years, 0005 goes offsite, 0006 has a two week DB backup on it, and 0007 is a normal tape pool tape? Since there's not one (standard) unified view for volumes (DB backups, offsite backups, checked in scratch volumes, not checked in scratch volumes) I worry a little about keeping track of and 'losing something' if they're all in one group. How do sites handle that issue? I have a LTO3 tape library and an external LTO3 drive. In our Networker environment, we found it a pretty good practice to have a drive outside of the jukebox for one-off operations (old restores, etc.) as well as some sort of fault tolerance if the jukebox or that SCSI bus went south. How do I setup that environment in TSM? It looks like I cannot use the same device class across two libraries. Doesn't that hinder me if I want to use the external drive in the same way as the jukebox drives, sharing storage pools, etc.? My jukebox isn't very large and I anticipate having to use overflow storage pools, which is where being able to mix the manual library (external drive) and SCSI library would be nice. Consolidating copypool tapes for offsite use. I had my reclaimation threshold for my copypool at 40%. I used 'backup stg' with maxpr=[# of drives] to minimize the amount of time the backup took. However, it left me with many under-utilized offsite drives, that as soon as I moved offsite, were then reclaimed (which then sat until the reusedelay expired.) That seems inefficient - that I move a larger than necessary number of tapes each time I do an offsite rotation (right now, weekly) and as soon as they're offsite, they're reclaimed. To fix this, I put the reclaimation threshold back to 100%, and set it down just before my offsite rotation. I've also taken a look at the to-be-offsited tapes and do some 'move data's as required to try to minimize the amount of offsite tapes. Is that standard practice? I feel like I'm fighting the natural way TSM works, given that it makes so many other decisions just fine without my direct intervention. (And that's a compliment - I can't say that for Networker.) Is there something I'm missing to make offsite tape usage more streamlined? So my offsite rotation procedure is starting to look like: - expire inventory - backup all the local storage
Re: multiple instance recommendations
On Fri, May 19, 2006 at 10:05:38PM -0400, Allen S. Rout wrote: I have questions about server sizing and scaling. I'm planning to transition from Networker to TSM a client pool of around 300 clients, with the last full backup being about 7TB and almost 200M files. The TSM server platform will be RHEL Linux. I realize putting all of that into one TSM database is going to make it large and unwieldy. You may be underestimating TSM; the size there is well in bounds for one server. The file count is a little high, but I'm not convinced it's excessive. The biggest server in my cluster has 70M files, in a 70% full DB of 67GB assigned capacity. So if that means 67GB ~= 100M files, you might be talking 130-150GB of database. I wouldn't do that on my SSA, but there's lots of folks in that size range on decent recent disk. I don't have a lot of a priori knowledge of TSM, so any sizing recommendations I've seen have come from the Performance Guides, interviews with a few admins, and general consensus from web searches I've done. I agree that if the general concern is getting database operations done in a managable timeframe, as long as I can architect the db disk well enough, it should size-scale-up fairly well. Thoughts on that? Do you feel your file size profile is anomolous? My 70M file DB is ~23TB; that ratio is an order of magnitude off mine, and my big server is a pretty pedestrian mix of file and mail servers. My numbers come from the Networker reports on the last full. Our domain is largely desktop systems with everything being backed up. Those large numbers of small files puts our average system at 722k files and 29GB of storage (averaging 39k a file?) I'm trying to float towards just protecting user data (and not backing up C:\Windows 250 times, especially on systems where we would never do a baremetal restore.) A compromise against management that doesn't want to risk data loss by people putting things in inappropriate (and not backed up) places would be to use different MCs to provide almost no versioning outside of defined user data spaces -- but that doesn't get around my high file count problem. I have heard of some dual-attach SCSI setups, but never actually seen one in the wild. If I were going to point at one upgrade to improve your life and future upgrade path, getting onto a shareable tape tech would be it. I have drunk the FC kool-ade. It's tasty, have some. :) I don't disagree. Maybe I'll start looking at storage routers to share my SCSI drives over FC. Tape access-wise, is there a hardware advantage putting multiple instances on the same system? Yes, it solves your drive sharing problem. All the TSM instances would be looking at /dev/rmtXX. Your LM instance can do traffic direction to figure out who's got the drive, and they are all using the same bus, same attach. Ah, that helps. I guess I could design one beefy system as a sole-TSM instance, and if things get too bogged down, split it into two (or three) TSM instances on that same hardware and not have to worry (as much) about device sharing because it's all local. If it got to the point where I'd want to split that to different hardware, I'd look at moving the drives to FC and sharing them that way. That makes sense to me. I like the 'beefy box' solution for all purposes except test instance. Make sure it's got plenty of memory. 6G? 8? Can you clarify the memory needs? I was thinking 2G of RAM per instance; would I need more? Dave
multiple instance recommendations
Hello, I have questions about server sizing and scaling. I'm planning to transition from Networker to TSM a client pool of around 300 clients, with the last full backup being about 7TB and almost 200M files. The TSM server platform will be RHEL Linux. I realize putting all of that into one TSM database is going to make it large and unwieldy. I'm just not sure how best to partition it in my environment and use the resources available to me. (Or what resources to ask for if the addition of X will make a much easier TSM configuration.) For database and storage pools, I will have a multiple TB SAN allocation I can divide between instances. I have one 60 slot HP MSL6060 library (SCSI), with two LTO-3 SCSI drives. There is also an external SCSI LTO-3 drive. My understanding of a shared SCSI library indicates that the library is SCSI-attached to a server, but drive allocation is done via SAN connections or via SCSI drives that are directly attached to the different instances. (Meaning the directly attached SCSI drives are not sharable.) Is that true, at least as far as shared libraries go? The data doesn't actually go through the library master to a directly connected drive, does it? If not, and I still wanted to use sharing, I could give each instance a dedicated drive - but since two drives seems like the minimum for TSM tape operations, I don't really think it's wise to split them. (However, if the 'best' solution would be to add two more drives to max out the library, I can look into doing that.) If the drives need to be defined just on one server, it looks like server-to-server device classes and virtual volumes are the only solution. I don't really like the complexity of one instance storing anothers' copy pools inside of an archive pool just to use tape, but it looks like things are heading that way. Other than the obvious hardware cost savings, I don't really see the advantage of multiple instances on the same hardware. (I haven't decided yet if we would use one beefy server or two medium servers.) If you load up multiple instances on the same server, do you give them different IP interfaces to make distinguishing between them in client configs and administration tools easier? Tape access-wise, is there a hardware advantage putting multiple instances on the same system? Any recommendations on any of this? Your help is appreciated. Dave
Re: Multiple storage pools?
On Thu, May 11, 2006 at 09:47:08PM +0200, Kurt Beyers wrote: The two copy pools behave completely independently. Each command backup stgpool primary copy will compare the files in both storage pools and copy the data that are in the primary pool and not yet in the copy pool. I presume that same 1-to-1 primary/copy mapping is true for a file that gets migrated between primary pools that both backup to the same copy pool? For example, a disk and a tape primary pool and a tape copy pool. The file is created on the disk pool, and copied to the tapecopy pool. Later, that file is migrated from the disk pool to the tape pool. When the tape pool is next copied to the tapecopy pool, does the file transfer again? Or does TSM know there's already a copy of that file migrated from another primary pool? I believe from my evaluation that the copying is happening twice, but I'm not sure. My assumption, in that case, is that the disk pool copy is expired at that point, so even if the copying happens twice, the actual post-reclaimation copy pool storage is singular. I'm new and still getting used to the concepts, so a confirmation/correction would be appreciated. Thanks! Dave
GENERATE BACKUPSET fails
Hello all, I'm having problems generating backupsets, and I'm new to TSM so I'm not sure how to troubleshoot this. I'm running a Linux server TSM version 5.3.2. Here's what I see: Mon May 1 14:02:25 2006 ANR2017I Administrator MUSSULMA issued command: GENERATE BACKUPSET endeavour.cs.uiuc.edu 2006-05 /var devc=dltclass Mon May 1 14:02:25 2006 ANR0984I Process 92 for GENERATE BACKUPSET started in the BACKGROUND at 02:02:25 PM. Mon May 1 14:02:25 2006 ANR3500I Backup set for node ENDEAVOUR.CS.UIUC.EDU as 2006-05.1099170 being generated. Mon May 1 14:03:40 2006 ANR8337I DLT volume TSM010 mounted in drive DRIVE2 (/dev/tsmscsi/mt2). Mon May 1 14:03:40 2006 ANR0513I Process 92 opened output volume TSM010. Mon May 1 14:04:58 2006 ANR1360I Output volume TSM010 opened (sequence number 1). Mon May 1 14:05:16 2006 ANRD bfgenset.c(3614): ThreadId38 Unknown result code (17) from bfRtrv. Mon May 1 14:05:16 2006 ANR3520E GENERATE BACKUPSET: Internal error encountered in accessing data storage. Mon May 1 14:05:16 2006 ANR3503E Generation of backup set for ENDEAVOUR.CS.UIUC.EDU as 2006-05.1099170 failed. Mon May 1 14:05:18 2006 ANR1361I Output volume TSM010 closed. Mon May 1 14:05:18 2006 ANR0515I Process 92 closed volume TSM010. Mon May 1 14:05:22 2006 ANR0987I Process 92 for GENERATE BACKUPSET running in the BACKGROUND processed 349 items with a completion state of FAILURE at 02:05:22 PM. It fails the same way if I try a different filespace on the client, or a different client, or write to a different devclass. Restores from that filespace seem to work okay. I tried some web searches but came up cold. Any advice? Dave