Re: Freezing my Email backups and starting fresh

2008-01-16 Thread Dave Mussulman
(Sorry for the old reply, I'm still catching up on email...)

On Wed, Jan 09, 2008 at 10:22:03AM -0600, Wanda Prather wrote:
 There are alternatives, including a backupset and an export tape.  The
 problem I have with those:  the data is on the tape, but the older backups
 will continue to expire out of the TSM DB.  So, 6 months from now, you don't
 have anyway to figure out what is ON the backupset or EXPORT tape.  This way
 you can query the DB to see what is in there, if your lawyers need
 something.

I would use a backupset.  TSM 5.4 allows you to generate a table of
contents for that backupset that the client can query/load to do
per-file browsing and restoring based off the backupset.  I'm not sure
how well the backupset would work with application backup tools (if
you're backing up with the Exchange TSM addon), but in terms of taking
an archive-like backup from pre-existing backups without changing
current/future backups, backupsets seem like the way to go.  The TOC can
be stored either on tape or to a file devclass so it's always browsable
even if the backupset media is offline.  (Or a TOC can be recreated
later from a pre-existing backupset, if the TOC is lost or didn't exist
when the backupset was taken.)

Making the node/domain/copygroup changes you described have the benefit
of storing all the active/inactive files for the node when you move it
over (rather than just the files active at the point-in-time defined in
the backupset.)  But that's a lot of configuration/policy/domain
overhead for one client/one point in time.

There's no right answer; only what works best for your situation and
comfort level.  I like that TSM provides these alternatives.

Dave


Re: Tracking Permission Changes

2007-12-12 Thread Dave Mussulman
On Tue, Dec 11, 2007 at 09:43:29AM -0600, Shawn Drew wrote:
 I'm trying to figure out when the permission of certain files changed.   I
 ran a few tests, and I'm not sure I like the behavior of TSM.  Maybe I'm
 missing something.

 - I backed up a file a month ago.
 - The file hasn't changed for a month
 - Today I changed the permission from 644 to 444.
 - I ran a dsmc i and it didn't back up the file again, just updated:
 Total number of objects updated:   1
 - When I browse the files for restore, there was no indication of the
 permission change
 - When I restore the file with a point-in-time from before this update, it
 still restores the file with the current permission!

 It doesn't seem this is how it should work, but I do understand that the
 metadata of the file is probably stored in the TSM DB and there is
 probably no facility to maintain metadata history.Does this seem
 strange to anyone?

Hi Shawn,

AFAIK, TSM doesn't keep a back history of unix file permissions.  It's
one of those gotcha cases where point-in-time restores from inactive
versions aren't actually point-in-time (because they'll use whatever the
active file's permissions are.)  It does seem strange.

It's also not consistent with different file types.  NTFS permissions
are stored with the file, and so you'll see an active/inactive increment
when permissions are changed.  The QuickFacts explains it as this is
warranted as the attributes are vital to the files.  I'm not sure why
that's true for Windows and not Unix.  (Other than the fact my Unix
admins would probably figure it out and be able to move on if they had
to do a restore and then adjust permissions.  My Windows admins would
make a much bigger fuss about recreating that permissions structure.)

I'm not sure how other file/attribute combinations are handled (zfs,
ext2/3 extended attributes, ufs, macos) but it would be nice if that was
consistent and/or documented somewhere.  At first blush, I'd at least
want that information backed up (I'm not sure ext2 extended attributes
are backed up at all) and, it would be great they were versioned with
the files.

Dave


Re: Scheduling a full archive of data once/month

2007-10-25 Thread Dave Mussulman
On Mon, Oct 22, 2007 at 09:49:00AM -0500, Joni Moyer wrote:
 I have just received the task of scheduling an archive of all data from
 several windows servers the first Saturday of every month for 7 years.
 What is the best method of accomplishing this?  I would like to do this
 from the TSM server which is an AIX 5.3 TSM 5.3.5.2 server.  Any
 suggestions are greatly appreciated!

I'd probably do this as backupsets in 5.4.  They're easy to generate
from the server, work from data already on the server, have their own
retention times different from the incremental values, are file-level
browsable in a ToC that lives in file space (not TSM DB-space) -- which
means you could offload them to tape, or leave the ToC on a disk/file
stgpool for quick access.  Backupsets have worked well as a point in
time archive on a few nodes of mine.

Dave


Re: Data Deduplication

2007-08-31 Thread Dave Mussulman
On Thu, Aug 30, 2007 at 03:09:09AM -0400, Curtis Preston wrote:
 Unlike a de-dupe VTL that can be used with TSM, de-dupe backup software
 would replace TSM (or NBU, NW, etc) where it's used.  De-dupe backup
 software takes TSM's progressive incremental much farther, only backing
 up new blocks/fragements/pieces of data that have never been seen by the
 backup server.  This makes de-dupe backup software really great at
 backing up remote offices.

We had Avamar out a few years ago pitching their solution, and we liked
everything about it except the price.  (And now that they're a part of
EMC, I don't expect that price to drop much... *smirk*)  But since we're
talking about software, there's an aspect of de-dupe that I don't think
has been explicitly mentioned yet.  Avamar said their software got
10-20% reduction on a backup of a stock Windows XP installation.  A
single system, say it's the first one you added to your backup group.
That's not two users with the same email attachments saved, or identical
files across two systems - that's hashing files in the OS (I presume
from headers in DLLs and such.)  So if you backup two identical stock XP
installs, you get 20% reduction on the first one and 100% on the second
and beyond.  Scale that up to hundreds of systems, and that's an
incredible cost savings.  Suddenly backing up entire systems doesn't
seem so inefficient anymore.

Dave


Re: TSM vs. Legato Networker Comparison

2007-07-30 Thread Dave Mussulman
Hi John,

Here's some perspective from someone who's currently transitioning from
Networker to TSM.  My first blush is I'm a little surprised EMC came in
at such a dramatic discount: Saving umpteen thousands of dollars is the
main reason we switched from them to IBM.  (Although I think the state
IBM contracts give us an advantage.)

It's probably important to understand how Networker is licensed.  The
quote may not map to your environment, and might explain some of the
cost.  EMC nickel and dimes you to death.  You need individual client
count licenses for each system.  You need a blanket license for each
different operating system (Windows, Linux, Solaris, MacOS, etc.)  You
need the server license.  You need a license per jukebox (varying costs
depending on size.)  You need a license for disk storage (varying costs
depending on size.)  NDMP?  VSS?  Clusters?  Those are individual
licenses too (per system, not blanket for all systems.)  Also,
investigate their maintenance costs...  Ours were on the order of 10x
what IBM offered.  This list might complain about the idiocay of CPU
licensing (which I agree with, especially for a storage-centric
product,) but it's night-and-day better than the ala carte menu
Networker requires.

It also means that when that next new gotta-have-it feature comes out,
it too is probably not included in your software maintenance and will
need a new license.  We'd been using Networker since it was BudTool, but
to add the software licensing for the advanced disk objects (B2D) and
MacOS support (another thing we were adding,) on top of our yearly
support, was enough to justify investigating other products and deciding
to purchase TSM.  So, if your costs are too good to be true, they might
just be.

Also, as you surmised, the transition is hard.  Getting up to speed on
new backup software, learning its quirks, documenting it for
administration and user-level docs, different reporting needs, etc..
Dealing with the hardware juggling to support an old production server
and a new server that moves from evaluation to semi-production to
production is a challenge.  I'm a year into it, and I don't foresee
shutting down our Networker server in 2007; probably not entirely until
next summer.  (Of course, this is a problem that throwing money into can
solve, but you're in this situation because you wanted to save money,
right?)

In terms of functionality, both software packages will probably meet
your objectives, but introduce unique quirks on how to do them.
Networker's advantages are less per-file management, so you can put more
clients and more files on a single server.  (The relatively low
supportable file quantities per server is one of my big hovering
concerns with TSM that didn't exist entirely with Networker.) Networker
also allows multiplexing sessions to a single tape, so provided your
network/disk pipe is big enough, I'd say it's easier to keep the tape
drives streaming.  The disadvantage to switching to Networker from TSM
is that a lot more media management is required.  There's no
reclaimation, so when it's on a tape, you're locked in and if you want
that tape to recycle appropriately, you need to make sure the
dependencies on it also cycle appropriately.  That can be a pretty
manual task, especially as clients go on/off the network.  You'll find
the staging and cloning tools in Networker require much more work than
in TSM (although I understand that's improving, most admins I know
control this with their own home-brew scripts, which is questionable
when off-site copies are critical to your backups.)

Other than that, that software's pretty much the same.  It backs up and
restores your stuff.  It runs on almost anything (client and server.)
Both companies have new upgrades that force new graphic admin tools on
them their customers don't like.  Navigating either product's support
tools/websites can be menacing at times.  Both have listservs with
passionate, sharp, seasoned admins willing to help others.  Both are
exorbitantly expensive because, well, they can be.

I was a little disgusted with EMC when we decided to purchase TSM, for
more reasons than I've listed here, so maybe I'm a little biased.
(Contact me off-list if you'd like to know more.)  I think it's
important to toe the waters with backup software and hardware every few
years to find out what other products are doing, evaluate your costs
against new pricing, etc. but I would caution to really spend some time
investigating what the new software will cost you in terms of support,
functionality, daily maintenance times, transition times, etc. and
decide if those umpteen thousands are worth it.  Ask me in a year or two
if they did in our environment.  ;)

Dave


On Fri, Jul 27, 2007 at 01:06:10PM -0500, Schneider, John wrote:
 Kelly,
   Thank you for your post.  There is no reason to say we are
 unhappy with TSM.  Since I inherited this environment about a year ago,
 due to lots of hardware and software version upgrades, 

backing up just a few directories?

2007-07-18 Thread Dave Mussulman
I have a few systems I'd like to add to TSM but only backup a few
directory hierarchies.  These aren't always mount points (for example,
the /etc directory under the / mount point.)  Does TSM really not have a
way for me to define just back up X without worrying about anything
else on that mount?  I know I can do a

exclude /.../*
include /etc/.../*

but then I get all the directories backed up all over /.  I could append
exclude.dirs for the larger hierarchies (/lib, /usr, etc.) but that
seems awkward too.  I feel like I'm going at this problem the wrong way,
but I haven't found a right way.

I tried putting a non-mount point in the domain line, but the client
didn't like that.  Recommendations?

Thanks,
Dave


example cloptset lists?

2007-06-26 Thread Dave Mussulman
Hey gurus,

Do you apply server based cloptsets to your clients, based on OS, to
filter files you know are going to be a problem to backup and/or don't
need to be restored?  Can someone share theirs (or point me to an
archive/resource of them?  I looked but didn't see anything.)  I know
it's typical to exclude OS and application cache and temp directories.
I'd like to see some other lists to help me build my own.  It doesn't
look to me like there's a client-side config way to override an exclude
in a server-based cloptset, so I'm interested in seeing what's in a good
base exclude list.

I'm looking for Windows, MacOS, and Linux, but for the sake of
discussion any OS block lists would be interesting.

Thanks,
Dave


MacOS client failing with rc 12 troubleshooting

2007-06-07 Thread Dave Mussulman
Hello gurus,

I have an Intel MacOS client running 5.4.0.0 that's failing it's
scheduled noon-window backup with an RC 12 message, and I'm not sure
why.  I'm not sure what's a cause of what.  From the dsmerror.log:

06/07/2007 12:44:47 ANS1228E Sending of object
'/Library/Logs/tivoli/tsm/dsmsched.log' failed
06/07/2007 12:44:47 ANS4037E Object
'/Library/Logs/tivoli/tsm/dsmsched.log' changed during processing.
Object skipped.
06/07/2007 12:52:06 ANS1228E Sending of object
'/private/var/tmp/folders.501/TemporaryItems/Acr1614669.tmp' failed
06/07/2007 12:52:06 ANS4005E Error processing
'/private/var/tmp/folders.501/TemporaryItems/Acr1614669.tmp': file not
found
06/07/2007 12:52:07 ANS1228E Sending of object
'/private/var/tmp/folders.501/TemporaryItems/Acr1616363.tmp' failed
06/07/2007 12:52:07 ANS4005E Error processing
'/private/var/tmp/folders.501/TemporaryItems/Acr1616363.tmp': file not
found
06/07/2007 13:27:38 ANS1228E Sending of object
'/Users/bob/Documents/Microsoft User Data/Office 2004 Identities/Main
Identity/Database' failed
06/07/2007 13:27:38 ANS4037E Object '/Users/bob/Documents/Microsoft
User Data/Office 2004 Identities/Main Identity/Database' changed during
processing.  Object skipped.
06/07/2007 13:39:41 ANS1228E Sending of object
'/Users/bob/Library/Application
Support/Firefox/Profiles/ap5cuske.default/cookies.txt' failed
06/07/2007 13:39:41 ANS4047E There is a read error on
'/Users/bob/Library/Application
Support/Firefox/Profiles/ap5cuske.default/cookies.txt'. The file is
skipped.

06/07/2007 13:46:14 ANS1802E Incremental backup of '/' finished with 5
failure

06/07/2007 13:46:14 ANS1512E Scheduled event '1215PM' failed.  Return
code = 12.
06/07/2007 13:46:16 TCP/IP received rc 4 trying to accept connection
from server.
06/07/2007 13:46:16 Error -50 accepting inbound connection

Questions:

Other than the five file failing, did the backup finished correctly?

Is the ANS1512E error occurring because there were individual file
failures, or because of the networking errors that follow the message?
I couldn't find those errors on the TSM site or listserv archives.  I'm
not sure if those are a cause or effect of the ANS1512E or unrelated.
The QuickFacts seem to imply it is just the failed files.  The
serialization for the class is shared-static, so I expect ANS4037E
errors -- and other clients in that schedule/domain are probably getting
them too, but aren't failing their backups.  Is it the ANS4047E error
that's tripping the ANS1512E?

Dave


Re: ext2 extended file attributes?

2007-05-30 Thread Dave Mussulman
On Tue, May 29, 2007 at 03:51:49PM -0400, Mueller, Ken wrote:
 I ran into a similar situation where the TSM client (v5.2.3 at the time)
 wouldn't backup/restore the ext3 extended acls on RHEL3.  The problem
 turned out to be that the TSM client is looking for libacl.so but
 couldn't find it.  It was available as libacl.so.1 (which symlinked to
 libacl.so.1.1.0).  By adding a symlink for /lib/libacl.so pointing to
 libacl.so.1 I was able to backup and restore files w/extended acls.

Thanks for the info, Ken, but it doesn't look like that's applicable in
my case.  (The symlink is already there.)  But it did point me to a few
ICs and a comment in the documentation about RHEL4 needing certain
packages, which are installed.  I upgraded my client to 5.4.0.0 and I
still am not getting the ext3 ACLs restored.  Time to call IBM?

Dave


ext2 extended file attributes?

2007-05-29 Thread Dave Mussulman
One of my users pointed out that ext2 has an extended attribute to tell
dump not to back it up.  He was wondering if TSM honored that setting.
I gave it a try (chattr +d test_file) and then did an incremental backup
of the directory.  TSM seemed to ignore the no_dump bit and backed up
the file, and upon restore, didn't restore the ext2 attributes (they
were all blank.)  Is this expected?  I can understand, perhaps, not
honoring the no_dump (since it's not dump,) but I'm a little concerned
extended file attributes for ext3 aren't backed up/restored.  Is there a
special flag or setting I need to define on the client for that
behavior?  A bug?

This is on a 5.3.3 client on a RedHat AS4 system.

Dave


lingering copypool copies?

2007-05-16 Thread Dave Mussulman
Hello list,

Is there a best practice for moving nodes between domains, when the
domains have different storage pools, copy pools, management classes,
etc.?  I noticed when I changed nodes to the new domain, the new
management classes and rebinding occurred (which is fine,) but any data
stored in the previous storage groups still remains, along with its
copypool copies.  I didn't really expect pre-existing data to magically
migrate, and since I was taking the old domain's primary disk
storage pool out of service anyway, I did a 'move data' for those old
disk volumes to a different tape pool, and then removed the disk
volumes.

I expected that the next expire inventory would remove the copies for
the files no longer in the stgpool they were copied from (or even that
the stgpool no longer existed,) but that didn't happen.  (I guess the
copies are tied to the file without any regard for the stgpool they
are/were in.)  The new domain doesn't have any copypools configured, and
I'd like to free those copy tapes up.  How do I do that?

Dave


flooded with ANR8210E messages

2006-11-06 Thread Dave Mussulman
Hello,

I've noticed that when a client either jumps off the network or has a
firewall installed that blocked server scheduled sessions, a message is
logged every 4 minutes during the schedule window stating it could not
connect to the client:

Mon Nov  6 06:01:11 2006 ANR8210E Unable to establish TCP/IP session with IP 
address - connection request timed out.
Mon Nov  6 06:04:50 2006 ANR8210E Unable to establish TCP/IP session with IP 
address - connection request timed out.
Mon Nov  6 06:08:29 2006 ANR8210E Unable to establish TCP/IP session with IP 
address - connection request timed out.
Mon Nov  6 06:12:08 2006 ANR8210E Unable to establish TCP/IP session with IP 
address - connection request timed out.
Mon Nov  6 06:15:47 2006 ANR8210E Unable to establish TCP/IP session with IP 
address - connection request timed out.
Mon Nov  6 06:19:26 2006 ANR8210E Unable to establish TCP/IP session with IP 
address - connection request timed out.
Mon Nov  6 06:23:05 2006 ANR8210E Unable to establish TCP/IP session with IP 
address - connection request timed out.

I anticipate this could happen quite a bit for a number of clients (where
I want them in a schedule in case they happen to be available, but if they
aren't I'm not going to worry too much.)  Is there a way to adjust this
level of logging?  Say, 10 clients during a 12 hour schedule window
spitting out log messages every 3-4 minutes is a lot of static in the logs
and reports.  Ideas?

Dave


Re: Lots of newbie questions

2006-08-15 Thread Dave Mussulman
On Fri, Aug 11, 2006 at 09:29:26AM -0400, Allen S. Rout wrote:
 Said from another direction, TSM doesn't worry about selecting only
 backed-up data from a stgpool when it's migrating, it takes whatever
 is there.

Thanks for the clarification.


  Best practice-wise, do you try to define different private groups
  and tape labels for onsite, archive and offsite storage?  Or do

 Your recordkeeping problem is too complex to do by casual inspection,
...
 You will inevitably end up needing more Primary tapes when all
 you've got on hand will be labeled for Reverse thudpucker use or
 some such; Then you'll cross your own lines, and after the first time
 it gets easier and easier

I've been running purposedly labeled tapes/pools for quite a while in
the leveled backup world of Networker.  Jumping to a 'just the system
pick it' mentality is one of the hurdles (and appeals) of TSM.  I fully
understand your point on the media labels, and will learn to live with
it.  :)


 Having an extra drive outside a library isn't exactly a _bad_ idea,
 but unless you're trying to write a different format I'd expect you to
 get more value out of adding it to the library.

I don't think I can convert an external drive to run in the library.
You make good arguments for having internal drives, and I don't
disagree.  But given that I have an external tape drive, it shouldn't be
an either/or for me to use it or the library.


 If you really, really want to do this, I'd suggest:

 - Define all of the drives to be in the library.  set the one which
   is physically outside to usually be offline.

 - When you want to use the external drive, set the interior drives
   offline, the exterior online.  Run the job, mount, dismount, etc.

 - When you're done, re-set to normal operations.

I suspect that by itself wouldn't work because the SCSI library would
want to do something to checkin/mount the drive, and I wouldn't want to
reconfigure switching the library to manual or not.

I'm thinking more along the lines of defining the SCSI library and
manual library, and switching the devclass between them as needed.
That's just a stopgap - it doesn't really make it usable, unless I
define an entirely new devclass and use it for relatively few things
(like backupsets or db backups to tape.)

No one else has dealt with transitioning between both manual and
automatic libraries?


 Personally, I'd much prefer checkin and checkout of desired volumes to
 this.  And get a quote on how much the next bigger step of library is,
 and count the amount of time you spend screwing around with inadequate
 library space.  That way you can demonstrate to the folks who are
 hoarding the beans when they start losing money because they didn't
 cough up a library at the outset.

 TSM is tremendously economical with tape and drive resources compared
 to other backup products.  Feed it well; feed it hardware instead of
 your time.

That point is muddied when a drive configuration that worked just fine
with other backup products is either unusable or inadequate for TSM.
I'm sticking with my current hardware/specs for this first year and have
already warned the bean counters that odds are I'll need something near
the start of year two -- be it more drives, more library space, more
staging disk, beefier or multiple servers, etc.


 Now, if you were me, you'd try to develop a theory of copystg
 utilization workflow, and solve it for a minimum.  But I suggest you
 just twirl the knob to the other end, and see if you like that
 tradeoff better.

Good point.  I'll try some variations and see how it goes.


 You should consider data which has expired to be Gone, except for
 heroic measures.

 If you Really Need data which expired out of the database in the last
 week or so (a common period for retention of DB backups) then yes, you
 can do a complete DR scenario, and consult the as-yet unreused tape
 volumes for the data.  Icky squicky.

Thanks, that's the definition I was looking for.  As for Really Need,
it depends who asks...  :)


 It's usually an answer like 'You messed this up last month, overwrote
 it every week and didn't notice until the first of this month, and
 have waited until NOW to ask me to get it back?  No, it's gone..

My backups will probably be augmented with regular archives (or
backupsets) for the important systems, so the provision of 'losing'
anything (at least for the people who Really Need it) should be pretty
low.

Any opinion on archives vs. backupsets for a monthly snapshot kept for 6
months?

Thanks for humoring me,
Dave


Lots of newbie questions

2006-08-10 Thread Dave Mussulman
Hello,

Forgive the laundry list of questions, but here's a few things as a
newbie I don't quite understand.  Each paragraph is a different
question/topic, so feel free to chime in on just a few or any that
you're comfortable answering.  Thanks!

I'm using Operational Reporting with the automatic notification turned
on for failed or missed schedules.  I have a node associated with a
schedule that no longer exists (not powered on,) just to test failures
and notifications.  However, I never get notifications about failed or
missed schedules from it (not the email or mentioned in the daily
report.)  In the client schedules part of the report, it's always in a
Pending status.  At what point does pending turn into failed or missed?
How can I configure that so I get notifications about systems that
missed their scheduled backup?

I'm using an administrative schedule to backup my DB to a FILE class
twice a day, and then I do full backups of the DB to tape right before
my offsite rotation.  I read somewhere that since I'm using DRM, I
shouldn't use 'del volhist' to remove old db backups.  However, I don't
think the DRMDBBACKUPEXPIREDAYS configuration setting is applying to my
FILE backups.  Is that normal?  Should I be running both drm and 'del
volhist'?

I do my backups to a DISK pool that has a 85/40 migrate percentage and a
tape next storage pool.  If everything (my disk and tape pool) is synced
up to my copypool before a backup runs, and the backup only goes to
disk, the 'backup stg' for the tape has nothing to do.  I understand
that.  If I backup the disk pool and manually migrate the data from disk
to tape, and then backup the tape pool, it has nothing to do.  (Since
that data was already in the copypool.)  I understand that.  But, if
during a backup the tape pool starts an automatic migration, the next
time I do a 'backup stg' for the tape pool, it has data to copy to the
copypool.  So, what's happening?  Since the migration is going on, does
TSM automatically route data from the node directly to the tape?  (My
maxsize parameter for the disk pool is No Limit, so I would guess no.)
Or is TSM migrating from the disk pool the newest data that wasn't
already copied to the copypool?  In that case, why doesn't it migrate
older data that's already copied?  What's the selection criteria for
what gets migrated?   Or would best practice say to manually migrate
the disk pool daily to minimize the chance of this condition?

Best practice-wise, do you try to define different private groups and
tape labels for onsite, archive and offsite storage?  Or do people
really just make one large 'pool' of tapes and not care if tape 0004 has
archive data on it and stays for years, 0005 goes offsite, 0006 has a
two week DB backup on it, and 0007 is a normal tape pool tape?  Since
there's not one (standard) unified view for volumes (DB backups, offsite
backups, checked in scratch volumes, not checked in scratch volumes) I
worry a little about keeping track of and 'losing something' if they're
all in one group.  How do sites handle that issue?

I have a LTO3 tape library and an external LTO3 drive.  In our Networker
environment, we found it a pretty good practice to have a drive outside
of the jukebox for one-off operations (old restores, etc.) as well as
some sort of fault tolerance if the jukebox or that SCSI bus went south.
How do I setup that environment in TSM?  It looks like I cannot use the
same device class across two libraries.  Doesn't that hinder me if I
want to use the external drive in the same way as the jukebox drives,
sharing storage pools, etc.?  My jukebox isn't very large and I
anticipate having to use overflow storage pools, which is where being
able to mix the manual library (external drive) and SCSI library would
be nice.

Consolidating copypool tapes for offsite use.  I had my reclaimation
threshold for my copypool at 40%.  I used 'backup stg' with maxpr=[# of
drives] to minimize the amount of time the backup took.  However, it
left me with many under-utilized offsite drives, that as soon as I moved
offsite, were then reclaimed (which then sat until the reusedelay
expired.)  That seems inefficient - that I move a larger than necessary
number of tapes each time I do an offsite rotation (right now, weekly)
and as soon as they're offsite, they're reclaimed.  To fix this, I put
the reclaimation threshold back to 100%, and set it down just before my
offsite rotation.  I've also taken a look at the to-be-offsited tapes
and do some 'move data's as required to try to minimize the amount of
offsite tapes.  Is that standard practice?  I feel like I'm fighting the
natural way TSM works, given that it makes so many other decisions just
fine without my direct intervention.  (And that's a compliment - I can't
say that for Networker.) Is there something I'm missing to make offsite
tape usage more streamlined?  So my offsite rotation procedure is
starting to look like:
- expire inventory
- backup all the local storage 

Re: multiple instance recommendations

2006-05-22 Thread Dave Mussulman
On Fri, May 19, 2006 at 10:05:38PM -0400, Allen S. Rout wrote:
  I have questions about server sizing and scaling.  I'm planning to
  transition from Networker to TSM a client pool of around 300 clients,
  with the last full backup being about 7TB and almost 200M files.  The
  TSM server platform will be RHEL Linux.

  I realize putting all of that into one TSM database is going to make it
  large and unwieldy.

 You may be underestimating TSM; the size there is well in bounds for
 one server.  The file count is a little high, but I'm not convinced
 it's excessive.  The biggest server in my cluster has 70M files, in a
 70% full DB of 67GB assigned capacity. So if that means 67GB ~= 100M
 files, you might be talking 130-150GB of database.  I wouldn't do that
 on my SSA, but there's lots of folks in that size range on decent
 recent disk.

I don't have a lot of a priori knowledge of TSM, so any sizing
recommendations I've seen have come from the Performance Guides,
interviews with a few admins, and general consensus from web searches
I've done.  I agree that if the general concern is getting database
operations done in a managable timeframe, as long as I can architect the
db disk well enough, it should size-scale-up fairly well.  Thoughts on
that?


 Do you feel your file size profile is anomolous?  My 70M file DB is
 ~23TB; that ratio is an order of magnitude off mine, and my big server
 is a pretty pedestrian mix of file and mail servers.

My numbers come from the Networker reports on the last full.  Our
domain is largely desktop systems with everything being backed up.
Those large numbers of small files puts our average system at 722k files
and 29GB of storage (averaging 39k a file?)  I'm trying to float towards
just protecting user data (and not backing up C:\Windows 250 times,
especially on systems where we would never do a baremetal restore.)  A
compromise against management that doesn't want to risk data loss by
people putting things in inappropriate (and not backed up) places would
be to use different MCs to provide almost no versioning outside of
defined user data spaces -- but that doesn't get around my high file
count problem.


 I have heard of some dual-attach SCSI setups, but never actually seen
 one in the wild.  If I were going to point at one upgrade to improve
 your life and future upgrade path, getting onto a shareable tape tech
 would be it.  I have drunk the FC kool-ade.  It's tasty, have some. :)

I don't disagree.  Maybe I'll start looking at storage routers to share
my SCSI drives over FC.


  Tape access-wise, is there a hardware advantage putting multiple
  instances on the same system?

 Yes, it solves your drive sharing problem.  All the TSM instances
 would be looking at /dev/rmtXX.  Your LM instance can do traffic
 direction to figure out who's got the drive, and they are all using
 the same bus, same attach.

Ah, that helps.  I guess I could design one beefy system as a sole-TSM
instance, and if things get too bogged down, split it into two (or
three) TSM instances on that same hardware and not have to worry (as
much) about device sharing because it's all local.  If it got to the
point where I'd want to split that to different hardware, I'd look at
moving the drives to FC and sharing them that way.  That makes sense to
me.


 I like the 'beefy box' solution for all purposes except test instance.
 Make sure it's got plenty of memory. 6G? 8?

Can you clarify the memory needs?  I was thinking 2G of RAM per
instance; would I need more?

Dave


multiple instance recommendations

2006-05-19 Thread Dave Mussulman
Hello,

I have questions about server sizing and scaling.  I'm planning to
transition from Networker to TSM a client pool of around 300 clients,
with the last full backup being about 7TB and almost 200M files.  The
TSM server platform will be RHEL Linux.

I realize putting all of that into one TSM database is going to make it
large and unwieldy.  I'm just not sure how best to partition it in my
environment and use the resources available to me.  (Or what resources
to ask for if the addition of X will make a much easier TSM
configuration.) For database and storage pools, I will have a multiple
TB SAN allocation I can divide between instances.  I have one 60 slot HP
MSL6060 library (SCSI), with two LTO-3 SCSI drives.  There is also an
external SCSI LTO-3 drive.

My understanding of a shared SCSI library indicates that the library is
SCSI-attached to a server, but drive allocation is done via SAN
connections or via SCSI drives that are directly attached to the
different instances.  (Meaning the directly attached SCSI drives are not
sharable.) Is that true, at least as far as shared libraries go?  The
data doesn't actually go through the library master to a directly
connected drive, does it?

If not, and I still wanted to use sharing, I could give each instance a
dedicated drive - but since two drives seems like the minimum for TSM
tape operations, I don't really think it's wise to split them.
(However, if the 'best' solution would be to add two more drives to max
out the library, I can look into doing that.)

If the drives need to be defined just on one server, it looks like
server-to-server device classes and virtual volumes are the only
solution.  I don't really like the complexity of one instance storing
anothers' copy pools inside of an archive pool just to use tape, but it
looks like things are heading that way.

Other than the obvious hardware cost savings, I don't really see the
advantage of multiple instances on the same hardware.  (I haven't
decided yet if we would use one beefy server or two medium servers.)  If
you load up multiple instances on the same server, do you give them
different IP interfaces to make distinguishing between them in client
configs and administration tools easier?  Tape access-wise, is there a
hardware advantage putting multiple instances on the same system?

Any recommendations on any of this?  Your help is appreciated.

Dave


Re: Multiple storage pools?

2006-05-11 Thread Dave Mussulman
On Thu, May 11, 2006 at 09:47:08PM +0200, Kurt Beyers wrote:
 The two copy pools behave completely independently.

 Each command backup stgpool primary copy will compare the files in both 
 storage pools and copy the data that are in the primary pool and not yet in 
 the copy pool.

I presume that same 1-to-1 primary/copy mapping is true for a file that
gets migrated between primary pools that both backup to the same copy
pool?

For example, a disk and a tape primary pool and a tape copy pool.  The
file is created on the disk pool, and copied to the tapecopy pool.
Later, that file is migrated from the disk pool to the tape pool.

When the tape pool is next copied to the tapecopy pool, does the file
transfer again?  Or does TSM know there's already a copy of that file
migrated from another primary pool?

I believe from my evaluation that the copying is happening twice, but
I'm not sure.  My assumption, in that case, is that the disk pool copy
is expired at that point, so even if the copying happens twice, the
actual post-reclaimation copy pool storage is singular.

I'm new and still getting used to the concepts, so a
confirmation/correction would be appreciated.  Thanks!

Dave


GENERATE BACKUPSET fails

2006-05-01 Thread Dave Mussulman
Hello all,

I'm having problems generating backupsets, and I'm new to TSM so I'm not
sure how to troubleshoot this.  I'm running a Linux server TSM version
5.3.2.  Here's what I see:

Mon May  1 14:02:25 2006 ANR2017I Administrator MUSSULMA issued command: 
GENERATE BACKUPSET endeavour.cs.uiuc.edu 2006-05 /var devc=dltclass
Mon May  1 14:02:25 2006 ANR0984I Process 92 for GENERATE BACKUPSET started in 
the BACKGROUND at 02:02:25 PM.
Mon May  1 14:02:25 2006 ANR3500I Backup set for node ENDEAVOUR.CS.UIUC.EDU as 
2006-05.1099170 being generated.
Mon May  1 14:03:40 2006 ANR8337I DLT volume TSM010 mounted in drive DRIVE2 
(/dev/tsmscsi/mt2).
Mon May  1 14:03:40 2006 ANR0513I Process 92 opened output volume TSM010.
Mon May  1 14:04:58 2006 ANR1360I Output volume TSM010 opened (sequence number 
1).
Mon May  1 14:05:16 2006 ANRD bfgenset.c(3614): ThreadId38 Unknown result 
code (17) from bfRtrv.
Mon May  1 14:05:16 2006 ANR3520E GENERATE BACKUPSET: Internal error 
encountered in accessing data storage.
Mon May  1 14:05:16 2006 ANR3503E Generation of backup set for 
ENDEAVOUR.CS.UIUC.EDU as 2006-05.1099170 failed.
Mon May  1 14:05:18 2006 ANR1361I Output volume TSM010 closed.
Mon May  1 14:05:18 2006 ANR0515I Process 92 closed volume TSM010.
Mon May  1 14:05:22 2006 ANR0987I Process 92 for GENERATE BACKUPSET running in 
the BACKGROUND processed 349 items with a completion state of
FAILURE at 02:05:22 PM.

It fails the same way if I try a different filespace on the client, or a
different client, or write to a different devclass.  Restores from that
filespace seem to work okay.  I tried some web searches but came up cold.
Any advice?

Dave