Re: Migration question

2007-10-01 Thread Stuart Lamble

On 02/10/2007, at 2:35 AM, Dollens, Bruce wrote:


I have a question that I feel a little stupid in asking.

What is the difference between migration and backup primary (disk) to
copy (tape)?


Other people have already answered, so I won't bother. :)


I am working on changing my scheduling up and the recommended order of
steps I was given is:

Backup clients
Migration
Backup primary to copy
Backup db
Expiration
Reclamation
Start all over again


I'd change that order slightly to:

Backup clients
Backup disk pool(s)
Migration
Backup tape pool(s)
Backup db
Expiration
Reclamation
Repeat ad nauseum.

Better to backup the storage pool data while it's still on disk,
rather than to flush it out to tape and then back it up - saves on
the amount of data read from tape. Some data may well still flush
down to tape before the first backup stgpool runs, which is why you
want to backup the tape pool as well after migration is complete.


Re: Possible to delete folders within a filespace?

2007-09-04 Thread Stuart Lamble

On 04/09/2007, at 10:34 PM, Angus Macdonald wrote:


Many thanks Richard. That looks like it'll do what I need.

Can anyone point me to a redbook or other document that covers
operation of the client command-line? I can't seem to find one
anywhere.


I'm looking right now at the IBM Tivoli Storage Manager for Unix:
Backup-Archive Clients Installation and User's Guide. Possibly the
wrong platform, but you should be able to extrapolate from there.
(Available in both HTML and PDF form from IBM's website ... somewhere.)

Cheers,

Stuart.


Re: prescheduled postcheduled commands help

2007-08-22 Thread Stuart Lamble

On 23/08/2007, at 12:29 AM, Głębicki Jakub wrote:
Hint :) use short version of path (Progra~1 instead of Program  
Files) in P*SCHEDULECMD options, otherwise schedule will fail.


A small trap for young players: the short name is defined by the  
order of directory creation. So if the server dies and you recover  
from backup, anything that relies on the short name is likely to fail  
if the recovery doesn't create the directories in exactly the right  
order. Isn't Microsoft *wonderful*?


I don't know this from personal experience, but I have heard it from  
somebody whose experience and knowledge in these matters I trust.  
Verify if it's a concern for you.


Re: Why virtual volumes?

2007-08-22 Thread Stuart Lamble

On 23/08/2007, at 7:29 AM, Nicholas Cassimatis wrote:

And a TSM DB Backup takes (at least) one volume, so with physical
cartridges, that's a whole tape.  With VV's, you're only using the
actual
capacity of the backup, which is more efficient on space.


At the cost of some reliability. What happens if the particular tape
the virtual volumes are on goes bad, and you're in a disaster needing
a DB restore?

I'd rather spend the extra money on tapes and know that if something
goes bad, we'll at least be able to recover some of the data ...


(I'm seeing installations getting over 2TB on the new 3592/TS1120
drives - for a 60GB TSM DB Backup, that's VERY wasteful).


Well, why not do what we're doing soon? We currently have some 1200
LTO2 tapes, and are in the process of migrating from LTO2 to LTO4;
(some of) the LTO2 tapes will be kept in the silo for database
backups (along with a single LTO2 drive for writing to those tapes.)
There's another silo with LTO3 volumes; some of the LTO2 tapes will
be put into that silo for exactly the same reason (LTO3 drives will
write LTO2 tapes, so there's no issue with needing an LTO2 drive in
that silo, at least for the time being).

Call me a conservative fuddy duddy if you want, but I prefer to keep
the TSM database backups as simple as possible.


Re: TDP for SQL question

2007-08-02 Thread Stuart Lamble

On 03/08/2007, at 2:53 PM, Paul Dudley wrote:


I have been told that if I want to create an archive backup of an
SQL
database via TDP for SQL, then I should create a separate node name in
TSM (such as SQL_Archive) and then backup using that node name
once a
month (for example) and make sure to bind those backups to a
management
class that has the long term settings that meet our requirements.

What else is involved in setting this up? On the client do I have to
create another dsm.opt file with the new node name to match what I set
up on the TSM server?


In the context of the BA client, you can create multiple server
stanzas in dsm.sys to point at different TSM instances and/or node
names - eg, we do this with one particular system so that we can hold
an image backup for ten days normally, but once a week, put one aside
for eight weeks. The dsm.sys for this node looks like:

servername (instance)_8WR
   commmethod tcpip
   tcpport(port)
   tcpserveraddress   (server)
   passwordaccess generate
[...]
   nodename (client)_8wr

servername (instance)
   commmethod tcpip
   tcpport(port)
   tcpserveraddress   (server)
   passwordaccess generate
[...]
   nodename  (client)

All that is in dsm.opt is a line servername (instance), so that
backups by default go to the client's standard node; for the 8 week
retention, we add a -se=(instance)_8WR, and TSM does the right
thing. (The '[...]' signals where I've cut out a number of TCP/IP
options, along with other client-side options.)

How to do this for a TDP client is left as an exercise for the
reader. :-) Hopefully, it should give you some ideas. Good luck.


Re: TSM performance very poor, Recovery log is being pinned

2007-07-31 Thread Stuart Lamble

On 29/07/2007, at 10:03 PM, Stapleton, Mark wrote:


From: ADSM: Dist Stor Manager on behalf of Craig Ross

TSM is installed on Solaris 10


This is something that popped right out for me. Do you have your
storage pools located on raw logical volumes or mounted
filesystems? If the latter, that might be your problem. Solaris has
traditionally had incredibly poor throughput performance on mounted
filesystems.

You might give thought to rebuilding those storage pools on raw
logical volumes. Of course, that will require that you completely
flush all data from your disk storage pools to tape storage pools
first, so as not to lose client data.


A small trap for young players: TSM has constraints in place to stop
it writing to cylinder 0 of a raw volume on Solaris. If you direct
TSM at slice 2, or some other slice that includes cylinder 0, it will
barf, and the error message is rather cryptic (sorry for the
vagueness; it's been a year or so since I bumped into this. At the
time, I was working on the storage pool volume level, but I would
expect to see similar behaviour for a DB or log volume.)

Workaround is simple: make slice 0 include the entire disk starting
from cylinder 1, and use slice 0 as the raw volume.

I am not going to enter into a debate about the relative merits of
raw volumes versus files on filesystems, as I have insufficient
direct knowledge to judge either way (I'm trusting a more senior
colleague to make the right call there. :)


Re: Sharing a 3583 library

2007-07-26 Thread Stuart Lamble

On 27/07/2007, at 1:43 AM, Zoltan Forray/AC/VCU wrote:


Thanks for the reply. I already have things configured like this.

What I was hoping for was zOS/MVS sysplex sharing smarts (for you
mainframe folks).  With a properly configured sysplex, the drives are
configured and online to all systems at the same time and the OS
figures
out which drive is in use by which system and which drives are
available
and choses acordingly, not stepping on any other system.  I was
hoping the
SAN type libraries were smarter and that  multiple TSM library
managers
could just figure out what drives are available and use them
accordingly.

I wanted to avoid an all-or-nothing reconfiguration since I need to
move
library management/ownership to my new TSM server (phasing out the
old AIX
server currently owning the 3583's) and having to reconfigure all TSM
servers that currently share the libraries.


So if I understand you correctly, you will eventually be moving the
library management completely to the new server, but you want to
avoid reconfiguring all the existing TSM instances?

I've just completed moving two library managers from one TSM instance
to another (the new manager is dedicated solely to library and
configuration management, where the old managers also served backup
duties.) It turned out to be remarkably easy:

  * Delete the old paths, drive definitions and library definition
on the old library manager (and the new library manager if it's
currently a library client).
  * Define the library, drives, and paths on the new library manager
(setting the drives to offline, so no tapes are accessed until you're
finished.)
  * Checkin the libvols on the new library manager (CHECKIN LIBVOL X
search=yes stat=scratch checkl=barcode)
  * Update the old library clients (UPDATE LIBRARY X
PRIM=new_lib_manager on all instances)
  * Create a library definition on the old library managed (of type
shared, pointing at the new library manager.)
  * Run an AUDIT LIBRARY on the library clients (including the old
library manager).
  * Set the drives to online, and you're away.

The audit on each client will set tapes with data to private status,
owned by the relevant client. If you're paranoid about this, check
them in as private, owned by the new library manager, and make a note
of the scratch volumes known by the old library manager before
deleting the library definition (query libvol). The audit will update
the ownership to the correct node, you can manually update the
remaining volumes to be scratch, and chase up the remaining volumes
owned by the new library manager (assuming it didn't own any volumes
beforehand) as potentially orphaned - we found some 30 tapes that
should have been returned to scratch by the clients, but the library
manager hadn't updated them for some reason so they were still marked
as being private.

As for the 6/8 character volume label: I can't speak for a 3583, but
we're using a pair of 3584 libraries. In the web-based management
system, there's a section for Library, Logical Libraries; have a
look at the modify volser reporting option - it might help. Note
that this will likely affect *all* volumes, so if you have data
stored on volumes with a mix of 8/6 volume serial numbers, you're in
trouble ...

Hope this helps.


Re: TSM vs. Legato Networker Comparison

2007-07-25 Thread Stuart Lamble

On 26/07/2007, at 2:54 AM, Schneider, John wrote:


Greetings,
We have been a TSM shop for many years, but EMC came to our
management with a proposal to replace our TSM licenses with Legato
Networker, at a better price than what we are paying for TSM today.
This came right on the heels of paying our large TSM license bill, and
so it got management's attention.
We have an infrastructure of 15 TSM servers and about 1000
clients, so this would be a large and painful migration.  It would
also
require a great deal of new hardware and consultant costs during the
migration, which would detract from the cost savings.
So instead of jumping from one backup product to another based
on price alone, we have been asked to do an evaluation between the two
products.  Do any of you have any feature comparisons between the two
products that would give me a head start?


Funny you say this. Monash was a Legato (now owned by EMC) shop until
the TSM migration around 3-4 years ago; I suspect that a large number
of the problems that were perceived to be Networker's fault were
actually the fault of the aging DLT silos and drives that underlay
Networker. I still have fond memories of those silos; they gave me a
great deal of callout pay every time they had a stuck cartridge or
similar. :-)

I also suspect that the greater reliability we've had since putting
in TSM is more because we also got in new tape silos (LTO2, now half
LTO2 and half LTO3, and soon to be half LTO3 and half LTO4) - if we'd
stuck with the DLT silos, we'd still be in a world of pain,
regardless of the software.

There are plusses and minuses to both products. Some points to consider:

  * Networker uses the traditional full plus incrementals, or dump
levels system. Monash used a pattern of full once a month;
incremental every other day; and a dump level interwoven - so, for
example, it might go full, incremental, level 8, incremental, level
7, incremental, level 9, level 2, incremental, level 8, incremental,
etc. - the idea being to minimise the number of backups needed to
restore a system.
  * Networker indexes are somewhat analogous to the TSM database. In
theory, you can scan each tape to rebuild the indexes if they're
lost; in practice, if you lose the indexes, you're pretty much dead -
there's just too much data to scan if the system is more than
moderately sized. Yes, Networker backs up the indexes each day. :)
  * At least the versions of Networker (up to 7.x) we used doesn't
support the idea of staging to disk - everything goes directly to
tape. However, data streams from multiple clients are multiplexed
onto tape to get the write speeds up. This is good for backups, but
does make recovery slower (since the data read will include a lot of
data for other clients.)
  * No more reclamation or copy pools to deal with (because of the
traditional full/incremental/dump level system). So the burden placed
on the tape drives is probably going to be significantly lower
(although you will be backing up more data each night than you would
with TSM.)
  * I don't think Networker has anything analogous to TSM's scratch
pool: volumes belong to a pool of tapes, and there's no shuffling
between the pool. So if the standard pool has a hundred tapes
available for use, but the database pool is out of tapes and needs
one more, you need to manually intervene. This *may* have been
because of the way we configured Networker, though, and it may also
have changed in the interim. Note that you *have* to have a separate
pool of tapes for index backups.

My honest assessment mirrors that of the other people who have
replied: use this as an opportunity to negotiate better pricing from
IBM, and point out to the powers that be that there are risks
involved with moving to a different backup product. There's nothing
wrong with Networker, it's a good system, but you aren't familiar
with it; it takes time with any new product to learn the tricks of
the trade. It's only in the past year or two that we've started to
feel more competent with TSM, as we've found and dealt with problems
in the production system which never showed up (and would never show
up) in the smaller scale proof of concept.

You also should note that it took Monash a couple of years to finish
the migration from Networker to TSM; I would expect a migration in
the other direction would take at least a year. I definitely would
not advise a dramatic cut-over - do a small number of servers at a
time to make sure you're not pushing the server too hard (and
besides, you want to stagger the full backups so they don't all take
place on the same day ...)

Oh, one other point that comes directly from Monash's experience with
Networker (assuming you do go down that path): we had a number of
large servers (mail in particular) that would take a very long time
to do a complete full backup. We ended up setting Networker up to
stagger the full backups on their filesystems: system filesystems on
day 1; mailbox 

Dealing with defunct filespaces.

2007-07-13 Thread Stuart Lamble

Hi all.

Whilst investigating something else, we discovered a number of nodes
that have old filespaces still stored within TSM - eg:

  Node Name: (node name)
 Filespace Name: /data
 Hexadecimal Filespace Name:
   FSID: 4
   Platform: SUN SOLARIS
 Filespace Type: UFS
  Is Filespace Unicode?: No
  Capacity (MB): 129,733.3
   Pct Util: 92.1
Last Backup Start Date/Time: 06/09/05   20:03:56
 Days Since Last Backup Started: 764
   Last Backup Completion Date/Time: 06/09/05   20:05:16
   Days Since Last Backup Completed: 764
Last Full NAS Image Backup Completion Date/Time:
Days Since Last Full NAS Image Backup Completed:

  Node Name: (node name)
 Filespace Name: /Z/oracle
 Hexadecimal Filespace Name:
   FSID: 12
   Platform: SUN SOLARIS
 Filespace Type: UFS
  Is Filespace Unicode?: No
  Capacity (MB): 119,642.2
   Pct Util: 31.5
Last Backup Start Date/Time: 08/26/05   01:03:08
 Days Since Last Backup Started: 686
   Last Backup Completion Date/Time: 08/26/05   01:14:01
   Days Since Last Backup Completed: 686
Last Full NAS Image Backup Completion Date/Time:
Days Since Last Full NAS Image Backup Completed:

  Node Name: (node name)
 Filespace Name: /mnt
 Hexadecimal Filespace Name:
   FSID: 15
   Platform: SUN SOLARIS
 Filespace Type: UFS
  Is Filespace Unicode?: No
  Capacity (MB): 120,992.9
   Pct Util: 55.8
Last Backup Start Date/Time: 01/26/06   20:05:15
 Days Since Last Backup Started: 533
   Last Backup Completion Date/Time: 01/26/06   20:06:34
   Days Since Last Backup Completed: 533
Last Full NAS Image Backup Completion Date/Time:
Days Since Last Full NAS Image Backup Completed:


These are all filesystems which existed at some time in the past, but
which were removed as part of an application upgrade (or system
rebuild, or ...), and hence no longer exist. It seems that TSM is
taking the attitude of if I can't see the filesystem, I'll not do
anything about marking files in that filesystem inactive, so the
data never expires. I can understand the reasoning behind this
approach, but it does mean that there's a large amount of data
floating around that is no longer needed (a quick and dirty estimate
says around 83 TB across primary and copy pools, although some of
that needs to stay).

A delete filespace will clear them up quickly, obviously, but there's
a twist: how can we identify filesystems like this, short of going
around to each client node and doing a df or equivalent? Searching
the filespaces table gives us some 600 filespaces all up; I *know*
that several of these have to stay - eg, image backups don't update
the backup_end timestamp, and there are some filespaces that are
backed up exclusively with image backups.

At the moment, the best I can come up with is to:
  * use a SELECT statement on the filespaces table to get a first
cut (select node_name, filespace_name, filespace_id from filespaces
where backup_end  current_timestamp - N days);
  * use QUERY OCCUPANCY on each of the filespaces mentioned in the
first cut; if the total occupied space is below some threshold,
ignore it as not being worth the effort;
  * use a SELECT statement on the backups table to confirm that no
backups have come through in the past N days. (select 1 from db where
exists (select object_id from backups where node_name=whatever and
filespace_id=whatever and state=ACTIVE_VERSION and current_timestamp
 backup_date+90 days) -- I use exists to try to minimise the effort
TSM needs to put into the query; I also have the active_version check
in there for the same reason (if there's only inactive versions,
they'll drop off the radar anyway in due course). Hopefully TSM's SQL
execution is optimised to stop in this case when it finds one match
rather than trying to find all matches ...)

Does anybody have any better ideas? Unfortunately, because of the
nature of Monash's organisation, simply having central policies
saying you must do X when shuffling filesystems around won't cut it
(and let's be honest 

Re: Mixing drive types in logical library 3584

2007-05-30 Thread Stuart Lamble

On 31/05/2007, at 3:38 AM, Chris McKay wrote:


Hi all,

I have been told by IBM that fibre LTO2 drives are no longer
available. We
wish to expand the number of drives in our 3584 library from 2 to
5. The
current 2 drives are fibre LTO2 drives, would it pose a problem by
adding
an additional 3 LTO3 drives to that logical library? Is it possible
under
Windows??  I realize I will need to continue to use LTO2 media, as the
existing LTO2 drives would not be compatable with LTO3 media. Our TSM
server is running on Windows 2003.


Compatibility with LTO media is straightforward: one generation back
for write; two generations back for read. So an LTO3 drive is a
perfectly reasonable way to extend your LTO2 capabilities, as well as
giving you the ability to migrate to LTO3 down the road.

At Monash, we're running one silo on LTO3, one on LTO2. We're looking
at upgrading the LTO2 silo to LTO4, and have checked the
compatibility; our information is that TSM is smart enough to mount
LTO media only in drives appropriate to the operation (so LTO2 media
will only mount in an LTO4 drive if it's being read, not written;
LTO4 media will never be mounted in an LTO2 drive; etc.) So if you
were to add LTO3 media to that library down the road, TSM would
access it only through the LTO3 drives (and if all the LTO3 drives
were being used with LTO2 media while an LTO2 drive sits idle, too
bad.) You may want to double check this, though, if it's a concern
for you down the road.

I can't speak for how far this advice extends into Windows - we're
running TSM on Solaris - but it's a starting point. My expectation
would be that there would be no problem as long as the LTO3 drive is
compatible with the drivers on the Windows system.


Re: Continuity of Data Backup

2007-05-17 Thread Stuart Lamble

On 18/05/2007, at 5:01 AM, Avy Wong wrote:


Hello,
  We have two instances TSM_A and TSM_B, Currently we want to
move some
nodes  from TSM_A to TSM_B.  Once the nodes are moved to TSM_B, the
filespaces will be backed up over TSM_B. Is there a way to keep
continuity
of the backed up data ? Do I need to move the backed up data of
those nodes
from TSM_A to TSM_B? How do I go about it? Thank you.


Easiest way to go about this would be to do an EXPORT NODE between
instances, assuming a high speed link between the two instances. If
the link between the two instances is slow (meaning less than the
streaming read rate of the tape drives, or if the importing server
is unable to write as fast as the tape drives can read), you'd
probably be better off doing the export to a file storage pool,
copying the exported files over to the other server, and importing
them (tedious in the extreme, and needs a large chunk of temporary
disk space, but ... well ...)

Alternatively, if you don't have any important archived data for the
nodes in question, you can just cut them over to the new instance,
take the hit of the one-time full increment, and remove them from the
original instance at some appropriate time, according to the
retention policies of the company. This assumes you have enough tape
storage space to take the short-term hit.

Cheers.


Moving the TSM configuration manager

2007-05-14 Thread Stuart Lamble

A bit of background: to keep the database size down, we have several
TSM instances. One of these instances acts as a configuration
manager; the others are all configuration clients (so to speak).

For various reasons, we intend to create a new TSM instance dedicated
solely to library management and configuration management - no TSM
clients will backup directly to that instance. The major stumbling
block I'm having right now is with policy domains; such a change means:

  * Copying policy domains to temporary PDs.
  * Moving clients from their current PDs to the temporary PDs.
  * Re-directing the configuration subscription to the new TSM
instance.
  * Moving clients from the temporary PDs back to the permanent PDs.

I'm aware of the implications this has for scheduling, and am not too
concerned about it (just recreate the schedules, and all will be
happy). I'm more concerned about data retention - given that the
policy domains will have the same management classes, etc., after the
cutover, will it affect retention in any way? I'm especially
concerned about archived data; backup data will rebind as desired
with the next backup if there's any difference, so I doubt that will
be an issue.

The documentation is not as clear on this matter as I would like. Any
and all assistance would be gratefully received.

Thanks,

Stuart.


Re: More on fssnap on Solaris

2005-08-10 Thread Stuart Lamble

On 11/08/2005, at 6:21 AM, Mark D. Rodriguez wrote:


Stuart,

I am not a Solaris expert so please bare with me on my suggestions.

First of all you might try backing the filesystem up by using a
virtualmountpoint, this option has worked for me in the past on
unsupported file systems in Linux.

Now in regards to your image backups.  Again, I am more of a Linux guy
then Solaris, but in Linux there is a mknod command the lets you
create block or character special files, i.e. devices.  Does Solaris
have a similar command?  If so why not just create unique device names
that have the same device major and minor numbers as the fssnap
devices.  They would be effectively the same device!  Then you could
backup using those unique names and everything would then be kept
straight.  It would require some scripting but it could be done.


I do hear where you're coming from, and yes, in theory it should be
doable ... the main hassle, though, is that this would mean a fair
amount of fiddling. I'd prefer to keep things as simple as possible,
on the basis that that way, there's less to go wrong.

Virtual mount points aren't really what I'm after -- yes, fssnap will
probably let me do an incremental backup of a point in time, but I
want to do an image backup, not a file-level backup. I do apologise
if my comments weren't clear on that point. So whilst your comments
are good, they don't quite get to where I want to go.

Thanks for the suggestions nonetheless; they might well be of use in
other applications. Just not in this one, alas.


More on fssnap on Solaris

2005-08-08 Thread Stuart Lamble

I've done some experimentation. So far, I've found that TSM will not
backup a filesystem image for a fs mounted from /dev/fssnap/N, but it
will backup the image as a raw image. This then raises the question
of consistency -- making sure that a given filesystem is backed up
with the same name every time.

As an example, suppose I have three filesystems (A, B, and C) that I
want to backup as an image using fssnap. If I create a snapshot for
A, it will be given the snapshot device /dev/fssnap/0. If I then back
it up and delete the snapshot, and then create another snapshot for
filesystem B, filesystem B will also get the fssnap device /dev/
fssnap/0.

Conversely, if I create a snapshot for filesystems A and B
concurrently, they will be given different numbers. (0 and 1).
Subsequent creations of snapshots will give them the same numbers as
they were given in the first instance (so if filesystem A has no
snapshot, and I ask for B to have a snapshot created, it will be
given /dev/fssnap/1 -- not 0.) I have not experimented beyond two
simultaneous filesystems as yet.

Based upon my experimentation and reading of the manuals, I get the
feeling that there is no way to tell TSM, Back up this device, but
name it as /foo/bar rather than /baz/bam. Can anybody shed any light
that may be of use in backing up filesystems in this manner, whilst
keeping it clear enough that we can easily determine which device is
associated with a given filesystem, and that this mapping is kept
consistent? Am I wrong? If not, I fear that this technique will be
rendered useless to us.

There are other options that I can pursue, as I've indicated before,
and I'm about to start on them; I was hoping that the collected TSM
gurus might be able to point me in another direction that I haven't
yet noticed.

The other option, of course, is to ask IBM to support fssnap in a
future release, but since that's not likely to happen before 5.4 (at
a guess) ...

Thanks for all advice that you can give,

Stuart.


Using fssnap for image backups on Solaris?

2005-08-04 Thread Stuart Lamble

Just wondering: there are a couple of instances within the university
where a system has large filesystems with a large number of small
files. (eg: our mail spools have a single file for each email
message ...) This is, obviously, a worst-case scenario for both
backups and restores. I notice in the documentation for the 5.1 TSM
client that snapshots are supported on Linux-x86, with the Linux
Logical Volume Manager; unfortunately, these systems are running
Solaris, not Linux.

In short: fssnap creates a new device entry, which it prints to
stdout, named /dev/fssnap/N (N being a number) for the block device
(raw device is /dev/rfssnap/N). Reads from this device are redirected
to a combination of the original filesystem and a backing store
(which is where the old data is captured before being overwritten; it
is defined during the execution of fssnap). Backups should then
proceed from this device image, rather than the original volume. I'm
guessing that this image is a consistent filesystem based at the time
of the fssnap command (even if not, it would be preferable to backing
up a raw, mounted, filesystem.).

The ideal solution would be to use TSM's image backup facility in
conjunction with fssnap. If this is not possible, an alternative
would be to forcibly backup that device (with all the headaches that
likely would then ensue should the device name change between
invocations of fssnap). I'm not sure at this stage whether we would
be using image backups in conjunction with standard incrementals, but
I suspect that we will in most cases; there's at least one case where
this will not be useful, though.

So the questions:
 * Does anybody have any experience with this type of backup, and
have any suggestions regarding pitfalls, etc.?
 * Is there any likelihood of TSM supporting fssnap for image
backups in the near future?
 * Any general advice from those who haven't dealt with this
specific scenario before, but who do have some idea of the problems
we're likely to encounter with image backups?
 * Comments relating to the possibility that the raw device fssnap
provides may differ from invocation to invocation?
 * Suggestions for alternative backup methods? We're also looking at
using the FastT900's snapshot capability, which I imagine would have
similar problems in large to using fssnap.

Many thanks for any and all tips and pointers. A quick Google doesn't
bring up much, and none of it seems to be useful. :(


Archives missing from archive list.

2005-05-08 Thread Stuart Lamble
Greetings all. This one has me baffled, and whilst I'd normally spend
more time investigating, I'm going on leave tomorrow for three days,
and other parties view this as sufficiently urgent that I want to
resolve it if possible before then. Basically, we have a number of
applications based around Oracle databases (eg: SAP); these are
backed up using the archive facility of TSM (yes, I'm aware of the
inherent contradiction in that statement.)
As an example of the problem, I have logs from the client that claim
that the command:
dsmc archive -description=CDUT_WEEKLY_07/05/05 01:00 -
compression=no -archmc=DDB_WEEKLY_8WR_1  /space/CDUT/data/sis_d4.dbf
completed successfully. Investigating the activity log for the
appropriate time also indicates that the archive completed
successfully. However, if I then run, on the archiving client, the
command:
dsmc q archive / -subdir=yes | grep CDUT_WEEKLY_07
I get back nothing -- there are entries for March or so, but
definitely not for May. In comparison, if I run the command
select description from archive where node_name='(source host)'
and look through the output, I see CDUT_WEEKLY_07/05/05 01:00 in the
list.
All of this combined suggests to me that the data is there; it's just
that the query archive command, for some unknown reason, isn't
reporting it.
Anybody have any suggestions on what to do next?
Many thanks in advance,
Stuart.


Re: Archives missing from archive list.

2005-05-08 Thread Stuart Lamble
On 09/05/2005, at 1:34 PM, Ian Hobbs wrote:
Make sure you are doing the q archive as the same userid (oracle
perhaps) that performed the archive in the
first place.
Nice try, but I get back the exact same results: absolutely no
mention of the (for instance) CDUT_WEEKLY_07/05/05 01:00 archive,
even when the command is run as oracle. In fact, to the best of my
knowledge, I get the same results as root as if I'm logged in (via
su) as oracle. Unless using su to access the account in question
makes a difference, but I would be *extremely* surprised if that were
so.
If that was all it was, I wouldn't have been prompted to look into
this in the first place. :) My thanks nonetheless.
[snippage]


Re: Archives missing from archive list.

2005-05-08 Thread Stuart Lamble
On 09/05/2005, at 2:35 PM, Ian Hobbs wrote:
This could be a really long shot.
try
dsmc q archive {insert the source filespace name here}/* -
subdir=yes | grep CDUT_WEEKLY_07
It appears that the request is for any archive off the root
filespace. Possible. I'm quite away from a unix system at the
moment, and was only able to
check using my home windows server.
I've been getting varying results that don't make much sence.
example, the dsmc q ar * -su=yes show me all archives from my C
drive, but not my E
drive, but if I do a q ar e:\* -su=yes then I see the E drive
archives.
H
Bingo. Thank you *very* much for that. I'll pass that on to the DBA;
hopefully that will calm her down. :-)
It would be nice to know why this is happening, but in the absence of
that knowledge, I'll take any solution I can grab. Especially since
it seems to give some of the older archives from the same
filesystem... *shrugs*
Again, many, many thanks.
Stuart.


Re: Export feasibility

2005-03-10 Thread Stuart Lamble
On 11/03/2005, at 6:25 AM, Ragnar Sundblad wrote:
--On den 10 mars 2005 16:18 +1100 Stuart Lamble
[EMAIL PROTECTED] wrote:
On 10/03/2005, at 8:15 AM, Ragnar Sundblad wrote:
[snipping]
I think that the most logical way to accomplish this with TSM
is to do a complete export like
export server filedata=allactive ...
What's wrong with using backup sets? The end result is the same -- all
currently active data is stored on tape and can be moved offsite,
without needing any maintenance from the main TSM server.
Backup sets may be fine too. There are two issues with it that
made me think that they were not as suitable for this as an export:
- There seem to be no way to generate a backup set that contains
multiple nodes, so I take it that if I don't want one tape
per node (hundreds), I will have to generate the backup sets to
disk and find some other way to pack them on tapes.
Am I wrong here?
Not as far as I can tell, alas. The other is probably not as important
-- I agree that symmetry is nice to have, but for DR purposes, you can
probably live without it.
[more snipping]
I was just thinking that there might be some other resource
that that server could get short of, like database rollback or
something, if it was both doing an export for days and at the
same time was backing up. I hope that is not a problem, and
especially it shouldn't be if I use backup sets instead.
On that, I'll have to defer to the more knowledgeable crowd on the
mailing list; I honestly don't know. One option you do have is to
export in batches -- say, a quarter of the nodes in one batch; another
quarter in the next batch; and so on, giving the server time to do any
maintenance it needs but can't perform whilst an export is occurring.
I'd be surprised, though, if there was any of significance. Not saying
there won't be any, just that it would surprise me.
If backup sets (see the
generate backupset command in the admin reference manual) fill your
needs, my advice would be to make use of them, rather than using the
rather fragile option of server data exports.
I very much agree, simple is good! :-)
It is interresting, but sad, to hear that you too find exports
fragile. May I ask what the problems typically are?
I've found, for example, that if the server is unable to read the data
it wants off the primary tape, it doesn't appear to revert to the
backup copy to complete the export; rather, the export fails. I haven't
chased this up, as I had far too many other (rather urgent) matters on
my hands when it occurred, so there may be other factors at play -- but
that's one example.
There are a couple of other things that may or may not be of concern;
again, I haven't spent as much time as I should verifying all the ins
and outs, so without knowing, I'd prefer to keep my mouth shut on them
lest I be seen as a fool. :-)
If I can do what I want with backup sets instead, that is
probably a better way.
One suggestion I do have would be to test. If you can afford to have
the backup system in a potentially not-working state for a period of
time, run some tests and see what happens. Try to make it break as much
as you can; take tapes out of the silo (telling TSM about it, of
course) to see how it copes; things like that. You'll end up being much
more secure in what you're doing if you know how it can break, and how
to recover if it does break.
All this, of course, is with the understanding that, more likely than
not, you'll not be in a position to do any of this. :-(
Hope this is of help in some small way.


Re: Export feasibility

2005-03-09 Thread Stuart Lamble
On 10/03/2005, at 8:15 AM, Ragnar Sundblad wrote:
We don't want to give up storing away a complete snapshot of
our systems off site every few months, over time maybe reusing
the off site tapes so that we finally save a snapshot a year.
I think that the most logical way to accomplish this with TSM
is to do a complete export like
export server filedata=allactive ...
What's wrong with using backup sets? The end result is the same -- all
currently active data is stored on tape and can be moved offsite,
without needing any maintenance from the main TSM server.
Given LTO 3 with an maximum data speed of 50 MB/s uncompressed,
a moderate guess (I hope) for the data transfer rate would be
30 MB/s, which would give 93 hours to write it down. Given that
a tape has a capacity of 400 GB uncompressed, it would take
at the most 25 tapes.
We would obviously want to be able to use the backup server
as usual while doing the export.
Then you would need to make sure that your server has enough capacity,
in terms of tape drives (and connectivity to the tape drives), network
bandwidth (to cover both the export and the backups), and such like to
cope with both the export (or backup set generation) and regular
backups simultaneously.
From my point of view, being in the middle of fiddling around with
exports (server to server in my case) for various last ditch DR
systems, I'd suggest keeping it simple. If backup sets (see the
generate backupset command in the admin reference manual) fill your
needs, my advice would be to make use of them, rather than using the
rather fragile option of server data exports.


Export node direct to managed server problems.

2005-01-24 Thread Stuart Lamble
Hey guys.
We have four TSM servers (long story, don't ask) across two sites,
setup with one as a config manager, the other three all subscribing to
the config pushed out by the manager. For various DR reasons, we're
wanting to do data exports from those four servers to a fifth server,
located at a completely different site to any of the four main servers.
These exports are for a limited subset of the servers being backed up,
and there's no easy way to separate them from the not-to-be-exported
hosts in terms of policy domains, storage pools, etc. At the moment,
the intent is to use the export node command to push the data across
directly to the fifth TSM server.
I've defined the server links and the requisite administrators. I've
also -- by way of experimentation as much as anything else -- set up
the fifth server to subscribe to the same config pushed to the other
servers. I'm now finding that, whenever I try to export a server
directly to this fifth server, I get a bunch of error messages in the
logs. Some of them are relatively spurious[1]; others, however, are of
concern[2].
I've had similar problems previously, which were rectified by changing
the version of the TSM server software running on each host to match
(they're all running version 5, release 2, level 3.2 now). I'm pretty
sure these problems only started after I made the fifth server a
managed server, rather than being mostly independent.
So. Questions.
  * Can a server to which data is being exported directly over the
network be managed, or must it be completely independent?
  * If it can be managed, how can the spurious errors about managed
objects be suppressed?
  * If it can only be managed for certain config items, what are the
restrictions on what can and cannot be managed?
  * If it _can't_ be managed, I'm curious to know why not. (This one
doesn't really matter beyond satisfying my curiosity :)
  * Presumably, there is no problem with making the server link
information for the importing server managed -- that the issue is
purely with making the importing server managed by one of the exporting
servers?
Having managed profiles would make a number of things a lot easier, but
if I have to live without them, so be it. Also, if there are any trips
or traps that would be useful to know, I'd greatly appreciate knowing
them. :)
If there are any details that need to be disclosed for this one to be
tracked down, let me know; I'm well aware that the above is rather
vague, but it's hard to know when you're just setting out what might be
relevant and what might not be.
Thanks for any and all advice you can give,
Stuart.
[1] For example:
01/25/2005 11:14:45   ANR0646I IMPORT (from Server TSM1): DEFINE
CLIENTOPT:
 AMCommand cannot be executed - option set NW is a
managed
   object. (PROCESS: 27)
01/25/2005 11:14:45   ANR0730E IMPORT (from Server TSM1): Internal
error from
 AMcommand 'DEFINE CLIENTOPT NW INCLEXCL exclude
   sys:\vol$log.err FORCE=Yes SEQNUMBER=11'.
(PROCESS: 27)
01/25/2005 11:14:45   ANR3229E DEFINE CLIENTOPT: Command cannot be
executed -
 AMoption set NW is a managed object. (SESSION:
4560,
   PROCESS: 27)
[2] Prime case in point:
01/25/2005 11:14:45   ANR0730E IMPORT (from Server TSM1): Internal
error from
 AMcommand 'REGISTER NODE SOMENODENAME 
DOMAIN=NW_PD
   CONTACT=  COMPRESSION=CLIENT ARCHDELETE=YES
BACKDELETE
   =NO URL=SOMENODENAME.its.monash.edu.au:2002
CLOPTSET=
   NW FILEAGGR=YES AUTOFSRENAME=NO
VALIDATEPROTOCOL=NO
   DATAWRITEPATH=ANY DATAREADPATH=ANY
SESSIONINIT=CLIENTORS
   ERVER TYPE=CLIENT'. (PROCESS: 27)
01/25/2005 11:14:45   ANR0728E IMPORT (from Server TSM1): Processing
terminated
 AMabnormally - internal error. (PROCESS: 27)
01/25/2005 11:14:45   ANR0620I IMPORT (from Server TSM1): Copied 0
domain(s).
 AM(PROCESS: 27)
01/25/2005 11:14:45   ANR0621I IMPORT (from Server TSM1): Copied 0
policy sets.
 AM(PROCESS: 27)
01/25/2005 11:14:45   ANR0622I IMPORT (from Server TSM1): Copied 0
management
 AMclasses. (PROCESS: 27)
01/25/2005 11:14:45   ANR0623I IMPORT (from Server TSM1): Copied 0 copy
groups.
 AM(PROCESS: 27)
01/25/2005 11:14:45   ANR0624I IMPORT (from Server TSM1): Copied 0
schedules.
 AM(PROCESS: 27)
01/25/2005 11:14:45   ANR0625I IMPORT (from Server TSM1): Copied 0
administrato
 AMrs. (PROCESS: 27)
01/25/2005 11:14:45   ANR0891I IMPORT (from Server TSM1): Copied 0
optionset
 AMdefinitions. (PROCESS: 27)
01/25/2005 11:14:45   ANR0626I IMPORT (from Server TSM1): Copied 0 node
 AMdefinitions. (PROCESS: 27)
01/25/2005 11:14:45   ANR0627I IMPORT (from Server TSM1): Copied 0 file
spaces 0
 AM  

Re: Export node direct to managed server problems.

2005-01-24 Thread Stuart Lamble
On 25/01/2005, at 11:31 AM, I wrote:
I've defined the server links and the requisite administrators. I've
also -- by way of experimentation as much as anything else -- set up
the fifth server to subscribe to the same config pushed to the other
servers. I'm now finding that, whenever I try to export a server
directly to this fifth server, I get a bunch of error messages in the
logs. Some of them are relatively spurious[1]; others, however, are of
concern[2].
Does anybody know where I can find a supplier of soft brick walls?
*sighs* Looks like the problem's solved: I forgot to activate the
policy sets on the fifth server. Talk about your trivial fixes ...
Please excuse me; I have some banging of heads (mostly my own) to do.


Selectively duplicating client data across servers

2004-08-25 Thread Stuart Lamble
Hey ho. Here's the skinny. We will, eventually, have a number of
clients backing up to a TSM server on a regular basis (we're still
setting up the SAN and other ancillary things that are needed to
support the TSM server). Some of them will be filesystem backups;
others will be database backups (which, if I understand correctly, are
most likely to be seen as archives rather than backups as such). The
setup involves two sites, A and B, which have multiple gigabit fibre
connections between them (so they're effectively on the same LAN; the
only real difference is a very small amount of additional latency.)
Systems at site A will backup to a server at site B, and vice versa.
The project I'm working on involves a third site; call it site C. The
connectivity from sites A and B to site C is significantly poorer than
that between A and B, largely because site C is significantly remote to
A and B (whilst A and B are within a few km of each other), precluding
running of fibre to it. (Not sure what the current connections are;
that's not my area.) The idea is that if things go boom in a major way,
and we lose everything at _both_ site A and site B, we want to have
copies of the university's critical data (student records, SAP, that
sort of thing) available for restore. Maybe the data will be a little
old, but better than nothing.
So the idea is this. The servers holding the critical data backup to
their backup server as normal. Once every so often (once a week, once a
fortnight, once a month...), we want to take the data off the _backup
server_, and copy it to another TSM server at site C. We want to do
this only for those critical servers, not for all servers. The data may
be in the form of either archives or filesystem backups, or some
combination of the two. We _don't_ want to be running extra backups for
the servers in question to provide this redundancy.
The two solutions I've come up with involve data export, and copy pools
(via virtual volumes). The problem is, both of those operate at the
storage pool level; there's no way to specify copy/export only the
data for _this_ client, and no others that I can see. It's preferable
that we not have to create separate storage pools (all the way down to
the tape level) for these systems just so we can do this -- we'd prefer
to have one disk pool and one tape pool for the whole shebang if
possible. Backup sets may be doable, but I'm a little uncertain about
how they'd go with virtual volumes, and also whether they'd cover
archive data sets.
So the question is: is there any way we can say to TSM Copy this
client's data (backup and archive) to that server over there, ignoring
all other client data in that storage pool? Or am I smoking crack
(again)?
Any pointers or suggestions on where to look would be very gratefully
received. For me, this isn't so much a learning curve as a learning
cliff. :) If it's relevant, the clients in question are mostly Solaris,
and we'll be running Tivoli Storage Manager 5.2 on Solaris servers.
Many, many thanks for any tips. I'm coming from a background with
Legato Networker, and I'm still trying to get my head around some of
the more interesting aspects of TSM, learning as I go. :)
Cheers,
Stuart.


Re: Selectively duplicating client data across servers

2004-08-25 Thread Stuart Lamble
(Once more, this time with the _right_ From address. Sigh.)
On 26/08/2004, at 12:40 PM, Steven Pemberton wrote:
On Thursday 26 August 2004 09:19, Stuart Lamble wrote:
Hey ho. Here's the skinny. We will, eventually, have a number of
clients backing up to a TSM server on a regular basis (we're still
setting up the SAN and other ancillary things that are needed to
support the TSM server). Some of them will be filesystem backups;
others will be database backups (which, if I understand correctly, are
most likely to be seen as archives rather than backups as such). The
setup involves two sites, A and B, which have multiple gigabit fibre
connections between them (so they're effectively on the same LAN; the
only real difference is a very small amount of additional latency.)
Systems at site A will backup to a server at site B, and vice versa.
So, you have the following?
1/ Clients at A backup to TSM server at B.
2/ Clients at B backup to TSM server at A.
Yup, pretty much.
Where are you producing the copypool versions? Eg:
1/ Client A - TSM B (primary) - TSM A (copy) ?
2/ Client A - TSM B (primary) - TSM B (copy) ?
3/ What copy pools? :(
Oh, option three of course -- we want to save money on media! *ducks*
Seriously, though, it'll be option 2, is my understanding -- if site A
goes down, and some of the media at site B is dead, we'd like to still
be able to recover the data. If we lost two sets of media at the same
time, well, we're obviously not meant to have that data any more. (Cue
the story I heard whilst doing Legato Networker training: a site with
several copies of key data. Key data goes poof. First copy on tape is
bad. Second copy on tape is bad. They send for the offsite copy.
Courier manages to have an accident, and the tapes are ruined...)
(The scary thing is, there were a couple of people advocating no copy
pools for some of the clients. Thank God _that_ got shot down in short
order.)
[verbose description of the basic plan snipped in favour of Steven's
summary]
Something like this?
1/ Client A - TSM B (primary) - TSM A (copy) (all)
 - TSM C (copy/export)
(critical
only)
A healthy paranoia. :)
That's pretty much it. A more accurate picture would be:
Client A - TSM B (primary/copy) (all) - TSM C (copy/export) (critical)
Client B - TSM A (primary/copy) (all) - TSM C (copy/export) (critical)
And remember: just because I'm paranoid, it doesn't mean they're _not_
out to get me... ;)
[copying data from the original backup server to the remote site]
The two solutions I've come up with involve data export, and copy
pools
(via virtual volumes). The problem is, both of those operate at the
storage pool level; there's no way to specify copy/export only the
data for _this_ client, and no others that I can see.
Actually, you can export node for individual hosts, but I'm not sure
if it's
the best way to do what you're planning. However, export node can
specify a
client, time range, backup and/or archive data, active files only, and
export/import directly via a server to server connection.
Hm. Going to have to re-read the manual on that; I must have missed
that point. *flick flick flick* ... ok, I missed that point. Excuse me
whilst I carefully extract my foot from my mouth. :)
It's preferable
that we not have to create separate storage pools (all the way down to
the tape level) for these systems just so we can do this -- we'd
prefer
to have one disk pool and one tape pool for the whole shebang if
possible.
I'd normally recommed that you DO create multiple storage pools, so
that you
can better control the physical location of the backup data. This can
improve
recovery performance by separating critical clients to their own
storage
pools and tapes. With only one huge disk/tape storage pool hierarchy
each
client's data will tend to fragment across a large number of tapes
(unless
you use collocation, which may greatly reduce tape efficiency instead).
Interesting point. Everybody here is an utter newbie when it comes to
TSM; we've done the initial training course (you should remember; IIRC,
you were the one taking the course I was on :) which is all fine and
dandy, but it doesn't really expose you to the little tricks of the
trade which come up when you're actually _using_ the product. :) (And
besides -- after too many months of not using the product because of
wrangling that's out of the hands of the techies, you tend to forget
the finer points that were covered on the course.) Still, I have a fair
amount of faith that TSM will do the job we need; it's more or less a
matter of what problems we run into along the way (and don't tell me we
won't run into problems -- we will; it's just a question of how severe
they are and how difficult to fix. With luck, they'll be less than what
we have with our current backup system.) We've already ruled out
collocation for the most part; I seem to recall an upcoming version of
TSM has a weaker form of collocation (along the lines of group