Re: Tape throughput troubleshooting?

2008-11-10 Thread Dan Foster
Hot Diggety! Richard Sims was rumored to have written:

 I would start with the netstat inspection outlined in the Backup
 performance topic of ADSM QuickFacts, which will quickly assess how
 network flow is doing in the actual event.  TCP window size may be
 involved in what you are seeing, known to be a particular factor over
 long distance communication.

I regularly push up to 980 Mbps sustained over a nearly 3,600 mile
(~5800 km) WAN, but usually about 175-200 Mbps for a typical workload.

Richard is correct: TCP window size tuning is *vital* for a high
bandwidth WAN. Don't leave home without it correctly tuned. :-)

For instance, according to Sun support working one of our cases, Solaris
is tuned for LAN performance out of the box.

It's not a lot more to tune it for a WAN, but it does require some
careful notes, theory, measurement, and trial-and-error.

We also got a major boost in performance merely by enabling jumbo frames
on each end. A smaller boost from using certain tunables to push the
NICs to their absolute maximum on Solaris.

I haven't done AIX perf tuning in so long that I'm a little rusty now, alas.

But for AIX, one tunes sliding windows by adjusting these 'no' options:

- tcp_sendspace
- tcp_recvspace

There's a few more 'no' options... rfc1323 is another key one to enable.

http://www.performancewiki.com/aix-tuning.html

Regardless of the sliding windows values chosen, strongly suggest
testing and retesting with various (power-of-2) values until the optimal
value is seen. E.g. 1024, 2048, ..., 65536, etc.

Also, the original poster may want to check with the network folks
(and/or upstream provider if appropriate) to make sure there's no
particular segment along the end-to-end network path that is of small
bandwidth. In my case, it's all 'in-house' so that makes it easier.

E.g. you're not going to see gigabit performance end-to-end if you have,
say, a DS3 (45 Mbps) circuit somewhere in between.

Also, before making changes, record before/after values as well as
results of measurements. Nothing worse than getting muddled in the perf
tuning process due to lack of careful record-keeping.

Some food for thought.

-Dan


Re: How do I get TDPO to send controlfiles to SBT_TAPE instead of DISK?

2008-04-23 Thread Dan Foster
Hot Diggety! Dan Foster was rumored to have written:
 My DBAs and I are completely baffled. How do I get TDPO to send
 controlfiles to SBT_TAPE instead of DISK?

Eventually resolved: needed to get *every* single config file on BOTH
hosts to match for a restore to the DR host, AND to disable
PASSWordaccess GENERATE.

After that, went swimmingly, and the DBAs not familiar with RMAN/TDPO
were impressed. We then set up a RMAN catalog db, which worked great, too.

-Dan


Re: How do I get TDPO to send controlfiles to SBT_TAPE instead of DISK?

2008-04-23 Thread Dan Foster
Hot Diggety! Steve Stackwick was rumored to have written:
 On Thu, Apr 10, 2008 at 3:26 PM, Dan Foster [EMAIL PROTECTED] wrote:
   My setup:
 
   TSM server (Enterprise) 5.5 on AIX 5.2.
   TDPO 5.4.1 on Solaris 10/SPARC and Oracle 10gR2.
 

 Do you really have TSM 5.5 running on AIX 5.2? I wondered if that
 would work, and kind of thought it would, but Tivoli sez u need AIX
 5.3. Just wondering.

There's two kinds of 'work':

- 'probably technically works but you're on your own'
vs.
- '20 foot wall of flames will erupt if you attempt it'

IBM's statement in this particular case is more of the former,
presumably partly to reduce costs in regression testing (amongst many
other valid reasons). And, of course, if you hit a bug with such a
setup, they're unlikely to fix it or even give the time of the day.

The $65,536 question: Why did I put 5.5 on AIX 5.2?

Simple: I was a little too hasty in reading the min requirements, a rare
departure for my character. :) I always, with that one glaring exception,
stick to supported revs for everything.

Why not backout? A lengthy maint moratorium amongst other things in my
environment made it more practical to just stay put.

With that said:

# lslpp -l tivoli.tsm.server.aix5.rte64
  Fileset  Level  State  Description
  
Path: /usr/lib/objrepos
  tivoli.tsm.server.aix5.rte64
 5.5.0.0  COMMITTED  IBM Tivoli Storage Manager 64
 bit Server Runtime

Path: /etc/objrepos
  tivoli.tsm.server.aix5.rte64
 5.5.0.0  COMMITTED  IBM Tivoli Storage Manager 64
 bit Server Runtime

# oslevel
5.2.0.0

(It's actually at 5.2 ML9 + some patches)

I haven't seen any issues so far, though I would not recommend doing
this to anyone except the truly desperate. It's by far better to stay
supported for the server side. You can usually get away with downrev
clients past the officially supported revs, but server [and support] is
too important to risk for most folks.

Then there's that TSM server revs may have some relation to device
support of the underlying AIX OS, particularly for newer devices.
In my case, I hadn't changed anything else, so got by with that.

I'm looking to bring the server up to a newer AIX version soon to
resolve this uncomfortable position. Last, but not the least, I've also
been duly thwacked by colleagues with multiple copies of the old ADSM
3.1 printed manuals to serve as a reminder to be more careful and
diligent in my reading. :) Sure deserved that.

-Dan


How do I get TDPO to send controlfiles to SBT_TAPE instead of DISK?

2008-04-10 Thread Dan Foster
My DBAs and I are completely baffled. How do I get TDPO to send
controlfiles to SBT_TAPE instead of DISK?

My setup:

TSM server (Enterprise) 5.5 on AIX 5.2.
TDPO 5.4.1 on Solaris 10/SPARC and Oracle 10gR2.

RMAN settings:

RMAN show all;

RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 7 DAYS;
CONFIGURE BACKUP OPTIMIZATION OFF;
CONFIGURE DEFAULT DEVICE TYPE TO 'SBT_TAPE';
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE SBT_TAPE TO '%F'; # 
default
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE 'SBT_TAPE' PARALLELISM 3 BACKUP TYPE TO BACKUPSET;
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE SBT_TAPE TO 1; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE SBT_TAPE TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' PARMS  '';
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO 
'/opt/oracle10/product/10.2.0/db_1/dbs/snapcf_mydb.f'; # default

How we backup -- RMAN commands:

allocate channel t1 type 'SBT_TAPE' parms 
'ENV=(TDPO_OPTFILE=/opt/tivoli/tsm/client/oracle/bin64/tdpo.opt)';

allocate channel t2 type 'SBT_TAPE' parms 
'ENV=(TDPO_OPTFILE=/opt/tivoli/tsm/client/oracle/bin64/tdpo.opt)';

allocate channel t3 type 'SBT_TAPE' parms 
'ENV=(TDPO_OPTFILE=/opt/tivoli/tsm/client/oracle/bin64/tdpo.opt)';

sql 'alter system archive log current';

backup incremental level = ${LEVEL}
filesperset 5
format df_%d_%s_%p_%t.lv${LEVEL}
(database include current controlfile);

backup
filesperset 20
format ar_%d_%s_%p_%t.lv${LEVEL}
(archivelog all delete all input);

release channel t1;
release channel t2;
release channel t3;


tdpo.conf for the restore on a test box:

DSMI_ORC_CONFIG /opt/tivoli/tsm/client/oracle/bin64/dsm.opt
DSMI_LOG/var/log/tsm

TDPO_FS prod-oracle

TDPO_MGMT_CLASS_2   tsmoracle-mgmt2
TDPO_MGMT_CLASS_3   tsmoracle-mgmt3
TDPO_MGMT_CLASS_4   tsmoracle-mgmt4

Not a fancy setup; pretty much 'stock' (standard). Thoughts/ideas?

-Dan


Re: Is there a simple way to test a tape drive?

2007-11-30 Thread Dan Foster
Hot Diggety! Richard Rhodes was rumored to have written:
 Sometimes I want to have TSM test out a tape drive.
 Something like . . .
   - pick a drive
   - pick a tape
   - have tsm mount the tape and read the label (or something)
   - tell me this succeeded or not

 Like yesterday . . . IBM repaired a tape drive.  After fixing it, IBM
 tells us it's fixed.  We put the drive online . . .and wait for TSM
 to use it.  For this particular drive, it was several hours before
 TSM used it.

 We also get this after an upgrade, or adding a new drive.  To get
 TSm to used any drive, let alone a particular drive, requires
 starting a migration, update stgpool, backup to tape, etc.  All
 are large processes.

 Is there a simple way to get TSM to exercise a particular tape drive? Or,
 what do you do in these situations?

Well, we run a r/w test with a known good scratch tape before letting
the IBM CE leave the site. :) It's not perfect but it's a quick test to
test the major functionality.

You didn't mention what OS and version the server runs nor the TSM
server version. So this will vary a bit, but for AIX and 3584/LTO:

# tapeutil -f /dev/rmtX rwtest -b 262144 -c 10 -r 2

is good with our LTO-1 drives in our 3584 library. Block size: 256KB, 10
blocks (of 256K each), 2 runs, read and write test.

Obviously, make sure it's a scratch tape, since this test will scribble
some data over it. You may want to have this tape temporarily checked
out of TSM so that TSM won't try to use it for any operation.

Some other platforms may have tapeutil; I'm not sure what platforms has
it and if it's named differently on some platforms or not.

Other things you can do:

# tapeutil -f /dev/smc0 inventory | grep -p ^Drive Address num

...where num corresponds to the element number of the drive.

(AIX supports grep -p; on other platforms, just use 'more' instead.)

You can also test mount/dismount of tape by moving the scratch tape from
either a library slot or from a I/O slot.

# tapeutil -f /dev/rmtX move elem ID #1 elem ID #2

...where elem ID #1 is the element number of your tape source location
...and   elem ID #2 is the element number of your tape drive

And, of course, rmtX is the drive in question (rmt0, rmt1, etc).

When done, can do 'tapeutil -f /dev/rmtX unload' and then 'tapeutil -f
/dev/rmtX move elem ID #2 elem ID #1'.

The tapeutil rwtest is best done with the drive disabled in TSM *FIRST*,
so that TSM doesn't try to use the drive at the same time. Once you're
finished with the work, bring the drive back online in TSM ('upd drive ...').

Then you can do 'sh libr' and look at the output; should look sane.

Doing all this only takes a minute or two, so I don't feel bad about
asking the CE to wait.

I believe it's mtlib for 3494 libraries.

tapeutil AND mtlib usage can be found here:

ftp://ftp.software.ibm.com/storage/devdrvr/Doc/IBM_Tape_Driver_IUG.pdf

-Dan


Re: Nervous Nellies and TDPO -- feasible?

2007-11-27 Thread Dan Foster
Hi ladies and gentlemen,

I *really* appreciate the feedback including one by private
e-mail! It's helped a lot in resolving our DBAs' concerns.

Yesterday, we configured RMAN/TDPO (our first TDPO setup) and I
must say the performance flies with use of multiple parallelized
streams! I'd only previously set up TDP-MSSQL (which is pretty good, too.)

The point regarding continuing to use some form of additional
backups outside of TDPO for 'insurance' purposes is a good one, and we
will continue to do so.

The Redbooks tip was a good one. I've worked with IBM gear for
over 10 years but for some reason, hadn't occurred to me to look there.
I guess I'm too used to knowing the stuff that I normally need to know. :-)

This database isn't on storage that allows for BCV/FlashCopy
type snapshots, unfortunately, or we'd leverage that. A pretty good
idea, though. (We have various EMC, IBM, and Hitachi enterprise disk
storage but not for this particular Oracle database -- historical
reasons partly tied in to fairly messy politics.)

I think we're on the right track now, and thanks again for your
assistance!

Cheers,

-Dan


Re: HELP!! TIPS on freeing up library slots

2007-11-27 Thread Dan Foster
Hot Diggety! Jim Young was rumored to have written:
 This will give you a list of tapes under twenty percent utilized from
 all pools

 select volume_name, stgpool_name,pct_utilized, status from volumes -
 where pct_utilized  20 and status='FILLING' - order by pct_utilized,
 stgpool_name, volume_name

 After that, do a  MOVE DATA VOLUMENAME  to help it along a bit.

 Does anyone know whay TSM has this issue with keeping lots of
 'filling' volumes?

Could be a collocated setup? That's one way you get lots of individual
tapes, each filling (relatively) slowly?

-Dan


Nervous Nellies and TDPO -- feasible?

2007-11-20 Thread Dan Foster
Hi -

My DBAs has some serious concerns about running TDPO and I don't
know enough about it to be able to answer their concerns
authoritatively. Any information would be much appreciated.

The subject line was _NOT_ meant to denigrate them -- not at all
(they're clued in DBAs) but meant to get your attention, as I'm
sufficiently desperate. Sorry. :)

Setup:

- TDPO 5.4 (Solaris 10 Update 3 + patches)
- Oracle 10gR2 + patches
- TSM server 5.4 (AIX 5.2 TL 10 + patches)
- Plans to go to TDPO 5.5 + TSM server 5.5 shortly

Their concerns:

1) RMAN doesn't deal with non-transactional DB data loads well.

   (i.e. data loaded not through the usual INSERT/UPDATE
   methods, but done via sqlldr. This doesn't generate redo logs.)

   True or false?

2) RMAN requires a recovery database be created to do recoveries to
   (instead of recovering to original database/tablespaces in-place).

   True or false? I thought RMAN didn't need an intermediate temp
   database and could restore directly 'in-place'.

3) Will RMAN guarantee a good point-in-time view of DB data?

4) Same as #3 above, but what if the DB is currently processing
   a large series of INSERTs or UPDATEs or sqlldr run, will RMAN
   only send data that was present at the moment of backup start?

5) Does RMAN/TDPO use scale to really large DB setups? Say, tens
   or hundreds of terabytes? Or even sub-10 TB?

   Anybody using TDPO well with terabytes of Oracle data? (I
   think they're hoping for assurances that it really does work
   in the real world at that scale.)

6) How does Oracle RMAN know what rows to send for an
   incremental backup? Does Oracle maintain a bitmap of some kind
   or time-based logging of changed blocks or something? Any
   pointers to whitepapers or information on how it works behind
   the scenes?

They would prefer to utilize Veritas DB Edition (DBED)
checkpoint feature which is where a script runs to put all tablespaces
in backup mode, brings DB-based filesystems to a consistent state,
generate a checkpoint (snapshot-like) of the filesystem, then brings all
tablespaces back to online mode. The TSM BA client then mounts the most
recent checkpoint R/O then backs it up at the file level.

The problem with the above approach is this essentially results
in what amounts to a full backup every single time it kicks off. That's
just not practical in terms of tape cost as well as the length of
backups.

I see TDPO as the only practical choice for much faster and more
frequent backups (and restores) and is guaranteed for consistent
point-in-time views of DB data. The BA client was also not meant to back
up databases (though it *could* do so if properly quiescied).

My issue? I don't have enough meaty information yet to put
concerns to rest, and I'd really like to make use of TDPO after paying
out $16,000. Any info would be much welcome.

Thanks,

-Dan


How to register TDP Oracle 5.4.1 licenses?

2007-09-15 Thread Dan Foster
Setup:

- TSM 5.4.1.1 server on IBM 7026-6H1 running AIX 5.2 TL10 SP3
- TDPO 5.4.1 client on Sun T2000/SPARC running Solaris 10 Update 3
- Oracle 10gR2 database - version 10.2.0.3

I've only previously set up TDP MS SQL 5.2? on a TSM 5.1.9.6 server.

We recently went to TSM and TDP 5.4 and had not previously registered an
Oracle TDP client.

I'm now finding out that you can't use REGISTER LICENSE FILE=oracle.lic
with TSM/TDP 5.4. It's not completely clear how I get this registered.

Do I just simply go through a normal TDPO installation on the client,
and let tdpoconf or REGISTER NODE register the client on the TSM server?

Is there something special I need to do to make TDPO licenses show up on
the TSM server? Do I need to fetch license enrollment certificate files
from somewhere?

Any insight leading to enlightenment would be most welcome. :-)


Yours in bafflement; may your tapes never oxidize,

-Dan


Re: Minimum TSM server version for Solaris/x86 client?

2007-03-12 Thread Dan Foster
Hot Diggety! David McClelland was rumored to have written:

 Do you mean client or server here? TSM *client* on x86 Solaris has been
 available since 5.3:

Ah, right; duly corrected, thanks!

 Given that TSM 5.2 Server goes out of support next month, you'll only
 really want to be connecting it to a 5.3 or 5.4 server anyway for which
 I wouldn't see there being any problems.

I eventually managed to stay connected with the FTP server long enough
to fetch the 5.3 and 5.4 clients and found out they do appear to work
with the older TSM server (5.1.9.6).

Needs are very basic; nothing fancy so I'm not too concerned for
deploying this on a small subset of our kits on an as-needed basis.

Buys me some more time while I continue to see if we can either get an
upgraded TSM server or if we can continue to run TSM instead of the
preferred organisational standard of EMC [Legato] Networker.

 I don't disagree about the Apple II performance of the
 ftp.software.ibm.com site, although I prefer to liken it to my Acorn
 BBC Micro Model B (I don't think they ever made it over to the US)
 reading at 1200 baud from cassette tape :o)

:-)

Heard much about the Acorn BBC kits over the years. :)

Cheers,

-Dan


Minimum TSM server version for Solaris/x86 client?

2007-03-08 Thread Dan Foster
TSM 5.4 added support for Solaris/x86... but what is the minimum server
version it will connect to? 5.1.5? 5.2? 5.3? 5.4?

I looked in the documentation and couldn't find that info. The FTP
server keeps disconnecting me while emulating the performance of an
Apple II running off floppies so I haven't been able to get the client
to test with just yet, either.

Any pointers or information would be MUCH appreciated. Thanks!!

-Dan


Re: TSM Client on GENTOO

2006-08-01 Thread Dan Foster
Hot Diggety! Christian Svensson was rumored to have written:

 Has anyone try to install ITSM BA/API on Gentoo?

Yes. The 5.1.5 client works well for me. I have not tried 5.2 or later.

 I can not even install ITSM on it because I don't have any RPM software.

# emerge rpm
# rpm -ivh --force Linux/x86 TSM client file.rpm

or something like that.

Then after that, just do normal dsmc client setup. (dsm.opt, dsm.sys,
add to inittab, pre-seed the password, etc.)

-Dan


Re: 3584 Library

2006-06-08 Thread Dan Foster
Hot Diggety! Gill, Geoffrey L. was rumored to have written:
 I'm bringing up a remote AIX system with a 3584. Seems as though IBM
 left it in an unusable state and I'm trying to figure out why. What I
 originally saw was the below output minus the 3584 info. I ran cfgmgr
 and now have the below output. What is odd to me is that there are 3
 3584's showing up.

 Can anyone shed some light on the library or tape drive issue?

 # lsdev -Cc tape

 rmt0 Available 40-60-00-5,0 SCSI 4mm Tape Drive

 rmt1 Available 14-08-02 Other FC SCSI Tape Drive
 rmt2 Available 17-08-01 Other FC SCSI Tape Drive
 rmt3 Available 54-08-01 Other FC SCSI Tape Drive
 rmt4 Defined   1A-08-02 Other FC SCSI Tape Drive
 rmt5 Defined   21-08-01 Other FC SCSI Tape Drive
 rmt6 Defined   5A-08-01 Other FC SCSI Tape Drive
 smc0 Available 14-08-02 IBM 3584 Library Medium Changer (FCP)
 smc1 Available 17-08-01 IBM 3584 Library Medium Changer (FCP)
 smc2 Available 54-08-01 IBM 3584 Library Medium Changer (FCP)

A possibility is that there were multiple control paths defined in the
3584 library.

This can be adjusted via the 3584 front panel LCD, or I think, also the
web interface (though I never used it much due to the infamous 3584
ethernet issues).

The 'Other FC SCSI...' looks odd, though. Does the system have Atape
installed? And the drives are IBM LTO drives?

Could be that rmt4-6 was seen before additional control paths were
defined, because once you have additional CPs, your view of the drives
contained in them can be blocked.

Ultimately, I think you want to determine if the 3 CPs is what you want.
If not, needs to be set back to the default one CP for entire library.

You'll also want to figure out why the LTO tape drives aren't showing up
right if they're IBM LTO tape drives.

Another thing to triple check with the 3584, in my experience, is to
ensure the drives, library, and FC adapter microcode are all at the
latest available and stable revision.

If you're still stuck, even after adjusting all these as desired, then
please describe the 3584 setup (1 L32 and how many D32s, how many
drives, what type of drives and their mfr, microcode revs for library,
drives, and FC adapter, what FC adapter, what host type/model, what host
OS and version is running, level of Atape installed, etc.

-Dan


Re: adm login is locked!

2006-06-06 Thread Dan Foster
Hot Diggety! Alexandr was rumored to have written:
 I`ve installed TSM sever and client (full pack) on the same
 server(AIX).
 Have registred client node on the server TSM.(input correct pwd for
 registred client node). All it was working well for long time.
 But after long off-working time was tring connect to tivoli-server under
 admin-login and have got message:

 ..ANS8056E Your administrator ID is locked.
 ANS8023E Unable to establish session with server.

 ANS8002I Highest return code was 61.

 How unlocked   administrator ID?
 May be exist any method of restore access to server without re-install
 server?

You do not need to reinstall the TSM server.

What TSM server version are you using?

Do you have any other administrator IDs?

If yes:  login to another admin ID then do in dsmadmc:

tsm unlock administrator your locked admin ID

If no:  login to TSM server, and use UNIX shell to do:

# kill -HUP dsmserv pid  sleep 5 \
 kill -KILL dsmserv pid
# DSMSERV_DIR=/usr/tivoli/tsm/server/bin
# DSMSERV_CONFIG=/usr/tivoli/tsm/server/bin/dsmserv.opt
# cd /usr/tivoli/tsm/server/bin
# ./dsmserv
tsm disable sessions
tsm unlock admin your locked admin ID
tsm halt
# nohup /etc/rc.adsmserv 

Then wait for the TSM server to come back up. You will be able to login
in dsmadmc when it is back up.

Then investigate to see why admin ID was locked. Did it expire?

tsm query admin your admin ID f=d

Example:

tsm query admin alex f=d

Your account was locked due to either:

1. Exceeding maximum number of invalid admin sign-ons
or
2. Your password expired

If your password was expired, then either set a longer password expiry
period or change your password more often.

-Dan


Re: Trying to use TSM API

2006-06-04 Thread Dan Foster
Hot Diggety! Mike was rumored to have written:
 I have a program that compiles properly, and connects, but when
 querying (I'm testing managemt classes first) it gets an error:

 ANS0245E (RC2065) The caller's structure version is different than the
 TSM library version.

 The version of the tivoli.tsm.samples is the same as the
 tivoli.tsm.client. Is there a way to check what is wrong? Is there a
 way around the problem?

What version is the API, client, and what platform?

-Dan


Re: Trying to use TSM API

2006-06-04 Thread Dan Foster
Hot Diggety! Mike was rumored to have written:

   tivoli.tsm.client.api.32bit
  5.3.2.2  COMMITTED  TSM Client - Application
  Programming Interface 32bit

That looks correct.

Do you have VisualAge C/C++ installed? It apparently needs the C++
compiler, and specifically, VisualAge's.

Did you copy /usr/tivoli/tsm/client/api/bin/libApiDS.a to /usr/lib?

Do you have a copy of /usr/tivoli/tsm/client/api/bin/sample/*.h in the
same directory as your app?

Or have you copied and adjusted
/usr/tivoli/tsm/client/api/bin/sample/Makefile to directory of your app
and adjusted CFLAGS's -I. to -I/usr/tivoli/tsm/client/api/bin/sample ?

-Dan


Re: Trying to use TSM API

2006-06-04 Thread Dan Foster
Hot Diggety! Mike was rumored to have written:

 The file (tsm.so) compiles great and connects to the tsm server just fine.
 Only when I issue the management class query does it complain about the
 structure version number.

 I am using vac/cc. No need to copy the library to /usr/lib since the extension
 compiles and connects without error.

Well, the reason why I mention it is because this version error
indicates a mismatch against library and header files. Meaning, it's
possible to compile fine but fail at the run-time check.

Besides, I think the makefile is specifically looking for libApiDS.a in
/usr/lib; take a look at the makefile. Makefile is also looking for
header files in current directory by default, too, I think.

Next step would be to verify that /usr/lib/libApiDS.a and
/usr/tivoli/tsm/client/api/bin/libApiDS.a are identical, possibly by
running md5, md5sum, or some such utility.

Then compare time/datestamps, and so forth, for both libApiDS.a *and*
the header files. You could be picking up an older version of either
from somewhere unexpected or forgotten about.

You've got the right package installed, but the error is indicating
something is out of sync between the library and header files.

So you need to identify and find where the older/incorrect stuff is,
then correct it, then recompile.

-Dan


Re: ./domdsmc: Error while loading shared libraries

2006-06-02 Thread Dan Foster
Hot Diggety! TSM User was rumored to have written:

 ./domdsmc: Error while loading shared libraries: libnotes.so: cannot
 open shared object file: No such file or direcotry.

 I made sure that libnotes.so exists in the correct direcroty that I
 specified while running dominstall.

 I am having this same problem with SUSE 9.

May I suggest a much shorter subject line capped to 60 characters or
less? I've shortened it for you.

Shared libraries in Linux is not loaded from the current directory. It
is loaded from any directory listed in /etc/ld.so.conf.

You will need to research to find out how your Linux distribution
handles updating of library paths because it differs amongst each
distribution. I'm not familiar with SuSE 9 or RHEL 3, but it should be
Googleable, in distro vendor's docs, or someone here likely knows.

In Gentoo Linux, updating ld.so.conf is done by adding a file in
/etc/env.d/ with a path to add, then doing 'env-update'. This procedure
is different with various Linux distributions.

In a hurry, you could always directly edit /etc/ld.so.conf to add the
directory that holds these shared libraries for Domino, then do
'ldconfig' as root. Then retry using the utility.

-Dan


Re: SQL Select commands

2006-05-26 Thread Dan Foster
Hot Diggety! Richard Sims was rumored to have written:

 Go to
  http://www.rz.uni-karlsruhe.de/rz/docs/TSM/WORKSHOPS/3rd/handouts/

 Handouts I of Andrew Raibeck is the online version.
 Handouts II of Andrew Raibeck is the PostScript version,

Very nice stuff, that's for sure (as are the others' presentations).

 which is trivially converted to PDF by opening on a Mac,
 where the Preview application takes care of it.

Or on a UNIX machine, can be converted via the ps2pdf utility. E.g.:

$ ps2pdf raibeck.ps  raibeck.pdf
$ acroread raibeck.pdf

(or xpdf raibeck.pdf, or transported to a Windows/Mac machine to view.)

A nice side effect of this conversion is that it shrinks the file size
for PDF by about 5.5 times without any apparent visual degradation. :-)

Can be fetched from:

http://www.cups.org/espgs/software.php

I got my copy from my Linux distribution's package repository, so it was
a mindlessly quick fetch and install.

-Dan


Re: backup performance

2006-05-19 Thread Dan Foster
Hot Diggety! Mario Behring was rumored to have written:
  
 I´ve started a backup operation at a Linux client using CENT OS
 (similar to RHat). The operation took 1 hour and 59 minutes to finish,
 and backed up 5.45GB of data.I think this is kind of
 slow..considering that the LTO3 tape unit is supposed to be very
 fast

Can you post the end of job statistics? For example, in my dsmsched.log,
I have:

05/18/06   22:32:33 --- SCHEDULEREC STATUS BEGIN
05/18/06   22:32:33 Total number of objects inspected:  845,061
05/18/06   22:32:33 Total number of objects backed up:   26,665
05/18/06   22:32:33 Total number of objects updated:  0
05/18/06   22:32:33 Total number of objects rebound:  0
05/18/06   22:32:33 Total number of objects deleted:  0
05/18/06   22:32:33 Total number of objects expired:  1,260
05/18/06   22:32:33 Total number of objects failed:   0
05/18/06   22:32:33 Total number of bytes transferred: 3.11 GB
05/18/06   22:32:33 Data transfer time:  464.32 sec
05/18/06   22:32:33 Network data transfer rate:7,040.43 KB/sec
05/18/06   22:32:33 Aggregate data transfer rate:  2,600.08 KB/sec
05/18/06   22:32:33 Objects compressed by:0%
05/18/06   22:32:33 Elapsed processing time:   00:20:57
05/18/06   22:32:33 --- SCHEDULEREC STATUS END

If you can post yours for that backup run, it would help give some
insight.

Also, do you use a disk pool on the TSM server? For example:

TSM client - TSM server (diskpool) - TSM server (tapes)

If you do not use a diskpool, you will not be able to feed data to the
LTO-3 tape drives fast enough. If that happens, it will do
start-and-stop which dramatically slows down performance.

-Dan


TDP oddity with an extra registered client?

2006-05-13 Thread Dan Foster
I've got:

TSM 5.1.9.6 server on AIX 5.2-ML7
TDP 5.2 for MSSQL (Windows 2003 Server running MS SQL Server 2000)

The strange thing is:

tsm: SERVERq lic
[...]
 Number of TDP for MS SQL Server in use: 3
[...]

But I only have two TDP 5.2 clients.

How can I determine a list of all 3 registered TDP clients?

And how can I delete the offending client?

Or does the TSM server count as a TDP client for licensing purposes?

Any suggestions or insight welcome. Thanks!

-Dan


Re: TDP oddity with an extra registered client?

2006-05-13 Thread Dan Foster
Hot Diggety! Stef Coene was rumored to have written:
  tsm: SERVERq lic
  [...]
   Number of TDP for MS SQL Server in use: 3
  [...]
 
  But I only have two TDP 5.2 clients.
 
  How can I determine a list of all 3 registered TDP clients?
 select LICENSE_NAME, NODE_NAME from LICENSE_DETAILS

Ah! Thanks.

As I suspected, the problem was due to me making a silly configuration
file error without thinking when I first installed it on a TDP client.

(It registered under the wrong name.)

I changed config file to use the proper name immediately after I
realized my error... but too late, wrong name was already registered in
TSM.

Does anybody know how I can delete the TDP registration for the
incorrectly registered client?

tsm select LICENSE_NAME, NODE_NAME from LICENSE_DETAILS
MSSQLVN01-DB
MSSQLVN02-DB
MSSQLW3USPHX1
[...]

VN01 and VN02 is correct.

W3USPHX1 was the mistake.

tsm: GBLX-PHXselect * from license_details where node_name='W3USPHX1'

LICENSE_NAME NODE_NAME   LAST_USED TRYBUY
 -- -- --
MSSQLW3USPHX1   2006-04-06  FALSE
   01:31:02.00
MGSYSLAN W3USPHX1   2006-05-13  FALSE
   19:41:46.00

So I need to delete MSSQL license registration for W3USPHX1 only, but
leave the MGSYSLAN license registration alone.

Is there a way to do that without having to delete W3USPHX1 entirely
then recreate and re-backup its TSM data?

-Dan


Re: dsmadmc using non-root id access

2004-08-08 Thread Dan Foster
Hot Diggety! Ameerul Mazli was rumored to have written:

 Has anyone got any idea whether it is possible to execute dsmadmc with
 non-root id?

Sure, no problem.

You only need root privileges to start TSM and possibly for a few things
like creating/defining db/log volumes? (May not even need root for these
if the filesystem has appropriate permissions?)

It is possible to set up sudo so that an unprivileged TSM user account
can only start the TSM server with sudo-granted root privileges, and you
get the benefits of logging, too.

 Can we use a e.g.tsmadm UNIX id and runs dsmadmc? Will there be any
 'side effects'?

No side effects that I've seen.

That's how I run my TSM jobs, via an unprivileged UNIX user called
'tapeadm' and it uses a privileged TSM user/pass so I can tell the
difference between human administrators and TSM jobs in the activity
logs. Works great.

-Dan


Re: Cancel all processes

2004-07-14 Thread Dan Foster
Hot Diggety! MC Matt Cooper (2838) was rumored to have written:
 I am not aware of a cancel all processes.  However, you can write a
 script to do it.  It would be based on the fact that TSM will tell you
 all the processes that are running.select process_num from processes
 will give you the process_num  numbers that need to be canceled

One caveat, from the 5.1 help information:

: Some processes, such as reclamation, will generate mount requests in
: order to complete processing. If a process has a pending mount request,
: the process may not respond to a CANCEL PROCESS command until the mount
: request has been answered or cancelled by using either the REPLY or
: CANCEL REQUEST command, or by timing out.

: Use the QUERY REQUEST command to list open requests, or query the
: activity log to determine if a given process has a pending mount
: request.

So a simple q proc (or equivalent) + canc proc num operation may
usually work most of the time, but may fail at certain times if you
don't also check for this, too.

-Dan


Re: A quick question

2004-07-06 Thread Dan Foster
Hot Diggety! David Longo was rumored to have written:
 I have used only IBM LTO1 tapes in my 3584 Fibre drives.
 I did have a few (3 or 4 in a year or so.)  As I slowly
 gathered over this time, there were two problems.

 1.  The cases on early LTO 1 tapes didn't have the halves
 welded together.  Therefore if when loading, the leader
 pin was pushed in slightly, it could partially open the case
 and the leader ping would then get stuck and IBM would have
 to open the drive to get tape out.
 Later LTO1 tapes (as off about a year ago or so) are supposed
 to be welded and LTO 2 tapes from the start.

 2.  There also was a mechanical mod to the LTO1 drives,
 that would reduce or elimimate the problem.  It involved
 installing a clip in the LTO1 drive.   I had this done on
 mine and have not had the problem since.  Don't know offhand
 the EC number on this, I think you have to ask about this one,
 and press a little bit to get it.  I understand it originated in Europe.

 If you can't get info on it from your IBM CE, let me know and
 I'll dig up info.

Talked with the IBM CE here, and he looked it up.

The clip is put over the tape retension spring inside the drive.

It's ECA 009, which is not a mandatory EC. Should be applied only if the
customer sees frequent B881 errors in the library.

If so, the CE needs to order part #18P7835 to get the parts and tool for
the EC.

Also says the work takes an half hour per drive.

We're seeing a lot of these B881 (tape unload from drive issue) errors!
Scheduling installation of these clips now. And, yes, we do have the
original generation of IBM LTO1 tapes. :) B881 errors certainly makes
sense now since there's only a few common major culprits for such
errors; leader pin being out of alignment certainly does it.

We haven't seen these errors with the other 3584 at another site, but
will keep this EC in mind.

-Dan


Re: TSM 5.2 on AIX 5.1

2004-07-01 Thread Dan Foster
Hot Diggety! Lawrence Clark was rumored to have written:
 We originally decided to put TSM 5.2 on the AIX 5.2 system for the
 migration because of a notice in the 5.2 install doc that said TSM 5.1
 would cause 5.2 to crash (at least the version that was on the Bonus
 disk.)

If my recollections are correct, that combination was an extremely early
version of both and the issues were fixed long ago.

I don't have TSM 5.2, but I believe quite a few people are running TSM
5.2 on AIX 5.2 just fine.

 Are there any people on this list running TSM 5.1 on AIX 5.2?

I'm running TSM 5.1 on AIX 5.2, works fine, for what it's worth.

-Dan


Re: Open VMS and TSM

2004-06-01 Thread Dan Foster
Hot Diggety! Muhammad Sadat was rumored to have written:
 I wonder if TSM supports Open VMS as server or client??

There's no TSM server software for OpenVMS BUT yes, OpenVMS can be a TSM
client if you purchase an additional commercial third party software
called ABC which uses the TSM API.

I do not have any VMS servers left but I know a VMS system manager who
uses TSM + ABC and is very pleased with both software.

For more information, please see:

http://www.rdperf.com/RDHTML/ABC.HTML

The license pricing information is at:

http://www.rdperf.com/RDDOC/ABC_PRICE_LIST.DOC

I do not have any business relationships with the company, and have
never used it myself... only heard about it from someone who has it.

Also, keep in mind that this does not provide any bare metal restore
(BMR) capabilities... you still have to use the standalone BACKUP
utility to disk or tape to have a supported bootable restore from
scratch for that.

-Dan

(I still have at home: a VAX, two Alphas, and an emulated VAX via SIMH,
all running 7.2 and 7.3, so I am familiar with OpenVMS.)


Re: TSM 32Bit to 64bit Upgrade

2004-05-25 Thread Dan Foster
Hot Diggety! Hart, Charles was rumored to have written:
 We are currently are running TSM 5.1.7.3 - 32bit on a p630 AIX 5.2.2.
 We would like to change the TSM from 32bit to 64bit during this
 upgrade.  We ran a preview of the install choosing the 64bit rte and
 server code and lic filesets but the preview failed because it stated
 that the tsm.server.rte was already there.  We decided to just go
 ahead and upgrade the 32bit code.

 Has anyone upgraded TSM 5.x from 32 to 64bit code?  The Quick Start
 and Admin guide only make reference to AIX 4.3 to 5.2, nothing about
 32Bit to 64bit TSM.

Haven't done the exact situation you described but my guess is you'd
have to upgrade the OS to 5.2, verify bos.64bit is installed, bootinfo
-K says 64 (if it doesn't, you'll need to adjust the two symlinks, do
bosboot, and reboot), then uninstall tsm.server.rte and install
tivoli.tsm.server.aix5.rte64, and make sure TSM and the OS is
patched/upgraded.

You'd be well advised to take a mksysb backup of the system prior to the
upgrade and perhaps do the 4.3.3-5.2 upgrade via alt_disk_migration.

-Dan


Re: 3584/lto2 code

2004-04-26 Thread Dan Foster
Hot Diggety! Stef Coene was rumored to have written:
 Then use the website.  Why don't you visit the homepage of your library:
 http://www-1.ibm.com/servers/storage/support/lto/3584.html

 There is a nice Firmware link and after 2 clicks you can download any
 firmware you want for your library.

Yes, but the firmware there is relatively outdated, too. (At least, it
is for LTO-1, and presumably also true of LTO-2 judging from the
comments seen earlier in this thread.)

We got newer firmware by asking our main IBM CE very nicely, so he
fetched it off the latest of his firmware CDs (don't recall the name,
but it's available only to CEs and other IBM service personnel).

-Dan


Re: Any experience with Sepaton VTL

2004-04-16 Thread Dan Foster
Hot Diggety! Johnson, Milton was rumored to have written:
 I got a call from a rep asking if I was interested in a Sepaton S2100
 VTL (Virtual Tape Library) (www.sepaton.com). It's billed as:

 My questions include where's the down side?  What's the catch?  If your
 choice is between expanding by purchasing a second 3494 frame or a S2100
 VTL, why choose a 3494 frame?

Keep this rep honest -- ask him/her what the down sides (negatives) are,
what the typical failure modes are, and so forth.

You'll soon know if you're about to drink sugary Kool-Aid or not :-)

Well, simply put, I haven't heard of VTLs as a single source replacement
for tape drives per se. They are pretty good when you can't wait a
single second longer (or 2-3 minutes) for a restore to begin -- compare
disk vs tape restore load-to-ready time.

Lots of places has this kind of rapid restore requirements -- financial
firms (banks, Wall St, etc), hospitals, nuclear power plants, utilities,
and other places where any sort of downtime is extraordinarily bad.

However, I don't think disks yet have the long-term reliability that
tape drives do... well, server class SCSI drives *can* usually last 5
years in brutal 24x7 operation, but drives in general aren't too
tolerant of being underutilized or if it's the cheaper engineered hard
drives (e.g. typical IDE drives), overly utilized.

So the way I see VTLs as being most useful is if you can't wait the 2-3
minutes it takes to load+spool a tape to 'ready to peel data off'; it's
still no replacement for any serious archiving past perhaps 12 months or
so, and still needs another backup source to restore data from in case a
drive goes south and loses the data.

Modern tape drives are pretty zippy, too, at 70 MB/sec, don't forget.
Tapes (not tape drives) also contains far fewer moving parts that can
fail than hard drives which has a motor, PSU, depending on the sub-1mm
air gap (Bernoulli effect), etc.

VTLs has a place, in my honest opinion, but only if you've got the need
and only if it isn't the sole source for backup data. I don't think most
folks has this need; so it just seems like a big push for companies to
make money at your expense, unnecessarily (unless you actually do have a
need and a well-engineered overall setup). After coming off rough
economic times, you can expect lots of these pitches. :-) I've already
gotten two of these so far. :)

-Dan


Directory storage pools

2004-04-03 Thread Dan Foster
As per the TSM redbook at:

http://www.redbooks.ibm.com/redbooks/pdfs/sg245416.pdf

It suggests use of DISKDIRS and OFFDIRS to store directory information
to avoid restore-time directory reconstruction hits.

Everything makes sense, including the need to explicitly bind all
directories on a client to the DISKDIRS management class with DIRMC.

The next question is... *how* do you select ONLY directories for this
management class without selecting any files within them, in dsm.opt?

(Server is 5.1.8.0 on AIX and clients are 5.1.5.14/15 on Windows, Linux,
Solaris, and AIX.)

I had also checked the TSM 5.1.5 UNIX client reference from:

http://publib.boulder.ibm.com/tividd/td/TSMC/GC32-0789-02/en_US/PDF/GC32-0789-02.pdf

...and read through everything referencing include, exclude, dirmc, but
it's still not obvious to me how to select only directory data for use
with DIRMC DISKDIRS. I see there's a dirsonly keyword, but how to use it
within context of dsm.opt?

include / -dirsonly -dirmc diskdirs

?

-Dan


Re: Sizing an AIX platform and tape libraries

2004-03-23 Thread Dan Foster
Hot Diggety! Nancy R. Brizuela was rumored to have written:

 1)  Right now we are backing up about 150 GB/ night, but we need to add
 Exchange and a new student information system (Banner) to this.  We are
 estimating that we will soon grow to at least 500 -600 GB/night.

 2)  Workload consists of 95 clients, consisting of from small 20-30gb
 Unix and Windows systems, to DB (Oracle) and file servers with half to
 one terabyte of storage.  This will grow to 120 or so clients soon.

Sounds like you know your current needs and have some idea of growth --
that's great! Helps to size potential candidate setups so much better
and more accurately.

 3)  We are currently storing 10.7 TB total data in two libraries, one
 local tape library and a second copy of everything at a remote tape
 library (3494 ATLs w/3590 E drives).  Looks like this storage could
 triple, given how much we will be backing up each night (15 TB in each
 location).

My suggestion might be to consider something like the IBM 3584 LTO
library with LTO-2 tape drives. These drives are pretty darned fast at
up to 70 MB/sec (with hardware compression enabled), don't suffer from
the same terrible stop/start issue that LTO-1 had, and holds 200 GB per
tape natively (no compression) -- for *our* library, we're seeing 2.24:1
compression with LTO-1 (so no reason not to expect that to be similar
for our LTO-2 setup) which suggests perhaps 400-450 GB per tape with
hardware compression enabled, depending on data mix.

The 3584 can go up to *16* library frames, with up to 12 drives per
frame (ie, 192 tape drives total) and up to 6,881 tapes in a 16 frame
setup. That's a lot of future expansion potential, but the nice thing is
that you can buy a frame or two now, and add more on later as your needs
grows.

In a two frame setup with 12 drives (our 3584 setup), has 610 tape
slots... which, in a LTO-2 setup, yields 122 TB of uncompressed data.
A single base frame 3584 LTO library has about 250 tape slots if you
have a 30 slot I/O station... so, 250*200 = only 50 TB. If you say
you expect to eventually triple... 15 TB in each location * 3 = 45 TB,
and you'd still have room for as much future expansion as you want.

The 3584 has been a very nice ATL library for us. Footprint isn't bad;
the two frame setup is about 5'x5'.

 4)  Network is Gigabit Ethernet.

No problem. Get as many IBM FC5700 (Gig-E/fiber) or 5701 (Gig-E/RJ-45)
adapters you want/need, plug in, determine how you want to handle the
networking (EtherChannel trunking, subnetted for each adapter, or
whatever).

 5)  We would like to use one large server vs. multiple servers.

Not a problem. We use the pSeries 660 model 6H1 server, which has been a
perfect fit -- it's modularized with the CPU and I/O units separate so
it hasn't been a physical hassle to rack mount it, and these 6H1s has so
many slots like you wouldn't believe. Up to 2 RIO drawers per system;
each drawer has *14* PCI slots! So we don't have slot pressure -- we can
throw in as many adapters as we want! FC cards, SCSI HBAs, Gig-E, SSA
adapters, whatever you want, throw it in there. It also has multiple
busses to spread out the bus traffic as well as a great memory and CPU
interconnection, internally, so performance is not an issue at all.

It also has 2 GB of RAM but can easily go to much larger configurations
as our load grows further. At today's memory prices and systems
configurations, probably wouldn't think twice about 4 GB of RAM... but
if stretching every dollar, 2 GB would be an adequate config for your
current needs (you mentioned number of clients, size of library, etc)
and short term needs.

Also has 4 processors in it due to the high I/O load -- CPUs are kept
busy putting data on / off I/O adapters, processing the data, then
offloading to yet some more I/O adapters (network, disk, or tape). 4's
an healthy number of CPUs that has worked out really well. Load is
nearly nonexistent! (We could probably do fine with 2 CPUs, but decided
it was easier to get 4 and overengineer slightly, to grow into it over
time, than to go back later and beg for expansion in a bad situation
with loading issues. Besides, it'd have meant downtime to add more CPUs
in the future amongst other logistical issues...)

The 6H1 also has dynamic hot-plug slots (you have to run a command to
turn slot power off, do whatever, turn slot power on, cfgadm, done), a
service processor (good for powering on/off a machine remotely as well
as forcing an halt even if OS is unresponsive... really rare but have
used that once), monitoring/management features, and many other things.

IBM isn't selling the 6H1 any more; they're selling its successors which
are even better -- the POWER4 based systems. (The 6H1 is basically a
rebadged and modularized H80 server, and one of the last machines
released before the POWER4 systems came out in force.)

So my point is simply that you want to look at a mid-level server that
has remote management, reliability and availability features, 

Re: 3583 Library Firmware upgrades

2004-01-20 Thread Dan Foster
Hot Diggety! Prather, Wanda was rumored to have written:

 You can go to www.storage.ibm.com www.storage.ibm.com  and download the
 latest 3583 or LTO firmware any time.
 But it doesn't tell me what's CHANGED in each level of firmware.

The local IBM CE usually is able to get something for me if I make a
request for it explaining the rationale behind it (change management,
planning upgrade, need to verify it's fixed known existing issues we've
seen, etc) and asking really nicely. ;)

He gets it via internal request to the tape storage people (IBM Tucson?)
and gets actual info via either internal email or internal web site, prints
it out, and sends it to me.

I sure do wish they'd post the change info on the web, too!

-Dan


Re: 5.2 upgrade path

2004-01-19 Thread Dan Foster
Hot Diggety! Jolliff, Dale was rumored to have written:
 We are looking at an upgrade from 4.2.1.15 to some version that supports
 LTO Gen 2 drives, which I hear is 5.2 or later...

Nope, LTO-2 support arrived in 5.1.6.1. There are *serious* problems with
some of the 5.1.6 tree to the point where IBM actually yanked some code
after release; started to stabilize around .3 or .4; .5 is last in that
tree.

You can see LTO-2 support arrived in 5.1.6.1 by looking at:

ftp://ftp.software.ibm.com/storage/tivoli-storage-management/patches/server/AIX/5.1.6.1/TSMSRVAIX5161.README.SRV

Do note that the initial release (5.1.6.1) with LTO-2 support didn't cover
the MVS or PASE platforms, if you run either of these two platforms.

 I have the 5.0 media.

 Are all the versions between 5.0 and 5.2.x downloadable, or is there another
 step that requires a media set?

Don't know about 5.0... but we got 5.1, and all 5.1.x.x versions are freely
downloadable. 5.2.x.x stuff requires a 5.2 CD for the base filesets.

By 5.0, I think you mean 5.1? I don't think 5.0 was ever released. If you
do indeed mean 5.1, then you're in luck -- just upgrade to anything at or
between 5.1.7.0 and 5.1.8.2 (latest 5.1 release).

To get the fixes of any TSM version for any client or server on any platform:

ftp://ftp.software.ibm.com/storage/tivoli-storage-management/patches/

TSM server patches would be in the 'server' subdirectory of the above URL.
Then pick your server OS platform, then desired server version, and then
download the actual file containing patches.

 What is the oldest/most stable version that understands LTO Gen 2?

I'd say 5.1.7.0 or so... but we've had no problems with 5.1.8.0 and in
fact, somewhere in 5.1.7.x or 5.1.8.0 fixes an annoying thread message that
used to appear in the actlog for every single backup session.

5.1.8.0 fixes a serious security hole; IBM **strongly** advises all 5.1
customers to run at least this rev for that reason. (There are also patched
versions of 4.2.x and 5.2.x for this hole as well.)

 Is anyone else out there running TSM/ACSLS/STK library/IBM Gen2 drives?

Nope, we use the 3584 with Gen 1 LTO drives, for what it's worth.

-Dan


Re: AIX startup of TSM scheduler

2004-01-07 Thread Dan Foster
Hot Diggety! Prather, Wanda was rumored to have written:
 This always works for me:

 nohup dsmc sched 21 /dev/null  /dev/null 

Alternatively, drop this one liner in /etc/inittab on an AIX system:

dsmc::respawn:/usr/bin/dsmc sched -password=quiet /dev/null 21

(It is usually preferrable to put this at or near the very end of the file)

Then do:

# kill -HUP 1

(to make inittab aware)

Subsequently, if you want to force the scheduler to update, it's just a
matter of kill -9'ing the 'dsmc sched' process ID and inittab would
automatically respawn a fresh copy of the dsmc scheduler.

For instance:

# kill -9 `ps -ef|grep '/usr/bin/dsmc sched'|grep -v grep|awk '{print $2}'`

This general approach works if you've got the client password stored as an
encrypted hash in a file locally... if you have a setup that requires
someone to manually enter the password every time, then this may not be a
suitable approach (use of inittab to respawn).

-Dan


Re: List all tapes, highlighting those in library.

2004-01-05 Thread Dan Foster
Hot Diggety! Deon George was rumored to have written:
 Has anybody got an SQL, that can list all VOLUMES, and highlight (either
 showing the element number or something) those that are in the library?

 This report would be useful to see quickly if a list of tapes are already
 in the library - or which tapes in the list are not and need to be checked
 in.

I got a list of tables by doing:

tsm select tabnames,remarks from tables

Then I decided to look at the table called LIBVOLUMES because your request
is essentially the SQL equivalent of 'query libvolume'.

So then I decided to see what fields (columns) were present in the table
called 'LIBVOLUMES' with:

tsm: MYSERVERselect colname from columns where tabname='LIBVOLUMES'

COLNAME
--
LIBRARY_NAME
VOLUME_NAME
STATUS
OWNER
LAST_USE
HOME_ELEMENT
CLEANINGS_LEFT

Based on that, I figured only three columns might be useful. So:

tsm select library_name,volume_name,home_element from libvolumes

...which would produce an output like:

tsm: MYSERVERselect library_name,volume_name,home_element from libvolumes

LIBRARY_NAME   VOLUME_NAMEHOME_ELEMENT
-- -- 
3584LIB1   MYS000 1026
3584LIB1   MYS001 1027
3584LIB1   MYS002 1028
3584LIB1   MYS003 1029
3584LIB1   MYS004 1030
3584LIB1   MYS005 1031
[...snip...]

If you have only one library, you are free to leave the library_name off
the select query.

Parse the results as you like, regardless of if it's called via an internal
TSM server-side script or if it's called via a script that parses the
output of calling dsmadmc in batch mode. It is trivial in either case.

-Dan


Re: 3584 LTO1 new firmware installed : 36UC

2003-12-31 Thread Dan Foster
Hot Diggety! Lambelet,Rene,VEVEY,GL-CSC was rumored to have written:

 we just upgraded the firmware for our LTO1 drives in the 3584. Version in
 now 36UC instead of 36U7. The bad bug (ask your CE about it please, if
 you are using 36U7) should be fixed now!! (Library is still at firmware
 level 3480).

More details from the CE an half hour ago:

LTO gen 1 drives with microcode version 36U6 through 36UB is affected by
this *really* serious bug. IBM is recommending customers put 36U5 on now if
they can't/won't load 36UC right away.

The bug: if certain types of permanent write errors is encountered during
operation AND a subsequent rewind command is issued, then an end of data
mark may be written at the BOT (beginning of tape).

That effectively loses existing data on the tape. The data can be
recovered but they must be sent to IBM for that, and to be arranged through
customer's CE if this pops up.

Looks like I get to load 36UC on Friday afternoon. :) Otherwise, you don't
want to find out at disaster recovery time that some tapes are not
currently usable for restores, with a wait of days or weeks between repair
and getting it back.

-Dan


Re: anyone had network problems with 3584?

2003-12-17 Thread Dan Foster
Hot Diggety! i love tsm was rumored to have written:

 The two 3584s that I run have both recently gone off the air,i.e I can't
 ping them.  They both have green link lights on the NIC. They are set to
 100/Full  although I have tried Auto and 100/half.

3584 Operator Guide recommends they be set to 10/half (if hub) or 10/full
(if switch).

They do not work real well at 100 Mbps, and they haven't quite ironed out
the culprit despite multiple revisions of library firmware, from my
understanding. We have given up on ever getting it to work reliably at 100
Mbps. Seems fine at 10 Mbps, especially if the switch port is hardcoded to
match the 3584 enet settings exactly. Don't really trust autonegotiation.

IBM CE was here recently for a firmware upgrade and he had some internal
IBM documentation for himself (recently issued) that still advised if
Ethernet was hooked up, to set it to 10 Mbps only.

-Dan


Re: SV: Query from z/OS-platform to intel-platform

2003-10-24 Thread Dan Foster
Hot Diggety! Richard Sims was rumored to have written:
 3. What does the rumor say - when can we expect a real DB2 ;-)

 Imagine what the licensing fee for TSM would be then!!   ;-))

Might jack up the price quite a bit... but would likely still be several
orders of magnitude cheaper than licensing Oracle ;) /tongue-in-cheek

We've got various of the major databases in-house, and the licensing costs
for each is just simply... *amazing*. On some days, I get the distinct
impression we're singlehandedly paying for Oracle CEO Larry Ellison's
private high-performance jet.

-Dan


Re: TSM 5.1 server on AIX 5.2 64-bit?

2003-10-17 Thread Dan Foster
Thanks to everyone who helped out!

We successfully migrated the AIX 5.1 server (6H1) to AIX 5.2 using
alt_disk_migration cloning and then doing a migration upgrade install on
hdisk1 (while preserving AIX 5.1 on hdisk0). We had also done a bootable
mksysb to DVD-RAM beforehand with mkcd.

Was trivial -- only a few commands required from start to finish.

TSM 5.1 64-bit started up just fine under AIX 5.2 64-bit. Once we were
satisfied everything worked, after a while, we eventually removed the old
AIX 5.1 rootvg on hdisk0 and re-mirrored with AIX 5.2. Had there been a
serious problem with the AIX 5.2 environment, we'd have simply booted off
hdisk0 to gain the pre-upgrade AIX 5.1 environment and junked AIX 5.2 on
hdisk1. (This approach is similar to how you can use Solaris' flash
archives for OS installation on an alternate disk.)

As an unrelated note -- we had dsmc core dump with internal program error
while doing an incremental backup. Turns out it was because the Solaris
snapshot filesystem was corrupted internally with some bad metadata, so it
wasn't dsmc's fault after all. We unmounted, redid the snapshot, and
re-mounted it; dsmc ran fine that time. (We also figured out the cause of
the Veritas Volume Replication for the Oracle DB snapshot filesystem
corruption and applied a small adjustment to prevent it in the future.)

-Dan


TSM 5.1 server on AIX 5.2 64-bit?

2003-10-14 Thread Dan Foster
1. Will that combination work?

2. Is that a supported combination? Or is only TSM 5.2 server on AIX 5.2
   supported?

The docs I've seen to date hasn't directly addressed TSM 5.1 server on AIX
5.2 64 bit, and I don't have a spare 64-bit 5.2 machine to test with at the
moment. (Got plenty of 32-bit 5.2 machines; verified the client works fine)

Thanks!

-Dan


Re: Mkcd-AIX help

2003-10-09 Thread Dan Foster
Hot Diggety! PINNI, BALANAND (SBCSI) was rumored to have written:

 Please excuse me for asking aix issue .But I think this was pressing issue
 for me right now.

 I use mkcd to backup AIX os to DVD . My question is what is equivalent to
 /etc/exclude.rootvg for mkcd compared to mksysb. I want to exclude file
 system (application) from backing up on DVD.

Well, you can do a mksysb if rootvg or savevg if non-rootvg volume group,
with your own exclude file. Then you supply that image to mkcd. Ie:

# mkcd -m /path/mksysb.image any other flags

or

# mkcd -s /path/savevg.image any other flags

-Dan


Re: Non-IBM LTO tapes in a 3584

2003-09-12 Thread Dan Foster
Hot Diggety! Shawn Price was rumored to have written:
  For what it's worth, my boss cheaped out and got some non IBM tapes for
  our 3584, and I've had nothing but problems with them. Mostly Imation
  and a few Emtecs. We are running the first generation LTO 1 drives, so
  there might be an issue with those as well. I have now gone in and just
  started pulling the tapes that have gone read-only more than twice and
  replaced them with IBMs.

Make sure your library and drives are at the latest firmware revision
levels -- this is especially important since more recent firmware revs
fixes bugs that causes problems with non-IBM tapes from what I hear.

Latest 3584 library firmware you can get from IBM's web/ftp site: 3300
if no D42 drives in library; else, 3060.

To check firmware rev for library w/AIX: # lscfg -vl smc0 | grep FW

The library firmware rev # also appears on the main LCD panel screen.

Latest 3584 drive firmware you can get from IBM's web/ftp site: 36U3 for
LTO-1 drives and 38D0 for LTO-2 drives.

To check firmware rev for drive w/AIX: # lscfg -vl rmt0 | grep FW
(...and for each drive... rmt1, rmt2, etc.)

-Dan


Re: New 3584 library and drive microcode

2003-09-02 Thread Dan Foster
Hot Diggety! Lambelet,Rene,VEVEY,GL-CSC was rumored to have written:
 Eric, following some unmount failures, IBM recommended to install 36U3 (36U5
 was chosen because it appeared on the IBM ftp site the day before).

I noticed that several people installed 36U5 but it does not appear to be
on either the web or ftp site... 36U3 is present for both. So they may have
yanked 36U5 -- possible serious bugs?

Just mentioning as an advisory note / heads-up.

-Dan


Re: audit library fails

2003-07-24 Thread Dan Foster
Hot Diggety! Karel Bos was rumored to have written:

 On both sites we had some problems with tapes stuck in drives. On both sites
 we had to do audit libraries to get TSM in synch with the libraries. While
 the tapes were stuck in the drive, the audit library failes and now one of
 the slots is empty (tape was in the drive and is completely gone) and still
 the audit fails. Has anyone seen this type of behavour? Is it a TSM thing
 (don't think so because of the version difference between sites) or has
 something been altered in the microcode of the library what changes things?

The 3584's a great unit. TSM's audit library requests data from the 3584's
inventory data... so even though you may have removed a stuck tape, the
3584 may still not be aware of it for its scanned inventory data.

What works for us in this situation is to go to the front panel (or the
unit's web interface if you hooked up Ethernet to it) and have it do an
inventory. 45-50 seconds later, it's done. Now go to TSM and do an AUDIT
LIBRARY. It should now work as expected.

-Dan


Re: Wouldn't it be nice...

2003-07-24 Thread Dan Foster
You bring up a good point - these two folks are posting here because they
genuinely want to help out and willing to tolerate the flak and other
divergent opinions at times. For having gone so far beyond the official
duty requirements, I think it'd be nice to give them due recognition and a
big round of thanks.

They are not official spokespeople; they are not expected or required to
participate here, to forward reports, to come up with ideas, to defend the
organization, amongst many other things, but they still hang out, anyway,
and help out a lot.

I would *never* jump on them for not doing more - they have already
willingly gone so far beyond what is expected of their duties; I appreciate
it and accept it, and allow them to be as human as they wish - that means
maybe they aren't always interested in responding to every single report
concerning their speciality at all times. That's all right; that's why we
have support contracts to guarantee we will always get the appropriate
support that we need at the right times.

If we wish to have greater participation and representation by IBM/Tivoli
experts here in adsm-l, I believe the appropriate folks to discuss it with
or pressure would be their management, directly.

Oftentimes, some reasons why more people from a vendor don't participate in
their product's public forums -- fear of legal issues, not enjoying being
singled out and flamed (we're all human) for decisions made by others, an
environment that's perceived to be demanding, amongst many other reasons.

So, thank you very much, Mr. Raibeck and Mr. Hoobler.

(amongst numerous others whom also posts here, of course.)

-Dan


Re: LTO throughput - real world experiences

2003-07-08 Thread Dan Foster
Another comment... LTO numbers should probably be reported as being
LTO-1 or LTO-2.

LTO-1 max uncompressed is 15 MB/sec, max compressed is 30 MB/sec; LTO-2 max
uncompressed is 35 MB/sec, max compressed is 70 MB/sec. This assumes a
typical average compression ratio of 2:1, of course.

We use tapeutil under AIX (part of the Atape driver) to test the raw
throughput to tape with the rwtest option. May be possible to do similar
tests with other tools under other OSes.

We did this with the drive in question disabled in TSM then put an unused
blank tape (unlabelled and unknown to TSM) in the upper I/O station, closed
the door, then did:

# tapeutil -f /dev/smc0 move 769 257
(move from first I/O station slot to first drive in first frame; AIX knows
this drive as rmt0 in our case. **MAKE SURE YOU VERIFY THE ELEMENT ID FOR
DRIVE TO OS'S DRIVE NAME MAPPING OR YOU COULD DESTROY A TAPE WITH VALID
DATA ON IT!!!**)

# tapeutil -f /dev/rmt0 rwtest -b 262144 -c 20 -r 3
(do a destructive read/write test with a 256K block size factor, run for 20
blocks for each read test, 20 blocks for each write test, and do both set
of tests 3 times)

# tapeutil -f /dev/smc0 move 257 769
(put the tape back in the upper I/O station slot when done)

This destroys the ANSI tape label so it's best to do this on a previously
unused tape or you'll have to re-label (dsmlabel or 'LABEL VOLUME') it
afterwards.

And then, of course, re-enable the drive within TSM.

That gives us some idea of the raw potential. Then within TSM, I like to
see how fast it can push large files through to tape in a diskpool-tape
pool migration. Of course, this is also another 'test best case' scenario
because tape drives are more optimized for large files than for a number of
small ones, but it still gives you an idea about the upper bound of real
world performance.

Between the tests and TSM tape writes, it appears we get 22.5 MB/sec
sustained in compressed mode for LTO-1 drives. That's pretty decent; it's
halfway between the theoretical min and max for compressed mode, and we're
happy by its performance so we've got no complaints :-)

I should note that we don't really do database backups (well, we usually
export them to a SQL dump file in ASCII and back that up because the TDP
type tools are priced *outrageously* and our needs aren't _that_ pressing)
and that our data generally compresses well to tape -- we get an average of
2.24:1 hardware compression across the entire tape pool with a few bursts
of 3:1 and once, almost 4:1 compression for a LTO-1 tape.

The only way we get this kind of performance is to put a disk pool in front
of the tape drives... and we also limit it to two drives per SCSI bus
(because they're capable of peaking at 30 MB/sec or so in compressed
mode... so 2x30 = 60 MB/sec which means you need at least a Ultra2 SCSI
controller) to avoid contention or bottlenecks at the bus level.

The earlier tips about testing the FC setup was good, too.

-Dan


Re: TSM Staffing level question for Enterprise Shop

2003-07-07 Thread Dan Foster
Hot Diggety! Gordillo, Silvio was rumored to have written:
 TSM Staffing Level Question:

 I'm searching for documentation, industry best practised, bench marks,
 etc... on recommended staffing levels for our TSM Environment.  Need to
 get pointed in the right direction to provide management with levels of
 support for enterprise sized TSM shops.  Maybe some TCO (total cost of
 ownership) validation docs out there to support staffing levels.  Below
 is what we manage.

 4 TSM Servers ver 4.2.2.12 (they will be replaced  upgraded soon) AIX
 4.3.3 780 TSM Client Nodes various flavors databases mostly oracle, SQL,
 Exchange 6 LAN-Free Clients. (Gresham EDT Knowledge) 4 STK 9310 Silos
 running ACS Server 6.0.1 Direct off-site copy pool stg process between
 data centers via Optical Carrier

Well, my take on it is that for an enterprise sized backup system...
needs to have a bare minimum of at least two skilled administrators -
mostly for coverage reasons (vacation, illness, unavailability in crisis,
etc).

We're a small shop here - about 150 TSM clients, and we've found that
one administrator is just fine (although we really do want more... but
interesting other co-workers in learning can be a bit of a daunting task
sometimes...).

Reason? Once you've configured TSM to your satisfaction, it pretty much
runs itself if you've set it up properly -- including scripts to manage
routine events, to check for abnormal events, etc.

The high cost of TSM setup for us was purely in the time it took to
read/re-read the manual and experiment with various setups, do testing,
iron out stupid SCSI issues with the tape library, and then to write
about 40 TSM management scripts for the first pass of the updated TSM
platform.

After that, it pretty much ran itself trouble-free after we'd worked out
the initial kinks during the first week of production for the updated
TSM setup (5.1, new library, new server, etc). Only change we had to do
after introducing the tested setup into production was to increase the
tsm db log size. That's it.

So for us, the expense is mostly in the time spent planning and configuring
the setup... post-setup, costs us just about nothing for manpower - two
seconds to scan the daily reports and that's it.

However, with more complex setups, definitely do need more people -
especially useful if you had a serious crisis such as a large scale site
breakage or even just for coverage reasons.

How to calculate how many people required... that's honestly a good
question. I'm of the school of thinking that says with a properly designed
setup, it should be able to pretty much scale up to any size.

I know I could go to 500-1000+ clients with the existing tools without
modification because I designed it to scale up easily and without any
additional work. So in theory, I could continue to run this setup ok just
by myself... although not most recommended idea for a large setup.

So... 'appropriate number'? May vary with actual production workload -
for instance, are TSM administrators expected to do tape swaps? If a *lot*
of tape swaps each week (or day) is required, this can take hours at really
large sites. Also, if you have older tape technologies or a really complex
setup that breaks down often, then that's also very labor intensive in
dealing with it, too... and would require more administrators.

Also, if you have significant tape infrastructure at multiple sites that
are far apart, you would most likely want to have at least one local TSM
administrator per site - for tape swaps, for debugging tape library issues,
for disaster recovery, etc. If it was a couple of sites in the same town or
nearby, could get by with fewer people... but harder to do that if you had
sites 500+ miles apart. I don't mean a couple drives and 50+ tapes at a
site, I mean the more significant ones with multiple robots and hundreds
or thousands of tapes.

A single admin that is very good with churning out decent scripts to manage
the platform in 'automatic' mode (self-running) is worth his/her weight in
gold and is significantly cheaper than 5 or 10 less experienced people whom
insists on doing everything manually (very labor and time intensive). So
admin experience and skill level also has to be factored into the TCO, too.

Because experienced admins making good tools will have setups that pretty
much run themselves well and effectively which saves you heaps of money in
the long term and hence, reduces the overall TCO if averaged and amortized
on a yearly basis.

So, the answer you seek... will depend on a variety of factors... age of
infrastructure, complexity of infrastructure, size of installation, number
of sites with sizable tape infrastructure, skill of personnel, coverage
needs, job expectations/responsibilities, growth rate for tape
infrastructure over 6-36 months, and so forth. Training is also a more
significant cost with more complex and larger enterprise setups -- training
classes, test server and associated hardware to 

Re: oops forgot the error message.

2003-06-22 Thread Dan Foster
Hot Diggety! Scott Figgins was rumored to have written:
 I'm getting this error in my Server log on AIX 5.1 with TSM 5.1.6.5 after
 defining a 7336-205 4mm Tape library. Any ideas? IBM says this library is
 supported in the BASE codeset.

1. What is the output of:

# lsdev -Cc tape

2. What Atape version do you have?

3. What were the commands you used to define the library?

-Dan


Re: oops forgot the error message.

2003-06-22 Thread Dan Foster
Hot Diggety! Dan Foster was rumored to have written:
 Hot Diggety! Scott Figgins was rumored to have written:
  I'm getting this error in my Server log on AIX 5.1 with TSM 5.1.6.5 after
  defining a 7336-205 4mm Tape library. Any ideas? IBM says this library is
  supported in the BASE codeset.

 2. What Atape version do you have?

Er, I just realized that I'm not sure you need the Atape driver for a 7336,
and most likely you don't need it. I've got an IBM 4mm library but last
time we used it was about seven years ago when we did backups with Sysback.

Do you have patches applied to AIX 5.1? Latest maintenance level is either
ML3 or ML4.

-Dan


Re: TSM LTO 3580 Performance

2003-06-13 Thread Dan Foster
Also, is there a diskpool? ie:

client-server[diskpool]-server[tapes]

If there's no diskpool in between the client and the tape drives, will be a
lot of stop/go writes to tape, resulting in about 1 MB/sec vs 10-25 MB/sec.

At least, that's true for LTO-1 drives. I've heard that LTO-2 drives better
handles this kind of situation.

-Dan


Re: TSM LTO 3580 Performance

2003-06-13 Thread Dan Foster
Hot Diggety! Colby Morgan was rumored to have written:
 The SCSI adapter is a 2940UW.  It does run great with large files, so if we
 had an adapter hardware bottleneck file size shouldn't make a difference.
 We are running the v5.0.2183.1 of the Microsoft drivers.  I also opened a
 call with IBM, and they didn't make any recommendations to update the
 adapter drivers.

How many drives per SCSI bus? We limit it to two for a 80 MB/sec SCSI bus
because a single drive is capable of pushing up to about 30 MB/sec in
compressed mode over the SCSI bus.

As a side note - with a diskpool, we achieve about 70-80 GB/sec with our
LTO-1 drives, so the fact you're getting 1/5 to 1/8 the performance does
sound pretty terrible, indeed.

Could it be cabling quality issues - ie, reflection on the bus causing
excessive retries or other related SCSI errors? That's another place where
performance could be killed.

-Dan


Re: Small TSM deployment

2003-04-05 Thread Dan Foster
Hot Diggety! Levinson, Donald A. was rumored to have written:
 has anyone deployed TSM from a B50 or a 43p 140 ?
 I need to back up about 4 GB a night on a local network from 2 clients and
 maintain a total tape pool of about 2.5 TB on about 100,000 objects or less.

A B50 is just a repackaged 43P-150 in a rack mount case. It's (either one -
the B50 or 150) is a very capable machine.

Only limitation to the B50 is that it is not slots-happy ;) Has only two
expansion slots so you have to choose *VERY* carefully with what and how you
expand it... and it also can not support gigabit ethernet adapters --
presumably due to the fact it's a 32 bit machine.

Doing some quick calculations shows 8 GB/night to be about 8.3 MB/sec for
about 8 hours which means about 70 Mbps on the network -- which would just
barely saturate a fast ethernet network.

You would be *highly* advised to add a second FE network adapter to the B50
and split the clients -- point half of the clients towards one B50
interface and the other half towards the second FE interface.

Ie:

backup1.foobarbazz.com points to 10.1.1.1
backup2.foobarbazz.com points to 10.1.2.1

(and both IPs would be individual IPs on each ethernet interface...
would put both on separate IP subnets such as a class C or /24
because this would prevent return traffic from hogging one primary
outgoing interface.)

On half of your clients, point them to backup1 in dsm.sys, on other
half, point to backup2 in dsm.sys, etc. How do you decide which clients
to split up? Do it based on data volume for backed up data.

If one client does 50% of the data traffic and 99 clients does 50% of the
data traffic, then that's how you'd split it up.

I would not recommend the use of a 140 because IIRC, it's a PReP based
machine and I think AIX 5.2 dropped support for these? The current
generation RS6Ks and pSeries servers are CHRP based, including the 150 and
B50.

Also, if you can daisy chain your tape drive(s) off the onboard SCSI
adapter for the B50, that might work... but you would most likely want
to have a diskpool and split the SCSI bus between disks and tapes; which
implies adding a second SCSI adapter for the B50, using up its second
expansion slot (first would be for another FE adapter).

Otherwise you could end up with a serious performance bottleneck that
could totally negate whatever the B50 is capable of handling.

It's also uniprocessor which could potentially mean some CPU contention
for various resources (tapes, disks, network, cpu) when it's busy.

In short, the B50 should work out ok, but you appear to be quite pushing
the very limits of its packaging for your _current_ needs, nevermind
long-term future growth needs.

You would be highly advised to get a slightly beefier machine, or at least
a machine with more PCI slots than the B50.

Right now, the next step up above the B50 would be the pSeries 6C1 or 6C4,
at about double to triple the cost of the B50. Unfortunately, we have not
found any other intermediate range machines between these.

I do use a B50 for some TSM testing so I know it works fine as long as you
don't try to push more data through it than what its adapters can handle.

-Dan


Re: urgent! server is down!

2003-03-12 Thread Dan Foster
Hot Diggety! Richard Sims was rumored to have written:
 After a reboot yesterday tsm doesnt start. ...
 ...
 ANR0900I Processing options file dsmserv.opt.
 ANR000W Unable to open default locale message catalog, /usr/lib/nls/msg/C/.
 ANR0990I Server restart-recovery in progress.
 ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
 changed; old capacity 983040 - new capacity 999424.
 ...
 I would take a deep breath and stand back and think about that situation
 first...  There's no good reason for a server to be running fine and
[...]

I agree with what Richard had to say. Taking a deep breath is always step
#1 for handling a crisis without making it worse.

999424 - 983040 = 16384, which is exactly 16 MB and sounds suspiciously
like the PP size. 'rtsm...' sounds like a raw LV rather than a filesystem.

Perhaps someone with root access had done this at some point earlier:

# extendlv tsmvglv11 1

[or had done the equivalent in SMIT.]

(DO NOT EXECUTE THE ABOVE COMMAND! I am only theorizing what may have
happened)

As for eng_US vs C, do this:

# grep LANG /etc/environment

If it says LANG=C then try:

1. Changing it to LANG=en_US in /etc/environment
2. At the root prompt: # export LANG=en_US
3. Try starting up TSM now

And you will probably want to ask your operations staff if anyone had
increased the LV's allocation, perhaps by one physical partition with
extendlv or similar. If someone had done it, I'd have made them put Humpty
Dumpty back together as a great learning experience ;) Tell people to *NOT*
mess around with the TSM server if they do not know what they're doing.

I did a quick test with TSM 5.1 by creating a small 16 MB DB logical
volume (1 PP), started up server OK. Then I did 'extendlv tsmdblv 1',
halted server, and tried to start it up again. I got the exact same
errors you got.

I suspect you may have to remove that LV, recreate it with the expected
size that TSM wants, then do a DB restore from your most recent full db
backup tape.

But before you do that, you'll want to save a copy of your current device
config and volume history file if you have these, as well as your
dsmserv.opt file. Then look in the TSM 5.1 Server for AIX Administrator's
guide at:

http://publibfp.boulder.ibm.com/epubs/pdf/c3207680.pdf

(This is assuming you use TSM 5.1 for AIX; if you use another version,
you'll want to consult that guide instead, but the steps will probably
be similar or still exactly the same.)

DB restore is covered in Chapter 22. 'Restoring a Database to its Most
Current State' at bottom of page 524 is probably your easiest option since
it sounds like you have everything else intact -- volume history info,
logvols, stgpool vols, etc.

Then you'll have to delete (with 'rmlv -y tsmvglv11') the offending LV,
and recreate it (with 'mklv -y vg tsmvglv11 number of PPs'). Then...

Find out which tape has the most recent full DB backup, then do:

# cd /usr/tivoli/tsm/server/bin
# ./dsmserv restore db devclass=whatever vol=tape volser

If that command worked (it's a preview, basically), then do:

# ./dsmserv restore db devclass=whatever vol=tape volser commit=yes

...which will make the restore actually happen, for real.

The actual restore operation is no big deal if you have a good and recent
db backup tape, and know which tape it is. I did this as part of testing
recently, and it worked right off the bat with no problems at all.

If you don't know which tape volser has the latest full db backup, then
you could look into your volume history file. For example, with my setup:

backup1:/usr/tivoli/tsm/server/bin# grep BACKUPFULL volhist.cfg
 2003/02/21 14:41:24  BACKUPFULL  5  0  1
3584_DEVCLASS1 ROC010
 2003/03/01 21:06:51  BACKUPFULL  6  0  1
3584_DEVCLASS1 ROC012

(The file might be called 'volhistory.cfg'; I had explicitly defined mine
to be 'volhist.cfg' at server installation time.)

I have two full db backup tapes... one was done on 2/21, is version 5.
The more recent was was done on 3/1, version 6. So I'd restore ROC012,
for example.

I should warn you that any backups done after the date/time of the most
recent DB backup will effectively be lost, so be sure you really do want to
restore the DB before committing to it. Doesn't sound like you have too
much of a choice in this particular case.

Note for the discerning ADSM-L reader: my production server has daily db backups! The 
above was from a test box.

-Dan


Re: Automatic checkin on a 3584?

2003-03-04 Thread Dan Foster
Thanks for all the comments, suggestions, ideas, and code examples.

They were very invaluable! I also found expect to be an interesting
approach that I hadn't considered, but makes sense.

Looks like I'm on the right track now; thanks again!

-Dan


Automatic checkin on a 3584?

2003-03-03 Thread Dan Foster
Howdy -

I seem to be having some sort of timing issues with the checkin
process via an automated script. The procedure:

1. Fill the I/O station(s) with brought back tapes to be checkin'd
2. Issue a 'checkin libvol library search=bulk status=scratch
checklabel=barcode'
3. Wait a bit for a request to appear in the request queue
4. Issue a reply corresponding to what appears in 'q req'
5. Wait a bit for the whole checkin process to finish in search mode
6. Issue a 'checkin libvol library search=bulk status=private
checklabel=barcode'
7. Wait a bit for a request to appear in the request queue
8. Issue a reply corresponding to what appears in 'q req'

Manually, works great. My script parses all data just fine using
dsmadmc in batch mode (with -id= -pass=...) and the command syntax is
correct -- I know because it works fine by hand.

I'm a little stuck in getting the exact timing *just right* for the
delays as well as the best way to construct a while loop logic to get the
timing right. It just runs through it so fast that it exits before it's
got a chance to answer both requests. Detecting certain corner cases seems
a little interesting when you throw in vagaries of timing, also.

Alas, cheating by direct loading and using search=yes to avoid
the requests isn't an option for the weekly tape swaps. ;)

Does anyone happen to have a similiar script fragment or even just
suggestions on how to construct the while loop 'just so'? :) Anything would
be much appreciated! (I'm trying to make the limited human interaction with
tapes reasonably bulletproof without requiring them to manually wade
through TSM - reduces chances of errors and not all personnel are skilled.)

-Dan


Re: Automatic checkin on a 3584?

2003-03-03 Thread Dan Foster
Hot Diggety! GUILLAUMONT Etienne was rumored to have written:

 You didn't say in what type of OS you where. If it is unix, no problem, you

Oops! I'm normally good about that. Server is AIX 5.1 on pSeries 660 model
6H1 server.

I do have some sleeps, but it's still running too fast or not exactly right.

-Dan


Re: Linux 5.1.5.14-client

2003-02-19 Thread Dan Foster
Hot Diggety! Tom Tann{s was rumored to have written:
 Hello *SM'ers!

 Is there a problem/bug in the libPiSNAP.so and/or libPiIMG.so?

 We don't use the default installation-path, but DSM_DIR, DSMI_DIR etc are
 set to the right paths.

 dsmc generated these entries in dsmerror.log:
 
 02/19/2003 20:18:45 dlopen() of /local/tsm/bin/plugins/libPiSNAP.so failed.
 02/19/2003 20:18:45 libApiDS.so: cannot open shared object file: No such file or 
directory.

You have two options:

1. set up symlinks to /usr/lib then do 'ldconfig'

or...

2. add /local/tsm/bin/plugins to /etc/ld.so.conf then do 'ldconfig'

-Dan



Where to get 5.1.6.0 server for AIX?

2003-02-10 Thread Dan Foster
I have the 5.1.6.1 server code but that apparently requires 5.1.6.0
code be installed...and I can only find 5.1.5.4 and 5.1.6.1, but no
5.1.6.0 on the FTP site...?

Is 5.1.6.0 supposed to be for-pay software for existing v5 customers?

I know they yanked 5.1.6.0 due to serious QA issues, but it seems
unbelievable they'd issue 5.1.6.1 if it requires a yanked version
that is no longer available?

Are there any workarounds for going from 5.1.5.4 to 5.1.6.1 without
having to deal with 5.1.6.0? Or is this going to be a case of calling
TSM support and requesting a special cut of the package for 5.1.6.1?

-Dan



Re: Where to get 5.1.6.0 server for AIX?

2003-02-10 Thread Dan Foster
Hot Diggety! [EMAIL PROTECTED] was rumored to have written:

[1]ftp://service.boulder.ibm.com/storage/tivoli-storage-management/mai
ntenance/server/v5r1/AIX/LATEST/

Ahh! No wonder -- thanks! (I was looking at patches rather than maintenance
tree.)

Got it now.

-Dan



Re: LTO Cleaning Cartridge Mount Points

2003-01-08 Thread Dan Foster
Hot Diggety! Joshua Bassi was rumored to have written:

 Does anybody know offhand the estimated number of cleanings an IBM LTO
 cleaning cartridge supports before it should be disposed of?  Thanks in
 advance.

Exactly 50. ;) It's got a built-in counter on the tape, and the IBM
libraries will report this via front panel LCD (at least, on the 3584,
it does). I believe it returns an error if you try to use it a 51st time,
from my recollection of the manual, but don't hold me to that.

-Dan



Re: 3584 library loading

2002-12-17 Thread Dan Foster
Hot Diggety! Matthew Glanville was rumored to have written:
 But, I have found bulk loading tapes with the door open on the 3584 is a
 problem when tapes are in the drives. I had to figure out which slots NOT
 to put tapes into because those are the slots that the tapes in the drives
 use.  When TSM tries to unload a tape after you closed the door it knew
 which slot it was in and tries to put it there, thus failing if you have
 manually put a tape there.  A simple audit library check=barcode don't work
 as it wont run until all drives are unmounted... Is that called a paradox
 or a conundrum?

With TSM v5, you can parse the output of an undocumented (I think?) command:

tsm show library libraryname

It will have something like this for the drive that has a tape loaded:

  loaded volume home slot=1033
  loaded volume name=ROC007

That tells you a) what the tape label name is that's loaded in drive right
now, and b) which element number it was originally fetched from.

So you just basically parse the output of that command and find out all
the home slot numbers, and *NOT* load tapes in them. Volume name may be
useful if you're going to be physically eyeballing the library, but
otherwise not programmatically useful.

I'm not sure if there's a better and documented command to do the same
sort of thing. Also, don't recall if this existed in previous versions,
but pretty sure I remember seeing it in TSM v4.

Alternatively, if you're using AIX, you could parse the output of:

# tapeutil -f /dev/smc0 inventory

Something along the lines of:

# tapeutil -f /dev/smc0 inventory|grep -p ^Drive Address | \
grep Source Element Address|grep -v Valid | \
awk '{print $NF}'

-Dan



Re: TSM SERVER 5.1.5.2

2002-12-10 Thread Dan Foster
Hot Diggety! David Longo was rumored to have written:
 5.1.5.2 has been out about a month or so now.
 Also that something else isn't broke?

There was a major performance bug fixed in 5.1.5.3; we had a really
nasty situation where a west coast client was sending data at only 2 Mbps
to the east coast server; raw network between the two is up to 32 Mbps
as evidenced by ttcp tests between west coast client and other east coast
servers on the same subnet.

Turns out we needed an efix for the pre-release version of fixed netinet
driver in AIX 5.1 (to be bos.net.tcp.client 5.1.0.37 and released next week)
to fix the OS perf bug *AND* to also upgrade to 5.1.5.3 from 5.1.5.2.

Doing so improved the ITSM network perf from 2 Mbps to 8 Mbps, and raw OS
network perf to 32 Mbps. Merely fixing the OS perf bug wasn't sufficient
for us. (We tested and retested between every single step to rule out
multiple variables.)

From the 5.1.5.3 README:

PQ68076 PERFORMANCE DEGRADATION AFTER UPGRADE TO 5.1.5.0, 5.1.5.1 OR

-Dan



How do you back up 2 PB of data?

2002-11-19 Thread Dan Foster
2 PB is 2,048 TB, or 2,097,152 GB.

A fun thought exercise:

http://www.cnn.com/2002/TECH/biztech/11/19/ibm.supercomputerr.ap/index.html

Well, assuming several things:

1. Using LTO (just because I know the numbers for this best off
   the top of my head) -- a 3584 library

2. LTO delivers maximum of 30 MB/sec in compressed mode, but
   22-23 MB/sec is usually realistic. Let's use 22.5 MB/sec.

3. Typically 1.7:1 to 1.8:1 ratio for hardware compression
   Let's use 1.75, or 175 GB for a 100 GB uncompressed tape.

4. 72 drives per maxed out LTO setup (1 base frame + 5 expansion
   frames) for about 2000 tapes in all frames?

5. A single 3584 complex therefore delivers (using hardware
   compression) a grand total of 175 GB * 72 = 12.6 TB of
   compressed data *within* the library at any one time, and
   assuming the client is constantly streaming data to the ITSM
   server at peak efficiency, can back up 81 GB per hour at
   max write-to-tape speeds.

6. Assuming a 16 hour window for all backups to complete per
   day (so that you have time for other ITSM server processing),
   that's 81 * 16, or 1.3 TB per 3584 _drive_ per day. 72 * 1.3
   means a single 3584 complex can do about 94 TB per day.

7. For a single full backup of 2 PB, that's 2048 TB, or 2,097,152
   GB... or about 12,000 maxed out LTO tapes. Since a single fully
   fleshed out 3584 library is about 2,000 tapes... that would mean
   6 3584 libraries for tape capacity alone.

8. 2048 TB divided by 94 TB yields about 22 3584 libraries.

9. Then you've got the small problem of having to come up with
   an appropriate ITSM server design... for starters, number
   of slots required would be incredible. You'd put max of 2
   3580 drives on a single Ultra HVD SCSI adapter... so 72 drives
   per complex would be 36 slots alone! 36 slots multiplied by
   22 complexes would be 792 slots!

10. Not sure about a p690 but think it's got a couple hundred slots?

11. Then you need more adapters for disk and network controllers.
To support 22 MB/sec over 1,584 drives concurrently would be...
465 gigabit ethernet adapters assuming a perfectly tuned setup
that can push 600 Mbps per adapter through.

12. You'd probably kill the bus with so much data zipping around
long before you max out the slots... more likely you would need
multiple (6-10?) p690 Regatta systems *just* to deal with ITSM
backups for 2 PB of data alone.

13. The HVAC requirements for all these disks must be interesting ;)
For the disks -- data, diskpool, db... total BTUs/hr would
possibly be in neighborhood of about 3 million BTUs/hr which
demands *seriously* beefy HVAC units for the disks alone, and
nevermind for the servers, routers, etc...!

14. Probably has their own electrical substation for the computer
room(s) alone. Run on an UPS? If they went to the extent of
having own electrical substation, they might as well... The
disks alone are probably going to eat about 15,300 amps at
the bare minimum... total for entire room could be in
neighborhood of 30-40,000 amps when you consider the large
network equipment, servers, and other supporting infrastructure.

I listed LTO and pSeries here just simply because I know the numbers and
hardware the best, but feel free to offer other possible approaches.

Keep in mind, all that is only a small part of the big picture... this
one is *just* for a single full backup, and doesn't take into account
the long-term needs such as ITSM db sizing or I/O loading of db or diskpool
disks; each hard drive has a finite amount of I/Os it can do at any given
time. Then you've got other issues such as performance vs reliability,
which becomes even more tricky with the extremely large scale setups
because use of RAID-5 could become a *very* real serious bottleneck that
gums up the entire works.

I actually wonder if ITSM on zSeries hardware would actually be better in
this particular scenario because mainframes typically have superior I/O
management, far beyond simple tricks like I/O pacing that exists on
commercial UNIX OSes. Mainframes also have incredible I/O capabilities.
Saw a zSeries box, had about 500 I/O controllers, and was still humming
along just fine even under varying workloads. But I think that's balanced
somewhat by the extensive training and support requirements, along with
licensing and support contract costs.

I do imagine that if I was the data center manager for that site, I'd
be hiring an entire team of senior ITSM administrators with 20 years of
experience ;) Teams of operators to deal with 

Re: How do you back up 2 PB of data?

2002-11-19 Thread Dan Foster
Hot Diggety! Orville Lantto was rumored to have written:
 A 72 drive, 10 I/O slot 3584 library will hold 2207 cartridges.  with 175
 GB/cartridge that works out to 6 libraries.

Aye, in terms of tape capacity. However, if you have a requirement that
it finish an entire full backup in a single day -- say, 20 hours...and
you get between 15-30 MB/sec for HW compressed writes with the LTO drives.

So we pick a common number that's actually sustainable in real life: 22.5
MB/sec.

22.5 MB/sec * 72,000 seconds (20 hrs) * 72 drives * 6 libraries =

683,438 TB, or about 32.6% of 2 PB. Therefore you need 3 times the
original number of 6 libraries, and that's assuming you can keep up
this rate constantly for every single second of 20 hours.

18 fully decked out 3584 libraries would be something to behold, I think ;)
That'd span 106 frames and 1,296 LTO drives. A single full backup would
run about $18M (list price) worth of tapes in that kind of configuration.
I'd like to work for a place that can afford it! Even after deep
discounting, that's still $8-10M worth of tapes.

-Dan



Re: Help requested !!!

2002-10-06 Thread Dan Foster

Hot Diggety! Murthy V Gongala was rumored to have written:

 I have a 3583 LTO library with 2 drives. TSM Server v4.2 on AIX.

 I have been running Scheduled backups without any problems so far.

 Yesterday the server started reclamation process for one of the Storage
 pools as shown below.

 784 Space ReclamationVolume 053ABS (storage pool TECH_TAPE_POOL),
Moved Files: 0, Moved Bytes: 0, Unreadable
Files: 0, Unreadable Bytes: 0. Current
 Physical
File (bytes): 10,487,404
 Waiting for mount of
input volume 053ABS (68221 seconds).

 Some of the activity log entries are :

 ANR1040I Space reclamation started for volume 053ABS,
storage pool TECH_TAPE_POOL (process number 723).

 10/06/02 14:06:19 ANR1044I Removable volume 053ABS is required for
 space
reclamation.

 Even though both the tape Drives are free, and a scratch volume is made
 available, all the migration processes are Held up.

Which AIX version? 4.3.3 or 5.1? And with which maintenance level patch
set? (Ie, 4.3.3 ML10, 5.1 ML02 are the latest) Or if you don't know the
ML set number, what version is listed in 'lslpp -l bos.rte.libc'?

What does the output look like for:

# lsdev -Cc tape

And within TSM, what does TSM think:

tsm q libv library name

specifically, for the tapes that were mentioned in the actlog.

You may want to try a library audit:

tsm audit library library name checklabel=barcode

Finally, how is the 3583 attached to the server? SCSI or FC-AL?

-Dan



Re: Help requested !!!

2002-10-06 Thread Dan Foster

Hot Diggety! Murthy V Gongala was rumored to have written:

 Level indicated by  lslpp -l bos.rte.libc - 5.1.0.25

That sounds like 5.1 ML02 -- latest major patch set collection, very good.

 TSM shows the volumes as  - Private.

Ouch. Ok, the basic problem you're running in is that the TSM server
*wants* at least one more scratch tape, but all the tapes you have
right now is marked as Private.

I think you haven't ruled out a tape communication problem yet, because
TSM marks all unreadable or inaccessible tapes as Private. Most likely
they are in the Private state because they already had existing backups,
but I can't rule out a tape drive communication issue of some sort.

Not running a FC-AL setup rules out a FC-AL switch failure issue, and
since you say the setup's worked fine daily until now AND the 'lsdev -Cc
tape' output shows all devices as 'Available' state, unlikely to be a
drive communication issue, I think.

I'm far from a TSM expert (many folks here are!) but I'm guessing
you'd want to compact the data on the existing tapes to create enough
free space for at least one or more scratch tape.

May want to check tape utilization of each volume with:

tsm q vol

Are they at 100% of about 190 GB per tape? Or are there some tapes that
have like 50%, 60%, something below 100%?

You may need to look into several things:

1. 'expire inventory' TSM command that will mark expired files as
being 'free' (reclaimable space). Run it daily after all of your backups
has completed - can be done via a server-scheduled command that you enter
only once.

Then do another 'q vol' to see if that helped clear up more free space,
enough for a 'move data' command to work.

2. look to keep a tight control on retention of data through adjusting
management classes, policy domains, storage pools, etc...  or by limiting
what data you back up through inclexcl include/excludes in client's dsm.opt
file.

3. do you have an off-site copygroup? (It can be an on-site copygroup
or off-site copygroup or whatever you want to call it, but it has to be
something in addition to your primary tape storage pool)

If you do, then you will want to make a copy of your existing tape data
into the offsite copygroup regularly -- once a week or whatever frequency
works for you. But you need to do it regularly because it requires scratch
tapes to do that, so you can't do that right now without clearing up some
room first.

Other folks will probably have better suggestions on how to get you out
of this dug hole. My comments may help more with maintaining the setup
to avoid this in future, once you're back to normal. I'm afraid I'm not
experienced enough to give you the best answer for how to recover, but
we've got a lot of good folks on this list who can give pointers.

-Dan



Re: TSM 5.1 license registration failed?

2002-10-03 Thread Dan Foster

Hot Diggety! Zlatko Krastev/ACIT was rumored to have written:

 look out carefully in 'q lic' output:
 Number of Managed System for LAN in use: 1
 Number of Managed System for LAN licensed: 0
 This ought to explain everything. Try 'reg lic file=mgsyslan.lic number=#
 of nodes you expect'

D'oh. Thanks. ;) (I was wondering about where clients went, and
looks like I confused mgsyslan with library sharing.)

All good now!

-Dan



Re: TDP for SAP R/3 on Solaris

2002-10-03 Thread Dan Foster

Hot Diggety! Seay, Paul was rumored to have written:
 Is anyone using the TDP for SAP R/3 3.2.0.11 on Solaris 2.8 with Oracle
 8.1.7 at SAP 4.6c2?

Not I, unfortunately.

 No matter what we do we seem to have serious problems with restores and
 difficulty implementing the TDP in production.  The restores hang about half
 way through.  Backups usually work.

 We are just at wits end trying to figure out what is going on.

Well, one place to start might be doing something like this:

# truss -f -p dsmc pid -o /tmp/dsmc-truss.log

Then get an UNIX expert to look at the output of that logfile, to see
if there's any really obvious thing such as blocking on a particular
resource or action. (I don't mean to offend -- I don't know how much
you know about UNIX, and you did mention a mainframe background. If
you're used to UNIX internals, then please accept my apologies. :) )

truss is the Solaris utility used to trace system calls performed by
an application; sometimes it tells you exactly why something broke,
but you need someone with an understanding of UNIX system calls, what
looks normal and what isn't, and some experience in decoding truss
output.

It may generate voluminous logs, so it's good to run it for a few minutes
or however long it takes to capture a good 'snapshot' of dsmc's activity.
Can be started once the restore stalls.

It's not guaranteed to give you the answer, but for the situation you
describe (hanging), it's certainly what I'd have started with, myself.
It might point to waiting on a response from the TSM server -- in which,
it might be a TSM server-specific issue (threading, blocking on I/O,
etc), or it might point to some other local oddity and culprit.

Finally, Sun's constantly fixing a lot of all sorts of OS-specific things
'under the hoods' that may have interesting impact on applications...
so there's always a chance that a recent OS patch set may fix it.

-Dan



Re: TSM 5.1 client error

2002-10-03 Thread Dan Foster

Hot Diggety! Mavis Jenkins was rumored to have written:

 So far, we've installed the server  client on an RS6000 running AIX 4.3.3
 The server is working fine if I use a 4.2 client from another server  but
 if I use the 5.1 client installed on the same box as the TSM server, I get
 the following error when I try to run dsmc or dsmadmc:

 ANS0101E NLInit: Unable to open message repository 'dsmclientV3.cat'

 The file does exist and the NLSPATH points to the correct location.  LANG
 is set to en_US

What does your NLSPATH list? Should say something like this:

NLSPATH=/usr/lib/nls/msg/%L/%N:/usr/lib/nls/msg/%L/%N.cat

Then, next question:

Do you have a 'dsmclientV3.cat' file here:

/usr/lib/nls/msg/en_US/dsmclientV3.cat

or

/usr/tivoli/tsm/client/ba/bin/en_US/dsmclientV3.cat

?

If it's a 32 bit TSM 5.1 client, you may want to see if this package
is installed with:

# lslpp -l tivoli.tsm.client.ba.aix43.32bit.common
  Fileset  Level  State  Description
  
Path: /usr/lib/objrepos
  tivoli.tsm.client.ba.aix43.32bit.common
 5.1.1.5  COMMITTED  TSM Client - Backup/Archive
 Common Files

-Dan



Re: Does TSM use all DB Volumes for I/O?

2002-10-02 Thread Dan Foster

Hot Diggety! Kilchenmann Timo was rumored to have written:
 I would vary much appreciate an answer to the question: Does TSM use all DB
 volumes for I/O (like round-robin) or does it fill a volume and then goes
 to the next one?

I do not know for sure, because I don't know of a TSM way to report
utilization statistics for db or log volumes on a per-volume basis.

But what I can tell you that from my observations, it _DOES_ do some
sort of round-robin on the diskpool volumes because it was almost perfectly
even when I filled up two diskpool volumes for the first time, throughout
the whole time.

That does not answer your question about dbvols and logvols, I know, but
it suggests that it might, since it does seem to do that for diskpool vols.

I'm sure that someone here will know the definite answer. :)

-Dan



Re: bad tapes

2002-10-02 Thread Dan Foster

Hot Diggety! Alexander Lazarevich was rumored to have written:

 1) is there any scenareo where a tape is still good, yet the server sets
 it to READ ONLY?

Yes, there is. It can happen if you have a drive whose element ID mapping
to rmt device name has gotten out of sync with each other. More common with
new installations, but has happened to at least one other person on this
mailing list when his FC switch broke and was replaced.

It's pretty easy to distinguish from other cases because this will usually
affect all or many tapes, rather than just a single tape.

But I imagine the sheer majority of these incidents will normally be either
a tape going bad or its label becoming unreadable.

-Dan



Re: TSM 5.1 license registration failed?

2002-10-01 Thread Dan Foster

Hot Diggety! Zlatko Krastev/ACIT was rumored to have written:
 - instead of grep-ping why you do not use `lslpp -L tivoli.tsm\*` or
 `tivoli.tsm\*license\*`. In this way you can see the version of filesets
 and ensure there is no one left from v4.2. You may also check reverse
 lookup - `lslpp -w /usr/tivoli/tsm/server/bin/library.lic`

There's no TSM 4.2 filesets at all, as this was a fresh OS install (no
preservation, only overwrite) and TSM 5.1 install straight from the 5.1
CDs. I did, however, double check as you suggested, just to make sure.
Looks clean.

 - number of libraries licensed is one. Either you have two libraries or
 3584 is partitioned and look like two libs or some other stanza is having

Only one physical and logical library, no partitioning.

 insufficient number of registered licenses. Can you provide full output of
 `q lic`?

Sure thing.

tsm: GBLX-ROCq lic

 Last License Audit: 09/29/02   16:33:59
  Number of space management clients in use: 0
Number of space management clients licensed: 0
   Is Tivoli Disaster Recovery Manager in use ?: No
 Is Tivoli Disaster Recovery Manager licensed ?: No
Number of TDP for Oracle in use: 0
  Number of TDP for Oracle licensed: 0
   Number of TDP for Oracle in try buy mode: 0
 Number of TDP for MS SQL Server in use: 0
   Number of TDP for MS SQL Server licensed: 0
Number of TDP for MS SQL Server in try buy mode: 0
   Number of TDP for MS Exchange in use: 0
 Number of TDP for MS Exchange licensed: 0
  Number of TDP for MS Exchange in try buy mode: 0
   Number of TDP for Lotus Notes in use: 0
 Number of TDP for Lotus Notes licensed: 0
  Number of TDP for Lotus Notes in try buy mode: 0
  Number of TDP for Lotus Domino in use: 0
Number of TDP for Lotus Domino licensed: 0
 Number of TDP for Lotus Domino in try buy mode: 0
  Number of TDP for Informix in use: 0
Number of TDP for Informix licensed: 0
 Number of TDP for Informix in try buy mode: 0
   Number of TDP for SAP R/3 in use: 0
 Number of TDP for SAP R/3 licensed: 0
  Number of TDP for SAP R/3 in try buy mode: 0
   Number of TDP for ESS in use: 0
 Number of TDP for ESS licensed: 0
  Number of TDP for ESS in try buy mode: 0
   Number of TDP for ESS R/3 in use: 0
 Number of TDP for ESS R/3 licensed: 0
  Number of TDP for ESS R/3 in try buy mode: 0
 Number of TDP for EMC Symmetrix in use: 0
   Number of TDP for EMC Symmetrix licensed: 0
Number of TDP for EMC Symmetrix in try buy mode: 0
 Number of TDP for EMC Symmetrix R/3 in use: 0
   Number of TDP for EMC Symmetrix R/3 licensed: 0
Number of TDP for EMC Symmetrix R/3 in try buy mode: 0
  Is Library Sharing in use: No
Is Library Sharing licensed: No
Number of Managed System for LAN in use: 1
  Number of Managed System for LAN licensed: 0
Number of Managed System for SAN in use: 0
  Number of Managed System for SAN licensed: 0
 Number of Managed Libraries in use: 0
   Number of Managed Libraries licensed: 0
   Tivoli Data Protection for NDMP in use ?: No
 Tivoli Data Protection for NDMP licensed ?: No
  Server License Compliance: FAILED

I haven't been in the computer room yet because that's where the TSM 5.1
CDs are (but will, shortly) and then I can try Paul's suggestion, also.

Thanks, much appreciated all ideas on this so far. :)

-Dan



Re: Withdrawal of 3570 Magstar

2002-09-28 Thread Dan Foster

Hot Diggety! Emil S. Hansen was rumored to have written:

 How about setting the mount retention for the devclass to something like
 10 to 30 mins? That will keep the tape mounted for at least 10 mins
 after the last access, so that if the tape is needed by the client it
 will likely still be mounted.

That's just the catch; mount retention was indeed set to 10 minutes.

Yet, according to q actlog, TSM 4.2 was in a big hurry to dismount it
for no apparent reason. I'm still rebuilding the 5.1 setup, and will
retry the same tests once ready.

-Dan



Re: Withdrawal of 3570 Magstar

2002-09-28 Thread Dan Foster

Hot Diggety! Dan Foster was rumored to have written:
 Hot Diggety! Emil S. Hansen was rumored to have written:
 
  How about setting the mount retention for the devclass to something like
  10 to 30 mins? That will keep the tape mounted for at least 10 mins
  after the last access, so that if the tape is needed by the client it
  will likely still be mounted.

 That's just the catch; mount retention was indeed set to 10 minutes.

 Yet, according to q actlog, TSM 4.2 was in a big hurry to dismount it
 for no apparent reason. I'm still rebuilding the 5.1 setup, and will
 retry the same tests once ready.

Well, appears that it now works correctly for me in TSM 5.1 -- it does
not unmount tape after the write completes... keeps it available in the
tape drive for any subsequent writes. No need to set KEEPMP=YES in the
client node definition.

Don't know what was wrong with TSM 4.2; I set up 5.1 with the identical
settings (I'd documented the step by step install and followed it) except
with some updated commands such as for library, drive, and path configs.

Oh well. Works great now. :) I'm liking what I see of TSM 5.1 so far.

-Dan



TSM 5.1 license registration failed?

2002-09-28 Thread Dan Foster

I'm stumped by something (simple?) relating to TSM licensing.

We've got a pretty plain vanilla TSM 5.1 setup with a 3584 tape library.
No additional features purchased or in use (SAN, TDP, NMDP, etc).

The 3584 requires a managed library license, as I understand it (has 12
drives and 610 tapes spanning the L32 and D32), so we paid for that
license, along with the number of client nodes we wanted to use.

Didn't receive any paperwork or anything special -- just TSM 5.1 CDs
with all the filesets that were appropriate.

So then, once I've got the 5.1 server set up... went to do:

tsm register license file=library.lic
tsm query license

Now says:

[...snip...]
   Number of Managed Libraries licensed: 1
  Server License Compliance: FAILED

What did I do wrong? Why did it fail? How do I fix so that I can have both
1 managed library AND server compliance = SUCCESS? The documentation is
extremely sparse in this area, to put it charitably. ;)

Did I install the wrong filesets for AIX 5.1 server? For the license, I
have these filesets installed:

# lslpp -l|grep tivoli|grep -i license
  tivoli.tsm.license.aix5.rte64
  tivoli.tsm.license.cert5.1.0.0  COMMITTED  Tivoli Storage Manager License

This is for a brand new TSM 5.1 install on a brand new AIX 5.1 install.
No existing nodelock file was present, nor were anything restored.

The only thing I wonder about, after looking at the library.lic file is
a line that says this:

ProductVersion=4.2

Does that mean the license certificate file is specific only to TSM 4.2?

Finally, in previous versions (ADSM 3.1, TSM 4.2, etc), I had to register
an appropriate license for the number of nodes I'd paid for. But there's
no client license certificate files nor anything mentioning it in the
TSM 5.1 'q license', nor in the 5.1 docs...? Does that mean I continue to
pay for client node licenses (I'm not here to rip anyone off!) but just
simply don't need to register them with the server...?

-Dan



Re: Withdrawal of 3570 Magstar

2002-09-26 Thread Dan Foster

Hot Diggety! Coats, Jack was rumored to have written:
 Try 3 minute :( to get a LTO loaded and started spinning ... great for bulk
 store, but not up to 'interactive' response needs :( ... Using LTO for HSM
 would seem counterproductive IMHO.

3 minutes?! Something sounds wrong there. I've got a 3584 with 12 LTO
drives attached to host system via SCSI, and it takes 3-9 seconds to
have the robot fetch the tape from storage slot, move it to drive, load
it in the drive, and about 46 seconds to read the volser on the tape and
other mount-related processing, for a total of about 55 seconds from
issuing command to load to start using a tape.

We've got a L32 and D32 frame; guess it could conceivably be a little
longer in worst case scenarios if had a decked out setup (one L32 and
five D32s) but I can't see it being much more than 1m to 1m10s or so.

However, that 55 seconds is a little painful in certain situations.

For example:

client-disk stgpool (10MB max)-tape stgpool

Client sends data to TSM server
As long as client data is  10MB, streams to disk pool
Soon as client sends a 11 MB or 500 MB file, then...
TSM fetches a tape, mounts it (55 sec wait)
TSM starts writing file to tape
Once done, TSM dismounts the tape (!)
Client continues sending data

If it hits another 10MB file, the whole mount-write-dismount process
repeats. This results in a significant performance hit from all the
mounts/dismounts. To alleviate this, I've set the node's KEEPMP option
to YES, so it ends up mounting the tape once on first access, then keeps
it mounted throughout the entire client run so that we get no more
subsequent 55 sec mount delays. When does it dismount the tape? After
the tape retention in drive period expires, but usually the next client
session grabs the same tape if it's got free space.

The above was with a TSM 4.2 setup; I just installed TSM 5.1, so I've
got to retest to see if this is any different without the KEEPMP option
being enabled.

Anyway, I'd strongly urge you to get that 3 minute wait looked at!

-Dan



Re: tapes getting marked PRIVATE

2002-09-20 Thread Dan Foster

Hot Diggety! Seay, Paul was rumored to have written:

 Suppose when the tapes were labeled the drive that labeled them was bad and
 now none of the drives can read them.

I had that issue pop up while setting up the 3584, and the culprit
was when the /dev/rmtX and element ID mappings had fallen out of sync.

End result? TSM loaded tape by element ID, then tried to access drive
via its incorrect rmtX mapping -- so it hit on a drive that didn't have
the tape loaded, got a read/open error, moved the tape back to its storage
slot (via element ID), marked it as Private in the TSM db, and repeat for
the next tape endlessly until TSM finally timed out this entire process.

Fixing the mappings instantly fixed that particular problem.

So I'd be more inclined to think that or some issue with DEVCLASS definition
was the culprit since one of the original poster's key words were '...all
tapes' -- if it was a bad tape, would have been an one-off.

-Dan



3584 tape questions

2002-09-18 Thread Dan Foster

I've finally got the new 3584 library up and worked through all hardware
and TSM issues, and just finished with the first round of successful tests.
Looks sharp! Currently tuning the setup (disk, memory, network, TSM, etc).

Environment: AIX 4.3.3 ML10, pSeries 660-6H1, TSM 4.2 (5.1 next week)
to 4.2.2.12, SCSI attachment to host, 3584-L32 and 3584-D32 frames.

Two questions about 3584 tapes --

1. Is it normal for it to take 45 seconds to read/verify the
   label on the tape? The tape load itself is pretty quick, but
   takes forever-and-an-half to read/verify the label.

2. There's two label sources -- one is the barcode scanner, which
   works great and *very* quick, and it seems the other label
   source is on the tape somewhere.

   Where exactly on the tape is the label preserved, other than the
   volser bar code label? Is it in the LTO-CM area of the tape? Or
   is it on the first part of the tape, or something?

   I'm curious about this because the implications is that if I
   do destructive read-write tests with tapes involving overwrite,
   I might potentially overwrite the label, and have to relabel it?

   The documentation suggests that 'LABEL LIBVOL' for a 3584 would
   update label in the LTO-CM area of the tape cartridge, but it
   isn't real definite or clear on that point.

-Dan



Re: Help for 2108 failure

2002-09-17 Thread Dan Foster

The other possibility is that since the 2108 was replaced, there may
be a small chance the /dev/rmtX mapping no longer matches what the
previously assigned rmtX-to-element ID mapping was.

Which conceivably could produce this sort of error, if TSM and
the library disagreed about what logical/physical drive matched up?

What I'd suggest is:

tsm q drive f=d

Then note each rmtX device name and what element ID it corresponds to.
(in TSM)

Then do some tracing to figure out which element ID each rmtX device
actually physically corresponds to.

Then you'll soon see if the TSM setup and the probed devices in AIX
matches or not.

If they don't match, update the TSM drive entries for rmtX and element ID
ASAP.

Should normally take only about 5-10 minutes of work to figure out the
mapping from scratch. Keep good notes and issue a whole bunch of commands
to assist in the logical-to-physical tracing.

I just finished fixing all this after no end of fun with the 3584 setup
and mismatched rmtX/element IDs.

-Dan



Re: baffling tape, 3584 HW compression, and stgpool design

2002-09-04 Thread Dan Foster

Hot Diggety! David Longo was rumored to have written:
 My experience with tapes in this condition is that the write protect tab
 was set on by accident.  AS Dwighty check the tape out. Then check
 the Write Protect.  If it is on, then set it off and check the tape back
 in as scratch.

Looks like two bad tapes in the 3570 library. Thanks! Going to store
them in the 'rejects pile' until I can legally have them degaussed and
then disposed of.

-Dan



Re: baffling tape, 3584 HW compression, and stgpool design

2002-09-04 Thread Dan Foster

Hot Diggety! Koen Willems was rumored to have written:

 Maybe that the access state is unavaileble try q vol f=d
 look at the access state if it is not in a readw state use
 update vol  access=readwrite
 Then do a move data on the volume to see if the remaining data moves
 Do a del vol and or a checkout if you want to lose the volume.

Apparently, it was bad tapes.

 Compression is automaticaly used when using format is drives.

Interesting. Guess that makes sense.

 (Do not use client compression)

Nah, wasn't planning to. The backups are fast enough that we wouldn't
have any real benefit to doing client compression.

 Storage pools give a beter restore performance on file data if the contain
 less volumes ( this cuts on mount and search times )

Ahh, makes sense.

 Make a pool for priority restore nodes ( if possible with collocation )
 Make a pool foor non priority nodes without collocation
 Make a pool for every type of TDP you install.

 ( colloction will cost you tapespace but will speed up restores )
 ( the less volumes an uncollocated storage pool will have the faster the
 restores )
 ( TDP pools have a beter restore performance because op the large
 sequentiall backups that kan stream back when restoring)

Makes sense. Good tip about TDP pools, thanks. Definitely will keep these
separate.

 Do not leave data on your diskpool longer then necessary.
 Further more one does not want to have backup data on a diskpool for longer
 than one backup cycle. after backup sessions do a:

 update stg diskpool hi=0 lo=0 and move data to tape and then offsite.

Sounds good. ;) I'm looking to concentrate all client backups in a specific
window, then have the scheduler fire off a daily job to do the migrations
(disk-tape, tape-offsite copypool) with some basic error checking, and
also do other things such as expire inventory, dbbackup, etc.

-Dan



Re: baffling tape, 3584 HW compression, and stgpool design

2002-09-04 Thread Dan Foster

Hot Diggety! Zlatko Krastev/ACIT was rumored to have written:

 David already answered to question 2 but as an additional remark - when
 TSM started to write a tape volume uncompressed the tape has to become
 back scratch to start write on it with compression.

Very interesting tidbit. I'll note that and file away for future reference.
Thanks!

-Dan



baffling tape, 3584 HW compression, and stgpool design

2002-08-27 Thread Dan Foster

Couple unrelated questions:

1. When I query a 3570 library, there's one tape that baffles me:

adsm q libvol 3570lib1

Library NameVolume NameStatus   Last Use Home Element
-----
3570LIB1133060 Private   34

   I can't do anything with it -- ie, I can't discard the data on it because
   ADSM says it doesn't belong to a storage pool, and yet, no indication that
   it's a DbBackup. All other tapes in the library are sane and belongs to
   either scratch, a storage pool, or a dbbackup.

   So what could it be, and how do I wipe that lone offending tape and change
   its status to scratch? (This is with ADSM 3.1+patches)

2. How do I enable hardware compression for LTO tapes? Some sort of device
   class or drive definition parameter? (This is with TSM 4.2.2+patches)

   I've looked through docs set and not finding anything that addresses this.

   I'm currently using:

   tsm DEFINE DEVCLASS 3584_DEVCLASS1 DEVTYPE=LTO FORMAT=DRIVE-
 MOUNTLIMIT=10 MOUNTWAIT=60 MOUNTRETENTION=0 LIBRARY=3584LIB1

   tsm DEFINE DEVCLASS 3584_DEVCLASS2 DEVTYPE=LTO FORMAT=DRIVE-
 MOUNTLIMIT=2 MOUNTWAIT=60 MOUNTRETENTION=0 LIBRARY=3584LIB1

   The mountlimit of 10 and 2 is to put an hard upper bound on number
   of drives that the backups (10) and offsite copying (2) can use.

   The mountwait of 60 minutes is to avoid having jobs fail if they're
   sitting there, waiting for a particularly huge 300GB job to finish and
   free up a drive for it to use.

   Mountretention=0 is because this is a fast automated library (3584 with
   12 drives); I can see setting it to 10 or so if it was a smaller human
   operated library or requiring operator tape swaps like the 3570 library.

   3584LIB1 refers to the library, obviously. ;) (device /dev/smc0, etc)

   As I understand it, FORMAT=DRIVE means it uses the drive's compression
   settings. I haven't seen any explicit mention of what this defaults to
   nor how to adjust it in either the 3584 or TSM docs.

   Or would I use something like DEVTYPE=LTOC...?

3. Is there really a reason to use multiple storage pools for the same
   tape repository?

   Ie: the TSM 4.2 docs suggests an out-of-box default setup such as DISKDIRS,
   DISKDATA that then migrates to TAPEDATA, which then copies to OFFDIRS
   and OFFDATA (offsite copy pool tapes). This makes sense. (I understand the
   purpose for each and every one of them, and how they're organized.)

   Is there a good reason why someone might want to split up diskdata and
   tapedata into multiple stgpools? I ask because I've heard references to
   other folks having multiple stgpools, and wondering what I'm missing here.

   That's the only thing I think that eludes my understanding in preparing the
   new design, at this point.

Any insight or comments much appreciated. Thanks!

-Dan



SNMP MIB for TSM 4.2?

2002-08-13 Thread Dan Foster

Does anyone know where I can get a SNMP MIB definition file for TSM 4.2?
I know it's capable of SNMP support, but in order to meaningfully use it,
need a MIB to plug into the monitoring system and I don't seem to be able
to find a MIB anywhere.

-Dan



Multiple logical libraries for 3584

2002-08-13 Thread Dan Foster

I've got a question.

One 3584-L32 with 6 drives and one 3584-D32 with 6 drives.

Is it possible to have a logical library that covers 4 drives in
the L32, and a second logical library that covers last 2 drives
in the L32 and all 6 drives in the D32?

Or is that not a valid configuration -- ie, do I need to keep
logical libraries from spanning multiple drives in multiple 3584
units?

-Dan



Re: SNMP MIB for TSM 4.2?

2002-08-13 Thread Dan Foster

Hot Diggety! Mark Stapleton was rumored to have written:
 From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED]]On Behalf Of
 Dan Foster
  Does anyone know where I can get a SNMP MIB definition file for TSM 4.2?
  I know it's capable of SNMP support, but in order to meaningfully use it,
  need a MIB to plug into the monitoring system and I don't seem to be able
  to find a MIB anywhere.

 How about adsmserv.mib, in your server subdirectory?

D'oh! Don't know how I didn't see that. Thanks ;)

-Dan



Re: Multiple logical libraries for 3584

2002-08-13 Thread Dan Foster

Hot Diggety! Don France was rumored to have written:
 Yep (to the last question);  you cannot span multiple physical libraries to
 make a single logical.  You can define multiple logicals within a single
 physical;  that is a common thing I've done, for various reasons.

Darn. ;) But makes sense, all considering. Thanks for confirming --
couldn't find explicit restrictions except for a few other items in
the manuals. Much appreciated!

-Dan



Re: Multiple logical libraries for 3584

2002-08-13 Thread Dan Foster

Hot Diggety! Mark D. Rodriguez was rumored to have written:

 Judging by your configuration I beleive that you have just one physical
 library.  I beleive the L32 is the base unit and the D32 is an expansion
 cabinet, i.e. the 2 of them are physicaly attached to one another and
 share the same robotics.  Is this correct?  If so then yes you can
 partition the library as you described.

Aye, that is correct, from what I understand of the setup.

 But the real question is whatare you trying to accomplish?  If this is
 to be connected to 2 different servers there are a few more things that
 have to be in place.  If both of these logical libraries are to be used
 by the same TSM server I am not sure I understand the rational for doing
 so.  You could just as easily manage the drive utilization through other
 TSM server options.  Please explain a little bit more what you are
 trying to accomplish.

Basically I'm trying to split on-site vs off-site (copypool) tapes, with
each group having its own drives and pile of in-use (private) vs spare tapes.

Currently we do it with a 3575 for on-site tapes and a 3570 library for
the off-site tapes, and it avoids significant potential handling issues.
Not wanting to give Murphy more chances to possibly break things anywhere
due to improper or mistaken tape handling, basically.

It also protects availability for doing client restores as well as gives
us a guaranteed number of drives/tapes that can be used at any given time
for doing daily backups. Gives me some hard guarantees about available
resources that I can then plan the entire operation around.

Connection is to only one server. I just didn't think it was possible to
safely handle multiple TSM libraries via a single media changer interface
(/dev/smc0) and from reading the literature, was led to believe that multiple
logical libraries was required (each one providing a smc device) in order to
control the drives in each partitioned group of tape drives, as well as
for determining what tapes were in a pool and available for spare use.

-Dan



Re: Performance of TSM 4.1.4 with Solaris

2002-06-27 Thread Dan Foster

Hot Diggety! Charles Anderson was rumored to have written:
 500% increase?!  We are using UFS right now and not RAW.  I have been a
 little hesitant to migrate to RAW though.  Any suggestions for keeping
 it as painless as possible ?  What did you do, just set your migration
 threshold to 0, backup the log, backup the db, repartion the disks,
 restore above from the backups?  What did you do about getting the data
 back onto the disks?

I believe Veritas has a product that allows you to continue to use ufs
filesystems, but has a special driver that completely bypasses the
double-buffering, which gives you the same performance as raw logical
volumes, but with all the integrity protections of ufs.

May be worth investigating if you'd like the best of both worlds.

-Dan Foster
IP Systems Engineering (IPSE)
Global Crossing Telecommunications



Re: Keeping an handle on client systems' large drives

2002-06-14 Thread Dan Foster

Hot Diggety! Seay, Paul was rumored to have written:
 Ask them where they were on 9-11-2001.  Are they totally brain dead?

Ahhh, so that's what you referred to in passing in the other post.

That's all right, and understandable.

I have a first rate appreciation of this. If you'll allow me to indulge
briefly on a tangentially related (but not completely) issue on this
list, just once...

I used to be a VMS admin. Best, most robust OS that I ever worked with -
probably true for the IBM mainframes but didn't work much with them, alas.
(A little OS/400, DOS/VSE, and one or two other related OSes)

Anyway, come post-9/11, a *lot* of financial firms were in a world of
hurt. The ones who planned and re-tested over and over again, each year,
for an alternate site a good distance away from NYC, was able to reopen
for business only a few days later. Many were based in NJ or about an
hour west/north of NYC... one was even based not too far from home, their
DR site being about 4-5 hours northwest of NYC.

Around this time, I heard that Compaq (company that bought out DEC)
was making a lot of frantic calls all around the country seeking out high
end machines such as the AlphaServer 8400s and VAX 7000s...that had been
discontinued for perhaps 10 years since, because a lot of customers were
suddenly calling in for warranty replacements (under their expensive
support contracts) in NYC and DC -- you can guess what kind of customer
it was in DC. How desperate was Compaq? They were calling up even third
level resellers of used equipment that they would normally never ever think
of talking to.

Compaq was in a nasty hole, because they had run out of set-aside reserve
spares. Fab plants *long* since shut down...they can't just take the
original plans and re-fab, since the engineers no longer there... I'm not
sure how they eventually resolved that... probably offered newer machines to
customers and provided migration assistance at Compaq's cost, is my guess.

But what the bean counters don't realize is that it doesn't take a
catastrophic national event to mean a bad effect on the business bottom
line, which I find unfortunate. Can be all sorts of more 'mundane' (albeit
not very common) events such as that train which burned in a Baltimore
tunnel and closed a part of downtown near Oriole Park at Camden Yards.
My company (used to also own a telco) was personally affected by an homeless
man burning something in a former abandoned railroad tunnel that melted
fiber optics and took out OC-12 to the area for 12+ hours, with a nice
number of servers based out of here.

It doesn't have to be a corporation for a nasty disaster to mean bad
things for their bottom line. I am very well reminded of a colossal failure
at an academic institution almost a decade ago that was a chain of events
ultimately resulting in failure of a critical drive in a RAID-5 array,
and the tapes weren't really usable for recovery...which they found out
the hard way. An entire semester of classwork was effectively disrupted,
with much data lost, before they were finally able to convince DEC to
send out the very best people to recover about 80% off the RAID-5 array
through some custom work. So many classes, projects, research papers, etc.
were affected that it just simply isn't funny. Same place where if the
IBM mainframe ever went down, school was closed for the day. (Happened
only once ever, to best of my knowledge.)

...and that is truly unfortunate, that the people who are actually tasked
to make things happen, like us, understand and appreciate, whereas others
higher up may not share the same view, knowledge, and experience.

In a D/R scenario, it also behooves you to know your power sources, how
they kick in, at what levels, how fast/when, evacuation plans, how to
config PBXes, have emergency equipment handy (eg flashlights), and a million
other details. Hardware that can be quickly hooked up/activated, written
step by step plan nearby, software CDs handy if needed, dry runs done,
backups/restores/app operation verified, and all of this tested once or
twice a year depending on level of need and impact, etc.

Still, I resolve to do my best to do whatever I can realistically do. :)

With that said, I now return you to the normal *SM discussions. ;)
(with the reason for copy stgpools driven home ;) )

-Dan Foster
IP Systems Engineering (IPSE)
Global Crossing Telecommunications



Re: TSM 4.2/AIX setup questions

2002-06-13 Thread Dan Foster

Hot Diggety! Miles Purdy was rumored to have written:

 1) Is there any particular reason to set a max file size for a disk stgpool?
(Assuming a setup where disk stgpool will migrate to tape stgpool)
 Generally yes, if you will be backing up a file larger than the stgpool. If you will 
be backing up a file larger than the stgpool it may be more efficient to send it 
right to tape, this is the general rule.

Ahhh, efficiency...makes sense. Also explains why there might be a situation
where client-disk [but too big or unavailable]-direct to tape instead,
so the copy stgpool might not have every single file in the disk stgpool,
hence this being the reason for backing up tape stgpool to the copy stgpool
in order to fill in any gaps (in addition to backing up disk stgpool, also).

 2) Should the TSM server have its own stgpool for backing up itself?
 A. I don't think so. Do you mean the filesytems on the TSM/AIX server? No, it 
doesn't _need_ its own.

Ok. Must've been an original design decision at this site for the existing
setup. Can't say I know the rationale (and can't ask since the folks in
question are long since gone now), but I'll keep that in mind. We set up
the existing scheme about 5-6 years ago, and in this high-turnover industry
(ISP), people moved on every few years when the grass was still green. ;)

 3) I've heard mixed things about 358x firmware version 22UD... I think we
have 18N2 (but not near it right now to confirm), although what I've
heard about 22UD is generally (but not 100% in agreement) positive. Stable?
 A. I'm using 22U0, things seem good.

Hmmm. Duly noted, thanks.

 4) Whom is supposed/allowed to upgrade firmware? IBM CE only?
 It depends how comfortable you are performing the work.

I'm good with any sort of firmware updates from SP nodes to disks to
disk arrays to tape drives, etc. But for this one time, I think I'll
let the CE do it the next time he's on-site for the 3584. :) No reason
to throw caution to the wind.

 5) The only docs for firmware upgrade references a NT/2000 box and the
NTUtil application, whereas I'm in an all-UNIX (AIX and Solaris, although
I do have a laptop with Linux and Windows XP if need be) environment, so
wonder how to upgrade the firmware without Windows if it's even possible.
 A. You can upgrade the firmware from a drive (drive to drive) or from a tape 
cartridge. Check the docs for the main panel.

Ahhh, yes. See it now, thanks.

 6) To *SM, all backups are incrementals (except for the first backup of a
new client), is my general understanding. Is there a way to force a full
backup of a particular client as an one-time operation? I'm guessing maybe
not, but thought I might try asking, anyway. :)
 A. This is called an 'archive'. There is plenty of docs on this.

Ahhh, sure is! Archive, being TSM's term for that... okay, can deal with that.
Back to reading up more on the archive section.

 7) The biggest single question... I don't have a real good understanding of
the purpose of copy stgpools. I've read a lot of documentation -- hundreds
of pages of multiple docs, re-read, read old adsm-l mail, Google searches,
etc... but still just don't quite 'get it'. I can set up HACMP clusters,
debug really obscure things, but this eludes me. ;)
 A. Copy pools are for offsite. Copy pools are what is in your vault. A copy pool is 
a complete copy, usually offsite.

Ahh, that makes much more sense, thanks. (And rest assured, we *do*
have an off-site storage vendor and bring tapes off-site. :) )

  Almost, you do send the (primary) data to another (copy) pool, but all the time. I 
would do something like this, everyday:

 1. backup your clients to disk (disk storage pool)
 2. make a copy to a copy storage pool (OFFSITE)
 3. backup from the disk storage pool to your primary storage pool
 4. backup your database, devconfig, volhist
 5. send the OFFSITE tapes and the database backup offsite
 6. during the day run reclamation on each storage pool

Makes sense.

Hmm, I saw a suggestion in the TSM 4.2 admin guide to disable reclamation
for the copy stgpool (specifically) to avoid a situation where tapes gets
sent off site and then it wants them for the reclamation.

But of course, for the other stgpools, reclamation does make sense.

Much appreciated the pointers...big help, thanks!

-Dan Foster
IP Systems Engineering (IPSE)
Global Crossing Telecommunications



Keeping an handle on client systems' large drives

2002-06-13 Thread Dan Foster

I've always been curious about something.

How do you keep an handle on the fact that commodity PC storage is
growing at a far faster rate than tape capacity/system is?

For example, if I had a small LAN of about 300 PCs -- let's say,
an academic or corporate departmental LAN environment... each
has at least a 40 GB HD, and probably a fair amount of apps and files
on them. In the stores, I see drives up to 160 GB, with even larger
ones on the way!

So let's say, an average of 25 GB utilization per system... a single
full backup would be about 7.5 TB, which is quite a few tapes ;)
Not everybody is using LTO or higher capacity.

So do those sites rely purely on the incrementals to save you? Or
some site specific policy such as tailoring backups to exclude
(let's say) C:\Program Files, or some such...? Just wondering.

Not every site is lucky enough to be able to convince the beancounters
the merits of having a backup system that keeps up with the needs of
the end users, even if it means one has to explain doomsday predictions
on the business bottom line -- they invariably hear that then say Oh,
pshaw, you're just exaggerating because you want money It sucks
to be the one that's right ;) And the ones who warns well before a
nasty event occurs may also be the first one to be fired out of spite
after something happens and gets the blame for not having prevented it.

-Dan Foster
IP Systems Engineering (IPSE)
Global Crossing Telecommunications



Re: TSM 4.2/AIX setup questions

2002-06-13 Thread Dan Foster

Hot Diggety! Miles Purdy was rumored to have written:
 NO! Remember that a copypool is a complete copy. So everything that is offsite is 
onsite.

 TSM does not need the offsite tapes to reclaim them!

Ah, that is a good point, when put that way. ;)

 But you are correct: You could send a tape out Monday, 10% used (90% percent 
reclaimable). Reclaim it during the day (Monday), and ask for it back Tuesday. With 
LTO tapes this can be mitigate by using a high reclaim, like 90%, or 10-to-1. Or run 
reclamation less often, like every second day. If you don't run reclamation you could 
have 100GB tape with 1KB used, stuck in your vault.

Ick, that'd be really nasty and an huge waste. Duly noted, thanks.

 Also make sure that you don't use collation on a copy pool, unless this is what you 
REALLY want. In general I don't think you want to you. Not using collation will keep 
fewer tapes, more full. Preventing tape thrashing, if you like.

Indeed. I read about that...sounded great at first, but then soon realized
that potential for tape thrashing was probably going to be too great given
how we manage the tape environment here. So it's definitely not going to be
enabled. (But I appreciate the TSM folks having made that an available option.
Options are always great to have, even if not used.)

-Dan Foster
IP Systems Engineering (IPSE)
Global Crossing Telecommunications
Internet: [EMAIL PROTECTED]



Re: Busy log files

2002-06-13 Thread Dan Foster

Hot Diggety! Thomas Denier was rumored to have written:

 It depends on the purpose of the log files. Some applications append

When put in that way, does make a lot of sense why the TSM behavior would
be have options to control that.

Fortunately, the log files in question is non-database, and used for
accounting; works by continually appending entries.

 planning, or accounting. This type of log can usually be dealt with
 by using an include statement in the include/exclude file to bind the
 log to a management class whose backup copy group is defined with
 'serialization=dynamic'. This will cause TSM to read the file once

Groovy, thanks. (Also appreciated the other poster whom also suggested
looking at the serialization options.)

-Dan Foster
IP Systems Engineering (IPSE)
Global Crossing Telecommunications



Re: Keeping an handle on client systems' large drives

2002-06-13 Thread Dan Foster

Hot Diggety! Seay, Paul was rumored to have written:

 What you have to do is revisit what you are saving and put in exclude.dirs
 for all directories that contain software that can be rebuilt from a common
 desktop image (hard drive replacment).  Have your users save their documents
 in specific folders and only back them up.  Then they just have to customize
 their desktop configure their node name in the dsm.opt and restore the stuff
 that is backed up.

 This is the trade-off.

Makes sense. Basic education + cost saving vs expense from a brute force
approach. The trick is to have education that works well for a wide range
of users, with differing expertise, and to also clearly communicate
expectations (if you save anywhere else, you won't get it back!).

Now that sounds like I also have to train them to not just blindly click
whenever an application offers them a default directory (often within app
area) to store documents in.

Perhaps a small data area carved out on the hard drive, like say, 5 GB
partition for user documents as Z: or whatever, and similiarly for other
platforms (/userdocs/user as a symlink from ~user/docs or whatever), to
provide a consistent and easy-to-use area for end user, yet predictable area
for mass-deployed *SM configurations to use.

I'm sure that the IT shop can help out significantly if they're able to
preconfigure these settings within each application before users gets their
hands on the machine. Hard part is when not every place has that luxury,
especially at smaller places where end users may be configuring everything
on their own.

Anyway, the overall education/training approach is definitely cheaper than
having to save everything on the HD, I do agree. ;)

-Dan Foster
IP Systems Engineering (IPSE)
Global Crossing Telecommunications



TSM 4.2/AIX setup questions

2002-06-12 Thread Dan Foster

Environment: TSM 4.2.x on an AIX 4.3.3 ML10 server (660-6H1) with a 3584-L32
and a 3584-D32 expansion frame.

In the process of setting it up, which is a luxury that I'll have only once
to get it right. :) So, some questions (since I am still coming up to speed
on *SM).

1) Is there any particular reason to set a max file size for a disk stgpool?
   (Assuming a setup where disk stgpool will migrate to tape stgpool)

2) Should the TSM server have its own stgpool for backing up itself?

3) I've heard mixed things about 358x firmware version 22UD... I think we
   have 18N2 (but not near it right now to confirm), although what I've
   heard about 22UD is generally (but not 100% in agreement) positive. Stable?

4) Whom is supposed/allowed to upgrade firmware? IBM CE only?

5) The only docs for firmware upgrade references a NT/2000 box and the
   NTUtil application, whereas I'm in an all-UNIX (AIX and Solaris, although
   I do have a laptop with Linux and Windows XP if need be) environment, so
   wonder how to upgrade the firmware without Windows if it's even possible.

6) To *SM, all backups are incrementals (except for the first backup of a
   new client), is my general understanding. Is there a way to force a full
   backup of a particular client as an one-time operation? I'm guessing maybe
   not, but thought I might try asking, anyway. :)

7) The biggest single question... I don't have a real good understanding of
   the purpose of copy stgpools. I've read a lot of documentation -- hundreds
   of pages of multiple docs, re-read, read old adsm-l mail, Google searches,
   etc... but still just don't quite 'get it'. I can set up HACMP clusters,
   debug really obscure things, but this eludes me. ;)

   What I want to do is:

   client - TSM server - disk stgpool - (automatically migrate to tape
   based on space utilization of disk stgpool) tape stgpool

   That's the general concept of what I want to achieve. Is a copy stgpool
   really needed, to be attached to either one of the primary stgpools?

   I was under the impression that a copy stgpool was something you wanted
   when you wanted to copy a primary stgpool so that you could send it to
   another stgpool when ready (based on whatever trigger...space, date),
   such as in a disaster recovery scenario?

-Dan Foster
IP Systems Engineering (IPSE)
Global Crossing Telecommunications



Re: Recovering SP node to Standalone RS6000

2002-05-02 Thread Dan Foster

Hot Diggety! Jolley, Bill was rumored to have written:
 I have a TSM server located on an IBM SP Node (winterhawk) and would like to
 recover to a standalone RS/6000 (H80). Do anyone have/know of a procedure or
 have suggestions?

Offhand, best bet is probably to do a mksysb backup (making sure that
device drivers for both boxes exists, among other key gotchas) and then
a mksysb install from that image. You can't just do a normal file backup
and then restore because there are a bunch of system specific stuff such
as the ODM entries, installed filesets, etc. Even more critical to get
it right when SP and non-SP is concerned.

I haven't personally done that with our SP and non-SP nodes as you
describe, so that's about all I can suggest from what I've heard from
others who has done similar things in the past.

It's usually done as a way to recover from a total SP node system failure,
rather than migrating to or from the SP, though.

-Dan Foster
IP Systems Engineering (IPSE)
Global Crossing Telecommunications



Re: UNABLE TO DELETE FILESPACE

2002-04-09 Thread Dan Foster

Hot Diggety! Stephen Pole was rumored to have written:
 Hello felloe ADSM/TSM'rs,

 We are trying DELETE FILESPACES belong to a NODE so that we can REMOVE the
 NODE.

One way to delete filespaces indiscriminately if you're just going
to remove the whole node, is to do:

DELETE FILESPACES nodename * TYPE=ANY

-Dan Foster
IP Systems Engineering (IPSE)
Global Crossing Telecommunications



Re: mailbox.pst

2002-04-07 Thread Dan Foster

Hot Diggety! Mark Stapleton was rumored to have written:

 As discussed back a month or two ago:

 1. Load kill.exe from the appropriate Windows resource kit.
 2. Create a preschedcommand line that runs
 drive:\path\kill.exe outlook.exe
 3. Run the backup. The .pst file will be available for backup.

 Those pesky users. How dare they keep Outlook open when they lock the
 workstation for the night...

:)

Perhaps it would also be prudent to, between steps 2 and 3, to insert
some way of notifying the user about the pending kill 10, 5, 1 minute
beforehand (and tool should be intelligent enough to auto-dismiss the
dialog box if no response after a timeout period).

I'm just afraid that the proposed solution, as is, may really anger
someone who was working on something important during a late night
work session, and seeing it all disappear. It's especially not fun
getting realy angry calls from execs. :)

And then you've got some sites that runs call centers, and so forth.
(24x7 operations, essentially.)

-Dan Foster
IP Systems Engineering (IPSE)
Global Crossing Telecommunications



  1   2   >