Re: NDMP TOC (was Re: TSM 5.3 new goody)

2005-04-01 Thread Iain Barnetson
Ben,
I've been doing TSM NDMP backups of a pair of clustered FAS940's since
January. 
I'm using TSM 5.2.4.1 and Data OnTap 7.0.0.1.
So far I've not had any problems really, only some minor things to work
round. Have changed the backups from filer complete jobs to individual
volumes with their own individual TOCs which has speeded up the backups
 restores.

Iain


Regards,

Iain Barnetson
IT Systems Administrator
UKN Infrastructure Operations

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
Ben Bullock
Sent: 01 April 2005 00:41
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] NDMP TOC (was Re: TSM 5.3 new goody)

Just a quick update on our NDMP/TSM implementation we have been
working on:

RECAP - We were attempting to get NDMP to work with TSM, at the
qtree level (as opposed to the whole volume) and get a TOC for the
files. We had ~some~ qtrees succeed but many failed. 
We've spent the last month with IBM and NetApp trying to figure
out who's fault it was with logs and traces.

Resolution:
For those of you using NDMP on NetApp servers, here is the bug
we ran into:
__

Bug ID 152072
Title Tape backups are larger than expected or appear to loop in Phase
V. 
 
Description
Formatted  If a data set has more than ~4,000 ACLs, then file data may
be mistakenly  written out in addition to NT ACL data during Phase V (NT
ACLs) of dump. 
 Symptoms of this bug include: Data written taking up significantly more

 space than what is being dumped or dump appearing to loop in Phase V.
 
 The behavior may be seen with dumps initiated from either the filer
console  or NDMP.


Related Solutions   
Fixed-In Version Data ONTAP 7.0.0.1P2 
__

The additional symptom that we bumped into is the inability to
create a Table of Contents for the TSM session.

I'm kinda surprised that nobody else has stumbled across this
bug...

Ben

 
-Original Message-
From: bbullock 
Sent: Tuesday, March 22, 2005 10:03 AM
To: 'ADSM: Dist Stor Manager'
Subject: RE: NDMP TOC (was Re: TSM 5.3 new goody)

Hmm, I haven't seen a good rule-of-thumb for the TOC. Perhaps
others who are using it more can address this. 

In my limited testing, I sometimes get a TOC much larger than I
would expect on qtrees with the same approx number of files. Might it
depend on if there are ACLs for the NTFS qtrees? I'm not sure in NDMP if
that type of information is in the actual image file or in the TOC...

Ben

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
Curtis Stewart
Sent: Tuesday, March 22, 2005 9:47 AM
To: ADSM-L@VM.MARIST.EDU
Subject: NDMP TOC (was Re: TSM 5.3 new goody)

Anyone know how to estimate the size required for a TOC? I'm looking at
NDPM for our filer, that has about 3 million files.

[EMAIL PROTECTED]


Re: another storage agent question

2005-04-01 Thread Rees, Chris (Corp)
Uwe

Thanks for your reply.   I had already tried what you suggested.
I was dead close but I hadn't got my mgmtclass set up properly on my
library client... what a silly mistake..

It now works.  


-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
Uwe Schreiber
Sent: Friday, April 01, 2005 12:48 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] another storage agent question

hi chris,

i have this setup working in our AIX TSM environment.

Library Manager : AIX 5.2 TSM 5.2.3.5
Library Client  : AIX 5.2 TSM 5.2.3.5
LANfree Client  : Solaris 9 TSM 5.2.3.4 + Storage Agent 5.2.3.5 + TDP
for
R/3 3.3.12.0

you have to do the following things:

1. setup the Storage Agent

dsmsta setstorageserver myname=storage_agent_node_name
 mypassword=storage_agent_password
 myhladdress=storage_agent_hl_address
 servername=library_client_name
 serverpassword=password_for_library_client
 hladdress=library_client_hl_address
 lladdresslibrary_client_ll_address



2. define a server to server communication from the Storage Agent to the
Library Manager via define server .. 

define server LANfree_Client_Name serverpass=LANfree_Client_Password
hladdr=LANfree_Client_hladdress lladdr=LANfree_Client_lladdress


3. define a server to server communication from the Storage Agent to the
Library Client via define server ...

define server LANfree_Client_Name serverpass=LANfree_Client_Password
hladdr=LANfree_Client_hladdress lladdr=LANfree_Client_lladdress



4. define the pathes for the drives for the Storage Agent at the Library
Manager via define path ...



now you should be able to start the Storage Agent in forground at the
LANfree Client and see at least one client session for the LANfree
Client
at
the Library Client TSM-Server

Regards Uwe






[EMAIL PROTECTED]
Sent by: ADSM-L@VM.MARIST.EDU
31.03.2005 17:18
Please respond to
ADSM-L@VM.MARIST.EDU


To
ADSM-L@VM.MARIST.EDU
cc

Subject
another storage agent question






Hi All

Env =  Win2k
TSM Server and Storage agent 5.2.2.3
3584 mixed lto1/lto2 libraries
1 tsm server library manager , 1 tsm server library client

The 5.2 and 5.3 manual on the storage agent seems to indicate that I
should be able to do lanfree backups to a tsm server that is a library
client which I don't quite understand because surely there would be no
tape paths defined on the library client. Quotes from manual

When the Tivoli Storage Manager server (data manager server) is also
the library manager for the devices where data is stored by the storage
agent, then the storage agent communicates requests to this Tivoli
Storage Manager server. When the Tivoli Storage Manager server (data
manager server) is another library client, then the storage agent
communicates requests for itself or the metadata server directly to the
library manager. 

A library client requests shared library resources, such as drives or
media, from the library manager, but uses the resources independently.
The library manager coordinates the access to these resources. Data
moves over the SAN between the storage device and either the library
manager or the library client. Either the library manager or any library
client can manage the LAN-free movement of client data as long as the
client system includes a storage agent. 


 Has anyone got this working?  I have only been able to get lanfree
backups working in the normal way to the tsm server which is the library
manager.

What am I missing here?


Cheers





___ Disclaimer Notice 
This message and any attachments are confidential and should only be
read
by those to whom they are addressed. If you are not the intended
recipient, please contact us, delete the message from your computer and
destroy any copies. Any distribution or copying without our prior
permission is prohibited.

Internet communications are not always secure and therefore the E.ON
Group
does not accept legal responsibility for this message. The recipient is
responsible for verifying its authenticity before acting on the
contents.
Any views or opinions presented are solely those of the author and do
not
necessarily represent those of the E.ON Group.

E.ON UK plc, Westwood Way, Westwood Business Park, Coventry, CV4 8LG.
Registered in England  Wales No. 2366970

E.ON UK Trading Ltd, Westwood Way, Westwood Business Park, Coventry, CV4
8LG
Registered in England  Wales No. 4178314

E.ON UK Trading Ltd is regulated by the Financial Services Authority to
carry out investment activities.

Telephone +44 (0) 2476 42 4000
Fax +44 (0) 2476 42 5432


Re: Restore performance problem

2005-04-01 Thread John Naylor
Thomas,
I suspect the media waits are down to the three streams wanting the same
tapes
If you are really restoring 9 million separate objects then most likely it
is going to be the client end writing out the data where the majority
elapsed time is spent.
What do the session stats show for network data transfer rate?
I have not done tracing for a while, but when I did I always found
Perform tracing on the client to be most useful
The time is usually found to be spent mainly in
Transaction:- A general category to capture all time not accounted for
   in other sections. This category includes file open/close
   time and other miscellaneous processing on the client.
   File open/close processing can make total Transaction time
   a large part of elapsed time with smaller files

File I/O:- Requesting data to be read or written on the client file
 system. Each File I/O usually represents a 32K logical
 request (or the remaining data if less than 32K). File I/O
 may be entered one additional time at the end of the file.
 With compression on some smaller clients a file I/O can
 represent a request for less than 32K. A file I/O request
 may require multiple physical accesses.
 For small files on systems without read ahead, average
 file I/O time for backup is 15 to 40 ms dependent on the
 platform. For large files on system doing read ahead this can
 be significantly reduced. Slow response times from disks will
contribute to the ammount of time logged here.

Data Verb :- consists of time spent in the network plus time spent on the
host server


.



Thomas Denier [EMAIL PROTECTED]
Sent by: ADSM: Dist Stor Manager ADSM-L@vm.marist.edu
31/03/2005 22:32
Please respond to
ADSM: Dist Stor Manager ADSM-L@vm.marist.edu


To
ADSM-L@vm.marist.edu
cc

Subject
Restore performance problem






We recently restored a large mail server. We restored about nine million
files with a total size of about ninety gigabytes. These were read from
nine 3490 K tapes. The node we were restoring is the only node using the
storage pool involved. We ran three parallel streams. The restore took
just over 24 hours.

The client is Intel Linux with 5.2.3.0 client code. The server is
mainframe
Linux with 5.2.2.0 server code.

'Query session' commands run during the restore showed the sessions in
'Run'
status most of the time. Accounting records reported the sessions in media
wait most of the time. We think most of this time was spent waiting for
movement of tape within a drive, not waiting for tape mounts.

Our analysis has so far turned up only two obvious problems: the
movebatchsize and movesizethreshold options were smaller than IBM
recommends. On the face of it, these options affect server housekeeping
operations rather than restores. Could these options have any sort of
indirect impact on restore performance? For example, one of my co-workers
speculated that the option values might be forcing migration to write
smaller blocks on tape, and that the restore performance might be
degraded by reading a larger number of blocks.

We are thinking of running a test restore with tracing enabled on the
client, the server, or both. Which trace classes are likely to be
informative without adding too much overhead? We are particularly
interested in information on the server side. The IBM documentation for
most of the server trace classes seems to be limited to the names of the
trace classes.



**
The information in this E-Mail is confidential and may be legally
privileged. It may not represent the views of Scottish and Southern
Energy Group.
It is intended solely for the addressees. Access to this E-Mail by
anyone else is unauthorised. If you are not the intended recipient,
any disclosure, copying, distribution or any action taken or omitted
to be taken in reliance on it, is prohibited and may be unlawful.
Any unauthorised recipient should advise the sender immediately of
the error in transmission. Unless specifically stated otherwise, this email (or 
any attachments to it) is not an offer capable of acceptance or acceptance of 
an offer and it does not form part of a binding contractual agreement.


Scottish Hydro-Electric, Southern Electric, SWALEC, S+S and SSE Power 
Distribution are trading names of the Scottish and Southern Energy Group.
**


Re: Restore performance problem

2005-04-01 Thread Rainer Wolf
Hi,
I have done quite similar restores on our mailserver.
you may also look at the Client what happens to the
restore-process. It may happen that the cpu is at 100 % for 
the 'dsmc restore ..' ? Another thing is the filesystem on the 
Client and you may check the filesystem/disk-activity/Service-time if there 
is any 'weakness' that may result from creating that many i-nodes.

I have recently done a lot of mailserver-restores (always  3,5 mio Files/140 GB 
)
using an old tsm-server ( v5.1.9.5  with k-tapes and same konfig like you ... 
10 tapes )
and observed that specially this old tsm-server was at the end.
Especially our io-konfiguration of that old tsm-server was very bad :
db,log, disk-cache are mixed up. This decreases the restore-performance 
especially 
when other activity ( backups at night ) happens. 
So we used
dsmc restore -quiet /mail/ /data2/mail/
(tcpwindowsize 64, tcpbuffsize 32, largecommbuffers no, txnbytelimit 25600
resourceutilization 3)
and received the  3,5 mio Files/140 GB finally in 09:53:34
For me that was ok because I know about the bad server-constitution.
The restore time would be much more worse if the restore comes into a time
when the tsm-DB got a lot of other transactions - like nighly backups. 
... restoring the same with only one drive results in 51 hours .


Running the same mail-restore test on a new hardware ( new db, tsm5.3, with 
3592 Drives )
--using the same restore-client--- we finally got 3.5mio Files/150GB  restored 
in 04:52:00
... using just 1 drive because the data fits on 1 3599-tape.
But here I have experienced a reproduceable bug/behaviour ( it is in the moment 
'closed' because
the solaris10 is not yet supported ) : when starting the restore everything 
runs fine and 
fast ( with a restore-performance at about 1 mio Files/hour ) ... after some 
time -maybe 40 %
of the total restore time-  the cpu of the client is raising to 100 % and the 
restore performance ( data/files) is thus slowing down -- there is no reason 
for this found at the server
or at the client. 
... maybe it happens when a very big directory with a lot of directory in it  
is in progress ...
In the end I found a 'workaround': I canceled this slowed-down restore-process 
running at 100%CPU 
( 'dsmc restore -quiet /mail/ /data2/mail/' ) 
with Control-C, and let him shut down ... and then I just restart the restore 
with 
'dsmc restart restore -quiet' . This 'restarted restore' works fast again and 
finally 
ends  with the 04:52:00 (total time).   
If I would not stop/restart the client-restore-session the restore will 
end restoring with 06:49:09 .
That is reproduceable and it is a quite big difference 
( 30 % faster with interrupting and restarting ) 
but maybe its because of our unsupported tsm-version 
...  or has someone else seen this cpu-crunching behaviour  ?

Greetings 
Rainer



Thomas Denier wrote:
 
 We recently restored a large mail server. We restored about nine million
 files with a total size of about ninety gigabytes. These were read from
 nine 3490 K tapes. The node we were restoring is the only node using the
 storage pool involved. We ran three parallel streams. The restore took
 just over 24 hours.
 
 The client is Intel Linux with 5.2.3.0 client code. The server is mainframe
 Linux with 5.2.2.0 server code.
 
 'Query session' commands run during the restore showed the sessions in 'Run'
 status most of the time. Accounting records reported the sessions in media
 wait most of the time. We think most of this time was spent waiting for
 movement of tape within a drive, not waiting for tape mounts.
 
 Our analysis has so far turned up only two obvious problems: the
 movebatchsize and movesizethreshold options were smaller than IBM
 recommends. On the face of it, these options affect server housekeeping
 operations rather than restores. Could these options have any sort of
 indirect impact on restore performance? For example, one of my co-workers
 speculated that the option values might be forcing migration to write
 smaller blocks on tape, and that the restore performance might be
 degraded by reading a larger number of blocks.
 
 We are thinking of running a test restore with tracing enabled on the
 client, the server, or both. Which trace classes are likely to be
 informative without adding too much overhead? We are particularly
 interested in information on the server side. The IBM documentation for
 most of the server trace classes seems to be limited to the names of the
 trace classes.

-- 

Rainer Wolf  eMail:   [EMAIL PROTECTED]
kiz - Abt. Infrastruktur   Tel/Fax:  ++49 731 50-22482/22471
Universitt Ulm  wwweb:http://kiz.uni-ulm.de


Re: help:can tsm server manage multi-library?

2005-04-01 Thread Bos, Karel
Yes, ITSM server can do that.

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
ming li
Sent: vrijdag 1 april 2005 6:25
To: ADSM-L@VM.MARIST.EDU
Subject: help:can tsm server manage multi-library?

Hi all,can tsm server manage more than one library in lan-free backup
environment?Thx!


Re: Large Linux clients

2005-04-01 Thread Henk ten Have
An old trick I used for many years:
to investigate a problem filesystem, do a find in that filesystem. 
If the find dies, tsm definitly will die.
I'll bet your find will die, and that's why your backup will die/hang or 
whatever also. A find will do a filestat on all files/dirs, actually the same 
the backup does.
So your issue is OS related and not tsm.

Cheers
Henk ()

On Tuesday 29 March 2005 12:11, you wrote:
 On Mar 29, 2005, at 12:37 PM, Zoltan Forray/AC/VCU wrote:
  ...However, then I try to backup the tree at the third-level (e.g.
  /coyote/dsk3/), the client pretty much siezes immediately and
  dsmerror.log
  says B/A Txn Producer Thread, fatal error, Signal 11.  The server
  shows
  the session as SendW and nothing going else going on

 Zoltan -

 Signal 11 is a segfault - a software failure.
 The client programming has a defect, which may be incited by a problem
 in that area of the file system (so have that investigated). A segfault
 can be induced by memory constraint, which in this context would most
 likely be Unix Resource Limits, so also enter the command 'limit' in
 Linux csh or tcsh and potentially boost the stack size ('unlimit
 stacksize'). This is to say that the client was probably invoked under
 artificially limited environmentals.

 Richard Sims


Re: Restore performance problem

2005-04-01 Thread Richard Sims
On Mar 31, 2005, at 4:32 PM, Thomas Denier wrote:
We recently restored a large mail server. We restored about nine
million
files with a total size of about ninety gigabytes. These were read from
nine 3490 K tapes. The node we were restoring is the only node using
the
storage pool involved. We ran three parallel streams. The restore took
just over 24 hours.
The client is Intel Linux with 5.2.3.0 client code. The server is
mainframe
Linux with 5.2.2.0 server code. ...
I noticed that you didn't mention the file system type. The effects of
file system type and layout of the subject instance is an often
overlooked contributor to performance in operations which are
mass-populating the file system, as a restoral will. A journaled file
system can exhibit a lot of overhead as its journal is written with at
least metadata, depending upon type; and an ill-located journal can
make for a lot of disk arm diversions during the restoral, aggravating
elapsed time.
IBM's outstanding documentation store includes a great series on Linux
file systems, which one can jump into at
http://www-106.ibm.com/developerworks/linux/library/l-fs7.html .
  Richard Sims


lost volume configuration in my 3583

2005-04-01 Thread Luc Beaudoin
Hi all

The IBM tech update my firmware on the 3583 ..
I was able to reconfigure the library path and drive path ...

Now I can't see my volumes 
How can I make TSM to re-learn the volumes

HELP

Luc Beaudoin
Administrateur Réseau / Network Administrator
Hopital General Juif S.M.B.D.
Tel: (514) 340-8222 ext:4318


export data

2005-04-01 Thread Joni Moyer
Hello All!

I have created a new TSM environment, but I changed the management class
naming standard.  There is data out on the old system that I want moved to
the new environment since it's supposed to be retained for 7 years.  Is
there a way to export the data directly to the new TSM server and associate
it with a different management class?  How would I get the data from the
old server with the old mc naming standard to the new TSM server with the
new MC names?  Please let me know if anyone has had success in exporting
data and how it was accomplished.  TIA for any advice!



Joni Moyer
Highmark
Storage Systems
Work:(717)302-6603
Fax:(717)302-5974
[EMAIL PROTECTED]



Re: lost volume configuration in my 3583

2005-04-01 Thread PAC Brion Arnaud
Luc,

I believe the only way to do this is using checkin libv command.
Something like : 
1) checkin libvol libname status=scratch search=yes checklabel=no
2) checkin libvol libname status=private search=yes  checklabel=no

Note that you MUST issue those commands in sequence : scratch first, and then 
private. Not doing it that way will result in having all of your tapes being 
private !
Hope this helped ...
Regards.

Arnaud 

**
Panalpina Management Ltd., Basle, Switzerland, CIT Department
Viadukstrasse 42, P.O. Box 4002 Basel/CH
Phone:  +41 (61) 226 11 11, FAX: +41 (61) 226 17 01
Direct: +41 (61) 226 19 78
e-mail: [EMAIL PROTECTED]
**

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Luc 
Beaudoin
Sent: Friday, 01 April, 2005 15:37
To: ADSM-L@VM.MARIST.EDU
Subject: lost volume configuration in my 3583

Hi all

The IBM tech update my firmware on the 3583 ..
I was able to reconfigure the library path and drive path ...

Now I can't see my volumes 
How can I make TSM to re-learn the volumes

HELP

Luc Beaudoin
Administrateur Réseau / Network Administrator Hopital General Juif S.M.B.D.
Tel: (514) 340-8222 ext:4318


Re: Restore performance problem

2005-04-01 Thread Thomas Denier
 I noticed that you didn't mention the file system type. The effects of
 file system type and layout of the subject instance is an often
 overlooked contributor to performance in operations which are
 mass-populating the file system, as a restoral will. A journaled file
 system can exhibit a lot of overhead as its journal is written with at
 least metadata, depending upon type; and an ill-located journal can
 make for a lot of disk arm diversions during the restoral, aggravating
 elapsed time.

The output file system was Ext2, which is not journalled.


Re: lost volume configuration in my 3583

2005-04-01 Thread William Boyer
I've also found that after doing firmware on the library the default or 
Extended label option gets reset. That's whether you get
6 or 8 characters of the barcode reported by the library. Check to make sure 
that the library is configured for the correct option
for the TSM tapes. Otherwise TSM will think they are all new tapes (different 
volume name) and try to use them as scratch. But the
internal label won't match so you're safe there.

Bill Boyer
Some days you're the bug, some days you're the windshield - ??

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of PAC Brion 
Arnaud
Sent: Friday, April 01, 2005 9:30 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: lost volume configuration in my 3583

Luc,

I believe the only way to do this is using checkin libv command.
Something like : 
1) checkin libvol libname status=scratch search=yes checklabel=no
2) checkin libvol libname status=private search=yes  checklabel=no

Note that you MUST issue those commands in sequence : scratch first, and then 
private. Not doing it that way will result in having
all of your tapes being private !
Hope this helped ...
Regards.

Arnaud 

**
Panalpina Management Ltd., Basle, Switzerland, CIT Department
Viadukstrasse 42, P.O. Box 4002 Basel/CH
Phone:  +41 (61) 226 11 11, FAX: +41 (61) 226 17 01
Direct: +41 (61) 226 19 78
e-mail: [EMAIL PROTECTED]
**

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Luc 
Beaudoin
Sent: Friday, 01 April, 2005 15:37
To: ADSM-L@VM.MARIST.EDU
Subject: lost volume configuration in my 3583

Hi all

The IBM tech update my firmware on the 3583 ..
I was able to reconfigure the library path and drive path ...

Now I can't see my volumes 
How can I make TSM to re-learn the volumes

HELP

Luc Beaudoin
Administrateur Réseau / Network Administrator Hopital General Juif S.M.B.D.
Tel: (514) 340-8222 ext:4318


Re: tsm acsls

2005-04-01 Thread Remco Post
Joni Moyer wrote:
Hello All!
I had an issue where the /tmp directory filled up and acsls stopped
running.  At that time TSM was doing a backup stgpool process.  The TSM
server is at 5.2.2.5 and is running on AIX 5.2 and we have ACSLS 7.1.  I
tried to cancel the job, but it will not.  The tapes are in the drives, but
I think that something got messed up when the connection was lost and now
it doesn't appear to be backing up, but it won't let me cancel it or
dismount the tapes.  Has anyone ever had this issue before and if so, what
did you have to do to fix it?  Thanks!

there are scripts in /usr/tivoli/tsm/devices/bin to kill and start the
acsls agent/daemon/hack thing, kill it, then restart

Joni Moyer
Highmark
Storage Systems
Work:(717)302-6603
Fax:(717)302-5974
[EMAIL PROTECTED]


--
Met vriendelijke groeten,
Remco Post
SARA - Reken- en Netwerkdiensten  http://www.sara.nl
High Performance Computing  Tel. +31 20 592 3000Fax. +31 20 668 3167
I really didn't foresee the Internet. But then, neither did the
computer industry. Not that that tells us very much of course - the
computer industry didn't even foresee that the century was going to
end. -- Douglas Adams


Re: export data

2005-04-01 Thread Remco Post
Joni Moyer wrote:
Hello All!
I have created a new TSM environment, but I changed the management class
naming standard.  There is data out on the old system that I want moved to
the new environment since it's supposed to be retained for 7 years.  Is
there a way to export the data directly to the new TSM server and associate
it with a different management class?  How would I get the data from the
old server with the old mc naming standard to the new TSM server with the
new MC names?  Please let me know if anyone has had success in exporting
data and how it was accomplished.  TIA for any advice!

Having data bound to a new mc might be a problem, never done that so I
can't tell if and how that might work, the basic tric is te define both
servers on the other (def server) then on the old server run:
export node whatever tos=newserver filedata=all

Joni Moyer
Highmark
Storage Systems
Work:(717)302-6603
Fax:(717)302-5974
[EMAIL PROTECTED]


--
Met vriendelijke groeten,
Remco Post
SARA - Reken- en Netwerkdiensten  http://www.sara.nl
High Performance Computing  Tel. +31 20 592 3000Fax. +31 20 668 3167
I really didn't foresee the Internet. But then, neither did the
computer industry. Not that that tells us very much of course - the
computer industry didn't even foresee that the century was going to
end. -- Douglas Adams


Re: Large Linux clients

2005-04-01 Thread Zoltan Forray/AC/VCU
Thanks for the suggestion.   However, this is not true.  We already tried
this.

We did find . | wc -l to get the object count (1.1M) with no problems.
But the backup still will not work. Constantly fails, in
unpredictable/inconsistant places, with the same Producer Thread error.

I spent 2+ days drilling through the various sub-directories (of this
directory that causes the failures), one-by-one, and was able to backup 38
of the 40 subdirs, totalling over 980K objects, with out a problem.  When
I included these two other directories, in the same pile, the backup would
fail.

When I then went back and individually selected the sub-sub directories of
these sub-directories (one at a time), I was able to backup *ALL* of the
sub-sub directories, no problem.  Then I went back and selected the
upper-level directory and backed it up, no problem..

Let me draw a picture of the structure of these directories.

The problem directories are in this directory:
/coyote/dsk3/patients/prostateReOpt/Mount_0/ .

If I try to backup the /Mount_0/ as a whole, crashes every time.   If I
point to sub-dirs below /Mount_0/ (40 of these - all with the same named
4-subsub dirs ), two of these cause a crash. I noted that these two both
have 72K objects while the other 38 have less than 60K objects.

Yet when I manually picked the 4-subsub dirs of the Patient_172 dir, the
backup worked (sort of - see below). Same for the Patient_173.

To really drive me crazy, the first attempt at backing up one of the
subsub dirs under Patient_172, the backup crashed. Yet I could backup the
other 3 with no issue. So, we started looking at the problem subdir and
noticed a weird file name that ended in a tilde (~).  When I excluded it,
the backup ran. Then when I went back and picked just the file with the
tilde, it backed up fine (my head is getting balder-and-balder !!).  I
then went back and re-selected the whole Patient_172 directory and it
backed up (or at least scanned it since everything was backed-up) just
fine !!!1  AGGH !!

This is maddening and shows no rhyme-or-reason.




Henk ten Have [EMAIL PROTECTED]
Sent by: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
04/01/2005 08:29 AM
Please respond to
ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU


To
ADSM-L@VM.MARIST.EDU
cc

Subject
Re: [ADSM-L] Large Linux clients






An old trick I used for many years:
to investigate a problem filesystem, do a find in that filesystem.
If the find dies, tsm definitly will die.
I'll bet your find will die, and that's why your backup will die/hang or
whatever also. A find will do a filestat on all files/dirs, actually the
same
the backup does.
So your issue is OS related and not tsm.

Cheers
Henk ()

On Tuesday 29 March 2005 12:11, you wrote:
 On Mar 29, 2005, at 12:37 PM, Zoltan Forray/AC/VCU wrote:
  ...However, then I try to backup the tree at the third-level (e.g.
  /coyote/dsk3/), the client pretty much siezes immediately and
  dsmerror.log
  says B/A Txn Producer Thread, fatal error, Signal 11.  The server
  shows
  the session as SendW and nothing going else going on

 Zoltan -

 Signal 11 is a segfault - a software failure.
 The client programming has a defect, which may be incited by a problem
 in that area of the file system (so have that investigated). A segfault
 can be induced by memory constraint, which in this context would most
 likely be Unix Resource Limits, so also enter the command 'limit' in
 Linux csh or tcsh and potentially boost the stack size ('unlimit
 stacksize'). This is to say that the client was probably invoked under
 artificially limited environmentals.

 Richard Sims


Lan-free backup limitations?

2005-04-01 Thread Adams, Matt (US - Hermitage)
Has anyone run across a limit (documented or otherwise) for the number
of lan-free clients one TSM server can handle?

We have one TSM server that we had 23 Exchange servers running lan-free
backups.  We added 24 and 25, and now we are having a strange problem
where it appears these new clients are polling or query command to the
tape drives and it is causing tape errors and taking them offline in
some cases.  We have verified that the two new servers are identical in
every way to all the others... Drives, firmware, BIOS, etc..  When we
block these two hosts at the port level from seeing the tape drives, the
problem goes away.


TSM Server AIX 5.2
TSM Version 5.2.3.1
3494 Library
3590H tape drives


Windows Hosts
B/A Client version 5.2.0.3
ITSM for Mail 5.2.1.0
Storage Agent version 5.2.2.3
Qlogic 2310F HBA - Driver version 9.0.0.13



We have sumbitted dumps to IBM support, but they are painfully slow in
getting back to us.

Just thought I would poll the group...'-

Thanks,

Matt Adams
Information Technology Services
Deloitte Services LP
615-882-6861
www.deloitte.com





This message (including any attachments) contains confidential information 
intended for a specific individual and purpose, and is protected by law.  If 
you are not the intended recipient, you should delete this message.  Any 
disclosure, copying, or distribution of this message, or the taking of any 
action based on it, is strictly prohibited.


5.3.1.0 server available

2005-04-01 Thread Loon, E.J. van - SPLXM
Hi *SM-ers!
For those of you that haven't noticed it yet: the TSM 5.3.1.0 server code is
available for download:
ftp://service.boulder.ibm.com/storage/tivoli-storage-management/maintenance/
server/v5r3/
At this moment only for AIX and Linux, I think.
Kindest regards,
Eric van Loon
KLM Royal Dutch Airlines


**
For information, services and offers, please visit our web site: 
http://www.klm.com. This e-mail and any attachment may contain confidential and 
privileged material intended for the addressee only. If you are not the 
addressee, you are notified that no part of the e-mail or any attachment may be 
disclosed, copied or distributed, and that any other action related to this 
e-mail or attachment is strictly prohibited, and may be unlawful. If you have 
received this e-mail by error, please notify the sender immediately by return 
e-mail, and delete this message. Koninklijke Luchtvaart Maatschappij NV (KLM), 
its subsidiaries and/or its employees shall not be liable for the incorrect or 
incomplete transmission of this e-mail or any attachments, nor responsible for 
any delay in receipt.
**


Re: curious behavior

2005-04-01 Thread Tyree, David
I just rechecked things with the comments from the list. It seems the
current access level of the tape didn't have any bearing on the
situation.
Right now I have a q mo showing one tape as R/W and the other as R/O.
The q vol f=d is telling me that both tapes are Read/Write. 


tsm: BACKUP1q pr

 Process Process Description  Status
  Number
 
-
 616 Space ReclamationOffsite Volume(s) (storage pool
NEWCOPYPOOL),
   Moved Files: 5868, Moved Bytes:
22,303,511,754,
   Unreadable Files: 543, Unreadable
Bytes: 0.
   Current Physical File (bytes):
11,347,126,815
   Current input volume: 03L2.
Current output
   volume: 93L2.

tsm: BACKUP1q mo
ANR8330I LTO volume 03L2 is mounted R/O in drive DRV4 (mt0.5.0.5),
status: IN USE.
ANR8330I LTO volume 93L2 is mounted R/W in drive DRV2 (mt0.3.0.5),
status: IN USE.
ANR8334I 2 matches found.

tsm: BACKUP1q vol 03l2 f=d

   Volume Name: 03L2
 Storage Pool Name: NEWTAPEPOOL
 Device Class Name: LTO2
   Estimated Capacity (MB): 431,430.2
   Scaled Capacity Applied:
  Pct Util: 93.8
 Volume Status: Full
Access: Read/Write
Pct. Reclaimable Space: 6.6
   Scratch Volume?: Yes
   In Error State?: No
  Number of Writable Sides: 1
   Number of Times Mounted: 18
 Write Pass Number: 1
 Approx. Date Last Written: 03/29/2005 17:13:29
Approx. Date Last Read: 04/01/2005 12:36:43
   Date Became Pending:
Number of Write Errors: 0
 Number of Read Errors: 0
   Volume Location:
Volume is MVS Lanfree Capable : No
Last Update by (administrator):
 Last Update Date/Time: 03/28/2005 21:27:17


tsm: BACKUP1q vol 93l2 f=d

   Volume Name: 93L2
 Storage Pool Name: NEWCOPYPOOL
 Device Class Name: LTO2
   Estimated Capacity (MB): 204,800.0
   Scaled Capacity Applied:
  Pct Util: 10.3
 Volume Status: Filling
Access: Read/Write
Pct. Reclaimable Space: 0.0
   Scratch Volume?: Yes
   In Error State?: No
  Number of Writable Sides: 1
   Number of Times Mounted: 1
 Write Pass Number: 1
 Approx. Date Last Written: 04/01/2005 12:47:32
Approx. Date Last Read: 04/01/2005 12:02:48
   Date Became Pending:
Number of Write Errors: 0
 Number of Read Errors: 0
   Volume Location:
Volume is MVS Lanfree Capable : No
Last Update by (administrator):
 Last Update Date/Time: 04/01/2005 12:01:20


Re: curious behavior

2005-04-01 Thread Stapleton, Mark
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On 
Behalf Of Tyree, David
I just rechecked things with the comments from the list. It seems the
current access level of the tape didn't have any bearing on the
situation.
Right now I have a q mo showing one tape as R/W and the other as R/O.
The q vol f=d is telling me that both tapes are Read/Write. 

That is correct behavior. 03L2 is the input tape for your
reclamation process; it will be mounted R/O so that nothing can be
written to the source tape while it moves data to the target tape. WAD.

I *have* seen instances where a source tape, which should be mounted
R/O, is instead mounted R/W. I haven't tracked the issue because, quite
frankly, it's not much of an issue; I am usually running down more
serious problems.

--
Mark Stapleton ([EMAIL PROTECTED])
IBM Certified Advanced Deployment Professional
Tivoli Storage Management Solutions 2005
Office 262.521.5627  


Re: curious behavior

2005-04-01 Thread Bos, Karel
Hi,

Not that curious. The input tape for the reclaim process is mounted in
the read-Only status, the output tapes is being writen to, so is mounted
read/write. 

This is normal and has nothing to do with the access state of a volume
seen by q vol.

In case of a readonly volume, mounted in read/write status and also
being the output volume of a process, I would begin to get worried

Regards, 


_
Karel Bos
Technical Expert level 5 
Server Management - Operations Back-up en Restore
Customer Unit Nuon
Atos Origin Nederland B.V.
Arlandaweg 98
1043 HP Amsterdam
Office:   +31 (0)20
Fax:  +31 (0)20
Mobile:  +31 (0)6.51.29.88.01 
Mail: [EMAIL PROTECTED]
The information in this mail is intended only for use of the individual
or entity to which it is addressed and may contain information that is
privileged, confidential and exempt from disclosure under applicable
law. Access to this mail by anyone else than the addressee is
unauthorised.  If you are not the intended recipient, any disclosure,
copying, distribution or any action taken omitted to be taken in
reliance of it, is prohibited and may be unlawful. If you are not the
intended recipient please contact the sender by return e-mail and
destroy all copies of the original message.

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
Tyree, David
Sent: vrijdag 1 april 2005 20:01
To: ADSM-L@VM.MARIST.EDU
Subject: Re: curious behavior

I just rechecked things with the comments from the list. It seems the
current access level of the tape didn't have any bearing on the
situation.
Right now I have a q mo showing one tape as R/W and the other as R/O.
The q vol f=d is telling me that both tapes are Read/Write. 


tsm: BACKUP1q pr

 Process Process Description  Status
  Number
 
-
 616 Space ReclamationOffsite Volume(s) (storage pool
NEWCOPYPOOL),
   Moved Files: 5868, Moved Bytes:
22,303,511,754,
   Unreadable Files: 543, Unreadable
Bytes: 0.
   Current Physical File (bytes):
11,347,126,815
   Current input volume: 03L2.
Current output
   volume: 93L2.

tsm: BACKUP1q mo
ANR8330I LTO volume 03L2 is mounted R/O in drive DRV4 (mt0.5.0.5),
status: IN USE.
ANR8330I LTO volume 93L2 is mounted R/W in drive DRV2 (mt0.3.0.5),
status: IN USE.
ANR8334I 2 matches found.

tsm: BACKUP1q vol 03l2 f=d

   Volume Name: 03L2
 Storage Pool Name: NEWTAPEPOOL
 Device Class Name: LTO2
   Estimated Capacity (MB): 431,430.2
   Scaled Capacity Applied:
  Pct Util: 93.8
 Volume Status: Full
Access: Read/Write
Pct. Reclaimable Space: 6.6
   Scratch Volume?: Yes
   In Error State?: No
  Number of Writable Sides: 1
   Number of Times Mounted: 18
 Write Pass Number: 1
 Approx. Date Last Written: 03/29/2005 17:13:29
Approx. Date Last Read: 04/01/2005 12:36:43
   Date Became Pending:
Number of Write Errors: 0
 Number of Read Errors: 0
   Volume Location:
Volume is MVS Lanfree Capable : No
Last Update by (administrator):
 Last Update Date/Time: 03/28/2005 21:27:17


tsm: BACKUP1q vol 93l2 f=d

   Volume Name: 93L2
 Storage Pool Name: NEWCOPYPOOL
 Device Class Name: LTO2
   Estimated Capacity (MB): 204,800.0
   Scaled Capacity Applied:
  Pct Util: 10.3
 Volume Status: Filling
Access: Read/Write
Pct. Reclaimable Space: 0.0
   Scratch Volume?: Yes
   In Error State?: No
  Number of Writable Sides: 1
   Number of Times Mounted: 1
 Write Pass Number: 1
 Approx. Date Last Written: 04/01/2005 12:47:32
Approx. Date Last Read: 04/01/2005 12:02:48
   Date Became Pending:
Number of Write Errors: 0
 Number of Read Errors: 0
   Volume Location:
Volume is MVS Lanfree Capable : No
Last Update by (administrator):
 Last Update Date/Time: 04/01/2005 12:01:20


Monthly TSM FAQ April 2005 part 1 of 2 (no April Fools here!)

2005-04-01 Thread Stapleton, Mark
This Frequently Asked Question list for the ADSM-L mailing list is
posted on the first day of each  month. It was created to cut down on
the number of questions that are repeated regularly in the ADSM-L
mailing list from vm.marist.edu. I would be grateful for any requests to
include additional  material. (Please send them directly to me, rather
than to the list.)

updated 4/1/2005


Questions marked with $ are new or improved since the last posting.

QUESTIONS Sections 01, 02, and 03

01.   About the list itself
01-01.  How do I subscribe to ADSM-L?
01-02.  How do I unsubscribe to ADSM-L?
01-03.  Why don't I see the questions I post to ADSM-L?
01-04.  How can I see the questions I post to ADSM-L?
01-05.  Who decides what questions go on ADSM-L?
01-06.  Is there a digest or archive of ADSM-L?
01-07.  How do I get more information about ADSM-L?
01-08.  Does IBM/Tivoli participate in ADSM-L?
01-09   How can I get just a digest of ADSM-L, instead of all the
  postings?

02.   Types of questions asked
02-01.  What subjects are covered in this list?
02-02.  What kinds of questions can be asked?
02-03.  What kinds of questions can I expect answers to?
02-04.  What levels of netiquette are expected?
02-05.  What's the first thing to do when I have a question about TSM?
02-06.  What's the second thing to do when I have a question about TSM?
02-07.  What's the third thing to do when I still have a question about
  TSM?
02-08.  What's the fourth thing to do when I STILL have a question 
  about TSM?
02-09.  What's the fifth thing to do when I *STILL* have a question
  about TSM?
02-10.  What's the last thing to do when I *STILL* have a question
  about TSM? 
02-11.  What are those out of office messages I keep seeing in the
  list?
02-12.  What's the single best thing I can do to improve the list?
02-13.  Why don't I get answers to my I need comparisons between TSM
  and brandX backup software questions?
02-14.  What kinds of things shouldn't I post on ADSM-L?
02-15.  Is there some sort of acronym list?
02-16.  Whatever happened to Richard Sims?

03.   Available TSM resources
03-01.  What FAQs are already out there?
03-02.  What other sources of help can I find?
03-03.  How do I get official TSM support?

ANSWERS to section 01, 02, and 03

01-01.  How do I subscribe to ADSM-L?
Send an email to [EMAIL PROTECTED] with a blank subject line and a
message consisting only of  the line SUBSCRIBE ADSM-L.

01-02.  How do I unsubscribe to ADSM-L?
Send an email to [EMAIL PROTECTED] with a blank subject line and a
message consisting only of  the line UNSUBSCRIBE ADSM-L. Do NOT try to
unsubscribe by sending email to [EMAIL PROTECTED] All  that does is
annoy the list members, and it doesn't get you unsubscribed.

01-03.  Why don't I see the questions I post to ADSM-L?
That's the normal behavior of ADSM-L.

01-04.  How can I see the questions I post to ADSM-L?
If you want to see your own questions, send an email to
[EMAIL PROTECTED] with a blank subject  line and a message that
consists only of the line SET ADSM-L REPRO.

01-05.  Who decides what questions go on ADSM-L?
The list members. There appears to be no active moderation of the list.
(That's not a license to  abuse the list. Complaints from the list
members are heeded by the list administrator.)

01-06.  Is there a digest or archive of ADSM-L?
Indeed. There are two indexed versions of the mailing list. The first is
at http://search.adsm.org;  the other one is
http://www.mail-archive.com/adsm-l@vm.marist.edu/. (Thanks, Richard!) 
Personally, I prefer the latter; the former's indexing leaves a lot to
be desired, and its message  threading is practically non-existent.

01-07.  How do I get more information about ADSM-L?
Send an email to [EMAIL PROTECTED] with a blank subject line and a
message consisting only of  the line INFO. This will cause an email to
be returned to you with a list of documents available  about the
listserver and instructions on how to get them.

01-08.  Does IBM/Tivoli participate in or post to ADSM-L?
From Andy Raibeck, of the TSM client development group: This list
server is owned and operated by  Marist College, and is not in any way
affiliated with IBM. While some IBMers do participate on  ADSM-L, they
do so on an unofficial, voluntary basis, and thus are not *required* to
answer your  questions. If you require an answer from IBM, or if your
situation is of an urgent nature, then you  should (also) go through
IBM's official support channels for assistance. 

01-09   How can I get just a digest of ADSM-L, instead of all the
  postings?
(from Andy Raibeck) Send an email to: [EMAIL PROTECTED]
In the body of the email, put *only* the following:
   info refcard
You do not need a subject line. You will get reference information back
from the list server that  tells you, among other things, how to
configure your subscription to receive the 

Monthly TSM FAQ April 2005 (part 2 of 2) (no April Fools here!)

2005-04-01 Thread Stapleton, Mark
This Frequently Asked Question list for the ADSM-L mailing list is
posted on the first day of each  month. It was created to cut down on
the number of questions that are repeated regularly in the  ADSM-L
mailing list from vm.marist.edu. I would be grateful for any requests to
include additional  material. (Please send them directly to me, rather
than to the list.)

updated 4/1/2005


Questions marked with $ are new or improved since the last posting.

Questions for sections 04 and 05

04.   Frequently-asked questions on ADSM-L
04-01.  Is it called ADSM, or TSM, or ITSM? What's the deal here?
04-02.  What are backupsets? How can I use them?
04-03.  How does TSM do full/incremental/differential backups, just like
  my old backup software fillintheblank used to?
04-04.  How do I unsubscribe to ADSM-L?
04-05.  How do I do mailbox-level restores of Exchange using the Tivoli 
  Data Protection Agent for Exchange?
04-06.  How do I force TSM to do a full backup of a client?
04-07.  Where can I download the latest version of TSM/TDP?
04-08.  What's the very first thing I do after TSM is delivered to me?
04-09.  I'm getting message ANRX from the TSM server. What does it
  mean?
04-10.  I'm getting message ANSX from the TSM client. What does it
  mean?
04-11.  My large-scale restores are slow. How can I speed them up?
04-12.  How do I back up normally open files, like database files?
04-13.  What's all this about TSM and SQL select statements?
04-14.  My boss wants disaster recovery procedures. What's the best way 
  to do it?
04-15.  How do I get TSM to report problems to me?
04-16.  Why does version X of TSM have this bad bug in it?
04-17.  How come my copy pool tape reclamation runs so slowly?
04-18.  I keep getting these server out of license compliance
  messages. Why?
04-19.  My scheduled backups fail (or are incomplete), but my manual 
  ones work fine. Why?
04-20.  While backleveling my TSM client from 4.2.1 to 4.1.3, I get a
  downlevel message. Why?
04-21.  Why do I get an ANR1440I All drives in use. Process being
  preempted by higher priority operation message when my 
  storage pool backup fails?
04-22.  I've deleted all data from a tape volume, but it hasn't come
  back as a scratch tape. Why?
04-23.  What is this ANRD error message? I don't understand it.
04-24.  I'm upgrading my TSM server/client from version X.X to version
  Y.Y Any pitfalls?
04-25.  How do I restore one client's data onto another client?
04-26.  Will my new tape library work with TSM?
04-27.  My Windows client backs up the same 3,000 files or so everyday.
  Why?
04-28.  I'm moving TSM to a new physical server. What's the best way
  to do that?
04-29.  How do I back up my NetWare NDS license files?
04-30.  What's all this fuss about cleanup backupgroups?
04-31.  I'm trying to include some files for backups, but it's not
  working. Why?
04-32.  Can I put TSM db and log volumes on raw devices?
04-33.  Why is my client backup {taking so long|running so
slowly|sluggish}?
04-34.  I have a tape volume that Q CONTENT says is empty,
  but I can't delete the volume. Why?
04-35   I'm upgrading my TSM server from version x.x.x.x to
  y.y.y.y. What's the best way to do it?
04-36   TSM is asking me to convert my archives? Why?
04-37   What kind/how many/what configuration should I set up for
  database disks/volumes/RAIDs?
04-38   How do I move/resize my database/recoverylog volumes?
04-39   I'm moving my TSM server from operating system BrandX
  to operating system BrandY. Can I just move my database
  volumes from one machine to another? Why not?
04-40   My library is out of space. What's wrong with TSM?
04-41   What's the difference between a TSM database backup and a
  TSM database snapshot?
04-42   How can I change the retention time for an archive I've
  already created?
04-43.  Boss and/or the political situation is forcing me to move my 
  TSM server from one operating system to another. Help!
04-44.  What kind of tape drive technology should I consider for my
  TSM server?
04-45   What is the Deadly Embrace?
04-46   What does the message 'Error 2 deleting row from table 
  Expiring.Objects.' mean? Is it bad?
04-47   I've had problems using the TSAFS module on my NetWare 6.x
   client. How can I make it work?
04-48   How do I back up my SharePoint Portal database?
04-49   How do I perform both full and incremental backups of my
   database/mail server?
04-50   What kind of copy serialization is best?
04-51   What is the Eternal Triangle?

and from IBM, questions about the Tivoli web site. (Thanks for posting
these, Andy!)

05-01.  I have a Tivoli ID and an IBM.com registered ID. Which one do I 
  use for problem submission?
05-02.  The top of 

Backup/restore of links in LINUX

2005-04-01 Thread fred johanson
I've a user trying to restore a LINUX directory with links to itself (I
confess to not understanding what is going on, but here is a listing from
the directory:lrwxrwxrwx   1 root root 6 2005-03-29 11:00 swapoff -
swapon).
When he tries a restore from either the CLI or gui, swapoff is restored,
but not swapon.  I ran a select on the directory from BACKUPS, and it
appears that only the link is TSM.  I'm wondering if the file itself was
ever backed up by TSM or only the link.  The options file has only the bare
minimum needed to get the client to run, with no options specified.

Fred Johanson
ITSM Administrator
University of Chicago
773-702-8464


Re: Large Linux clients

2005-04-01 Thread Ben Bullock
Ya, 
Sorry, I have no answers for you, but you do have my sympathy.

I've had to do that kind of detective work before. Some times it
is an oddly named file, a very very long-named file, or some times it's
a file that somehow got a very bizarre date, like Apr 15  1904. In a
few cases it has also been hung NFS mounts somewhere in the path.

I've had to drill down each of the subdir one after another just
like you did to figure it out, because there was no filename or other
hints in the schedule or error logs, just a generic failed message.

Luckily I only have to do it about once or twice a year, but it
is time consuming.

 Ben


-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
Zoltan Forray/AC/VCU
Sent: Friday, April 01, 2005 9:03 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: Large Linux clients

Thanks for the suggestion.   However, this is not true.  We already
tried
this.

We did find . | wc -l to get the object count (1.1M) with no problems.
But the backup still will not work. Constantly fails, in
unpredictable/inconsistant places, with the same Producer Thread
error.

I spent 2+ days drilling through the various sub-directories (of this
directory that causes the failures), one-by-one, and was able to backup
38 of the 40 subdirs, totalling over 980K objects, with out a problem.
When I included these two other directories, in the same pile, the
backup would fail.

When I then went back and individually selected the sub-sub directories
of these sub-directories (one at a time), I was able to backup *ALL* of
the sub-sub directories, no problem.  Then I went back and selected the
upper-level directory and backed it up, no problem..

Let me draw a picture of the structure of these directories.

The problem directories are in this directory:
/coyote/dsk3/patients/prostateReOpt/Mount_0/ .

If I try to backup the /Mount_0/ as a whole, crashes every time.   If I
point to sub-dirs below /Mount_0/ (40 of these - all with the same named
4-subsub dirs ), two of these cause a crash. I noted that these two both
have 72K objects while the other 38 have less than 60K objects.

Yet when I manually picked the 4-subsub dirs of the Patient_172 dir, the
backup worked (sort of - see below). Same for the Patient_173.

To really drive me crazy, the first attempt at backing up one of the
subsub dirs under Patient_172, the backup crashed. Yet I could backup
the other 3 with no issue. So, we started looking at the problem subdir
and noticed a weird file name that ended in a tilde (~).  When I
excluded it, the backup ran. Then when I went back and picked just the
file with the tilde, it backed up fine (my head is getting
balder-and-balder !!).  I then went back and re-selected the whole
Patient_172 directory and it backed up (or at least scanned it since
everything was backed-up) just fine !!!1
AGGH !!

This is maddening and shows no rhyme-or-reason.




Henk ten Have [EMAIL PROTECTED]
Sent by: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
04/01/2005 08:29 AM
Please respond to
ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU


To
ADSM-L@VM.MARIST.EDU
cc

Subject
Re: [ADSM-L] Large Linux clients






An old trick I used for many years:
to investigate a problem filesystem, do a find in that filesystem.
If the find dies, tsm definitly will die.
I'll bet your find will die, and that's why your backup will die/hang or
whatever also. A find will do a filestat on all files/dirs, actually the
same the backup does.
So your issue is OS related and not tsm.

Cheers
Henk ()

On Tuesday 29 March 2005 12:11, you wrote:
 On Mar 29, 2005, at 12:37 PM, Zoltan Forray/AC/VCU wrote:
  ...However, then I try to backup the tree at the third-level (e.g.
  /coyote/dsk3/), the client pretty much siezes immediately and 
  dsmerror.log says B/A Txn Producer Thread, fatal error, Signal 11.

  The server shows the session as SendW and nothing going else going

  on

 Zoltan -

 Signal 11 is a segfault - a software failure.
 The client programming has a defect, which may be incited by a problem

 in that area of the file system (so have that investigated). A 
 segfault can be induced by memory constraint, which in this context 
 would most likely be Unix Resource Limits, so also enter the command 
 'limit' in Linux csh or tcsh and potentially boost the stack size 
 ('unlimit stacksize'). This is to say that the client was probably 
 invoked under artificially limited environmentals.

 Richard Sims


Re: curious behavior

2005-04-01 Thread Richard Sims
On Apr 1, 2005, at 1:11 PM, Stapleton, Mark wrote:
...I *have* seen instances where a source tape, which should be mounted
R/O, is instead mounted R/W. I haven't tracked the issue because, quite
frankly, it's not much of an issue; I am usually running down more
serious problems.
I wonder if it's a situation where the tape had been mounted for a
preceding R/W operation then, during the MOUNTRetention period, along
came the next request for it, as an input volume.
  Richard Sims


3583 Meltdown

2005-04-01 Thread Curtis Stewart
Hi everyone,

Here's an interesting little piece of activity log from one of my remote
TSM servers Gotta like TapeAlert.  Other than being an amusing way to
start my day, everything checks out OK and I can't believe it.

1. The 6AM job is a copy stgpool operation.
2. Despite the TapeAlert warnings, there were no stuck tapes.
3.  L20014  wasn't snapped when visually  inspected.
4. The library wouldn't mount tapes even though no drives were being used.
5. Stopped TSM, rebooted the library, started TSM and all was well again.
6. Audited each volume reported below without a single failure.

TSM Server: 5.2.4.1
OS  AIX 5.2
Library 3583 with 3 LTO2 drives (SCSI)

I don't really have a question, just thought it was odd and someone might
have insight.


Date/TimeMessage

--

03/31/2005 06:00:50  ANR8949E Device /dev/smc0, volume  has issued the

  following Critical TapeAlert: The library can
not operate
  without the magazine.  1. Insert the magazine
into the
  library.  2. Restart the operation. (SESSION:
21550,
  PROCESS: 784)
03/31/2005 06:11:36  ANR8950W Device /dev/rmt1, volume L20056 has
issued the
  following Warning TapeAlert: The cartridge is
not
  data-grade.  Any data you write to the tape is
at risk.
  Replace the cartridge with a data-grade tape.
(SESSION:
  21550, PROCESS: 784)
03/31/2005 06:11:36  ANR8950W Device /dev/rmt1, volume L20056 has
issued the
  following Warning TapeAlert: The tape drive may
have a
  hardware fault.  Run extended diagnostics to
verify and
  diagnose the problem.  Check the tape drive
users manual
  for device specific instruction on running
extended
  diagnostic tests. (SESSION: 21550, PROCESS: 784)

03/31/2005 06:17:33  ANR8950W Device /dev/rmt2, volume L20066 has
issued the
  following Warning TapeAlert: The cartridge is
not
  data-grade.  Any data you write to the tape is
at risk.
  Replace the cartridge with a data-grade tape.
(SESSION:
  21550, PROCESS: 786)
03/31/2005 06:17:33  ANR8950W Device /dev/rmt2, volume L20066 has
issued the
  following Warning TapeAlert: The tape drive may
have a
  hardware fault.  Run extended diagnostics to
verify and
  diagnose the problem.  Check the tape drive
users manual
  for device specific instruction on running
extended
  diagnostic tests. (SESSION: 21550, PROCESS: 786)

03/31/2005 06:39:38  ANR8950W Device /dev/rmt4, volume L20014 has
issued the
  following Warning TapeAlert: The cartridge is
not
  data-grade.  Any data you write to the tape is
at risk.
  Replace the cartridge with a data-grade tape.
(SESSION:
  21550, PROCESS: 785)
03/31/2005 06:39:39  ANR8948S Device /dev/rmt4, volume L20014 has
issued the
  following Critical TapeAlert: The operation has
failed
  because the tape in the drive has snapped: 1. Do
not
  attempt to extract the tape cartridge.  2.  Call
the tape
  drive supplier help line. (SESSION: 21550,
PROCESS: 785)
03/31/2005 06:39:39  ANR8949E Device /dev/rmt4, volume L20014 has
issued the
  following Critical TapeAlert: The tape drive
needs
  cleaning:  1. If the operation has stopped,
eject the
  tape and clean the drive.  2. If the operation
has not
  stopped, wait for it to finish and then clean
the drive.
  Check the tape drive users manual for device
specific
  cleaning instructions. (SESSION: 21550, PROCESS:
785)
03/31/2005 06:39:39  ANR8950W Device /dev/rmt4, volume L20014 has
issued the
  following Warning TapeAlert: The tape drive is
due for
  routine cleaning:  1. Wait for the current
operation to
  finish.  2. Then use a cleaning cartridge. Check
the
  tape drive users manual for device specific
cleaning
  instructions. (SESSION: 21550, PROCESS: 785)
03/31/2005 06:39:39  ANR8949E Device /dev/rmt4, volume L20014 has
issued the
  following Critical TapeAlert: The tape drive has
a
  hardware fault:  1. Eject the tape or magazine.
2. Reset
   

Re: 3583 Meltdown

2005-04-01 Thread Bill Kelly
On Fri, 1 Apr 2005, Curtis Stewart wrote:

 Here's an interesting little piece of activity log from one of my remote
 TSM servers Gotta like TapeAlert.  Other than being an amusing way to
 start my day, everything checks out OK and I can't believe it.

[...lots of bogus TapeAlerts omitted...]

And then there were three.  :-)

So you're now the third person who has posted to this list in the past
month or so with this same problem.  At least this *looks* to be the same
problem - yours is with a 3583, the other instances I'm aware of are with
3584s.  Please see recent postings with the subject 'LTO2 corrupted index
question', where Jurjen Oskam and I discussed the 'flurry of silly
TapeAlerts' problems we've been seeing on our 3584 libraries.

In my case, I sometimes see a dozen of these TapeAlerts all at the same
time, telling me the tape just snapped, the drive needs cleaning, the
drive was just cleaned but the cleaning cartridge is no good, the data
cartridge is no good, the drive has a hardware fault, etc., etc.  None of
which appear to be true.  I get no error log entries or Atape dumps at the
time of these messages; I'm current on drive and library firmware and on
Atape.

These may be harmless, or they may not be; in either case, it makes it
harder to track and deal with real problems with the tape drives.  Last
I heard, both Jurjen and I have problems open with IBM hardware support
(personally, I suspect a firmware problem, but obviously that's just an
uneducated guess). Maybe if everyone who's seeing these messages contacts
IBM support, they'll have a better chance of figuring out what's going on.

Regards,
Bill

Bill Kelly
Auburn University OIT
334-844-9917


Re: 3583 Meltdown

2005-04-01 Thread Stapleton, Mark
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On 
Behalf Of Curtis Stewart
Here's an interesting little piece of activity log from one of 
my remote
TSM servers Gotta like TapeAlert.  Other than being an 
amusing way to
start my day, everything checks out OK and I can't believe it.

1. The 6AM job is a copy stgpool operation.
2. Despite the TapeAlert warnings, there were no stuck tapes.
3.  L20014  wasn't snapped when visually  inspected.
4. The library wouldn't mount tapes even though no drives were 
being used.
5. Stopped TSM, rebooted the library, started TSM and all was 
well again.
6. Audited each volume reported below without a single failure.

TSM Server: 5.2.4.1
OS  AIX 5.2
Library 3583 with 3 LTO2 drives (SCSI)

I don't really have a question, just thought it was odd and 
someone might have insight.

My first inclination is to not trust the TapeAlert messages'
doom-and-gloom pronouncements.

That being said, I actually don't know a lot about TapeAlert, other than
the fact that in older versions of TSM (before version 5) TSM and
TapeAlert did not work and play well together. It is possible that the
relationship has improved, since one of TSM 5.2's selling points was
improved TapeAlert support. 

It sounds like TapeAlert does something to stop TSM from performing its
operations until the something is cleared or TSM is restarted.

--
Mark Stapleton ([EMAIL PROTECTED])
IBM Certified Advanced Deployment Professional
Tivoli Storage Management Solutions 2005
Office 262.521.5627


Re: 3583 Meltdown

2005-04-01 Thread Rushforth, Tim
We did find Tape Alert messages useful in the past (told us which tapes
were affected by the corrupted index problem).

But we too were getting too many of these useless errors so we decided
to turn tape alert off and only use if we suspect a problem. (Perhaps we
should leave tape alert on but not report on any of the errors ...)

We are on TSM 5.2.2.4 on Windows with SCSI LTO1 and LTO2 in 3584's.

Tim Rushforth
City of Winnipeg

-Original Message-
From: Bill Kelly [mailto:[EMAIL PROTECTED] 
Sent: Friday, April 01, 2005 1:58 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: 3583 Meltdown

On Fri, 1 Apr 2005, Curtis Stewart wrote:

 Here's an interesting little piece of activity log from one of my
remote
 TSM servers Gotta like TapeAlert.  Other than being an amusing way
to
 start my day, everything checks out OK and I can't believe it.

[...lots of bogus TapeAlerts omitted...]

And then there were three.  :-)

So you're now the third person who has posted to this list in the past
month or so with this same problem.  At least this *looks* to be the
same
problem - yours is with a 3583, the other instances I'm aware of are
with
3584s.  Please see recent postings with the subject 'LTO2 corrupted
index
question', where Jurjen Oskam and I discussed the 'flurry of silly
TapeAlerts' problems we've been seeing on our 3584 libraries.

In my case, I sometimes see a dozen of these TapeAlerts all at the same
time, telling me the tape just snapped, the drive needs cleaning, the
drive was just cleaned but the cleaning cartridge is no good, the data
cartridge is no good, the drive has a hardware fault, etc., etc.  None
of
which appear to be true.  I get no error log entries or Atape dumps at
the
time of these messages; I'm current on drive and library firmware and on
Atape.

These may be harmless, or they may not be; in either case, it makes it
harder to track and deal with real problems with the tape drives.
Last
I heard, both Jurjen and I have problems open with IBM hardware support
(personally, I suspect a firmware problem, but obviously that's just an
uneducated guess). Maybe if everyone who's seeing these messages
contacts
IBM support, they'll have a better chance of figuring out what's going
on.

Regards,
Bill

Bill Kelly
Auburn University OIT
334-844-9917


Re: How to schedule the backup?

2005-04-01 Thread William
Thanks Andy. Unfortunately I am still sitting on TSM 5.1/5.2.

The customer needs the backup for weekday not for weekend. But for
Monday I can only be allowed to start backup at 2:00am on Tuesday, so
does Friday, I do it on Saturday 2:00am.

Maybe there is a workaround, but haven't gotten a change to try it.

1. Create a regular incremental backup schedule on weekday start at 23:59

2. On the client, customize the preschedulecmd to let the schedule
to sleep 2 hours. It is on Unix Client, so just sleep 7200

I will let you guys know my test.

On Mar 31, 2005 4:30 PM, Andrew Raibeck [EMAIL PROTECTED] wrote:
 As has already been mentioned, TSM 5.3 has an enhanced schedule feature
 that allows you to do with with one schedule.

 Otherwise you will need to define 5 schedules, one for each day the event
 should run. While it might take slightly more effort to set up 5 schedules
 instead of 1, once they are defined, you're done.

 While admin schedules can be used to define and delete client schedules,
 you'll lose prior event information when the schedules are deleted.

 Regards,

 Andy

 Andy Raibeck
 IBM Software Group
 Tivoli Storage Manager Client Development
 Internal Notes e-mail: Andrew Raibeck/Tucson/[EMAIL PROTECTED]
 Internet e-mail: [EMAIL PROTECTED]

 The only dumb question is the one that goes unasked.
 The command line is your friend.
 Good enough is the enemy of excellence.

 ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU wrote on 2005-03-31
 13:01:11:

  I want to setup one client schedule which only starts at
  Tue/Wed/Thu/Fri/Sat. How can I do it? TIA.



Licensing again

2005-04-01 Thread Joe Crnjanski
Hello,

If I have an Exchange server, and I'm backing up only exchange database,
no file backup; do I need TDP license plus server license, or just TDP
license.

Regards,

Joe Crnjanski
Infinity Network Solutions Inc.
Phone: 416-235-0931 x26
Fax: 416-235-0265
Web: www.infinitynetwork.com
 


Re: Licensing again

2005-04-01 Thread Stapleton, Mark
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On 
Behalf Of Joe Crnjanski
If I have an Exchange server, and I'm backing up only exchange 
database,
no file backup; do I need TDP license plus server license, or just TDP
license.

Being that you cannot run TSM for Mail (Exchange) unless you install the
TSM client as well, I suspect you'll need to license both. Talk to your
Tivoli reseller; they should be able to tell you.

--
Mark Stapleton ([EMAIL PROTECTED])
IBM Certified Advanced Deployment Professional
Tivoli Storage Management Solutions 2005
Office 262.521.5627  


Re: Licensing again

2005-04-01 Thread William
Yes, he definitely needs both. One is TDP for Exchange Server, one is
for TSM Client. Both of them must be registed on TSM Server.

On Apr 1, 2005 9:48 PM, Stapleton, Mark [EMAIL PROTECTED] wrote:
 From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On
 Behalf Of Joe Crnjanski
 If I have an Exchange server, and I'm backing up only exchange
 database,
 no file backup; do I need TDP license plus server license, or just TDP
 license.

 Being that you cannot run TSM for Mail (Exchange) unless you install the
 TSM client as well, I suspect you'll need to license both. Talk to your
 Tivoli reseller; they should be able to tell you.

 --
 Mark Stapleton ([EMAIL PROTECTED])
 IBM Certified Advanced Deployment Professional
 Tivoli Storage Management Solutions 2005
 Office 262.521.5627



Remove from list

2005-04-01 Thread Dan B
Please remove me from your mailing list.
  Thanks, Dan