Re: Failing incremental Oracle RMAN backups

2005-01-24 Thread Andreas Almroth
Hi all,
Further on my email on failing Oracle backups, we probably have
discovered where the problem was, and it was indeed a misconfiguration.
When going through all configuration files again we realised that
COMMTIMEOUT and IDLETIMEOUT were only set on the TSM server and not in
the storage agent's dsmsta.opt files. Once we added these, we seem to
have working incremental backups.
So summary: COMMTIMEOUT and IDLETIMEOUT should be set for the SAN
storage agent as well as on the TSM server.
Thanks,
Andreas


Failing incremental Oracle RMAN backups

2005-01-14 Thread Andreas Almroth
Hi all,
I have been browsing through the archive and haven't found anything on
my particular problem.
Setup:
TSM 5.2.1.2 server on Solaris 9
Oracle TDP 5.2.0 32-bit and 64-bit
TSM 5.2.2 client on Solaris 9
Oracle 9.2.0.5 32-bit
Oracle 9.2.0.5 64-bit
Oracle 8.1.7.4 32-bit
StorageAgent 5.2 (for LAN free)
HP MSL 6060 library with LTO2 drives
Story:
We have implemented RMAN backups straight to tape using LAN free option
on various servers with various versions of Oracle.
We are having problems with incremental backups on smaller databases
(100-150GB) using both 8.1.7 and 9.2 of Oracle. We have altered
COMMTIMEOUT to 4800 and IDLETIMEOUT to 60 in order to avoid cancelled
sessions while RMAN is looking for changed blocks.
On a busier and larger database (1.2TB) we do not see the same problem
as I suspect RMAN continously writes
to tape.
I have noticed that during an incremental backup on smaller database the
mounted LTO2 volume goes idle (QUERY MOUNT), and it is after this we
receive failure from MML when it tries to start writing again.
When we run full backups everything works as expected.
I suspect that there could be a problem when the writing stop, and later
to continue again on the volume.
I'm not sure which information I should provide to better describe the
situation, but below are a few extracts that may be of interest;
First, the error in RMAN:
RMAN-00571: ===
RMAN-00569: === ERROR MESSAGE STACK FOLLOWS ===
RMAN-00571: ===
RMAN-03015: error occurred in stored script b_incr_backup
RMAN-03015: error occurred in stored script b_incr_1
RMAN-03009: failure of backup command on t1 channel at 01/14/2005 09:36:37
ORA-19502: write error on file "flagsrv3_incr_CCB_729_1_547549821_1",
blockno 1 (blocksize=512)
ORA-27030: skgfwrt: sbtwrite2 returned error
ORA-19511: Error received from media manager layer, error text:
  ANS1235E (RC-72)  Unknown system error
It always happens on block no 1.
sbtio.log:
===
Tracing started for:
---
  Application Client :   TDP Oracle SUN
 Version :   5.2.0.0
===
SBT-15692 01/14/2005 09:36:33 send2.cpp(383): sbtwrite2(): Exit -
DSMSENDDATA() failed. dsmHandle = 1
SBT-15692 01/14/2005 09:36:33 send2.cpp(383): sbtwrite2(): Exit -
DSMSENDDATA() failed. dsmHandle = 1
SBT-15692 01/14/2005 09:36:33 send2.cpp(383): sbtwrite2(): Exit -
DSMSENDDATA() failed. dsmHandle = 1
SBT-15692 01/14/2005 09:36:33 send2.cpp(383): sbtwrite2(): Exit -
DSMSENDDATA() failed. dsmHandle = 1
On the TSM server following log entry is created;
01/14/05   11:36:56  ANE4994S (Session: 80103, Node: FLAGSRV3_TDP)
TDP Oracle
 SUN ANU0599 ANU2602E The
object
 /flag3_db//flagsrv3_incr_CCB_729_1_547549821_1
was not
 found on the TSM Server
No other error or warning messages found.
I'm not sure what is going wrong, if it is a configuration issue, a
software bug, or firmware issues.
So any help or pointers to information would be greatly appreciated.
Best regards,
Andreas