Re: [Veritas-bu] Number of retries query

2009-04-07 Thread Dave Markham
Thanks guys there are some useful options there.

To give more info we run the RMAN job as follows :-

-We have an oracle admin station which holds various oracle dba scripts.
-We have a policy which controls the scheduling and kicks of a client 
backup of this oracle management station and backs up a single file to a 
disk storage unit on the master. (simple directory). We backup one file 
to stop any status 71
-The reason the policy and schedule is there is to run a bpstart script 
on the management station.
-This bpstart script checks no oracle tape dba script is already running 
(if it is it exits non zero and obviously gives status 73 in netbackup)
-Once the checks are passed it launches an oracle dba script (not 
maintained by me).
-This oracle script talks to 3 oracle RAC servers and works out which 
one is running the particular db instance.
-These oracle RAC servers are all Netbackup media servers and they then 
initiate the oracle backup through a Netbackup oracle agent on the 
relevant media server. This backs up using the application schedules on 
the master server for the associated policy with each media server. 
(sorry that sounds confusing).
-If the oracle script fails and exits with non zero then in turn our 
bpstart script fails with status 73 and we can alert the dbas

We want to launch via netbackup this way so we can trap the exit status 
and report to the dbas there has been a problem, plus for it to appear 
on a daily report.

The case we have experienced is if a backup fails which could be due to 
no tapes or various oracle failures, the dba's don't want an automatic 
one running again as it starts doing things with flash recovery areas 
and starts running into the normal working day.

Indeed perhaps some logic in the bpstart script to create a lockfile is 
useful, but the lock file would need to be removed upon completion or 
failure and this would then not give us any benefit when try 2 happens.

If a lock file was used we could do some date matching and perhaps only 
run a job if the lockfile was older than x hours ( a lot of date parsing 
though which could be difficult ) to touch it again and run the backup. 
I'll have to explorer this method.

Cheers





ken_zuf...@goodyear.com wrote:

 Dave,

 This isn't an ideal fix, but it will work--schedule the backups from 
 the client.  Basically, just put entries in cron (root or oracle will 
 work) with the commands (or script wrapper around the command) to 
 launch the backup instead of using the NBU scheduler (will have to 
 remove current full/incremental schedules and replace with a user 
 directed that has the appropriate windows).  Reason this will work is 
 because the automatic retries only affects backups launched from the 
 master...if it's submitted by the client, it will not retry on failure.

 Only real issues off the top of my head are:

 1) If client is down or doesn't have network connectivity, you won't 
 see failure to run backup in NBU because the backup will never be 
 submitted.

 2) You lose visibility to backup schedules within NBU.

 Ken Zufall
 Technical Analyst
 D660C
 The Goodyear Tire  Rubber Company
 GTN 446.0592 or 330.796.0592



 *Len Boyle len.bo...@sas.com*
 Sent by: veritas-bu-boun...@mailman.eng.auburn.edu

 04/06/2009 09:30 AM

   
 To
   dave.mark...@fjserv.net dave.mark...@fjserv.net, 
 veritas-bu@mailman.eng.auburn.edu veritas-bu@mailman.eng.auburn.edu
 cc
   
 Subject
   Re: [Veritas-bu] Number of retries query



   





 Good Morning Dave,

 I know of no way to change the number of job retries on a policy or 
 client or schedule object.
 I can see where this would be a nice feature to have.

 There are many different reasons that a rman backup job can fail.

 From a netbackup end of things one could have a 96 error no scratch 
 tapes,
 A media fault, A network issue. Etc.
 Or it could be a oracle issue.

 For something like a media issue that is cleared up on the netbackup 
 end of things I would think that the dba's would want the backup to be 
 retried. For an oracle issue I do not know enough.

 But either way I believe that you could add the control you require 
 into the script that netbackup runs on the client to run the rman 
 commands. Might not be easy.

 I am sure other that know oracle can give you a better answer then 
 this, and I look forward to learning.
 As a simple case of go or nogo without any variance based on the prior 
 failure you could try.
 In the beginning of the script you could set a state value of 
 STARTED into a file on client. At the end of the script the vaule 
 could be changed to COMPLETE.
 At the start of the script if the value is not COMPLETE the script 
 could give an error return and exit. Someone would have to change the 
 statue value to STARTED to enable the script to run. This could be 
 done after clearing the problem. This can also be used to bypass the 
 running of the backup at the script level when the   oracle dba's

Re: [Veritas-bu] Number of retries query

2009-04-07 Thread Dave Markham
and if its use for anyone else here is what i shall implement :-

Prev_Job=`bperror -backstat -client oracle mgmt client -hoursago 12 | 
awk '$14 == policy { print Client [$12], STATUS [$19] }'`
if [ $Prev_Job ];then
echo ERROR: A previous job has ran in the past [$hours] hours  $log
echo $Prev_Job  $log
exit 1
fi


ken_zuf...@goodyear.com wrote:

 Wow, glad I don't have your job...that's pretty convoluted :P

 But may have an answer, building off the lock file idea...but much 
 simpler.  Just put the logic in the bpstart to check and see if the 
 policy you're executing has run in the past X hours and failed...if it 
 has, exit gracefully, if it hasn't, continue the backup.  

 Quick and dirty logic:

 bperror -backstat -hoursago [hours] -l | awk '{print $19,$14}' | grep 
 -v ^0 | grep [policy_name]

 In the above, $19 = backup status code, $14 = policy name.  Strip out 
 any successful backups, grep for the policy name...if it's not null, 
 you've had a failure in the past X hours.

 Of course, there are different ways to parse the bperror output, but 
 the above would work.  In fact, you shouldn't even have to grep out 
 successes because the process shouldn't be trying to submit the policy 
 if it's run successfully.

 Ken Zufall
 Technical Analyst
 D660C
 The Goodyear Tire  Rubber Company
 GTN 446.0592 or 330.796.0592



 *Dave Markham dave.mark...@fjserv.net*

 04/07/2009 06:16 AM
 Please respond to
 dave.mark...@fjserv.net


   
 To
   ken_zuf...@goodyear.com
 cc
   veritas-bu@mailman.eng.auburn.edu veritas-bu@mailman.eng.auburn.edu
 Subject
   Re: [Veritas-bu] Number of retries query



   





 Thanks guys there are some useful options there.

 To give more info we run the RMAN job as follows :-

 -We have an oracle admin station which holds various oracle dba scripts.
 -We have a policy which controls the scheduling and kicks of a client
 backup of this oracle management station and backs up a single file to a
 disk storage unit on the master. (simple directory). We backup one file
 to stop any status 71
 -The reason the policy and schedule is there is to run a bpstart script
 on the management station.
 -This bpstart script checks no oracle tape dba script is already running
 (if it is it exits non zero and obviously gives status 73 in netbackup)
 -Once the checks are passed it launches an oracle dba script (not
 maintained by me).
 -This oracle script talks to 3 oracle RAC servers and works out which
 one is running the particular db instance.
 -These oracle RAC servers are all Netbackup media servers and they then
 initiate the oracle backup through a Netbackup oracle agent on the
 relevant media server. This backs up using the application schedules on
 the master server for the associated policy with each media server.
 (sorry that sounds confusing).
 -If the oracle script fails and exits with non zero then in turn our
 bpstart script fails with status 73 and we can alert the dbas

 We want to launch via netbackup this way so we can trap the exit status
 and report to the dbas there has been a problem, plus for it to appear
 on a daily report.

 The case we have experienced is if a backup fails which could be due to
 no tapes or various oracle failures, the dba's don't want an automatic
 one running again as it starts doing things with flash recovery areas
 and starts running into the normal working day.

 Indeed perhaps some logic in the bpstart script to create a lockfile is
 useful, but the lock file would need to be removed upon completion or
 failure and this would then not give us any benefit when try 2 happens.

 If a lock file was used we could do some date matching and perhaps only
 run a job if the lockfile was older than x hours ( a lot of date parsing
 though which could be difficult ) to touch it again and run the backup.
 I'll have to explorer this method.

 Cheers





 ken_zuf...@goodyear.com wrote:
 
  Dave,
 
  This isn't an ideal fix, but it will work--schedule the backups from
  the client.  Basically, just put entries in cron (root or oracle will
  work) with the commands (or script wrapper around the command) to
  launch the backup instead of using the NBU scheduler (will have to
  remove current full/incremental schedules and replace with a user
  directed that has the appropriate windows).  Reason this will work is
  because the automatic retries only affects backups launched from the
  master...if it's submitted by the client, it will not retry on failure.
 
  Only real issues off the top of my head are:
 
  1) If client is down or doesn't have network connectivity, you won't
  see failure to run backup in NBU because the backup will never be
  submitted.
 
  2) You lose visibility to backup schedules within NBU.
 
  Ken Zufall
  Technical Analyst
  D660C
  The Goodyear Tire  Rubber Company
  GTN 446.0592 or 330.796.0592
 
 
 
  *Len Boyle len.bo...@sas.com*
  Sent by: veritas-bu-boun...@mailman.eng.auburn.edu
 
  04/06/2009 09:30

Re: [Veritas-bu] Number of retries query

2009-04-07 Thread Dave Markham
Dave Markham wrote:

Sorry correction :-

hours=12
Prev_Job=`bperror -backstat -client oracle mgmt client -hoursago $hours | 
awk '$14 == policy { print Client [$12], STATUS [$19] }'`
if [ $Prev_Job ];then
echo ERROR: A previous job has ran in the past [$hours] hours  $log
echo $Prev_Job  $log
exit 1
fi



 and if its use for anyone else here is what i shall implement :-

 Prev_Job=`bperror -backstat -client oracle mgmt client -hoursago 12 | 
 awk '$14 == policy { print Client [$12], STATUS [$19] }'`
 if [ $Prev_Job ];then
 echo ERROR: A previous job has ran in the past [$hours] hours  $log
 echo $Prev_Job  $log
 exit 1
 fi


 ken_zuf...@goodyear.com wrote:
   
 Wow, glad I don't have your job...that's pretty convoluted :P

 But may have an answer, building off the lock file idea...but much 
 simpler.  Just put the logic in the bpstart to check and see if the 
 policy you're executing has run in the past X hours and failed...if it 
 has, exit gracefully, if it hasn't, continue the backup.  

 Quick and dirty logic:

 bperror -backstat -hoursago [hours] -l | awk '{print $19,$14}' | grep 
 -v ^0 | grep [policy_name]

 In the above, $19 = backup status code, $14 = policy name.  Strip out 
 any successful backups, grep for the policy name...if it's not null, 
 you've had a failure in the past X hours.

 Of course, there are different ways to parse the bperror output, but 
 the above would work.  In fact, you shouldn't even have to grep out 
 successes because the process shouldn't be trying to submit the policy 
 if it's run successfully.

 Ken Zufall
 Technical Analyst
 D660C
 The Goodyear Tire  Rubber Company
 GTN 446.0592 or 330.796.0592



 *Dave Markham dave.mark...@fjserv.net*

 04/07/2009 06:16 AM
 Please respond to
 dave.mark...@fjserv.net


  
 To
  ken_zuf...@goodyear.com
 cc
  veritas-bu@mailman.eng.auburn.edu veritas-bu@mailman.eng.auburn.edu
 Subject
  Re: [Veritas-bu] Number of retries query



  





 Thanks guys there are some useful options there.

 To give more info we run the RMAN job as follows :-

 -We have an oracle admin station which holds various oracle dba scripts.
 -We have a policy which controls the scheduling and kicks of a client
 backup of this oracle management station and backs up a single file to a
 disk storage unit on the master. (simple directory). We backup one file
 to stop any status 71
 -The reason the policy and schedule is there is to run a bpstart script
 on the management station.
 -This bpstart script checks no oracle tape dba script is already running
 (if it is it exits non zero and obviously gives status 73 in netbackup)
 -Once the checks are passed it launches an oracle dba script (not
 maintained by me).
 -This oracle script talks to 3 oracle RAC servers and works out which
 one is running the particular db instance.
 -These oracle RAC servers are all Netbackup media servers and they then
 initiate the oracle backup through a Netbackup oracle agent on the
 relevant media server. This backs up using the application schedules on
 the master server for the associated policy with each media server.
 (sorry that sounds confusing).
 -If the oracle script fails and exits with non zero then in turn our
 bpstart script fails with status 73 and we can alert the dbas

 We want to launch via netbackup this way so we can trap the exit status
 and report to the dbas there has been a problem, plus for it to appear
 on a daily report.

 The case we have experienced is if a backup fails which could be due to
 no tapes or various oracle failures, the dba's don't want an automatic
 one running again as it starts doing things with flash recovery areas
 and starts running into the normal working day.

 Indeed perhaps some logic in the bpstart script to create a lockfile is
 useful, but the lock file would need to be removed upon completion or
 failure and this would then not give us any benefit when try 2 happens.

 If a lock file was used we could do some date matching and perhaps only
 run a job if the lockfile was older than x hours ( a lot of date parsing
 though which could be difficult ) to touch it again and run the backup.
 I'll have to explorer this method.

 Cheers





 ken_zuf...@goodyear.com wrote:
 
 Dave,

 This isn't an ideal fix, but it will work--schedule the backups from
 the client.  Basically, just put entries in cron (root or oracle will
 work) with the commands (or script wrapper around the command) to
 launch the backup instead of using the NBU scheduler (will have to
 remove current full/incremental schedules and replace with a user
 directed that has the appropriate windows).  Reason this will work is
 because the automatic retries only affects backups launched from the
 master...if it's submitted by the client, it will not retry on failure.

 Only real issues off the top of my head are:

 1) If client is down or doesn't have network connectivity, you won't
 see failure to run backup in NBU because

[Veritas-bu] Number of retries query

2009-04-06 Thread Dave Markham
Guys does anyone know if you can change the number of job retries in xx 
time period on a per client basis?

I currently have the global set at 2 tries per 12 hours which is fine 
for our needs and good in the fact it will try a failed backup.

However the DBA for an RMAN and oracle policy doesn't want this to 
happen and re-run a backup if there is a failure so i need to try and 
find a way of setting it to 1 try for just one client.

Any ideas?

Cheers
___
Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu


Re: [Veritas-bu] Number of retries query

2009-04-06 Thread Len Boyle
Good Morning Dave, 

I know of no way to change the number of job retries on a policy or client or 
schedule object. 
I can see where this would be a nice feature to have. 

There are many different reasons that a rman backup job can fail. 

From a netbackup end of things one could have a 96 error no scratch tapes, 
A media fault, A network issue. Etc. 
Or it could be a oracle issue. 

For something like a media issue that is cleared up on the netbackup end of 
things I would think that the dba's would want the backup to be retried. For an 
oracle issue I do not know enough. 

But either way I believe that you could add the control you require into the 
script that netbackup runs on the client to run the rman commands. Might not be 
easy. 

I am sure other that know oracle can give you a better answer then this, and I 
look forward to learning. 
As a simple case of go or nogo without any variance based on the prior failure 
you could try.
In the beginning of the script you could set a state value of STARTED into a 
file on client. At the end of the script the vaule could be changed to 
COMPLETE. 
At the start of the script if the value is not COMPLETE the script could give 
an error return and exit. Someone would have to change the statue value to 
STARTED to enable the script to run. This could be done after clearing the 
problem. This can also be used to bypass the running of the backup at the 
script level when the   oracle dba's are doing maintenance work on the oracle 
database. If you use and check for some state value of BYPASS then the script 
could exit with a normal return code and netbackup would not have a backup but 
would think that everything is ok and not retry. 
You  could also use touch files instead on one state file. 

Let us know what you end of doing to solve this issue. 

len

-Original Message-
From: veritas-bu-boun...@mailman.eng.auburn.edu 
[mailto:veritas-bu-boun...@mailman.eng.auburn.edu] On Behalf Of Dave Markham
Sent: Monday, April 06, 2009 8:17 AM
To: veritas-bu@mailman.eng.auburn.edu
Subject: [Veritas-bu] Number of retries query

Guys does anyone know if you can change the number of job retries in xx 
time period on a per client basis?

I currently have the global set at 2 tries per 12 hours which is fine 
for our needs and good in the fact it will try a failed backup.

However the DBA for an RMAN and oracle policy doesn't want this to 
happen and re-run a backup if there is a failure so i need to try and 
find a way of setting it to 1 try for just one client.

Any ideas?

Cheers
___
Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

___
Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu


Re: [Veritas-bu] Number of retries query

2009-04-06 Thread ken_zufall
Dave,

This isn't an ideal fix, but it will work--schedule the backups from the 
client.  Basically, just put entries in cron (root or oracle will work) 
with the commands (or script wrapper around the command) to launch the 
backup instead of using the NBU scheduler (will have to remove current 
full/incremental schedules and replace with a user directed that has the 
appropriate windows).  Reason this will work is because the automatic 
retries only affects backups launched from the master...if it's submitted 
by the client, it will not retry on failure.

Only real issues off the top of my head are:

1) If client is down or doesn't have network connectivity, you won't see 
failure to run backup in NBU because the backup will never be submitted.

2) You lose visibility to backup schedules within NBU.

Ken Zufall
Technical Analyst
D660C
The Goodyear Tire  Rubber Company
GTN 446.0592 or 330.796.0592




Len Boyle len.bo...@sas.com 
Sent by: veritas-bu-boun...@mailman.eng.auburn.edu
04/06/2009 09:30 AM

To
dave.mark...@fjserv.net dave.mark...@fjserv.net, 
veritas-bu@mailman.eng.auburn.edu veritas-bu@mailman.eng.auburn.edu
cc

Subject
Re: [Veritas-bu] Number of retries query






Good Morning Dave, 

I know of no way to change the number of job retries on a policy or client 
or schedule object. 
I can see where this would be a nice feature to have. 

There are many different reasons that a rman backup job can fail. 

From a netbackup end of things one could have a 96 error no scratch tapes, 

A media fault, A network issue. Etc. 
Or it could be a oracle issue. 

For something like a media issue that is cleared up on the netbackup end 
of things I would think that the dba's would want the backup to be 
retried. For an oracle issue I do not know enough. 

But either way I believe that you could add the control you require into 
the script that netbackup runs on the client to run the rman commands. 
Might not be easy. 

I am sure other that know oracle can give you a better answer then this, 
and I look forward to learning. 
As a simple case of go or nogo without any variance based on the prior 
failure you could try.
In the beginning of the script you could set a state value of STARTED 
into a file on client. At the end of the script the vaule could be changed 
to COMPLETE. 
At the start of the script if the value is not COMPLETE the script could 
give an error return and exit. Someone would have to change the statue 
value to STARTED to enable the script to run. This could be done after 
clearing the problem. This can also be used to bypass the running of the 
backup at the script level when the   oracle dba's are doing maintenance 
work on the oracle database. If you use and check for some state value of 
BYPASS then the script could exit with a normal return code and 
netbackup would not have a backup but would think that everything is ok 
and not retry. 
You  could also use touch files instead on one state file. 

Let us know what you end of doing to solve this issue. 

len

-Original Message-
From: veritas-bu-boun...@mailman.eng.auburn.edu 
[mailto:veritas-bu-boun...@mailman.eng.auburn.edu] On Behalf Of Dave 
Markham
Sent: Monday, April 06, 2009 8:17 AM
To: veritas-bu@mailman.eng.auburn.edu
Subject: [Veritas-bu] Number of retries query

Guys does anyone know if you can change the number of job retries in xx 
time period on a per client basis?

I currently have the global set at 2 tries per 12 hours which is fine 
for our needs and good in the fact it will try a failed backup.

However the DBA for an RMAN and oracle policy doesn't want this to 
happen and re-run a backup if there is a failure so i need to try and 
find a way of setting it to 1 try for just one client.

Any ideas?

Cheers
___
Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

___
Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

___
Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu