Can someone explain the use of logged archive backup?

2012-03-29 Thread Bergquist, Brett
I need to protect against a media crash and it is not so important to go back 
to specific periods of time for the database.   I have read the documentation 
on the Derby Admin guide but am still confused on the use of 
SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE.

>From what it says, a copy of the last full backup, the archived logs, and the 
>active logs are needed to restore.   I am a little confused here on the 
>archived logs and the active logs.


* Is there something special that should be done with the archived 
logs?   Or is it that the log file location needs to be on another media?

* When SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE is 
used, are all transactions from that point on logged forever until the next 
full backup is done?   Is this more expensive in terms of disk space usage than 
just doing an SYSCS_UTIL.SYSCS_BACKUP_DATABASE periodically

* Is the difference between SYSCS_UTIL.SYSCS_BACKUP_DATABASE and 
SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE that with the 
SYSCS_UTIL.SYSCS_BACKUP_DATABASE I can restore to that point in time, but with 
SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE (assuming the 
backup and log files are on a separate media), that I can recover to the last 
transaction?

Another question is the time to perform a roll-forward recovery if 
SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE is used.   Might 
it not be better to just have a separate instance of Derby available and use 
database replication.  Would this not provide a fully functional database up to 
just about the last transaction so recovery would be a matter or copying the 
replicated database to the system and restarting?




Re: Can someone explain the use of logged archive backup?

2012-03-29 Thread Bryan Pendleton

confused here on the archived logs and the active logs.


In general, there can be multiple logs covering the time between one
backup and the next backup, and those logs must be applied, serially,
in the correct order, to recover the database fully.

Once you take that "next" backup, you no longer need the previous
backup/logs, though it's probably wise to establish a reasonable
holding period depending on your resources (e.g., backup weekly, keep
3 months of backups and logs, destroy the oldest set when you complete
the most recent backup, etc.)


I need to protect against a media crash and it is not so important to go back 
to specific periods of time for the database.


Perhaps you can entirely use lower-level mechanisms, then, such as
RAID or other redundant storage hardware, or a modern filesystem which
automatically replicates the underlying data against the failure of
the storage, such as ZFS (http://en.wikipedia.org/wiki/ZFS)

I think it's still wise to have an application-level backup strategy,
because sometimes logical recovery is necessary (e.g., to recover from
an application bug or an administrative mistake), so I think that the
exercise you're going through about documenting your backup and
recovery strategies is an excellent one.

And don't forget to test those backup/restore practices, since an
untested restore is no better than no restore at all.

I've found that one useful technique is to provision a secondary
machine, which can be MUCH smaller in terms of CPU, memory, networking,
etc., and just has to have enough disk space, and automate things so
that every time I take a backup, my scripts automatically copy the
backup to this spare machine, restore the backup and apply all the
logs, and then run a few queries to satisfy myself that the database
is correctly recovered.

thanks,

bryan


RE: Can someone explain the use of logged archive backup?

2012-03-29 Thread Bergquist, Brett
Thanks Bryan, but I am a little dense on this one and just want to understand 
the proper procedure to use 
"SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE" so that I can 
either consider it or not.  

On one system that uses ZFS, we will probably use the 
SYS_UTIL.SYSCS_FREEZE_DATABASE to freeze, perform an ZFS snapshot, and 
SYS_UTIL.SYSCS_FREEZE_DATABASE to unfreeze the database.   I am concerned of a 
failure in the code between freezing the database and unfreezing the database 
as I believe if the connection is lost, the database will remain frozen with no 
way to unfreeze.  I need to test that and maybe patch Derby to perform an 
unfreeze if the connection is lost. I have had this happen when I used a script 
to call IJ to do the freeze, the shell to do the ZFS snapshot, and IJ again to 
unfreeze.  The last IJ to unfreeze could not get a connection and could not 
unfreeze.  And the database was locked up and could not even be shutdown 
cleanly.

Another system is an older system that is running UFS so this will not be 
possible.

Back to "SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE".  So the 
procedure would be:

- use this procedure to create a backup of the database and enable log 
archiving.  Presumably the database would be created on some other media.
- if the log directory has not been moved with the from 
"/log", what does one do with this?  Periodically copy the file in 
there to the other media?  How does one determine the "archive" logs from the 
"active" logs used for crash recovery or is even necessary.
- time goes on and another full backup is done.  Does one use 
SYSCS_UTIL.SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE again?  I assume 
that the new archive logs need to be kept with this backup now, right?
- I guess I confused by the archive logs and where they are located, 
what they look like, and what to do with them

Now there is a disk crash that contains the database, a new disk is put in 
place, formatted, etc. and is available to the OS.  So the procedure to recover 
this is to issue a connect like:

- connect 'jdbc:derby:;rollForwardRecoveryFrom=';

Is that correct?  Is that any different than physically copying the database 
and all of the logs to the new disk simply connecting like:

- connect 'jdbc:derby:'


Any light you can shed on this will be helpful.  I think the Administration 
guide should go through this.  I t starts to, but the part about handling the 
"archive logs" is not really presented.



-Original Message-
From: Bryan Pendleton [mailto:bpendleton.de...@gmail.com] 
Sent: Thursday, March 29, 2012 10:31 AM
To: Derby Discussion
Subject: Re: Can someone explain the use of logged archive backup?

> confused here on the archived logs and the active logs.

In general, there can be multiple logs covering the time between one backup and 
the next backup, and those logs must be applied, serially, in the correct 
order, to recover the database fully.

Once you take that "next" backup, you no longer need the previous backup/logs, 
though it's probably wise to establish a reasonable holding period depending on 
your resources (e.g., backup weekly, keep
3 months of backups and logs, destroy the oldest set when you complete the most 
recent backup, etc.)

> I need to protect against a media crash and it is not so important to go back 
> to specific periods of time for the database.

Perhaps you can entirely use lower-level mechanisms, then, such as RAID or 
other redundant storage hardware, or a modern filesystem which automatically 
replicates the underlying data against the failure of the storage, such as ZFS 
(http://en.wikipedia.org/wiki/ZFS)

I think it's still wise to have an application-level backup strategy, because 
sometimes logical recovery is necessary (e.g., to recover from an application 
bug or an administrative mistake), so I think that the exercise you're going 
through about documenting your backup and recovery strategies is an excellent 
one.

And don't forget to test those backup/restore practices, since an untested 
restore is no better than no restore at all.

I've found that one useful technique is to provision a secondary machine, which 
can be MUCH smaller in terms of CPU, memory, networking, etc., and just has to 
have enough disk space, and automate things so that every time I take a backup, 
my scripts automatically copy the backup to this spare machine, restore the 
backup and apply all the logs, and then run a few queries to satisfy myself 
that the database is correctly recovered.

thanks,

bryan