Item x: Deletion of Disk-Based Volumes
Date:   Nov 25, 2005
Original: Ross Boylan <RossBoylan at stanfordalumni dot org>
Status: Proposal

What:  It would be useful to control how long the actual backups were
kept for those backups that went to disk-based volumes.  A range of
options similar to those currently available for retaining catalog
information would be useful.  An additional option to permit deletion
of the actual backup when the record of the associated job and/or
files is purged from the catalog would also be useful.  However, it
should be possible for the actual backups to be retained for more or
less time than the associated catalog records.

A particular volume should be deleted only when all jobs it contains
are OK to delete.

A possible extension would be to delete backup volumes when the total
size or number of backup volumes on disk exceeded a certain threshhold.

Why: Bacula currently manages only entries in its catalogs.
Historically, this arose because it is based on a model of backup to
tape.   With disk-based backups, each volume may be online with a
unique name (e.g., one with the data embedded in it).  These need to
removed periodically to avoid filling the disk.  Manual or automatic
methods of doing so are available outside of bacula, but it would be
convenient if bacula could manage this itself.

In particular, methods from outside of bacula will likely be unable to
reproduce bacula's precise logic for when to purge things; only a
method built into bacula can do so.

Finally, one possible implementation would be to implement this
feature by reinterpreting the meaning of "recycle" for disk based
volumes.  I, and apparently others, expected that this would delete
old disk-based volumes rather than reuse then.  The alternate
implementation might reduce this confusion.

Notes: The desired behavior depends on the model used for naming
disk-based volumes.  If constant names (e.g., "FridayPool") are in
use, recycling as it now stands is appropriate.  But with variable
names, ones with date or sequence numbers embedded in the volume name,
the features indicated above would be useful.

It is unclear what the appropriate behavior should be if the volume to
be deleted is not found.  Probably a warning is appropriate.  If the
operator is deleting volumes after moving them to a more permanent
medium, the absence of a volume may not be a problem at all.

I requested the ability to control the lifetime of the volume and the
catalog entries separately primarily because I anticipate wanting to
preserve the catalog entries so I know what files were on my system at
particular times.  Since all necessary information can be recovered
from the volume (I think), one could also use the volumes without the
catalog entries.  An alternate, simpler, implementation would consider
the job on the volume suitable for deletion when its corresponding job
or files entries in the catalog were deleted.

A possible implementation would be to provide events for python
scripts, rather than, or as well as, keywords in the configuration
file.

Thus, at least 3 possible avenues for implementation occur:
1) reinterpretation of the volume Recycling arguments to mean delete
rather than reuse (maybe with a new keyword controlling this behavior);
2) creation of a new set of keywords analogous to existing ones
governing lifetimes of things in the catalog;
3) an event-based framework for scripts.

Because alternate solutions outside of bacula are straight-forward,
and usually adequate, this feature does not seem particularly urgent.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to