Re: [zfs-discuss] GSoC 09 zfs ideas?

Mike Gerdts Sat, 28 Feb 2009 05:27:08 -0800

On Sat, Feb 28, 2009 at 1:20 AM, Richard Elling
<richard.ell...@gmail.com> wrote:
> David Magda wrote:
>> On Feb 27, 2009, at 20:02, Richard Elling wrote:
>>> At the risk of repeating the Best Practices Guide (again):
>>> The zfs send and receive commands do not provide an enterprise-level
>>> backup solution.
>>
>> Yes, in its current state; hopefully that will change some point in the
>> future (which is what we're talking about with GSoC--the potential to change
>> the status quo).
>
> I suppose, but considering that enterprise backup solutions exist,
> and some are open source, why reinvent the wheel?
> -- richard


The default mode of operation for every enterprise backup tool that I
have used is file level backups.  The determination of which files
need to be backed up seems to be to crawl the file system looking for
files that have an mtime after the previous backup.

Areas of strength for such tools include:

- Works with any file system that provides a POSIX interface
- Restore of a full backup is an accurate representation of the data backed up
- Restore can happen to a different file system type
- Restoring an individual file is possible

Areas of weakness include:

- Extremely inefficient for file systems with lots of files and little change.
- Restore of full + incremental tends to have extra files because of
spotty support or performance overhead of tool that would prevent it.
- Large files that have blocks rewritten get backed up in full each time
- Restores of file systems with lots of small files (especially in one
directory) are extremely slow

There exist features (sometimes expensive add-ons) that deal with some
of these shortcomings via:

- Keeping track of deleted files so that a restore is more
representative of what is on disk during the incremental backup.
Administration manuals typically warn that this has a big performance
and/or size overhead on the database used by the backup software.
- Including add-ons that hook into other components (e.g. VxFS storage
checkpoints, Oracle RMAN) that provide something similar to
block-level incremental backups

Why re-invent the wheel?

- People are more likely to have snapshots available for file-level
restores, and as such a "zfs send" data stream would only be used in
the event of a complete pool loss.
- It is possible to provide a general block-level backup solution so
that every product doesn't have to invent it.  This gives ZFS another
feature benefit to put it higher in the procurement priority.
- File creation slowness can likely be avoided allowing restore to
happen at tape speed
- To be competitive with NetApp "snapmirror to tape"
- Even having a zfs(1M) option that could list the files that change
between snapshots could be very helpful to prevent file system crawls
and to avoid being fooled by bogus mtimes.


-- 
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] GSoC 09 zfs ideas?

Reply via email to