Alexander Block <abloc...@googlemail.com> wrote:

> tar/pax was the initial format that was chosen for btrfs send/receive
> as it looked like the best and most compatible way. In the middle of
> development however I realized that we need more then storing whole
> and incremental files/dirs in the format. We needed to store
> information about moved, renamed, deleted, reflinked and even partial
> clones where only some bits of a file are shared with another. This
> can for sure all be implemented in pax, but then the next problem is
> that in some situations renamed/moved files need multiple entries to
> get to the desired result. For example, file a may be renamed to b
> while at the same time file b got renamed to a. In such cases we need
> 3 entries that use a temporary name so that we don't loose one of the
> files while receiving. There are much more complex examples where it
> gets quite complicated.

The problems of complex renames has been solved in star with the incremental 
backup/restore concept 8 years ago already. Renames are done based on inode 
numbers.

> Also, it needed support for metadata (mode, size, uid/gid, ...)
> changes on already existing files/dirs. Reusing already existent
> tar/pax entry types was not possible for this as standard tar would
> overwrite the original files with empty files.

This is not true. Star implements this since more than 8 years.


> I had all that implemented with pax, using a lot of custom pax
> entries. A lot...so many that it didn't look like tar/pax anymore. It
> actually mutated from a list of file/dir/link entries (which tar/pax
> is meant to be) to a list of filesystem instructions (rename, link,
> unlink, rmdir, write parts of a file, clone parts of a file, chmod,
> ...).

If you end up in something that does not look like an enhanced tar archivem you 
did probably not follow the rules.

> My thought was, that this was already a big misuse of tar/pax, so I
> decided to implement a simple format for this purpose only. Using pax
> gave no advantages anymore. In tar/pax every entry must have a file
> name, even the pax header entries need a file name. The problem now
> is, that plain tar will treat every unknown entry type as regular file
> and blindly overwrite existing ones which may result in data loss. To
> prevent this, I always added something to the file name so that
> unpacking with tar would not hurt the user. The unavoidable side
> effect however is that the result of a plain untar is unusable without
> further interpretation, which will be hard because tar by default does
> not dump pax headers but instead ignores unknown entries.

The problem with possible overrides with too dump archivers has been solved 
in star long ago. Star let's old software believe that there is an EOF when a 
meta data only entry is found.

> Also, using tar/pax as the format for send/receive may give a user the
> wrong impression that he can later use his good old standard tar to
> restore his backups...this could be fatal for him.

This works fine for the incremental backup/restore system used by star.

Enhancing existing data structures without breaking the philosophy is not 
trivial and may take some time, but it is usually better than reinventing the 
wheel.

Star now exists since more than 30 years (it exists since August 1982). Star is 
where other tar implementaions take their ideas from ;-)

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       j...@cs.tu-berlin.de                (uni)  
       joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to