On Fri, Oct 19, 2012 at 11:23 AM, Arne Jansen <sensi...@gmx.net> wrote: > On 19.10.2012 11:16, Irek Szczesniak wrote: >> On Wed, Oct 17, 2012 at 2:29 PM, Arne Jansen <sensi...@gmx.net> wrote: >>> We have finished a beta version of the feature. A webrev for it >>> can be found here: >>> >>> http://cr.illumos.org/~webrev/sensille/fits-send/ >>> >>> It adds a command 'zfs fits-send'. The resulting streams can >>> currently only be received on btrfs, but more receivers will >>> follow. >>> It would be great if anyone interested could give it some testing >>> and/or review. If there are no objections, I'll send a formal >>> webrev soon. >> >> Why are you trying to reinvent the wheel? AFAIK some tar versions and >> AT&T AST pax support deltas based on a standard (I'll have to dig out >> the exact specification, but from looking at it you did double work). >> > > I haven't done the research myself, but the result was that pax would > have needed significant extension, but I don't have the details. If > you dig out a format already in use that supports everything we need > (like sharing data between files, needed for btrfs reflinks), it should > be easy to change the format. Stuffing the data into a specific format > is not an essential part of the work and can be changed with a limited > amount of work. > > -Arne > >> Irek >
tar/pax was the initial format that was chosen for btrfs send/receive as it looked like the best and most compatible way. In the middle of development however I realized that we need more then storing whole and incremental files/dirs in the format. We needed to store information about moved, renamed, deleted, reflinked and even partial clones where only some bits of a file are shared with another. This can for sure all be implemented in pax, but then the next problem is that in some situations renamed/moved files need multiple entries to get to the desired result. For example, file a may be renamed to b while at the same time file b got renamed to a. In such cases we need 3 entries that use a temporary name so that we don't loose one of the files while receiving. There are much more complex examples where it gets quite complicated. Also, it needed support for metadata (mode, size, uid/gid, ...) changes on already existing files/dirs. Reusing already existent tar/pax entry types was not possible for this as standard tar would overwrite the original files with empty files. I had all that implemented with pax, using a lot of custom pax entries. A lot...so many that it didn't look like tar/pax anymore. It actually mutated from a list of file/dir/link entries (which tar/pax is meant to be) to a list of filesystem instructions (rename, link, unlink, rmdir, write parts of a file, clone parts of a file, chmod, ...). My thought was, that this was already a big misuse of tar/pax, so I decided to implement a simple format for this purpose only. Using pax gave no advantages anymore. In tar/pax every entry must have a file name, even the pax header entries need a file name. The problem now is, that plain tar will treat every unknown entry type as regular file and blindly overwrite existing ones which may result in data loss. To prevent this, I always added something to the file name so that unpacking with tar would not hurt the user. The unavoidable side effect however is that the result of a plain untar is unusable without further interpretation, which will be hard because tar by default does not dump pax headers but instead ignores unknown entries. Also, using tar/pax as the format for send/receive may give a user the wrong impression that he can later use his good old standard tar to restore his backups...this could be fatal for him. Alex. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss