On Mon, 31 Dec 2007, Darren Reed wrote:

> Frank Hofmann wrote:
>>
>>
>> On Fri, 28 Dec 2007, Darren Reed wrote:
>> [ ... ]
>>> Is this behaviour defined by a standard (such as POSIX or the
>>> VFS design) or are we free to innovate here and do something
>>> that allowed such a shortcut as required?
>>
>> Wrt. to standards, quote from:
>>
>>     http://www.opengroup.org/onlinepubs/009695399/functions/rename.html
>>
>>     ERRORS
>>     The rename() function shall fail if:
>> [ ... ]
>>     [EXDEV]
>>     [CX]  The links named by new and old are on different file systems
>> and the
>>     implementation does not support links between file systems.
>>
>> Hence, it's implementation-dependent, as per IEEE1003.1.
>
> This implies that we'd also have to look at allowing
> link(2) to also function between filesystems where
> rename(2) was going to work without doing a copy,
> correct?  Which I suppose makes sense.

Copy-on-write. rename() is just defined as an "atomic" sequence of:

        link(old, new);
        unlink(old);

If cross-fs rename is possible, then cross-fs link is as well. It's 
"per-file clone".

Btw, Joerg, this addresses the concern you had in any case. It's cross-fs, 
that means st_dev/st_ino _WILL_ change. Persistence of open files is not 
related to that. If you hold a file open, the st_dev/st_ino associated 
with the open fd will stay around and continue to be accessible with 
fstat() - but not necessarily with stat(). It definitely would not be in 
case the file got removed. That cross-fs rename would, on the source fs, 
remove the file is, for all I can see, not violating anything.
The location of the file's data is _NOT_ the only way to derive a unique 
st_dev/st_ino pair.
rename() _within_ a filesystem (as defined by the set of nodes with a 
common st_dev) should preserve st_ino if the fs supports link counts 
larger than one, agreed. But let's not confuse this with cross-fs rename, 
where by definition (cross-fs) st_dev must change. The identity of that 
file, therefore, has changed.
We're just in the happy situation with ZFS that the storage low-level 
implementation can know that the contents haven't.

That's a sad situation for backup utilities, by the way - a backup tool 
would have no way of finding out that file X on fs A already existed as 
file Z on fs B. So what ? If the file got copied, byte by byte, the same 
situation exists, the contents are identical. I don't think just because 
this makes backups slower than they could be if the backup utility were 
omniscient, that makes a reason to slow file copy/rename operations down.

Happy new year !
FrankH.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to