Hi, Tomas Volf <~@wolfsden.cz> skribis:
> On modern file-systems (BTRFS, ZFS) it is possible to copy a file using > copy-on-write method. For large files it has the advantage of being > much faster and saving disk space (since identical extents are not > duplicated). This feature is stable and for example coreutils' `cp' > does use it automatically (see --reflink). > > This commit adds support for this feature into our > copy-file (scm_copy_file) procedure. Same as `cp', it defaults to > 'auto, meaning the copy-on-write is attempted, and in case of failure > the regular copy is performed. > > No tests are provided, because the behavior depends on the system, > underlying file-system and its configuration. That makes it challenging > to write a test for it. Manual testing was performed instead: > > $ btrfs filesystem du /tmp/cow* > Total Exclusive Set shared Filename > 36.00KiB 36.00KiB 0.00B /tmp/cow > > $ cat cow-test.scm > (copy-file "/tmp/cow" "/tmp/cow-unspecified") > (copy-file "/tmp/cow" "/tmp/cow-always" #:copy-on-write 'always) > (copy-file "/tmp/cow" "/tmp/cow-auto" #:copy-on-write 'auto) > (copy-file "/tmp/cow" "/tmp/cow-never" #:copy-on-write 'never) > (copy-file "/tmp/cow" "/dev/shm/cow-unspecified") > (copy-file "/tmp/cow" "/dev/shm/cow-auto" #:copy-on-write 'auto) > (copy-file "/tmp/cow" "/dev/shm/cow-never" #:copy-on-write 'never) > $ ./meta/guile -s cow-test.scm > > $ btrfs filesystem du /tmp/cow* > Total Exclusive Set shared Filename > 36.00KiB 0.00B 36.00KiB /tmp/cow > 36.00KiB 0.00B 36.00KiB /tmp/cow-always > 36.00KiB 0.00B 36.00KiB /tmp/cow-auto > 36.00KiB 36.00KiB 0.00B /tmp/cow-never > 36.00KiB 0.00B 36.00KiB /tmp/cow-unspecified > > $ sha1sum /tmp/cow* /dev/shm/cow* > 4c665f87b5dc2e7d26279c4b48968d085e1ace32 /tmp/cow > 4c665f87b5dc2e7d26279c4b48968d085e1ace32 /tmp/cow-always > 4c665f87b5dc2e7d26279c4b48968d085e1ace32 /tmp/cow-auto > 4c665f87b5dc2e7d26279c4b48968d085e1ace32 /tmp/cow-never > 4c665f87b5dc2e7d26279c4b48968d085e1ace32 /tmp/cow-unspecified > 4c665f87b5dc2e7d26279c4b48968d085e1ace32 /dev/shm/cow-auto > 4c665f87b5dc2e7d26279c4b48968d085e1ace32 /dev/shm/cow-never > 4c665f87b5dc2e7d26279c4b48968d085e1ace32 /dev/shm/cow-unspecified > > This commit also adds to new failure modes for (copy-file). > > Failure to copy-on-write when 'always was passed in: > > scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write > 'always) > ice-9/boot-9.scm:1676:22: In procedure raise-exception: > In procedure copy-file: copy-on-write failed: Invalid cross-device link > > Passing in invalid value for the #:copy-on-write keyword argument: > > scheme@(guile-user)> (copy-file "/tmp/cow" "/dev/shm/cow" #:copy-on-write > 'nevr) > ice-9/boot-9.scm:1676:22: In procedure raise-exception: > In procedure copy-file: invalid value for #:copy-on-write: nevr > > * NEWS: Add note for copy-file supporting copy-on-write. > * configure.ac: Check for linux/fs.h. > * doc/ref/posix.texi (File System)[copy-file]: Document the new > signature. > * libguile/filesys.c (clone_file): New function cloning a file using > FICLONE, if supported. > (k_copy_on_write): New keyword. > (sym_always, sym_auto, sym_never): New symbols. > (scm_copy_file): New #:copy-on-write keyword argument. Attempt > copy-on-write copy by default. > * libguile/filesys.h: Update signature for scm_copy_file. The patch looks great (and very useful) to me, modulo one issue: > -SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile); > +SCM_API SCM scm_copy_file (SCM oldfile, SCM newfile, SCM rest); Since this is a public interface, we cannot change this function’s signature during the 3.0 stable series. Thus, I would suggest keeping the public ‘scm_copy_file’ unchanged and internally having a three-argument variant. The Scheme-level ‘copy-file’ would map to that three-argument variant. (See how ‘scm_pipe’ and ‘scm accept’ as examples.) Could you send an updated patch? BTW, copyright assignment to the FSF is now optional but encouraged. Please see <https://lists.gnu.org/archive/html/guile-devel/2022-10/msg00008.html>. Thanks, Ludo’.