Jim Meyering wrote: > Subject: [PATCH 01/10] cp: Add FIEMAP support for efficient sparse file copy
FYI, using those patches, I ran a test for the first time in a few days: check -C tests TESTS=cp/sparse-fiemap VERBOSE=yes It failed like this on an ext4 partition using F13: + timeout 10 cp --sparse=always sparse fiemap + fail=1 ++ stat --printf %s sparse ++ stat --printf %s fiemap + test 1099511628800 = 0 + fail=1 That is very odd. No diagnostic from cp, yet it failed after creating a zero-length file. Here's the corresponding piece of the script: # It takes many minutes to copy this sparse file using the old method. # By contrast, it takes far less than 1 second using FIEMAP-copy. timeout 10 cp --sparse=always sparse fiemap || fail=1 # Ensure that the sparse file copied through fiemap has the same size # in bytes as the original. test $(stat --printf %s sparse) = $(stat --printf %s fiemap) || fail=1 However, so far I've been unable to reproduce the failure, running hundreds of iterations: for i in $(seq 300); do printf .; make check -C tests \ TESTS=cp/sparse-fiemap VERBOSE=yes >& makerr-$i || break; done Have any of you heard of a problem whereby a cold cache can cause such a thing? "echo 3 > /proc/sys/vm/drop_caches" didn't help. I suspect that having so many extents is unusual, so maybe this is a rarely exercised corner case. =============================== As I wrote the above, I realized I probably had enough information to deduce where things were going wrong, even if so far I've been unable to reproduce it. And sure enough. There is a way to provoke exactly that failure. If the *second* (or later) FIEMAP ioctl fails: do { fiemap->fm_length = FIEMAP_MAX_OFFSET; fiemap->fm_extent_count = count; /* When ioctl(2) fails, fall back to the normal copy only if it is the first time we met. */ if (ioctl (src_fd, FS_IOC_FIEMAP, fiemap) < 0) { /* If the first ioctl fails, tell the caller that it is ok to proceed with a normal copy. */ if (i == 0) *normal_copy_required = true; return false; } In that case, fiemap_copy returns false (with no diagnostic) and cp fails silently. Obviously I will now add code to diagnose the failure, but do any of you know off hand how to reproduce this or what the failure might have been? Here's the patch I plan to merge: diff --git a/src/copy.c b/src/copy.c index eb67700..07d605e 100644 --- a/src/copy.c +++ b/src/copy.c @@ -200,6 +200,12 @@ fiemap_copy (int src_fd, int dest_fd, size_t buf_size, ok to proceed with a normal copy. */ if (i == 0) *normal_copy_required = true; + else + { + /* If the second or subsequent ioctl fails, diagnose it, + since it ends up causing the entire copy/cp to fail. */ + error (0, errno, _("%s: FIEMAP ioctl failed"), quote (src_name)); + } return false; }