On 02/01/2023 06:36, Noah Misch wrote:
Because Debian coreutils 9.1-1 "cp" silently falls back to copy_file_range()
after FICLONE reports EIO, "cp" can transfer incorrect bytes. Linux syscalls
having a file descriptor parameter report EIO after a fault in the underlying
device. The affected file is not recoverable in the general case, but syscall
outcomes after the EIO don't reflect that. For example, consider FICLONE
returning EIO for a fault during source file writeback. The kernel will mark
"clean" the affected page cache entries and clear the EIO state. If the page
cache evicts those pages, their file offsets revert to the last written-back
values if any, else zeros. If userspace issues a syscall that bypasses the
page cache, like copy_file_range() or another FICLONE, that syscall clones the
last written-back state or zeroes. See
https://lore.kernel.org/linux-xfs/[email protected]
for a "cp" and "cat" test script, background, and discussion.
I recommend instead reporting the EIO and terminating when FICLONE or
copy_file_range() fails with EIO. One could argue that ENOSPC also warrants
termination, since no fallback reduces space usage. For other errno values,
fallback to the next transfer strategy, like today. An alternative would be
to fallback from FICLONE to copy_file_range() only after known-appropriate
errors EBADF, EINVAL, EOPNOTSUPP, ETXTBSY, and EXDEV. That alternative wins
if future FICLONE reports an additional termination-deserving errno value.
Since just EIO needs termination today, I bet new errno values are more likely
than not to deserve fallback. What do you think?
Yes we should have an allow list that we retry for,
and otherwise fail for EIO, ENOSPC, ENOMEM, ...
We already do this for copy_file_range(),
but should treat FICLONE similarly,
as done in the attached.
thanks!
Pádraig
From 4041d93fa895222064d6c20d380e05158bd939e7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <[email protected]>
Date: Mon, 2 Jan 2023 13:07:41 +0000
Subject: [PATCH] copy: immediately fail with transient errors from FICLONE
* src/copy.c (copy_reg): Fail with all errors from FICLONE,
unless they're from the set indicating the file system
or file do not support the clone operation.
* NEWS: Mention the bug fix. Also mention explicitly
the --reflink=auto default change to aid searching.
* cfg.mk: Adjust old_NEWS_hash with `make update-NEWS-hash`.
Fixes https://bugs.gnu.org/60489
---
NEWS | 7 +++++++
cfg.mk | 2 +-
src/copy.c | 28 ++++++++++++++++++----------
3 files changed, 26 insertions(+), 11 deletions(-)
diff --git a/NEWS b/NEWS
index f43182757..94f3b9086 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,12 @@ GNU coreutils NEWS -*- outline -*-
'cp -rx / /mnt' no longer complains "cannot create directory /mnt/".
[bug introduced in coreutils-9.1]
+ cp, install, mv now immediately acknowledge transient errors from
+ the FICLONE ioctl, used to create copy-on-write or reflinked clones.
+ Previously they would have tried again with other copy methods
+ which may have resulted in data corruption.
+ [bug introduced in coreutils-7.5 and enabled by default in coreutils-9.0]
+
'mv --backup=simple f d/' no longer mistakenly backs up d/f to f~.
[bug introduced in coreutils-9.1]
@@ -283,6 +289,7 @@ GNU coreutils NEWS -*- outline -*-
** Changes in behavior
cp and install now default to copy-on-write (COW) if available.
+ I.e., cp now uses --reflink=auto mode by default.
cp, install and mv now use the copy_file_range syscall if available.
Also, they use lseek+SEEK_HOLE rather than ioctl+FS_IOC_FIEMAP on sparse
diff --git a/cfg.mk b/cfg.mk
index 179a51d1c..c9b2d4326 100644
--- a/cfg.mk
+++ b/cfg.mk
@@ -49,7 +49,7 @@ export VERBOSE = yes
# 4914152 9e
export XZ_OPT = -8e
-old_NEWS_hash = f365e0cf2c7890c617df6abe70615fce
+old_NEWS_hash = 49269d8d99f4888a4e955d28cda50091
# Add an exemption for sc_makefile_at_at_check.
_makefile_at_at_check_exceptions = ' && !/^cu_install_prog/ && !/dynamic-dep/'
diff --git a/src/copy.c b/src/copy.c
index 310c8bf14..0f599459b 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -267,6 +267,7 @@ sparse_copy (int src_fd, int dest_fd, char **abuf, size_t buf_size,
}
if (n_copied < 0)
{
+ /* Note error set is consistent with clone_file(). */
if (errno == ENOSYS || is_ENOTSUP (errno)
|| errno == EINVAL || errno == EBADF
|| errno == EXDEV || errno == ETXTBSY)
@@ -1275,23 +1276,30 @@ copy_reg (char const *src_name, char const *dst_name,
{
if (clone_file (dest_desc, source_desc) == 0)
data_copy_required = false;
- else if (x->reflink_mode == REFLINK_ALWAYS)
+ else
{
- error (0, errno, _("failed to clone %s from %s"),
- quoteaf_n (0, dst_name), quoteaf_n (1, src_name));
+ /* Note error set is consistent with copy_file_range(). */
+ bool no_clone_attempted = errno == ENOSYS || is_ENOTSUP (errno)
+ || errno == EINVAL || errno == EBADF
+ || errno == EXDEV || errno == ETXTBSY;
+
+ if (x->reflink_mode == REFLINK_ALWAYS || ! no_clone_attempted)
+ error (0, errno, _("failed to clone %s from %s"),
+ quoteaf_n (0, dst_name), quoteaf_n (1, src_name));
/* Remove the destination if cp --reflink=always created it
- but cloned no data. If clone_file failed with
- EOPNOTSUPP, EXDEV or EINVAL no data were copied so do not
- go to the expense of lseeking. */
+ but cloned no data. */
if (*new_dst
- && (is_ENOTSUP (errno) || errno == EXDEV || errno == EINVAL
- || lseek (dest_desc, 0, SEEK_END) == 0)
+ && x->reflink_mode == REFLINK_ALWAYS
+ && (no_clone_attempted || lseek (dest_desc, 0, SEEK_END) == 0)
&& unlinkat (dst_dirfd, dst_relname, 0) != 0 && errno != ENOENT)
error (0, errno, _("cannot remove %s"), quoteaf (dst_name));
- return_val = false;
- goto close_src_and_dst_desc;
+ if (x->reflink_mode == REFLINK_ALWAYS || ! no_clone_attempted)
+ {
+ return_val = false;
+ goto close_src_and_dst_desc;
+ }
}
}
--
2.26.2