Re: cat: adjust the maximum data copied by copy_file_range

Paul Eggert Mon, 22 Dec 2025 10:42:03 -0800

[cc'ing [email protected]; this coreutils thread can befound in <https://lists.gnu.org/r/coreutils/2025-12/threads.html#00055>.]


On 2025-12-20 00:51, Matteo Croce wrote:

This can be triggered with a huge file:


$ truncate -s $((2**63 - 1)) file1

$ ( dd bs=1M skip=$((2**43 - 2)) count=0 && cat ) < file1
0+0 records in
0+0 records out
0 bytes copied, 2,825e-05 s, 0,0 kB/s
cat: -: Invalid argument

$ dd if=file1 bs=1M skip=$((2**43 - 2))
dd: error reading 'file1': Invalid argument
1+0 records in
1+0 records out
1048576 bytes (1,0 MB, 1,0 MiB) copied, 0,103536 s, 10,1 MB/s

OK, but in bleeding-edge coreutils neither of these examples callcopy_file_range. The diagnostics result from plain 'read' syscalls nearTYPE_MAXIMUM (off_t). (dd never calls copy_file_range, and ironicallythe code in 'cat' that does call copy_file_range avoids the overflowitself, before invoking copy_file_range, and relies on plain 'read' todo the right thing near TYPE_MAXIMUM (off_t).) So these examples havenothing to do with copy_file_range.

You've found a Linux kernel bug that affects countless apps, and wecan't reasonably expect app developers to patch all the apps to workaround the bug. So the fix should be done in the kernel.

I looked at the kernel patch you suggested in<https://lore.kernel.org/linux-fsdevel/[email protected]/T/>.Unfortunately, I see two problems with it, the first minor, the secondless so.

The minor problem is that the unpatched kernel code is merelyincorrectly checking whether pos + count fits into loff_t. MAX_RW_COUNTshould not be involved with the fix, as MAX_RW_COUNT is irrelevant tofile offset range. Better would be to do correct overflow checks, withsomething like the attached patch (which I have not compiled or tested).

Second and more important, the patch doesn't fix the real bug which isthat read(FD, BUF, SIZE) fails with -EINVAL if adding SIZE to thecurrent file position would overflow off_t. That's wrong: the syscallshould read whatever bytes are present (up to EOF), and then report thenumber of bytes read. We cannot fix this bug merely via something likethe attached patch.

One possible fix for the second problem would be to changerw_verify_area's API to return the possibly-smaller number of bytes thatcan be read, and then modify its callers to do the right thing.("correct" in the sense of "don't try to read past TYPE_MAXIMUM(off_t)".) Alternatively, we could fix rw_verify_area's callers to nottry to read past TYPE_MAXIMUM (off_t), without changing the API.

diff --git a/fs/read_write.c b/fs/read_write.c
index 833bae068770..215d7cdbb1aa 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -459,13 +459,17 @@ int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t
 	if (ppos) {
 		loff_t pos = *ppos;
 
-		if (unlikely(pos < 0)) {
 			if (!unsigned_offsets(file))
 				return -EINVAL;
-			if (count >= -pos) /* both values are in 0..LLONG_MAX */
-				return -EOVERFLOW;
-		} else if (unlikely((loff_t) (pos + count) < 0)) {
-			if (!unsigned_offsets(file))
+		}
+		if (unsigned_offsets(file)) {
+			if (check_add_overflow ((uoff_t) pos, count,
+						&(uoff_t) {0}))
+				return -EINVAL;
+		} else {
+			if (unlikely(pos < 0))
+				return -EINVAL;
+			if (check_add_overflow (pos, count, &(loff_t) {0}))
 				return -EINVAL;
 		}
 	}

Re: cat: adjust the maximum data copied by copy_file_range

Reply via email to