On 27/09/2024 18:18, Matthew Wilcox wrote:
Package: coreutils Version: 9.4-3.1strace cp --sparse=always dd dd-sparse [extraneous stuff skipped] openat(AT_FDCWD, "dd", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0755, st_size=134248, ...}) = 0 openat(AT_FDCWD, "dd-sparse", O_WRONLY|O_CREAT|O_EXCL, 0755) = 4 ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = 0 close(4) = 0 close(3) = 0 It makes no attempt to look for sparse regions in the file, just uses FICLONE (which succeeds because it's on XFS). I worked around this by copying to /tmp, which is on a different filesystem (tmpfs):
Without looking at the source code, it seems likely that cp blindly tries FICLONE without checking to see whether the sparse flag is set. I suggest that setting --sparse=always should disable the FICLONE optimisation.
I see your point, however `cp --reflink=auto --sparse=always` was documented as the way to make a copy taking the least amount of space supported by the file system. That would be a more common use case than making a separate copy as sparse as possible. --reflink=auto is the default since coreutils 9.0, and it could be confusing to behave differently depending on whether --reflink=auto was explicitly or implicitly specified (considering aliases etc.) In general it's difficult to reason about the combination of factors impacting how a file is copied. For e.g. the sparseness of the file, what file system it is on, what file system the destination is on, the attributes of the file, and whether they're being copied or not. Also the --reflink and --sparse options complicate things further. To help, we added the --debug option to cp (and install and mv), to explain how a file was being copied: $ cp --debug --sparse=always dd dd-sparse 'dd' -> 'dd-sparse' copy offload: unknown, reflink: yes, sparse detection: unknown $ cp --debug --reflink=never --sparse=always dd dd-sparse 'dd' -> 'dd-sparse' copy offload: avoided, reflink: no, sparse detection: zeros thanks, Pádraig

