bug#58521: human readable still wrong output (after 20+ years?)

2023-01-02 Thread Chandler Sobel-Sorenson

Paul Eggert wrote on 1/2/23 1:28 PM:

Unfortunately backward-compatibility concerns are real,


Such as?



which means we're not likely to make a big change to -h's behavior at this 
point. You can use --si instead.


While --si produces correct output, this does not address the subject of this bug, which 
is -h's faulty behavior.  What big change are you thinking of?  All you need to do is add 
an "i" to -h's prefixes.  Optionally and additionally, you could change -h's 
behavior so it outputs the same thing as --si, then create a new flag such as --bi for 
binary prefixes and values.



Assuming you're talking about 'df' and 'du', neither -h nor --si follows the SI 
standard, since they both output prefixes without units.


Don't be silly, we all know what the units are, plus it is mentioned in the 
manuals.



So this is not a question of conforming to any standard.


Never said it was, or maybe I mentioned it, don't remember since it's been so 
long.  Either way, the standard is a guideline to follow, this is about your 
programs outputting WRONG and INCORRECT values, inconsistent with reality.  I'm 
sure I'm not the only one who works with international collaborators involving 
terabytes of data.  After having to spend several days going back and forth 
explaining to them that, yes, we did deliver all the data required by the 
project and, no, there is no data missing, it was imperative to file this 
report, just as it is for your programs to output results that are in 
accordance with reality.





bug#59382: cp(1) tries to allocate too much memory if filesystem blocksizes are unusual

2023-01-02 Thread Pádraig Brady

On 02/01/2023 23:18, Paul Eggert wrote:

On 2023-01-02 15:03, Pádraig Brady wrote:

On 20/11/2022 03:50, Paul Eggert wrote:

Although we inadvertently removed support for weird devices in 2009 by
commit 55efc5f3ee485b3e31a91c331f07c89aeccc4e89, and nobody seems to
care (because people use dd or whatever to deal with weird devices), I
think it'd be better to limit the fix to regular files. And while we're
at it we might as well resurrect support for weird devices.


Paul I'm happy with your patch for this.
I've rebased and tweaked the summary and NEWS slightly in the attached.

OK for me to apply?


Sure, that looks good, thanks.


I forgot about documenting the reinstated I/O block size multiple constraint,
which I'll do with the attached.

cheers,
Pádraig
From e01c06fb9c182151d4bbbe90873f79b5b2295245 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Mon, 2 Jan 2023 23:16:07 +
Subject: [PATCH] doc: copy: mention the reinstated I/O size constraints

* NEWS: Mention the change in behavior re block size multiples
to support unusual devices with this constraint.
* src/copy.c (copy_reg): Likewise.
---
 NEWS   | 4 
 src/copy.c | 4 +++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 7a655974a..dce09e9ba 100644
--- a/NEWS
+++ b/NEWS
@@ -58,6 +58,10 @@ GNU coreutils NEWS-*- outline -*-
   'cp --reflink=always A B' no longer leaves behind a newly created
   empty file B merely because copy-on-write clones are not supported.
 
+  cp, mv, and install again read in multiples of the reported block size,
+  to support unusual devices that may have this constraint.
+  [behavior inadvertently changed in coreutils-7.2]
+
   'ls -v' and 'sort -V' go back to sorting ".0" before ".A",
   reverting to the behavior in coreutils-9.0 and earlier.
   This behavior is now documented.
diff --git a/src/copy.c b/src/copy.c
index 1058b08ec..251769300 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1387,7 +1387,9 @@ copy_reg (char const *src_name, char const *dst_name,
   if (! make_holes)
 {
   /* Compute the least common multiple of the input and output
- buffer sizes, adjusting for outlandish values.  */
+ buffer sizes, adjusting for outlandish values.
+ Note we read in multiples of the reported block size
+ to support (unusual) devices that have this constraint.  */
   size_t blcm_max = MIN (SIZE_MAX, SSIZE_MAX);
   size_t blcm = buffer_lcm (io_blksize (src_open_sb), buf_size,
 blcm_max);
-- 
2.26.2



bug#59382: cp(1) tries to allocate too much memory if filesystem blocksizes are unusual

2023-01-02 Thread Paul Eggert

On 2023-01-02 15:03, Pádraig Brady wrote:

On 20/11/2022 03:50, Paul Eggert wrote:

Although we inadvertently removed support for weird devices in 2009 by
commit 55efc5f3ee485b3e31a91c331f07c89aeccc4e89, and nobody seems to
care (because people use dd or whatever to deal with weird devices), I
think it'd be better to limit the fix to regular files. And while we're
at it we might as well resurrect support for weird devices.


Paul I'm happy with your patch for this.
I've rebased and tweaked the summary and NEWS slightly in the attached.

OK for me to apply?


Sure, that looks good, thanks.






bug#59382: cp(1) tries to allocate too much memory if filesystem blocksizes are unusual

2023-01-02 Thread Pádraig Brady

On 20/11/2022 03:50, Paul Eggert wrote:

Although we inadvertently removed support for weird devices in 2009 by
commit 55efc5f3ee485b3e31a91c331f07c89aeccc4e89, and nobody seems to
care (because people use dd or whatever to deal with weird devices), I
think it'd be better to limit the fix to regular files. And while we're
at it we might as well resurrect support for weird devices.


Paul I'm happy with your patch for this.
I've rebased and tweaked the summary and NEWS slightly in the attached.

OK for me to apply?

Preemptively marking this as done.

cheers,
PádraigFrom 573979e21af8f9ea7f83660e949902b3330ce25e Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 19 Nov 2022 19:04:36 -0800
Subject: [PATCH] copy: fix possible over allocation for regular files

* bootstrap.conf (gnulib_modules): Add count-leading-zeros,
which was already an indirect dependency, since ioblksize.h
now uses it directly.
* src/ioblksize.h: Include count-leading-zeros.h.
(io_blksize): Treat impossible blocksizes as IO_BUFSIZE.
When growing a blocksize to IO_BUFSIZE, keep it a multiple of the
stated blocksize.  Work around the ZFS performance bug.
* NEWS: Mention the bug fix.
Problem reported by Korn Andras at https://bugs.gnu.org/59382
---
 NEWS|  5 +
 bootstrap.conf  |  1 +
 src/ioblksize.h | 28 +++-
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 0fbfe7b09..40eb66968 100644
--- a/NEWS
+++ b/NEWS
@@ -20,6 +20,11 @@ GNU coreutils NEWS-*- outline -*-
   which may have resulted in data corruption.
   [bug introduced in coreutils-7.5 and enabled by default in coreutils-9.0]
 
+  cp, mv, and install avoid allocating too much memory, and possibly
+  triggering "memory exhausted" failures, on file systems like ZFS,
+  which can return varied file system I/O block size values for files.
+  [bug introduced in coreutils-6.0]
+
   'mv --backup=simple f d/' no longer mistakenly backs up d/f to f~.
   [bug introduced in coreutils-9.1]
 
diff --git a/bootstrap.conf b/bootstrap.conf
index 247734673..f04c90624 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -59,6 +59,7 @@ gnulib_modules="
   config-h
   configmake
   copy-file-range
+  count-leading-zeros
   crypto/md5
   crypto/sha1
   crypto/sha256
diff --git a/src/ioblksize.h b/src/ioblksize.h
index 7a56c1a51..80d7931ba 100644
--- a/src/ioblksize.h
+++ b/src/ioblksize.h
@@ -18,6 +18,7 @@
 
 /* sys/stat.h and minmax.h will already have been included by system.h. */
 #include "idx.h"
+#include "count-leading-zeros.h"
 #include "stat-size.h"
 
 
@@ -75,8 +76,33 @@ enum { IO_BUFSIZE = 128 * 1024 };
 static inline idx_t
 io_blksize (struct stat sb)
 {
+  /* Treat impossible blocksizes as if they were IO_BUFSIZE.  */
+  idx_t blocksize = ST_BLKSIZE (sb) <= 0 ? IO_BUFSIZE : ST_BLKSIZE (sb);
+
+  /* Use a blocksize of at least IO_BUFSIZE bytes, keeping it a
+ multiple of the original blocksize.  */
+  blocksize += (IO_BUFSIZE - 1) - (IO_BUFSIZE - 1) % blocksize;
+
+  /* For regular files we can ignore the blocksize if we think we know better.
+ ZFS sometimes understates the blocksize, because it thinks
+ apps stupidly allocate a block that large even for small files.
+ This misinformation can cause coreutils to use wrong-sized blocks.
+ Work around some of the performance bug by substituting the next
+ power of two when the reported blocksize is not a power of two.  */
+  if (S_ISREG (sb.st_mode)
+  && blocksize & (blocksize - 1))
+{
+  int leading_zeros = count_leading_zeros_ll (blocksize);
+  if (IDX_MAX < ULLONG_MAX || leading_zeros)
+{
+  unsigned long long power = 1ull << (ULLONG_WIDTH - leading_zeros);
+  if (power <= IDX_MAX)
+blocksize = power;
+}
+}
+
   /* Don’t go above the largest power of two that fits in idx_t and size_t,
  as that is asking for trouble.  */
   return MIN (MIN (IDX_MAX, SIZE_MAX) / 2 + 1,
-  MAX (IO_BUFSIZE, ST_BLKSIZE (sb)));
+  blocksize);
 }
-- 
2.26.2



bug#58521: human readable still wrong output (after 20+ years?)

2023-01-02 Thread Paul Eggert

On 2023-01-02 13:22, Chandler Sobel-Sorenson wrote:


Unfortunately backward-compatibility concerns are real,


Such as? 


I imagine lots of programs read the current output format. GNU 'sort' 
does. I haven't investigated all such programs.


The current behavior has been in place since Larry McVoy contributed it 
in 1996. It'd take a pretty strong argument to change it now. Mere 
dislike wouldn't suffice.


this is about your programs outputting WRONG and INCORRECT values, 
inconsistent with reality


"Inconsistent with reality" is hyperbole; this is merely disagreement 
about notation.



All you need to do is add an "i" to -h's prefixes


That would bloat the output and introduce compatibility concerns as 
mentioned above.






bug#58521: human readable still wrong output (after 20+ years?)

2023-01-02 Thread Paul Eggert

On 2022-10-14 09:24, chandler wrote:

Please no excuses about how this will break other programs


Unfortunately backward-compatibility concerns are real, which means 
we're not likely to make a big change to -h's behavior at this point. 
You can use --si instead.


Assuming you're talking about 'df' and 'du', neither -h nor --si follows 
the SI standard, since they both output prefixes without units. So this 
is not a question of conforming to any standard.






bug#60489: Programs should exit after EIO from FICLONE or copy_file_range()

2023-01-02 Thread Pádraig Brady

On 02/01/2023 13:28, Andreas Schwab wrote:

On Jan 02 2023, Pádraig Brady wrote:


+  /* Note error set is consistent with copy_file_range().  */
+  bool no_clone_attempted = errno == ENOSYS || is_ENOTSUP (errno)
+|| errno == EINVAL || errno == EBADF
+|| errno == EXDEV || errno == ETXTBSY;


Should this be refactored to avoid duplication?


Yes good call.
We should also refactor the handling of clone failure
to also apply to the usage of fclonefileat() on macOS.

Updated patch attached.

cheers,
Pádraig
From d13bde46a48036f17c3fb651977d75b8d3eb7eda Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Mon, 2 Jan 2023 13:07:41 +
Subject: [PATCH] copy: immediately fail with transient errors from FICLONE

* src/copy.c (handle_clone_fail): A new function refactored
from copy_reg() to handle failures from FICLONE or fclonefileat().
(copy_reg): Fail with all errors from FICLONE,
unless they're from the set indicating the file system
or file do not support the clone operation.
Also fail with errors from fclonefileat() if they're
from the set indicating a transient failure for the file.
(sparse_copy): Call the refactored is_CLONENOTSUP()
which is now also used by the new handle_clone_fail() function.
* NEWS: Mention the bug fix.  Also mention explicitly
the older --reflink=auto default change to aid searching.
* cfg.mk: Adjust old_NEWS_hash with `make update-NEWS-hash`.
Fixes https://bugs.gnu.org/60489
---
 NEWS   |  8 +
 cfg.mk |  2 +-
 src/copy.c | 94 --
 3 files changed, 79 insertions(+), 25 deletions(-)

diff --git a/NEWS b/NEWS
index f43182757..0fbfe7b09 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,13 @@ GNU coreutils NEWS-*- outline -*-
   'cp -rx / /mnt' no longer complains "cannot create directory /mnt/".
   [bug introduced in coreutils-9.1]
 
+  cp, mv, and install, now immediately acknowledge transient errors
+  when creating copy-on-write or cloned reflink files, on supporting
+  file systems like XFS, BTRFS, APFS, etc.
+  Previously they would have tried again with other copy methods
+  which may have resulted in data corruption.
+  [bug introduced in coreutils-7.5 and enabled by default in coreutils-9.0]
+
   'mv --backup=simple f d/' no longer mistakenly backs up d/f to f~.
   [bug introduced in coreutils-9.1]
 
@@ -283,6 +290,7 @@ GNU coreutils NEWS-*- outline -*-
 ** Changes in behavior
 
   cp and install now default to copy-on-write (COW) if available.
+  I.e., cp now uses --reflink=auto mode by default.
 
   cp, install and mv now use the copy_file_range syscall if available.
   Also, they use lseek+SEEK_HOLE rather than ioctl+FS_IOC_FIEMAP on sparse
diff --git a/cfg.mk b/cfg.mk
index 179a51d1c..c9b2d4326 100644
--- a/cfg.mk
+++ b/cfg.mk
@@ -49,7 +49,7 @@ export VERBOSE = yes
 # 4914152 9e
 export XZ_OPT = -8e
 
-old_NEWS_hash = f365e0cf2c7890c617df6abe70615fce
+old_NEWS_hash = 49269d8d99f4888a4e955d28cda50091
 
 # Add an exemption for sc_makefile_at_at_check.
 _makefile_at_at_check_exceptions = ' && !/^cu_install_prog/ && !/dynamic-dep/'
diff --git a/src/copy.c b/src/copy.c
index 310c8bf14..1058b08ec 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -224,6 +224,18 @@ create_hole (int fd, char const *name, bool punch_holes, off_t size)
 }
 
 
+/* Whether the errno from FICLONE, or copy_file_range
+   indicates operation is not supported for this file or file system.  */
+
+static bool
+is_CLONENOTSUP (int err)
+{
+  return err == ENOSYS || is_ENOTSUP (err)
+ || err == EINVAL || err == EBADF
+ || err == EXDEV || err == ETXTBSY;
+}
+
+
 /* Copy the regular file open on SRC_FD/SRC_NAME to DST_FD/DST_NAME,
honoring the MAKE_HOLES setting and using the BUF_SIZE-byte buffer
*ABUF for temporary storage, allocating it lazily if *ABUF is null.
@@ -267,9 +279,7 @@ sparse_copy (int src_fd, int dest_fd, char **abuf, size_t buf_size,
   }
 if (n_copied < 0)
   {
-if (errno == ENOSYS || is_ENOTSUP (errno)
-|| errno == EINVAL || errno == EBADF
-|| errno == EXDEV || errno == ETXTBSY)
+if (is_CLONENOTSUP (errno))
   break;
 
 /* copy_file_range might not be enabled in seccomp filters,
@@ -1064,6 +1074,42 @@ infer_scantype (int fd, struct stat const *sb,
 }
 
 
+/* Handle failure from FICLONE or fclonefileat.
+   Return FALSE if it's a terminal failure for this file.  */
+
+static bool
+handle_clone_fail (int dst_dirfd, char const* dst_relname,
+   char const* src_name, char const* dst_name,
+   int dest_desc, bool new_dst, enum Reflink_type reflink_mode)
+{
+  /* If the clone operation is creating the destination,
+ then don't try and cater for all non transient file system errors,
+ and instead only cater for 

bug#60489: Programs should exit after EIO from FICLONE or copy_file_range()

2023-01-02 Thread Andreas Schwab
On Jan 02 2023, Pádraig Brady wrote:

> +  /* Note error set is consistent with copy_file_range().  */
> +  bool no_clone_attempted = errno == ENOSYS || is_ENOTSUP (errno)
> +|| errno == EINVAL || errno == EBADF
> +|| errno == EXDEV || errno == ETXTBSY;

Should this be refactored to avoid duplication?

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





bug#60489: Programs should exit after EIO from FICLONE or copy_file_range()

2023-01-02 Thread Pádraig Brady

On 02/01/2023 06:36, Noah Misch wrote:

Because Debian coreutils 9.1-1 "cp" silently falls back to copy_file_range()
after FICLONE reports EIO, "cp" can transfer incorrect bytes.  Linux syscalls
having a file descriptor parameter report EIO after a fault in the underlying
device.  The affected file is not recoverable in the general case, but syscall
outcomes after the EIO don't reflect that.  For example, consider FICLONE
returning EIO for a fault during source file writeback.  The kernel will mark
"clean" the affected page cache entries and clear the EIO state.  If the page
cache evicts those pages, their file offsets revert to the last written-back
values if any, else zeros.  If userspace issues a syscall that bypasses the
page cache, like copy_file_range() or another FICLONE, that syscall clones the
last written-back state or zeroes.  See
https://lore.kernel.org/linux-xfs/20221108172436.ga3613...@rfd.leadboat.com
for a "cp" and "cat" test script, background, and discussion.

I recommend instead reporting the EIO and terminating when FICLONE or
copy_file_range() fails with EIO.  One could argue that ENOSPC also warrants
termination, since no fallback reduces space usage.  For other errno values,
fallback to the next transfer strategy, like today.  An alternative would be
to fallback from FICLONE to copy_file_range() only after known-appropriate
errors EBADF, EINVAL, EOPNOTSUPP, ETXTBSY, and EXDEV.  That alternative wins
if future FICLONE reports an additional termination-deserving errno value.
Since just EIO needs termination today, I bet new errno values are more likely
than not to deserve fallback.  What do you think?


Yes we should have an allow list that we retry for,
and otherwise fail for EIO, ENOSPC, ENOMEM, ...

We already do this for copy_file_range(),
but should treat FICLONE similarly,
as done in the attached.

thanks!
Pádraig
From 4041d93fa895222064d6c20d380e05158bd939e7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Mon, 2 Jan 2023 13:07:41 +
Subject: [PATCH] copy: immediately fail with transient errors from FICLONE

* src/copy.c (copy_reg): Fail with all errors from FICLONE,
unless they're from the set indicating the file system
or file do not support the clone operation.
* NEWS: Mention the bug fix.  Also mention explicitly
the --reflink=auto default change to aid searching.
* cfg.mk: Adjust old_NEWS_hash with `make update-NEWS-hash`.
Fixes https://bugs.gnu.org/60489
---
 NEWS   |  7 +++
 cfg.mk |  2 +-
 src/copy.c | 28 ++--
 3 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/NEWS b/NEWS
index f43182757..94f3b9086 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,12 @@ GNU coreutils NEWS-*- outline -*-
   'cp -rx / /mnt' no longer complains "cannot create directory /mnt/".
   [bug introduced in coreutils-9.1]
 
+  cp, install, mv now immediately acknowledge transient errors from
+  the FICLONE ioctl, used to create copy-on-write or reflinked clones.
+  Previously they would have tried again with other copy methods
+  which may have resulted in data corruption.
+  [bug introduced in coreutils-7.5 and enabled by default in coreutils-9.0]
+
   'mv --backup=simple f d/' no longer mistakenly backs up d/f to f~.
   [bug introduced in coreutils-9.1]
 
@@ -283,6 +289,7 @@ GNU coreutils NEWS-*- outline -*-
 ** Changes in behavior
 
   cp and install now default to copy-on-write (COW) if available.
+  I.e., cp now uses --reflink=auto mode by default.
 
   cp, install and mv now use the copy_file_range syscall if available.
   Also, they use lseek+SEEK_HOLE rather than ioctl+FS_IOC_FIEMAP on sparse
diff --git a/cfg.mk b/cfg.mk
index 179a51d1c..c9b2d4326 100644
--- a/cfg.mk
+++ b/cfg.mk
@@ -49,7 +49,7 @@ export VERBOSE = yes
 # 4914152 9e
 export XZ_OPT = -8e
 
-old_NEWS_hash = f365e0cf2c7890c617df6abe70615fce
+old_NEWS_hash = 49269d8d99f4888a4e955d28cda50091
 
 # Add an exemption for sc_makefile_at_at_check.
 _makefile_at_at_check_exceptions = ' && !/^cu_install_prog/ && !/dynamic-dep/'
diff --git a/src/copy.c b/src/copy.c
index 310c8bf14..0f599459b 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -267,6 +267,7 @@ sparse_copy (int src_fd, int dest_fd, char **abuf, size_t buf_size,
   }
 if (n_copied < 0)
   {
+/* Note error set is consistent with clone_file().  */
 if (errno == ENOSYS || is_ENOTSUP (errno)
 || errno == EINVAL || errno == EBADF
 || errno == EXDEV || errno == ETXTBSY)
@@ -1275,23 +1276,30 @@ copy_reg (char const *src_name, char const *dst_name,
 {
   if (clone_file (dest_desc, source_desc) == 0)
 data_copy_required = false;
-  else if (x->reflink_mode == REFLINK_ALWAYS)
+  else
 {
-  error (0, errno, _("failed to clone %s from %s"),
- quoteaf_n (0, dst_name), quoteaf_n (1, src_name));
+  /* Note error set 

bug#60489: Programs should exit after EIO from FICLONE or copy_file_range()

2023-01-02 Thread Noah Misch
Because Debian coreutils 9.1-1 "cp" silently falls back to copy_file_range()
after FICLONE reports EIO, "cp" can transfer incorrect bytes.  Linux syscalls
having a file descriptor parameter report EIO after a fault in the underlying
device.  The affected file is not recoverable in the general case, but syscall
outcomes after the EIO don't reflect that.  For example, consider FICLONE
returning EIO for a fault during source file writeback.  The kernel will mark
"clean" the affected page cache entries and clear the EIO state.  If the page
cache evicts those pages, their file offsets revert to the last written-back
values if any, else zeros.  If userspace issues a syscall that bypasses the
page cache, like copy_file_range() or another FICLONE, that syscall clones the
last written-back state or zeroes.  See
https://lore.kernel.org/linux-xfs/20221108172436.ga3613...@rfd.leadboat.com
for a "cp" and "cat" test script, background, and discussion.

I recommend instead reporting the EIO and terminating when FICLONE or
copy_file_range() fails with EIO.  One could argue that ENOSPC also warrants
termination, since no fallback reduces space usage.  For other errno values,
fallback to the next transfer strategy, like today.  An alternative would be
to fallback from FICLONE to copy_file_range() only after known-appropriate
errors EBADF, EINVAL, EOPNOTSUPP, ETXTBSY, and EXDEV.  That alternative wins
if future FICLONE reports an additional termination-deserving errno value.
Since just EIO needs termination today, I bet new errno values are more likely
than not to deserve fallback.  What do you think?

Thanks,
nm