* doc/coreutils.texi (dd invocation): Document the behavior of 'dd' on
multibyte characters and some unspecified behavior that will be
documented in a future POSIX release [1].

[1] https://austingroupbugs.net/view.php?id=1959
---
 doc/coreutils.texi | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index d37cf2471..8ae81e110 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9280,6 +9280,17 @@ @node dd invocation
 
 The @samp{lcase} and @samp{ucase} conversions are mutually exclusive.
 
+@c https://austingroupbugs.net/view.php?id=1959
+POSIX leaves the behavior of @samp{lcase} and @samp{ucase} unspecified
+on multibyte characters.  GNU @command{dd} only converts one byte at a
+time, because multibyte characters may cross block boundaries and case
+conversion may change the length of characters.
+
+POSIX also leaves the behavior of @samp{lcase} and @samp{ucase}
+unspecified if used with @samp{ascii}, @samp{ebcdic}, or @samp{ibm}.
+GNU @command{dd} will perform the case conversion and then perform the
+character set conversion.
+
 @item sparse
 @opindex sparse
 Try to seek rather than write NUL output blocks.
-- 
2.52.0


Reply via email to