A NOTE has been added to this issue. ====================================================================== https://www.austingroupbugs.net/view.php?id=1920 ====================================================================== Reported By: stephane Assigned To: ====================================================================== Project: 1003.1(2013)/Issue7+TC1 Issue ID: 1920 Category: Shell and Utilities Type: Omission Severity: Objection Priority: normal Status: New Name: Stephane Chazelas Organization: User Reference: Section: read utility, stdin section Page Number: 3321 Line Number: 112915 Interp Status: --- Final Accepted Text: ====================================================================== Date Submitted: 2025-04-21 07:16 UTC Last Modified: 2025-04-23 14:59 UTC ====================================================================== Summary: read -d '' on invalid text without -r and IFS= ======================================================================
---------------------------------------------------------------------- (0007143) dwheeler (reporter) - 2025-04-23 14:59 https://www.austingroupbugs.net/view.php?id=1920#c7143 ---------------------------------------------------------------------- This proposed text can't be correct: > Otherwise, the input shall be text except that it is > not required to end in a newline character and lines are not > limited to {LINE_MAX} bytes in length. This proposal can't be correct because you CANNOT presume the input is text in the case of read -rd '' . The expected use case of read -rd '' is to handle arbitrary pathnames which are NOT necessarily text at all. Pathnames are sequences of bytes, and the spec never guarantees that those byte sequences are text. Using read with -d '' but without -r is generally a mistake. Once you use -d '', you can't assume the fields are "text", and the lack of "-r" presumes that the input is text and that we know what its encoding is. The *obvious* solution would be to allow implementations to silently enable '-r' whenever -d is passed an empty string. Frankly, I'd also allow returning a failure when using read with -d '' but without -r, as it's not really a sensible combination. If you assume "all of the world is UTF-8" I guess it's easy to implement, but it's not clear *why* you would do it :-). Instead, after: > If the -d delim option is specified and delim is the null string, > the standard input shall contain zero or more bytes (which > need not form valid characters). I would add: > In this case, implementations MAY act as if -r was also used > or return an error. In some future version of the spec I would be happy to add a read "-0" option that did both -d '' and -r. It's good to make security-relevant options easy. Issue History Date Modified Username Field Change ====================================================================== 2025-04-21 07:16 stephane New Issue 2025-04-21 07:30 stephane Note Added: 0007139 2025-04-21 07:38 stephane Note Edited: 0007139 2025-04-22 14:46 geoffclare Note Added: 0007140 2025-04-22 15:20 hvd Note Added: 0007141 2025-04-22 18:45 chet_ramey Note Added: 0007142 2025-04-23 14:59 dwheeler Note Added: 0007143 ======================================================================
