A NOTE has been added to this issue. ====================================================================== https://www.austingroupbugs.net/view.php?id=1920 ====================================================================== Reported By: stephane Assigned To: ====================================================================== Project: 1003.1(2013)/Issue7+TC1 Issue ID: 1920 Category: Shell and Utilities Type: Omission Severity: Objection Priority: normal Status: New Name: Stephane Chazelas Organization: User Reference: Section: read utility, stdin section Page Number: 3321 Line Number: 112915 Interp Status: --- Final Accepted Text: ====================================================================== Date Submitted: 2025-04-21 07:16 UTC Last Modified: 2025-04-23 16:47 UTC ====================================================================== Summary: read -d '' on invalid text without -r and IFS= ======================================================================
---------------------------------------------------------------------- (0007144) stephane (reporter) - 2025-04-23 16:47 https://www.austingroupbugs.net/view.php?id=1920#c7144 ---------------------------------------------------------------------- Re: https://www.austingroupbugs.net/view.php?id=1920#c7143 The "otherwise" is - if IFS is not empty (or is empty which is equivalent to the default IFS value) (or at least it contains characters other than dot slash CR LF). In which case how would you do IFS splitting (which is defined in terms of characters, not bytes) on non-text? - or if -r is not provided. Otherwise how would you identify backslash characters (considering that in several charmaps, the encoding of backslash is encoded in many other characters). - or the delimiter (as passed to -d, defaulting to LF) is something other than dot slash CR LF or the empty string (which means NUL character with guaranteed single 0 byte encoding), as again that character could have an encoding found in that of other characters. "IFS= read -rd '' filename" in practice is the only reliable way to read an arbitrary file path (if we ignore bugs in some versions of bash as recently reported and which led me to submit this ticket) read -d was first introduced in ksh93, and is also found in bash and zsh. I believe all 3 were (at least initially) treating the argument as a byte and were looking for it in the input *before* decoding it into text (to look for backslashes and IFS characters) I wouldn't personally object to (and would probably even approve) POSIX mandating that behaviour even for delimiter values other than NUL, dot, slash, CR, LF (the ones whose encoding is guaranteed to be single byte and not found in the encoding of any other character in the locale), even if in practice that may lead to unwanted behaviour in locales that use GB18030, BIG5 and BIG5-HKSCS (and maybe others). In practice those character encodings are not workable anyway, and the mere fact of /enabling/ locales with those charmaps (let alone use them) is a sure way to introduce security vulnerabilities on a system. Issue History Date Modified Username Field Change ====================================================================== 2025-04-21 07:16 stephane New Issue 2025-04-21 07:30 stephane Note Added: 0007139 2025-04-21 07:38 stephane Note Edited: 0007139 2025-04-22 14:46 geoffclare Note Added: 0007140 2025-04-22 15:20 hvd Note Added: 0007141 2025-04-22 18:45 chet_ramey Note Added: 0007142 2025-04-23 14:59 dwheeler Note Added: 0007143 2025-04-23 16:47 stephane Note Added: 0007144 ======================================================================
