The following issue has been SUBMITTED. ====================================================================== https://www.austingroupbugs.net/view.php?id=1920 ====================================================================== Reported By: stephane Assigned To: ====================================================================== Project: 1003.1(2013)/Issue7+TC1 Issue ID: 1920 Category: Shell and Utilities Type: Omission Severity: Objection Priority: normal Status: New Name: Stephane Chazelas Organization: User Reference: Section: read utility, stdin section Page Number: 3321 Line Number: 112915 Interp Status: --- Final Accepted Text: ====================================================================== Date Submitted: 2025-04-21 07:16 UTC Last Modified: 2025-04-21 07:16 UTC ====================================================================== Summary: read -d '' on invalid text without -r and IFS= Description: Issue 8 added the -d option to the read utility as a resolution of https://www.austingroupbugs.net/view.php?id=243 to be able to read records from the output of find -print0 reliably with IFS= read -rd '' pathname.
That includes in the STDIN section: > If the -d delim option is specified and delim is the null string, > the standard input shall contain zero or more bytes (which > need not form valid characters). However if -r is not also included, that leaves it unclear how an implementation should identify backslash characters (as used to escape separators and delimiter) and if IFS is unset or non-empty, how it should locate those in non-text input. That also implies that shells are required to store variable values internally as the raw byte encoding, not decoded text like yash does for instance (but then again there are other parts of the specification that imply it as well and IMO yash's approach is not sustainable and should be discouraged). Desired Action: Change to: > If the -d delim option is specified and delim is the null string, the -r option is specified > and IFS is set to the empty string, the standard input shall contain zero or more bytes > (which need not form valid characters). Instead of "IFS is set to the empty string", I guess we could make it "IFS is empty or contains only single byte characters among those whose encoding is guaranteed not to be found in that of multibyte characters" (period, slash, newline, carriage return). Shells variables, environment variables and command line arguments being required to be able to contain non-text would likely warrant a separate ticket. ====================================================================== Issue History Date Modified Username Field Change ====================================================================== 2025-04-21 07:16 stephane New Issue ======================================================================
