A NOTE has been added to this issue. ====================================================================== https://www.austingroupbugs.net/view.php?id=1924 ====================================================================== Reported By: stephane Assigned To: ====================================================================== Project: 1003.1(2024)/Issue8 Issue ID: 1924 Category: Shell and Utilities Type: Error Severity: Objection Priority: normal Status: New Name: Stephane Chazelas Organization: User Reference: Section: Shell word splitting and "read" utility Page Number: various Line Number: various Interp Status: --- Final Accepted Text: ====================================================================== Date Submitted: 2025-05-05 19:02 UTC Last Modified: 2025-05-15 15:14 UTC ====================================================================== Summary: New word splitting requirements inappropriate in locales with non-self-synchronising character encodings ======================================================================
---------------------------------------------------------------------- (0007183) geoffclare (manager) - 2025-05-15 15:14 https://www.austingroupbugs.net/view.php?id=1924#c7183 ---------------------------------------------------------------------- After page 79 line 2388 section 3 Definitions, add: <b>3.328 Self-synchronizing Character Encoding</b> <blockquote>A character encoding in which no contiguous subset of bytes from the encoding of any one character or two adjacent characters can also represent the encoding of any valid character on its own.</blockquote> and renumber the later subsections. On page 2481 line 80454 section 2.5.3 Shell Variables (IFS), after: <blockquote>If the value of <i>IFS</i> includes any bytes that do not form part of a valid character, the results of field splitting, expansion of '*', and use of the <i>read</i> utility are unspecified.</blockquote> add a sentence: <blockquote>If the character encoding used for the characters in <i>IFS</i> is not self-synchronizing and the value of <i>IFS</i> includes any character for which the byte encoding can overlap with the byte encoding of any other sequence of characters, the results of field splitting, expansion of '*', and use of the <i>read</i> utility are unspecified. (Note: the UTF-8 encoding is self-synchronizing, meaning that no character's encoding can be confused with any other sequence of characters, and thus does not trigger this exception.)</blockquote> Issue History Date Modified Username Field Change ====================================================================== 2025-05-05 19:02 stephane New Issue 2025-05-15 15:14 geoffclare Note Added: 0007183 ======================================================================
