A NOTE has been added to this issue. ====================================================================== https://www.austingroupbugs.net/view.php?id=1649 ====================================================================== Reported By: kre Assigned To: ====================================================================== Project: Issue 8 drafts Issue ID: 1649 Category: Shell and Utilities Type: Error Severity: Objection Priority: normal Status: Resolved Name: Robert Elz Organization: User Reference: Section: XCU 2.6.5 Page Number: 2476 Line Number: 80478 - 80504 Final Accepted Text: See https://www.austingroupbugs.net/view.php?id=1649#c6483. Resolution: Reopened Fixed in Version: ====================================================================== Date Submitted: 2023-03-31 01:55 UTC Last Modified: 2023-09-25 15:43 UTC ====================================================================== Summary: Field splitting is woefully under specified, and in places, simply wrong ====================================================================== Relationships ID Summary ---------------------------------------------------------------------- related to 0001560 clarify wording of command substitution ======================================================================
---------------------------------------------------------------------- (0006488) Don Cragun (manager) - 2023-09-25 15:43 https://www.austingroupbugs.net/view.php?id=1649#c6488 ---------------------------------------------------------------------- Replace XCU section 2.6.5 on issue 8 draft 3 P2476, L80478-80504 with: <blockquote>After parameter expansion (Section 2.6.2), command substitution (Section 2.6.3), and arithmetic expansion (Section 2.6.4), and if the shell variable <i>IFS</i> [xref XCU 2.5.3] is set and its value is not empty, or if the <i>IFS</i> variable is unset, the shell shall scan each field containing results of expansions and substitutions that did not occur in double-quotes for field splitting; zero, one or multiple fields can result. For the remainder of this section, any reference to the results of an expansion, or results of expansions, shall be interpreted to mean the results from one or more unquoted variable or arithmetic expansions, or unquoted command substitutions. If the <i>IFS</i> variable is set and has an empty string as its value, no field splitting occurs. However if an input field which contained the results of an expansion is entirely empty, it shall be removed. Note that this occurs before quote removal, any input field that contains any quoting characters can never be empty at this point. After the removal of any such fields from the input, the possibly modified input field list becomes the output. Each input field is considered in sequence, first to last, with the results of the algorithm described in this section causing output fields to be generated, which remain in the same order as the input fields from which they originated. Fields which contain no results from expansions shall not be affected by field splitting, and shall remain unaltered, simply moving from the list of input fields to be next in the list of output fields. In the remainder of this description, it is assumed that there is present in the field at least one expansion result, this assumption will not be restated. Field splitting only ever alters those parts of the field. For the purposes of this section, the term "<i>IFS</i> white space" shall mean any of the white-space bytes [xref to XBD 3.412, 3.413, and 3.414] <space>, <tab> or <newline> from the Portable Character Set [xref XBD 6.1] which are present in the value of the <i>IFS</i> variable, and perhaps other white-space characters. It is implementation defined whether other white-space characters which appear in the value of <i>IFS</i> are also considered as "<i>IFS</i> white space". The three characters above specified as <i>IFS</i> white-space bytes are always <i>IFS</i> white space, when they occur in the value of <i>IFS</i>, regardless of whether they are white-space characters in any relevant locale. For other locale specific white-space characters allowed by the implementation it is unspecified whether the character is considered as <i>IFS</i> white space if it is white space at the time it is assigned to the <i>IFS</i> variable, or if it is white space at the time field splitting occurs (the locale may have changed between those events). If the <i>IFS</i> variable is unset, then for the purposes of this section, but without altering the value of the variable, its value shall be considered to contain the three single byte characters <space>, <tab> and <newline> from the portable character set, all of which are <i>IFS</i> white-space characters. The shell shall use the byte sequences that form the characters in the value of the <i>IFS</i> variable as delimiters. Each of the characters <space> <tab> and <newline> which appears in the value of <i>IFS</i> shall be a single byte delimiter. The shell shall use these delimiters as field terminators to split the results of expansions, along with other adjacent bytes, into separate fields, as described below. Note that these delimiters terminate a field, they do not, of themselves, cause a new field to start, subsequent data bytes that are not from the results of an expansion, or that do not form <i>IFS</i> white-space characters are required for a new field to begin. Note that the shell processes arbitrary bytes from the input fields, there is no requirement that those bytes form valid characters. If results of the algorithm are that no fields are delimited, that is, if the input field is wholly empty or consists entirely of <i>IFS</i> white space, the result shall be zero fields (rather than an empty field). For the purposes of this section, when a field is said to be delimited, the the candidate field, as generated below shall become an output field. When the algorithm transforms a candidate into an output field it shall be appended to the current list of output fields. Each field containing the results from an expansion shall be processed in order, intermixed with fields not containing the results of expansions, processed as described above, as if as follows, examining bytes in the input field, from beginning to end: Begin with an empty candidate field and the input as specified above. When instructed to start the next iteration of the loop, this is the start of the loop. While the input (as modified by earlier iterations of this loop) is not empty:<blockquote>Consider the leading remaining byte or byte sequence of the input. No such byte sequence shall contain data such that some bytes in the sequence resulted from an expansion, and others did not, or which contains bytes resulting from the results of more than one expansion. If the byte or sequence of bytes is: <ol><li>A byte (or sequence of bytes) in the input that did not result from an expansion:<blockquote>Append this byte (or sequence) to the candidate, and remove it from the input. Start the next iteration of the loop.</blockquote></li> <li>A byte sequence in the input which resulted from an expansion that does not form a character in <i>IFS</i>:<blockquote>Append the first byte of the sequence to the candidate, and remove that byte from the input. Start the next iteration of the loop.</blockquote></li> <li>A byte sequence in the input which resulted from an expansion that forms an <i>IFS</i> white space character:<blockquote>Remove that byte sequence from the input, consider the new leading input byte sequence, and repeat this step.</blockquote></li> <li>A byte sequence in the input that resulted from an expansion that forms an <i>IFS</i> character, which is not <i>IFS</i> white space:<blockquote>Remove that byte sequence from the input, but note it was observed.</blockquote></li></ol> At this point, if the candidate is not empty, or if a sequence of bytes representing an <i>IFS</i> character that is not <i>IFS</i> white space was seen at step 4, then a field is said to have been delimited, and the candidate becomes an output field. Empty (clear) the candidate, and start the next iteration of the loop.</blockquote> Once the input is empty, the candidate becomes an output field if and only if it is not empty. The ordered list of output fields so produced, which may be empty, replaces the list of input fields.</blockquote> Issue History Date Modified Username Field Change ====================================================================== 2023-03-31 01:55 kre New Issue 2023-03-31 01:55 kre File Added: ifs 2023-03-31 01:55 kre Name => Robert Elz 2023-03-31 01:55 kre Section => XCU 2.6.5 2023-03-31 01:55 kre Page Number => 2476 2023-03-31 01:55 kre Line Number => 80478 - 80504 2023-07-31 16:13 Don Cragun Note Added: 0006412 2023-09-07 14:14 kre Note Added: 0006459 2023-09-07 14:15 kre Note Added: 0006460 2023-09-07 14:30 kre Note Added: 0006462 2023-09-07 14:32 kre Note Added: 0006463 2023-09-07 14:41 kre Note Deleted: 0006463 2023-09-07 14:43 kre Note Edited: 0006462 2023-09-07 14:45 kre Note Added: 0006464 2023-09-07 14:54 kre Note Added: 0006465 2023-09-07 15:01 kre Note Added: 0006466 2023-09-07 15:03 kre Note Added: 0006467 2023-09-07 15:06 kre File Added: IFS-test 2023-09-07 15:07 kre File Added: POSIX-bug-1649-impl.sh 2023-09-07 15:09 kre File Added: Expected-Results 2023-09-07 15:15 kre Note Added: 0006468 2023-09-07 15:21 kre Note Added: 0006469 2023-09-07 15:22 kre Note Edited: 0006469 2023-09-07 16:12 kre Note Added: 0006471 2023-09-07 16:40 geoffclare Note Added: 0006472 2023-09-07 16:41 geoffclare Relationship added related to 0001560 2023-09-07 17:06 kre Note Added: 0006473 2023-09-07 17:09 kre Note Edited: 0006473 2023-09-11 03:35 kre Note Added: 0006477 2023-09-11 03:36 kre Note Added: 0006478 2023-09-11 03:39 kre Note Edited: 0006469 2023-09-11 03:59 kre Note Added: 0006479 2023-09-11 03:59 kre File Added: Revised-bug-1649-suggestion 2023-09-11 04:18 kre Note Added: 0006480 2023-09-11 04:20 kre Note Edited: 0006480 2023-09-21 15:51 Don Cragun Note Added: 0006482 2023-09-21 15:58 Don Cragun Note Deleted: 0006482 2023-09-21 16:48 Don Cragun Note Added: 0006483 2023-09-21 16:52 Don Cragun Note Edited: 0006483 2023-09-21 17:01 Don Cragun Note Edited: 0006483 2023-09-21 17:03 Don Cragun Final Accepted Text => See https://www.austingroupbugs.net/view.php?id=1649#c6483. 2023-09-21 17:03 Don Cragun Status New => Resolved 2023-09-21 17:03 Don Cragun Resolution Open => Accepted As Marked 2023-09-21 17:04 Don Cragun Tag Attached: issue8 2023-09-21 19:28 kre Note Added: 0006485 2023-09-25 09:43 Don Cragun Note Edited: 0006483 2023-09-25 09:47 Don Cragun Note Edited: 0006483 2023-09-25 14:21 Don Cragun Note Added: 0006487 2023-09-25 14:21 Don Cragun Resolution Accepted As Marked => Reopened 2023-09-25 15:43 Don Cragun Note Added: 0006488 ======================================================================