Date: Fri, 23 Aug 2024 23:47:06 +0200 From: Steffen Nurpmeso <stef...@sdaoden.eu> Message-ID: <20240823214706.oskn9OEF@steffen%sdaoden.eu>
| So IFS whitespace only if part of $IFS. That is the definition of IFS whitespace. | So this "adjacent" even if *not* part of $IFS. No, only characters that are in IFS are ever delimiters (really terminators). | So this means that *regardless* of whatever $IFS is, the three IFS | whitespace characters are $IFS anyway *if* that is set to | a nin-empty non-default value. No. Only if they are in IFS. If we have IFS=': ' then colon and space are IFS characters, space is IFS whitespace, and tab and newline are simply characters. What is important about space (0x20) tab (0x09) and newline (0x0a) is that if they appear in IFS, they are IFS whitespace. Whether other characters for which isspace() might return true (or the wide equivalent thereof where appropriate) are IFS whitespace or not is implementation defined (and usually, not). Since you're clearly looking at the new (Issue 8) standard, look at the 6th paragraph of XCU 2.6.5 which starts "For the purposes of this section,..." and goes on to define exactly what the term "IFS whitespace" means. | If the value of IFS is null, no word splitting occurs. Correct. | I have to say i still have a lot of problems wrapping my head | against the term Almost everyone has problems understanding how field splitting really works. It is odd (historically odd). | It seems to me, now, that the actual point here is that IFS | whitespace can give no empty output in say a IFS=: case, whereas | the colon in $IFS *can* create empty output tokens. First, fields, not tokens. Tokens are what passes from lexical analysis to the parser (and IFS has nothing whatever to do with that) - there is no parsing happening here (doing field splitting). But Yes. With F=foo::bar and G='foo bar' (two spaces) then with IFS=' :' (the order only matters when expanding "$*") argc $F gives 3 and argc $G gives 2. [argc() is just argc() { echo $#; }] If H were ' foo: : bar: ' argc $H would be 3 ("foo" "" and "bar"). kre ps: the hope was that the text that is now in 2.6.5 would finally be explicit enough that it would be possible to read that, completely, and then implement it properly.