Date: Sat, 1 Apr 2023 18:49:56 -0600 From: Felipe Contreras <felipe.contre...@gmail.com> Message-ID: <camp44s1nv0+4r34_+4zyocvg+81subm_-nr0pphi1b52vzh...@mail.gmail.com>
| Fortunately kre did listen. Not really. I agree that what POSIX currently says is not correct, which is why the defect report got filed (you may have noticed that there was no new wording proposed there - and still isn't - which is because this is very hard to get correct, other than possibly by simply giving the code that should be executed (no, that won't happen)). But the others are correct, POSIX (in general) standardises what shells actually do - you can see this if you read a few pages, all kinds of things lead to unspecified (or worse, undefined) behaviour. That's because different implementations do different things in those cases. Not because some specific behaviour could not be required, not even that doing so might not be better all around. But implementations don't do the same thing in those cases, and so users cannot rely upon anything particular happening (sometimes behaviour is unspecified, but only between a limited number of choices). The standard has two purposes - one is to allow application writers (users) work out what they can expect to work, and what they should not do if they expect code to be portable. The other is so implementors of new implementations (of the shell, or anything else included) know what to implement (and where they can do things differently). You're right, when the standard uses "shall" it is being prescriptive, and implementations must do that if they want to claim to conform. But the standard only does that when the existing (at least major, and intending to conform) implementations, at the time the standard is written, actually do what is proposed to be required by a "shall". There are odd occasions (such as the read errors in scripts) where something that (almost all) implementations do is so obviously the wrong thing to do, that the standard requires implementations to change, but if you looked at that issue, and I believe you did, that was only done after checking with implementors to see if they were willing to make the change. In this case, the standard will certainly end up saying that IFS characters (both white space and others - there are differences in how they work, but not in this regard) terminate fields, and if there is nothing after the final IFS character (or characters, in the case of IFS whitespace), then there is no additional field, and if there is something there, then that makes an additional field, even if there is no IFS terminator following it. That's because that's what all (or essentially all) shells do, and always (for almost 45 years now) have done so. That is, if we have "IFS=," then both a,b,c and a,b,c, produce 3 fields "a" "b" and "c". On the other hand, the standard is likely to say that whether characters other than space/tab/newline which are white space according to the definition of that term in the standard, can be IFS white space, is unspecified - because shell implementations are split about that (about 60/40 for "no" - even though the standard currently seems to say "yes"). That is unless shell implementers can be persuaded to change their implementations, which in this case is probably unlikely (as no-one can be sure that there aren't scripts around which rely upon their current behaviour - no-one wants to break backward compat). The effect will probably be that using any white space char in IFS, other than the blessed 3, will make a script non-portable (might work with one shell, and not another). kre