A NOTE has been added to this issue. ====================================================================== https://www.austingroupbugs.net/view.php?id=1778 ====================================================================== Reported By: kre Assigned To: ====================================================================== Project: Issue 8 drafts Issue ID: 1778 Category: Shell and Utilities Type: Enhancement Request Severity: Objection Priority: normal Status: New Name: Robert Elz Organization: User Reference: Section: XCU 3/read Page Number: 3291-3294 Line Number: 111869-111878, 111961-111963, 11946, 11979-11980 Final Accepted Text: ====================================================================== Date Submitted: 2023-10-02 13:58 UTC Last Modified: 2023-10-16 21:43 UTC ====================================================================== Summary: The read utility needs field splitting updates/corrections )and a little more) ====================================================================== Relationships ID Summary ---------------------------------------------------------------------- related to 0001649 Field splitting is woefully under speci... ======================================================================
---------------------------------------------------------------------- (0006537) kre (reporter) - 2023-10-16 21:43 https://www.austingroupbugs.net/view.php?id=1778#c6537 ---------------------------------------------------------------------- Re https://www.austingroupbugs.net/view.php?id=1778#c6533 I can imagine situations in which an application writer might choose to do that, and the choice would be reasonable. I cannot, or not in any case where using IFS would not also be acceptable (perhaps even more acceptable than altering the locale vars). In the case in the example you gave IFS="" read -r LANG there is no problem, as field splitting never happens in this case, and so changes of locale cannot generate unspecified results, one could also do IFS= ; read -r IFS where I made a separate assignment to IFS, rather than using a var-assign, purely because the effects of a var-assign, and an assignment to the same variable in a non-special builtin are not very well defined (many shells will simply reset the var-assigned variable to its value in the shell env from before the command was executed, nullifying any change the builtin might have made to it) - that issue however has nothing to do with the current one (in this example returning IFS to some other more appropriate value for the commands that follow is assumed to be done elsewhere if required). Similarly the -r is irrelevant to this, whether \ escaping (and line joining) happens is also immaterial. Where things get weird is in a case like: unset IFS read [-r] LANG var1 var2 LANG var3 var4 Doesn't my suggestion at the end of https://www.austingroupbugs.net/view.php?id=1778#c6530 adequately distinguish cases I have two issues with it. First it is too complex, and very few people are going to be able to work out just when it applies, and worse, as a script writer, how am I supposed to know whether the user supplied data being read is going to: change how the bytes in IFS form characters or which characters in IFS are considered to be IFS white space ?? If the script writer knew what value was in the data, they wouldn't use read to set it, they'd just do LANG=xxx and be done. Beyond that, the https://www.austingroupbugs.net/view.php?id=1778#c6526 text doesn't specify whether vars are assigned as the fields are split, or after all the field splitting has completed - which I understand, as shells exist which do it both ways, so users cannot rely upon it doing anything useful at all for field splitting during the read command, nor can they assume it won't (even if the script writer somehow believes that the conditions in the https://www.austingroupbugs.net/view.php?id=1778#c6530 text are met, and so no effect upon the field splitting results was planned). Last, for this, this technique while IFS="" read -r LANG do .. validate $LANG ... is OK for most variables, where in the time between when the var is set by read, and the time it is validated, we know the var is not going to be used (as it is only used when expanded in the script) for the variables the shell uses itself, this is a horrible, and very unreliable technique, and should never even be hinted at as unacceptable practice. The "validate $LANG" (unspecified there how it is done) is likely to use come comparisons using test, or pattern matching in a case statememt - for which the value of LANG can affect the results (different collating sequences, different char orders for [a-z] type expressions - lots of possible variations). The only safe way to update any of the variables the shell actually uses as part of its processing (PATH, IFS, all the locale vars, OPTIND, etc), based upon externally supplied input (however it is supplied, as a command like arg, via read, from the results of a command substitution which is just returning user data (like $(sed 1q < file)) or anything else) is to validate the data first, and if it is OK, assign it to the variable afterwards. So: True, but why force them to do that if there's no need? There is a need, In your example, as you wrote it, what LC_MESSAGES file is going to be used for any message from the "validate $LANG" code, if that discovers there is a problem? I suppose the script is supposed to reset things to LANG=C in that case. But then we'd have lost the value that we consider faulty, which we'd want to include in the error message, so we'd need to have another variable and do BADLANG=$LANG LANG=C printf whatever ... "${BADLANG}" ... so in practice, we're saving nothing, the extra var is still needed, we're making the validation process almost impossible to code, and making the language in the standard almost incomprehensible, and all for what purpose? Issue History Date Modified Username Field Change ====================================================================== 2023-10-02 13:58 kre New Issue 2023-10-02 13:58 kre Name => Robert Elz 2023-10-02 13:58 kre Section => XCU 3/read 2023-10-02 13:58 kre Page Number => 3291-3294 2023-10-02 13:58 kre Line Number => 111869-111878, 111961-111963, 11946, 11979-11980 2023-10-02 14:00 kre Note Added: 0006500 2023-10-02 14:02 kre Note Edited: 0006500 2023-10-02 14:20 kre Note Added: 0006502 2023-10-02 14:22 kre Note Edited: 0006502 2023-10-02 14:24 kre Note Edited: 0006502 2023-10-02 14:30 kre Note Added: 0006503 2023-10-02 14:33 kre Note Edited: 0006502 2023-10-02 14:34 kre Note Edited: 0006502 2023-10-02 14:44 kre Note Edited: 0006500 2023-10-02 16:18 geoffclare Project 1003.1(2013)/Issue7+TC1 => Issue 8 drafts 2023-10-02 16:19 geoffclare version => Draft 3 2023-10-02 16:20 geoffclare Note Added: 0006507 2023-10-02 16:21 geoffclare Relationship added related to 0001649 2023-10-03 09:23 geoffclare Note Added: 0006509 2023-10-03 11:59 kre Note Added: 0006510 2023-10-03 12:11 kre Note Added: 0006511 2023-10-03 13:42 geoffclare Note Added: 0006512 2023-10-04 14:42 kre Note Added: 0006513 2023-10-04 17:36 kre Note Added: 0006514 2023-10-05 08:59 geoffclare Note Added: 0006515 2023-10-05 11:08 kre Note Added: 0006516 2023-10-05 14:53 geoffclare Note Added: 0006517 2023-10-05 15:34 kre Note Edited: 0006516 2023-10-05 15:35 kre Note Edited: 0006516 2023-10-05 15:36 kre Note Edited: 0006516 2023-10-05 15:38 kre Note Added: 0006519 2023-10-05 15:41 kre Note Edited: 0006513 2023-10-06 07:58 geoffclare Note Edited: 0006515 2023-10-06 21:41 chet_ramey Note Added: 0006524 2023-10-10 11:06 geoffclare Note Edited: 0006515 2023-10-10 14:36 geoffclare Note Added: 0006526 2023-10-10 17:19 kre Note Added: 0006527 2023-10-11 08:04 geoffclare Note Added: 0006528 2023-10-11 17:13 kre Note Added: 0006529 2023-10-12 08:30 geoffclare Note Added: 0006530 2023-10-12 18:33 kre Note Added: 0006531 2023-10-12 18:45 kre Note Edited: 0006516 2023-10-12 18:54 kre Note Added: 0006532 2023-10-16 09:02 geoffclare Note Added: 0006533 2023-10-16 09:03 geoffclare Note Edited: 0006533 2023-10-16 09:19 geoffclare Note Added: 0006534 2023-10-16 21:43 kre Note Added: 0006537 ======================================================================