Re: variable set in exec'ing shell cannot be unset by child shell
On Sun, 15 Oct 2023, 02:03 Robert Elz, wrote: > Date:Sat, 14 Oct 2023 14:46:12 +1000 > From:Martin D Kealey > Message-ID: a2+3nnknhm5a+...@mail.gmail.com> > > > | Back when I used the Bourne Shell we didn't have `local`, so we used to > | write `var= func` to make sure that `func` couldn't mess with *our* > `var`. > > If you were using the original Bourne shell you couldn't have done that, > as it had no functions either. Fair point, it was just whatever was /bin/sh on Ultrix at the time. I was a uni student so I don't even know what version of Ultrix we were using. I take your point that the Shell (and especially Bash) has grown Frankenfeatures way beyond a mere command interpreter, in ways that are fundamentally irreconcilable. But I don't think sticking to our guns about "let's go back to simple" is the best way forward. The one thing to be said for the Shell is that it's universal. If we kill it, what will take its place? I already have to install Bash, Awk, Perl, Python, and Node just to have a running system. How many more will be needed after Bash finally dies? If the Shell is left out in no man's land, with a shortfall in features so it can't be a "real" programming language, but at the same time with the crazy complexity for users to learn, we pretty much doom it to extinction. If the Shell is truly a moribund legacy language, we should stop changing it. No new features. No "bug fixes". No new safety guards. Or we design a new language that feels more like a regular programming language even if its syntax is weird. In my opinion it should have: Proper per-package feature selection; proper lexically scoped variables & functions; opt-in rather than opt-out globbing & word splitting; opt-in rather than opt-out filedescriptor inheritance; strongly typed variables, with string/number/array/compound/filehandle values; distinguishable binary (octet-stream) and text (Unicode/utf-8), with support for null bytes in strings, and a Cstring attribute to prohibit assignments that include null bytes (because execve is so central to everything); support for AF_LOCAL sockets as bidirectional pipes; exceptions separate from exit-status, with ability to enrol some but not all commands for the set-e treatment. Yes that's far too much work for one person; I do not expect Chet to do all this, I expect there to be a governance team.
Re: variable set in exec'ing shell cannot be unset by child shell
On Sun, 15 Oct 2023, 03:15 Ti Strga, wrote: > On Fri, Oct 13, 2023 at 5:59 PM Grisha Levit > wrote: > > IMHO you'd be better off just putting a `{` line at the start and `}` > line at the end of your scripts > The big weakness of the "{}" approach is that if a writer forgets to do > that, there's no way to detect it until a script is modified and the > running one crashes. But in the case of cloning, we can add such explicit > test-and-detection for "did you forget to trigger the cloning" in the few > scripts that really, really need it. > I think I would attack this from an entirely different angle: what about simply modifying Baeh so that it slurps in the entire file upon opening it? You could even hide that inside an LD_PRELOAD module so you don't have to recompile Bash, and so that it's inherited automatically. -Martin
Re: variable set in exec'ing shell cannot be unset by child shell
On Sun, 15 Oct 2023, 03:05 Greg Wooledge, wrote: > On Sat, Oct 14, 2023 at 12:55:21PM -0400, Ti Strga wrote: > > it's just the "[[ -v foo ]]" tests to see where along the cloning > process we are. > > *Shudder* > Likewise, b. If the *real* goal is to overwrite a running script with a new version of > itself, and then re-exec it, then the correct solution is to wrap the > script in a single compound command so that it gets read and parsed up > front, before beginning execution of the main loop. Either wrap the whole > thing in "{" ... "}" as Grisha suggested, or wrap the whole thing in a > "main()" function and then call main "$@". > Agreed. Either way, don't forget to put "exit;" just before the closing "}". Or write « exec main "$@" ». (For good measure I would also make sure it's a valid posix text file with a terminal newline, so that cat rubbish >> script can't break it.) Personally I don't much care for the main "$@" style as it makes an extra copy of argv for no particularly good reason, and Shell Is Not C™; but it's better than allowing the script to blow up with parse errors after it's started running. -Martin >
Re: variable set in exec'ing shell cannot be unset by child shell
On Fri, Oct 13, 2023 at 5:59 PM Grisha Levit wrote: > On Fri, Oct 13, 2023, 10:03 Ti Strga wrote: >> >> [*] Alternatively, there's the trick about putting the entire script >> contents inside a compound statement to force the parser to read it all, >> but that just makes the script harder for a human to read. Copy-and-exec >> makes the top-level scripts cleaner IMHO. > > IMHO you'd be better off just putting a `{` line at the start and `}` line at > the end of your scripts, Enh, that clutters up the calling scripts, and unlike setting a variable at the top (the "OUTSIDE" in the example, with a real name in the real code), it's not immediately clear to future coworkers why we're doing it and what effect it has. Semi-self-documenting variables that can be easily grepped for are always better than apparently arbitrary isolated curly braces. Having to play tricks with the parser to avoid something tangentially related to parsing is not my style, but I appreciate that others may feel differently. The big weakness of the "{}" approach is that if a writer forgets to do that, there's no way to detect it until a script is modified and the running one crashes. But in the case of cloning, we can add such explicit test-and-detection for "did you forget to trigger the cloning" in the few scripts that really, really need it. > and avoid a whole host of other potential problems. (Do you make a separate > holding directory for each run of the outer script? If so, what happens if > someone starts another copy after making changes? If not, how do you clean it > up? Etc.) Already taken care of. Honestly, this part of the functionality is pretty solid, I just didn't put it in the example. :-) Yes, we use different holding copies, it's not a hardcoded "COPY_OF_SCRIPT" in the real script. Several simultaneous copies are fine. We clean things up with a combination of chained EXIT traps in the scripts, and some systemd-tmpfiles work for the parts that aren't scripts.
Re: variable set in exec'ing shell cannot be unset by child shell
On Sat, Oct 14, 2023 at 12:55:21PM -0400, Ti Strga wrote: > it's just the "[[ -v > foo ]]" tests to see where along the cloning process we are. *Shudder* I foresee so much more pain in your future. Seriously, this is going to blow up in your face at some point. -v peeks into some incredibly dark and spooky corners of the shell, and will expose *precisely* how your assumptions about the shell differ with those of the bash author. Also, it's been historically buggy. I'm inclined to agree with Grisha Levit. This whole thing looks like a massively out-of-control X-Y problem. If the *real* goal is to overwrite a running script with a new version of itself, and then re-exec it, then the correct solution is to wrap the script in a single compound command so that it gets read and parsed up front, before beginning execution of the main loop. Either wrap the whole thing in "{" ... "}" as Grisha suggested, or wrap the whole thing in a "main()" function and then call main "$@". That way, you can overwrite the file without sabotaging running instances of the script.
Re: variable set in exec'ing shell cannot be unset by child shell
On Fri, Oct 13, 2023 at 5:35 PM Chet Ramey wrote: > This is what happens. First, you have to remember that variables supplied > as temporary assignments to builtins like eval and source persist for the > entire life of that builtin's execution, and appear in the environment of > child processes those builtins create (this is what the man page text > "added to the environment of the executed command" means for a builtin). Yep, that part I'm extremely familiar with... > these temporary variables can shadow global variables. ...but I was not aware of that part in this context! That's what I was missing. > 6. inner.sh calls unset, which unsets the temporary variable (clone) and > `unshadows' the global variable (clone2) And that makes it very clear what's going on. Thank you for that walkthrough. > There is code in bash to make a unsetting a function's local copy of a > dynamically-scoped variable that shadows a global variable remain `unset' > instead of unshadowing the global, but I've never done that for source or > eval. It's not clear that would help in this case, either -- it depends > on what the rest of the code does and expects. I could see "helpful or not" going either way, honestly. In this specific case, there isn't really any "rest of the code" that's relevant to the variables being shadowed, etc, it's just the "[[ -v foo ]]" tests to see where along the cloning process we are. The only part I didn't include in the example was the code that does such tests to see if it's inside the cloned copy, and if it is, arranges to delete the temporary copy on exit. (The arbitrary top-level script might also be doing on-exit actions, so there's this whole thing of registering a chain of functions to be called by the single permitted EXIT trap. All of that is working, and is independent of the optional cloning, so I didn't want to litter up the example with distractions.) Activating the code you mention for source/eval might have helped for my particular use case, in that it would have saved me some debugging time that I would otherwise have spent... I dunno, probably drinking more coffee. But it would likely have introduced confusion for all the other users who were accustomed to the shadowing behavior, and caused them to spend even more time debugging and writing bug report emails. I agree probably not helpful to have that code for source/eval. :-) We'll either be writing a solid comment in the code explaining why that particular 'unset' has to be where it is, or we'll change to testing the values of those trigger variables instead of just "is it set or not" and using varying values to track where in the process it is ("clone" vs "clone2" in your example). Thank you again! -Ti
Re: variable set in exec'ing shell cannot be unset by child shell
Date:Sat, 14 Oct 2023 14:46:12 +1000 From:Martin D Kealey Message-ID: | Back when I used the Bourne Shell we didn't have `local`, so we used to | write `var= func` to make sure that `func` couldn't mess with *our* `var`. If you were using the original Bourne shell you couldn't have done that, as it had no functions either. The changes made to it beyond that were very often badly designed or just plain broken, and in POSIX (because of the way it worked in some shells) that behaviour was prohibited, VAR=foo func was required to set VAR=foo in the shell environment (unless func altered it). That requirement has only relatively recently been changed - changed because it violated another POSIX requirement, that being that (ignoring execution speed, etc) it should not be possible to tell the difference between a utility implemented as a function and one implemented as a file system, command (that is, if the function sets out to implement the same thing as the external utility - which of course precludes it from making any changes to the shell environment). Even now it is unspecified what happens: This is from XCU 2.9.1.2 (in the latest available Issue 8 draft, but I think it is the same text in the current issued standard (Issue 7 + TCs)) � If the command name is a function that is not a standard utility implemented as a function, variable assignments shall affect the current execution environment during the execution of the function. [Aside: "standard utilities implemented as functions" are required to act identically to the external utility, but they aren't the issue here] It is unspecified: -- Whether or not the variable assignments persist after the completion of the function [ie: your trick is not guaranteed to work] -- Whether or not the variables gain the export attribute during the execution of the function [ie: such a variable isn't even guaranteed to be exported] -- Whether or not export attributes gained as a result of the variable assignments persist after the completion of the function (if variable assignments persist after the completion of the function) [ie: it is possible that a variable that wasn't exported before being used as "VAR=val func" might now be exported] A good implementation will revert the value if the func doesn't alter it, and will put it in the environment during the lifetime of the function, but none of that is guaranteed. At least now that is permitted, rather than prohibited, which it used to be. There is nothing here anywhere that permits an implementation to avoid making an assignment to a variable within a function fail to persist when the function terminates - regardless of whether the variable was named in a var-assign that precedes the command name (the command being a function).Of course if some non standard feature is used (like for example, "local") then all bets are off, and whatever happens depends upon what the shell defines to happen. However in the case of a special built-in utility (which "." is) then the requirements are much stricter: � If the command name is a special built-in utility, variable assignments shall affect the current execution environment before the utility is executed and remain in effect when the command completes; if an assigned variable is further modified by the utility, the modifications made by the utility shall persist. Unless the set -a option is on (see set), it is unspecified: -- Whether or not the variables gain the export attribute during the execution of the special built-in utility -- Whether or not export attributes gained as a result of the variable assignments persist after the completion of the special built-in utility That is, in the case of "VAR=val . script" (which is what the OP was doing, there were no functions involved) POSIX actually requires that VAR=val be done before the utility is invoked (and never undone) and that if the script modifies VAR, that modification remain after it has completed. (It is just unspecified whether anything gets exported by this, and if it does, whether that attribute remains after the script ends). Note that some of that text is new in Issue 8, to deal with making it clear what happens if a script does "X=Y unset X" where previously it might have seemed permitted for X to remain set after that command (to Y or whatever value it had before) - now it is (will be) clear that is not permitted, and X must be unset after that command completes. Similarly "X=foo export X" must result in X being exported with value "foo" when that command completes. "." is no different (conceptually) than those. | Given that "put in the environment" actually means "create a shell variable | and mark it as exported", That's an implementation detail, it doesn't require that at all for external utilities,