Date: Thu, 2 Feb 2023 15:36:30 -0500 From: Greg Wooledge <g...@wooledge.org> Message-ID: <y9wezobgtznjv...@wooledge.org>
| There's a legitimate reason to support function names that contain *some* | punctuation characters beyond underscore. There is a very good reason to allow function names to contain almost any character at all. See below. | A case might be made that slashes should also be disallowed, That (and \0) are the exceptions. But not because: | because it allows exported function names like /bin/echo to be | inherited by a script, which is harmless, unless the shell is badly breaking the command execution rules of POSIX (zsh does I believe) but because for any shell that follows those rules, it is impossible to invoke a function with a '/' in its name. The rules for executing commands require that any command name with a '/' in it simply be handed to exec*() as is - it can never be a built in command, can never be a function, and never searches PATH. There's no actual need to forbid creating functions with '/' in their names, but users are likely to be annoyed if the shell allows them to exist but provides no mechanism to invoke them. But apart from that, IMO, any name which can be a command which will be found by a PATH search, ought to be able to be a function, so that users can write functions to replace or augment the filesystem commands. Technically that would not require "." or ".." to be allowed as function names, but as "." is a built in command already, it needs to be allowed so the built in command can be replaced by a function if desired, and if "." is going to be OK, ".." ought to be as well. Consider the following script (sorry, I know it isn't easy to follow when just reading, because of what is needed to quote things properly, but extract it and run it ... you just need to make the first entry in PATH be a directory that you can write into, and execute from .. it doesn't strictly need to be first, any entry in PATH would do, but then you'd need to manually set DIR in the script, rather than having it just pick the first). I will also attach these scripts as MIME attachments, which might be easier for some users to deal with, if the list allows attachments through. ------------------------------------------------- cut after this line DIR="${PATH%%:*}" test -z "${DIR}" && DIR=. if ! [ -w "${DIR}" ] || ! [ -x "${DIR}" ] then printf >&2 'Cannot use "%s" (from PATH[0]) for testing\n' "${DIR}" fi for name in ' ' '<' '$' '$1' '|' '`e`' '{' '()' '"' "'" \\ do if [ -e "${DIR}/${name}" ] then printf 'Name "%s" exists in %s already\n' "${name}" "${DIR}" continue fi Q=\' case "${name}" in *\'*) Q=\" ;; esac printf "#! /bin/sh\\n\\nprintf '..%%s.. works %%s\\\\n' %s%s%s \"\$*\"\\n" \ "${Q}" "${name}" "${Q}" > "${DIR}/${name}" || continue chmod +x "${DIR}/${name}" printf ':%s: ' "${name}" case "${name}" in "'") \' ok;; *) eval "'${name}' ok";; esac rm -f "${DIR}/${name}" done ------------------------------------------------- cut before this line Put that in /tmp/txf (or any other name you like) and run it with any (bourne type) shells you have available (except bosh, which has a bug which makes the final name test fail). It should simply work... jacaranda$ sh /tmp/txf : : .. .. works ok :<: ..<.. works ok :$: ..$.. works ok :$1: ..$1.. works ok :|: ..|.. works ok :`e`: ..`e`.. works ok :{: ..{.. works ok :(): ..().. works ok :": ..".. works ok :': ..'.. works ok :\: ..\.. works ok jacaranda$ bash /tmp/txf : : .. .. works ok :<: ..<.. works ok :$: ..$.. works ok :$1: ..$1.. works ok :|: ..|.. works ok :`e`: ..`e`.. works ok :{: ..{.. works ok :(): ..().. works ok :": ..".. works ok :': ..'.. works ok :\: ..\.. works ok jacaranda$ yash /tmp/txf : : .. .. works ok :<: ..<.. works ok :$: ..$.. works ok :$1: ..$1.. works ok :|: ..|.. works ok :`e`: ..`e`.. works ok :{: ..{.. works ok :(): ..().. works ok :": ..".. works ok :': ..'.. works ok :\: ..\.. works ok jacaranda$ mksh /tmp/txf : : .. .. works ok :<: ..<.. works ok :$: ..$.. works ok :$1: ..$1.. works ok :|: ..|.. works ok :`e`: ..`e`.. works ok :{: ..{.. works ok :(): ..().. works ok :": ..".. works ok :': ..'.. works ok :\: ..\.. works ok jacaranda$ ksh93 /tmp/txf : : .. .. works ok :<: ..<.. works ok :$: ..$.. works ok :$1: ..$1.. works ok :|: ..|.. works ok :`e`: ..`e`.. works ok :{: ..{.. works ok :(): ..().. works ok :": ..".. works ok :': ..'.. works ok :\: ..\.. works ok (you get the point, I could include similar results from more shells), Now extract the following script, which uses functions of the same names, instead of file system commands ------------------------------------------------- cut after this line DIR="${PATH%%:*}" test -z "${DIR}" && DIR=. if ! [ -w "${DIR}" ] || ! [ -x "${DIR}" ] then printf >&2 'Cannot use "%s" (from PATH[0]) for testing\n' "${DIR}" fi for name in ' ' '<' '$' '$1' '|' '`e`' '{' '()' '"' "'" \\ do Q=\' name_of_name=${name} case "${name}" in *\'*) Q=\" ;; *\\*) name_of_name=${name}${name};; esac eval "${Q}${name}${Q}() { printf $Q..${name_of_name}().. works %s\n$Q \"$*\"; }" printf ':%s: ' "${name}" case "${name}" in "'") \' ok;; *) eval "'${name}' ok";; esac done ------------------------------------------------- cut before this line Note that in that, the code that invokes the function is identical to the code which invoked the script in the previous script. That's why we needed to put the executable files in a directory that's in PATH, otherwise the two couldn't be executed in the same way - doing that is the whole reason for this. The code to build the function is different to the code which creates the file, but that part is irrelevant, and mostly caused by the desired name being a variable, which normally would not be the case, usually you'd just write something like '()'() { : whatever code should be run; } for the function definition, or printf '#! /bin/sh\n\n: whatever code should be run\n' > "${DIR}/()" for the executable file version (or create that with an editor). In either case '()' whatever args are needed should run it (or \(\) ... or "()" or any mixture of quoting you like). Put this second one in /tmp/tfn (or name of your choice) and run it just the same way: jacaranda$ sh /tmp/tfn : : .. ().. works :<: ..<().. works :$: ..$().. works :$1: ..$1().. works :|: ..|().. works :`e`: ..`e`().. works :{: ..{().. works :(): ..()().. works :": .."().. works :': ..'().. works :\: ..\().. works jacaranda$ zsh /tmp/tfn : : .. ().. works :<: ..<().. works :$: ..$().. works :$1: ..$1().. works :|: ..|().. works :`e`: ..`e`().. works :{: ..{().. works :(): ..()().. works :": .."().. works :': ..'().. works :\: ..\().. works I'll stop there ... those are the only shells I know of that behave sensibly for this, bash behaves as it documents (not usefully). None of those names work in most other shells. All of this is an extension to POSIX of course, there is no requirement there to allow this. POSIX only specifies that a "name" (as in the same syntax as a variable name) must work - but explicitly allows implementations to extend the namespace. The reason POSIX syntax is so limited is not because anyone believes it to be the right thing to do, but because that's what the first shells to implement functions implemented (a poor choice IMO) and consequently what was copied elsewhere. POSIX specifies what users can expect will work in all conforming shells, not what would be ideal if it did work. That POSIX explicitly says the name space for function names is allowed to be extended is a strong hint that it ought to be - and definitely means that this is not an extension that needs to be disabled in "posix mode". Quoting isn't always needed for all of these names (a function or script called ] should not need quoting around its name for definition or execution) - but should always be possible when the user requires or desires to use it (just like you can say: echo "hello" where the quotes are 100% pointless. You can also say: "echo" hello if you like. Works just the same whether echo is a file system command, built-in command, or a function. If not, the shell is buggy (because quote removal happens before it starts actually executing anything, so at the point of seeing what to run, there are no quoting chars left). Note that no expansions are specified to happen on the function name in a function definition, so $var() isn't ever going to create a function named by the value of var - you need eval for that. Because of that the NetBSD shell requires any function definition which contains something that looks like a var/command/arith expansion to be quoted such as to make it clear the user knows that isn't going to happen - we'd reject the $var() case, but allow '$var'() or \$var() (but not "$var"() - "\$var"() works however.) ie: if it would expand if used as as a normal shell word eg: as a command arg, then we forbid it as a function name, but with quoting to prevent that, the same name is just fine. Lastly, that I believe all this should be possible, doesn't mean that I advocate using function names like any of the above (the first entry in my PATH is $HOME/bin ... you'll have observed that I have no commands with any of those names already). Rather just that the shell should not prevent users from calling things whatever they want (including using names made of non-ascii characters, for both file names and function names), and that *any* file system command should be able to be replaced by a function. kre ps: in the above, "sh" is the NetBSD shell. The others should be obvious. Also, note that the scripts are not very general, you cannot put any arbitrary name you like in the list, and expect the rest of the script to cope - it won't always, the scripts were (believe it or not) kept as simple as possible, just for illustration in this e-mail.
txf
Description: txf
tfn
Description: tfn