On 12/04/2021 00:25, Robert Elz wrote:
Date: Sun, 11 Apr 2021 22:27:19 +0100
From: Harald van Dijk <a...@gigawatt.nl>
Message-ID: <79b98e30-46ba-d468-153f-c1a2a0416...@gigawatt.nl>
| Okay, but that is a technicality. The pre-seeding is only permitted at
| startup time,
No, what it says is "an unspecified shell start-up activity".
"unspecified" means it can be anything.
No, not anything. It still has to be shell start-up activity.
Anything includes starting
a thread which monitors what commands are about to be executed and
loads the hash table just in time. Or one which populates the hash table
with every possible command every tenth of a micro-second. Anything.
It is unspecified.
The starting a thread would be shell start-up activity. The actions
performed on that thread while some other thread is running the script
clearly aren't.
| so cannot depend on the contents of the script.
Of course, it can, the script is available at startup time of the
shell, the startup activity can read the entire script, parse it,
find all the command names and possible command names, and add them
to the hash table.
It cannot do this either, parsing the whole script in advance is not
only not allowed (it would break the use of aliases defined in the
script, at the least) but also impossible as command names need not be
named literally inside the script.
Alternatively, it can examine PATH and load
every executable in every directory in PATH into the hash table.
zsh (seems to) do something like the latter.
This is something that I agree is valid for a shell to do. It does not
make any fundamental difference.
Incidentally, I only see this in zsh's interactive mode. I am not sure
whether this depends on interactive mode directly, or on another option
automatically turned on or off in interactive mode.
| I want to say this is a theoretical concern, that there are no shells
| where hash -r is implemented as doing anything other than clearing the
| hash table. I cannot prove this but will be quite disappointed if any to
| turn out to do something else.
zsh comes close, it appears to empty the hash table on "hash -r", but
do anything at all, and it fills up again. And I mean fills. And I
understand that - if you're going to search the directories in PATH
over and over again, every time a command is executed, better to read
them once, and remember what they contain - no more useless I/O.
(I vaguely recall deciding that zsh read as many directories as needed
to find the command, and then stopped - getting a "command not found"
would result in everything possible from PATH now being in the hash table.)
Yes, that is exactly what it is doing.
| > That is, find an entry for cmd in PATH for which exec() succeeds.
| > Only fail if there is none.
|
| Yes, that is what dash is doing.
The way PATH searches should be done.
| Well, that is sort of what dash does. dash takes an extra integer that
| specifies which PATH component was hashed and uses that as the starting
| point for the search,
I know. This is irrelevant here. If this algorithm doesn't produce the
required results, that would be a bug, and like most bugs, if it is
considered serious enough, it can be fixed.
The important issue, is that the intent is to examine each element in
PATH, until we get success from exec(), (or ENOEXEC with a file we're
willing to treat as a script, and so exec a shell to interpret it).
So, if there is a /bin/gcc that is "#!/bad" and a later one in path
that is a real executable, we should exec the later one, right?
I am not convinced that that is the intent at all.
This code is shared between ordinary command execution and the exec
builtin. The former needs no second PATH lookup if the hashing was done
correctly, the latter does: no hashing happens for exec, as it would be
useless. The code looks like the simplest way to shoehorn both into a
single function based on the assumption that the first execve() in the
ordinary command execution case would not fail. The fact that it can
fail tells us nothing about what was intended to happen in such a case.
Cheers,
Harald van Dijk