Re: [Shell Command Language][shortcomings of command utlity][improving robustness of POSIX shells]

Harald van Dijk via austin-group-l at The Open Group Mon, 12 Apr 2021 10:42:49 -0700

On 12/04/2021 00:25, Robert Elz wrote:

     Date:        Sun, 11 Apr 2021 22:27:19 +0100
     From:        Harald van Dijk <a...@gigawatt.nl>
     Message-ID:  <79b98e30-46ba-d468-153f-c1a2a0416...@gigawatt.nl>


   | Okay, but that is a technicality. The pre-seeding is only permitted at
   | startup time,

No, what it says is "an unspecified        shell start-up activity".
"unspecified" means it can be anything.


No, not anything. It still has to be shell start-up activity.

                                          Anything includes starting
a thread which monitors what commands are about to be executed and
loads the hash table just in time.   Or one which populates the hash table
with every possible command every tenth of a micro-second.   Anything.
It is unspecified.

The starting a thread would be shell start-up activity. The actionsperformed on that thread while some other thread is running the scriptclearly aren't.

   | so cannot depend on the contents of the script.

Of course, it can, the script is available at startup time of the
shell, the startup activity can read the entire script, parse it,
find all the command names and possible command names, and add them
to the hash table.

It cannot do this either, parsing the whole script in advance is notonly not allowed (it would break the use of aliases defined in thescript, at the least) but also impossible as command names need not benamed literally inside the script.

                    Alternatively, it can examine PATH and load
every executable in every directory in PATH into the hash table.
zsh (seems to) do something like the latter.

This is something that I agree is valid for a shell to do. It does notmake any fundamental difference.

Incidentally, I only see this in zsh's interactive mode. I am not surewhether this depends on interactive mode directly, or on another optionautomatically turned on or off in interactive mode.

   | I want to say this is a theoretical concern, that there are no shells
   | where hash -r is implemented as doing anything other than clearing the
   | hash table. I cannot prove this but will be quite disappointed if any to
   | turn out to do something else.

zsh comes close, it appears to empty the hash table on "hash -r", but
do anything at all, and it fills up again.  And I mean fills.   And I
understand that - if you're going to search the directories in PATH
over and over again, every time a command is executed, better to read
them once, and remember what they contain - no more useless I/O.
(I vaguely recall deciding that zsh read as many directories as needed
to find the command, and then stopped - getting a "command not found"
would result in everything possible from PATH now being in the hash table.)


Yes, that is exactly what it is doing.

   | > That is, find an entry for cmd in PATH for which exec() succeeds.
   | > Only fail if there is none.
   |
   | Yes, that is what dash is doing.

The way PATH searches should be done.

   | Well, that is sort of what dash does. dash takes an extra integer that
   | specifies which PATH component was hashed and uses that as the starting
   | point for the search,

I know.  This is irrelevant here.  If this algorithm doesn't produce the
required results, that would be a bug, and like most bugs, if it is
considered serious enough, it can be fixed.

The important issue, is that the intent is to examine each element in
PATH, until we get success from exec(), (or ENOEXEC with a file we're
willing to treat as a script, and so exec a shell to interpret it).
So, if there is a /bin/gcc that is "#!/bad" and a later one in path
that is a real executable, we should exec the later one, right?


I am not convinced that that is the intent at all.

This code is shared between ordinary command execution and the execbuiltin. The former needs no second PATH lookup if the hashing was donecorrectly, the latter does: no hashing happens for exec, as it would beuseless. The code looks like the simplest way to shoehorn both into asingle function based on the assumption that the first execve() in theordinary command execution case would not fail. The fact that it canfail tells us nothing about what was intended to happen in such a case.


Cheers,
Harald van Dijk

Re: [Shell Command Language][shortcomings of command utlity][improving robustness of POSIX shells]

Reply via email to