Re: [1003.1(2013)/Issue7+TC1 0001123]: Problematic specification of execution environment for word expansions

Robert Elz Fri, 10 Aug 2018 08:29:49 -0700

    Date:        Fri, 10 Aug 2018 13:30:28 +0100
    From:        Geoff Clare <g...@opengroup.org>
    Message-ID:  <20180810123028.GA23963@lt2.masqnet>


  | Actually, I think the existing description of Field Splitting handles
  | it correctly.

I disagree, but not for the reason that I think you believe...

  | It may be easier to see if you consider:

Yes, I know how that works, and I know how 2.6.7 results in
a "nothing" from the modified example that you gave.

  | (The $* isn't an issue because parameter expansion produces "one field
  | for each positional parameter that is set".  There are no positional
  | parameters set, so there are no fields.)

And this one is (as things are currently worded) the same issue
(that I have).   Again, I agree that 2.5.2 is saying the correct thing,
just not 2.6

  | So I think just deleting that paragraph, as bugnote 4082 currently has
  | it, is the right thing to do.

I don't.    Though there are other changes that could be made which
would allow that tto be done - but we'd also need the additional changes.

Here I will intersperse a short reply to your subsequest message ...

g...@opengroup.org said:
  | The best way forward might be for someone familiar with the source of ksh93
  | or bash (or another popular shell) to look at how the shell actually decides
  | whether the complete expansion produces nothing or a single empty field.

I don'r know what bash or ksh93 do, though I do know the NetBSD and FreeBSD
sh approach (though these days they are dine differently) - and while neither 
of those might qualify as a "popular" shell, I'd expect that dash is probably 
quite similar to one of them.

In the *BSD shells, the parser has produced a list of words (several actually,
one for command words/args, one for var assigns, and one for redirects),
and the shell just goes through each list (at the appropriate point) producing
a replacement list (we can change the nomenclature from words to fields at
this point, but that doesn't make any substantive difference to the 
implementation). 
For each new field that is produced from any of the expansions, a new entry is
made on the output list (appended to it).  If an expansion produces no fields,
then nothing new appears on the output list.  As each word is processed from
the input list (in order) it is removed.   Eventually the input list is empty, 
and the output list is what is left.   (nb: this is a conceptual description, 
the "list" is actually an array of pointers, and nothing is actually removed
until it is all done, the array index just advances.)

The primary difference between the NetBSD and FreeBSD shells, as I
understand the latter anyway, is that the FreeBSD shell applies the 4
steps to each word as it is processed, whereas the NetBSD shell
tends to do step 1 for all words, then step 2, ...   This difference seems
to be more related to how easy it is (believed to be) to get the code
correct, and perhaps to some degree, shell effieiency, rather than to
any expected difference in results.

OK, back from that to the issue at hand...

The problem I see with simply deleting the paragraph in question
is that nowhere (else) is there (in the standard) any allowance for an
input word not producing an output field.   In the cases explicitly called
out there is the possibility for an input field to produce multiple output
fields, but never zero.

Now I guess that it is possible to read "multiple" as any multiple of 1,
which incudes 0*1 1*2 2*1 ... and so no fields could be interpreted as
one of the possible multiple fields results - but I don't think most people
would read it that way, and as pathname expansion is one of the
places where producing multiple fields is permitted,, and we really
do not want anyone assuming that pathname expansion is permitted
to produce nothing (or people would expect that might be the logical
result from a pattern that matches nothing.)

But given the way that the text is written (lines 74994-5):

        It is only field splitting or pathname expansion that can create 
multiple
        fields from a single word. The single exception...

it is really had to read it as "0 is permitted", rather it looks (at least to
me) as if it means one in, one (and in the two cases, plus the exception, "or 
more") out.

The proposed text in 1193, while different, leads to the same colclusion
(to me anyway) ...

        The shell shall create multiple fields from a single word only as a
        result of field splitting, pathname expansion, or the following cases   

again, to me, that means one in, one (or more when permitted) out.

Assuming my reading is (at least) plausible, from where, apart from in the
relevant paragraph, does the shell obtain its authority to delete words
(not include them in the final list of fields).

Again here, at least for this issue, I do not believe there is any dispute
at all about what is supposed to happen - the question is just how the
words are written so that it is obvious what happens to people who do
not "already know" how the shell works.

kre

Re: [1003.1(2013)/Issue7+TC1 0001123]: Problematic specification of execution environment for word expansions

Reply via email to