Re: [Issue 8 drafts 0001560]: clarify wording of command substitution

Geoff Clare via austin-group-l at The Open Group Fri, 22 Apr 2022 04:04:27 -0700

> ---------------------------------------------------------------------- 
>  (0005802) calestyo (reporter) - 2022-04-15 00:41
>  https://www.austingroupbugs.net/view.php?id=1560#c5802 
> ---------------------------------------------------------------------- 
> AFAIU, this involves now three types of changes:
> 
> 1) The first one, which improves on the wording of trailing newlines.
> => seems good to me.
> 
> 
> 2) "comprising each character of the IFS" and similar
> "The shell shall treat the byte sequence comprising each character of the
> IFS as a delimiter"
> 
> It took me a bit to understand what's meant. I would reword this,
> especially the "each" is a bit strange here, I think.
> 
> AFAIU, you want to say, that any byte sequence in a word, that equals one
> of the characters in IFS is to be taken as a split point.
> So isn't that *any* character... not *each* character?
> 
> What about:
> "The shell shall treat a byte sequence forming any character of the
> characters in the IFS value as a delimiter"


I like this suggestion, although "any character of the characters"
is a bit strange.  I'll go with "any of the characters".

> The same in:
> "The term ``IFS white space'' is used to mean any sequence (zero or more
> instances) of the byte sequences that comprise white-space characters in
> the IFS value (for example, if IFS contains <space>/<comma>/<tab>, any
> sequence of bytes that have the encoded values of <space> and <tab>
> characters is considered IFS white space)."
> 
> rather something like:
> "The term ``IFS white space'' is used to mean any sequence (zero or more
> instances) of the byte sequences that form any of the white-space
> characters in the IFS value..."

Okay.

> Perhaps also instead of "is used to mean" just "means".

That's what's in the existing text - I didn't feel the need to change
that part.

> 3) You introduce bytes/byte sequences vs. characters.
> 
> I don't understand why you need that at all?

The current wording in terms of characters implies that the word being
subjected to field splitting can be treated as a character string.
I wanted to ensure that there is no possible way to infer that as
being allowed by the new text.

> Perhaps it would be better to generally mention that somewhere in the field
> splitting chapter?

That could invite complaints that it conflicts with the use of "character"
elsewhere.

> => But there is one thing that's IMO lost on the way:
> The old:
> "    any sequence of <space>, <tab>, or <newline> characters at the
> beginning or end of the input shall be ignored and any sequence of those
> characters within the input shall delimit a field"
> 
> "sequence of those characters" indicated that a sequence of 1-n IFS
> characters were still regarded as one single field splitter.
> 
> With the new:
> "ignored and any sequence of such bytes"
> that's IMO a bit lost... sequence of bytes is rather considered like ONE
> "multi-byte" character.

Each of the bytes in question encodes a single-byte character, so it's
impossible for them to combine to form one multi-byte character.

> You don't have that problem with the 4th change, where you explicitly say:
> "any sequence (zero or more instances) of the byte sequences that comprise
> white-space characters"

I'll insert "(one or more instances)".

-- 
Geoff Clare <[email protected]>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

Re: [Issue 8 drafts 0001560]: clarify wording of command substitution

Reply via email to