The following issue has been SUBMITTED.
======================================================================
https://austingroupbugs.net/view.php?id=1784
======================================================================
Reported By: kre
Assigned To:
======================================================================
Project: Issue 8 drafts
Issue ID: 1784
Category: Shell and Utilities
Type: Error
Severity: Objection
Priority: normal
Status: New
Name: Robert Elz
Organization:
User Reference:
Section: XCU 3 / getopts
Page Number: 2955 - 2959
Line Number: 98803 - 98966
Final Accepted Text:
======================================================================
Date Submitted: 2023-10-22 06:14 UTC
Last Modified: 2023-10-22 06:14 UTC
======================================================================
Summary: getopts specification needs fixing (multiple issues)
Description:
First:
Line 98807
and the index of the next argument to be processed in the
shell variable OPTIND.
Much the same is in the ENVIRONMENT VARIABLES section, lines 98888-9
say:
OPTIND This variable shall be used by the getopts utility as
the index of the next argument to be processed.
Which is the "next argument to be processed" - the argument after the
one that supplied the option written into the <i>name</i> arg, or the
argument that will be processed by the next call to getopts ?
It makes a difference when the argument in question has two (or more)
options in it, and anything but the last of them is being processed now.
Eg: (given an optstring with "xy" in it (no colons))
script -xy -d
if getopts is used in script to process those options, then
where name is set to 'x', this same arg will be processed again
next time to return 'y', but the "next argument" is the one
containing -d in many people's interpretation (and different shells
interpret it each way, in some OPTIND is 1 for 'x' and '2' for 'y',
in others it is 2 for both 'x' and 'y'). yash is different, it's
(intermediate) OPTIND settings contain the index of the arg being
processed, a colon, and the index of the option char within that arg
(so would be 1:2 and 1:3 in this case).
The standard is unclear what is intended here, it would be better to
simply say that the value of OPTIND at this point is unspecified, as
in practice there isn't anything much a script can do with it anyway,
even if we did pick one of the plausible interpretations. Pretending
that a simple integer is useful to the implementation (which the
definition at line 98888 does) is not helpful to anyway - to keep
track of whet it is up to, the implementation either needs to use
some other mechanism (ie: not use OPTIND for anything except when
the application does OPTIND=1) or it needs (as yash does) to encode
more than just an integer into OPTIND.
Beyond that, is the term "index of" defined anywhere? (It isn't in XBD
3)
If it is, there should be an xref, otherwise there should be a definition
given here. What is its format? For the usage when getopts returns
an exit status of 1, it is clearly intended to contain an integer, as
the EXAMPLES section, shows at like 98951
shift $(($OPTIND - 1))
which wouldn't work if OPTIND were not an integer. But is that
also actually required of the OPTIND returned upon other invocations?
If the intent here was to rely upon the standard English use of
the term, then that fails, as there really isn't one of those, to
be useful an index has to be relative to some base, is the
first option index 0 or index 1 (or something else) ?
On line 98836 it is stated:
The shell variables OPTIND and OPTARG shall be local to the
caller of getopts
WTF? What is that supposed to mean, that is, what does it mean
to be local to something, and what exactly is the "caller of getopts" ??
Really!
This is particularly absurd, as in the immediately following paragraph
(lines 98840-1) it says:
The shell variable specified by the name operand, OPTIND, and
OPTARG
shall affect the current shell execution environment;
which makes sense, and is what implementations actually do. If that
shell environment is "the caller" then what does it mean to be "local",
that it isn't allowed to be exported? That it doesn't survive the
termination of that shell environment? If this last one, then why does
it need stating, what variables do survive the termination of the shell
environment? Or was something else fanciful intended there ?
Next, at lines 98862-3
the value in OPTARG shall be stripped of the option character and the
'-'.
So, if we have an optstring of "abc:d" and the invocation of
getopts is
getopts abc:d var -abcfoo -d
then when 'var' is set to 'c' OPTARG is supposed to be "abfoo" ? (that
is we remove the 'c' and the '-' as instructed).
No, that can't be right, the option-argument is (at least implied by)
XBD 12.1 (which isn't referenced anywhere in XCU 3/getopts - directly
or indirectly, only XBD 12.2) the string which follows the option when
it is included in the same argument as the option, so the 'ab' should not
be included, just "foo" - but the '-' does not follow the option there
either, so why is the standard saying that the '-' must be removed?
Why isn't just saying that OPTARG is the option-argument (properly
defined by an xref) and leaving it at that?
Incidentally, XBD 3.244 is not very helpful here, all it says is an
Option-Argument is:
A parameter that follows certain options. In some cases an
option-argument
is included within the same argument string as the option--in most
cases
it is the next argument.
The "follows" is suggestive, but "included within the same argument
string"
leaves more possibilities open. And why does that say "certain options"
?
If it means options that require one, those aren't "certain". Just
"some options" would be better there.
In the RATIONALE, at lines: 98964-6 :
Although a leading <plus-sign> in optstring is required to have no
effect on the behavior of getopt(), this standard intentionally allows
implementations of the getopts utility to use a leading
<plus-sign> as an extension that alters behavior.
First, I am not sure just where it intentionally does that, the RATIONALE
isn't a normative part of the standard, so that paragraph can't be it,
did I miss something? But ignoring that...
Implementations are to be allowed to support a leading '+' in optstring.
But how does that effect (at line 98821, and I think other places, like
line 98895, there might be more):
If the first character of optstring is a <colon> ...
In XSH/getopt it is clear that the optional '+' precedes the optional ':'
in optstring, but if that is followed here, how can that ':' be the
first character of optstring? Must the application use only one or
the other, or is getopts doing the reverse of getopt() and requiring the
order be ":+..." (and if so, where does it say so) or should the wording
here
be fixed so it works like the getopt() function ?
And while we're here. the first mention of <i>options</i> (line 98803)
should contain an xref to XBD 3.243, the first mention of
<i>option-arguments</i>
(also on line 98803) should have an xref to XBD 3.243 and the first mention
of
<i>operand</i> (I think on line 98831) should have an xref to XBD 3.241.
These xrefs then each refer to XBD 12.1 which shows better than the
definitions how those things are formed (particularly in bullet point 1) -
but
referencing the definitions is better I think (XBD 12.1 does not refer
back
to XBD 3).
Desired Action:
Fix it all...
Maybe some wording, for some of it, may follow sometime later, in a note.
======================================================================
Issue History
Date Modified Username Field Change
======================================================================
2023-10-22 06:14 kre New Issue
2023-10-22 06:14 kre Name => Robert Elz
2023-10-22 06:14 kre Section => XCU 3 / getopts
2023-10-22 06:14 kre Page Number => 2955 - 2959
2023-10-22 06:14 kre Line Number => 98803 - 98966
======================================================================