Re: Pushing/restoring a file descriptor for a compound command

2018-04-27 Thread Stephane Chazelas
2018-04-28 00:23:51 +0200, Martijn Dekker:
[...]
> That said, do you have any opinion on whether something like
>{ ... ; } 3>&-
> should push/restore a closed file descriptor if it's already closed, so that
> the effect of exec-ing that descriptor within the compound command is local
> to that compound command?
[...]

Actually, in my list of "issues" I plan to some day raise to the
Austin Group, I had:

sh -c '{ exec > a; } > b; echo x' > c

which sounds like it's the same.

Do we honour the compound command asking for stdout to be
redirected, or do we honour the "> b" redirection being
temporary and apply only to the command substitution?

I'm undecided here, though I'd tend to lean the same way as you
do.

The problem can also be expressed as:

redir_stdout() {
  exec > "$1"
}

redir_stdout file1 > file2

Or:

close() { eval "exec $1>&-"; }
close 1 > file

f() {
  close 1
  redir_stdout file
  echo  foo
  close 1
}

f > /dev/null

POSIX could arbitrate based on the number of implementations
that go one or the other way. Or leave it unspecified (I've not
checked what it currently says)..

-- 
Stephane



Re: Pushing/restoring a file descriptor for a compound command

2018-04-27 Thread Robert Elz
Date:Sat, 28 Apr 2018 03:53:12 +0200
From:Martijn Dekker 
Message-ID:  <81b93245-e42f-ad62-4005-8ad676733...@inlv.org>

  |  How does NetBSD sh handle this?

This isn't really the best place for code samples, but ...

"fd" is the file descriptor in question:

if ((flags & REDIR_PUSH) && !is_renamed(sv->renamed, fd)) {
INTOFF;
if (big_sh_fd < 10)
find_big_fd();
if ((i = fcntl(fd, F_DUPFD, big_sh_fd)) == -1) {
switch (errno) {
case EBADF:
i = CLOSED;
break;
case EMFILE:
case EINVAL:
find_big_fd();
i = fcntl(fd, F_DUPFD, big_sh_fd);
if (i >= 0)
break;
/* FALLTHRU */
default:
i = errno;
INTON;/* XXX not needed here ? */
error("%d: %s", fd, strerror(i));
/* NOTREACHED */
}
}
if (i >= 0)
(void)fcntl(i, F_SETFD, FD_CLOEXEC);
fd_rename(sv, fd, i);

CLOSED is < 0 

fd_rename() does (aside from bookkeeping)

rl->orig = from;
rl->into = to;

The "sv" arg is the data struct where all this is
saved (rl is alloc'd memory, linked to it).

and then later, when things are being restored

if (rl->into < 0)
close(rl->orig);
else
movefd(rl->into, rl->orig);

movefd() ends up translating into dup2() with
error, and close-on-exec handling.

The FreeBSD sh is similar, but simpler - they don't deal
with user fd's >= 10, which makes the data structs needed
simpler, and much easier to avoid user fd's stepping all
over the shell's internal fds - we allow user fds to be anything
the system allows, and the shell makes sure it moves its
own fds around if needed to avoid issues.

The EINVAL and EMFILE handling is dealing with the
consequences of the user/script playing with ulimit, or
at least as much as is possible.

  | If it's a bug, surely it would meet that requirement.

Not for me to say, but it it doesn't have to be different
(rather than just would be nicer if it was different) then
it can be hard to justify making changes.

kre



Re: Pushing/restoring a file descriptor for a compound command

2018-04-27 Thread Martijn Dekker

Op 28-04-18 om 01:55 schreef Robert Elz:

 Date:Sat, 28 Apr 2018 00:23:51 +0200
 From:Martijn Dekker 
 Message-ID:  <8800d6d5-67ea-fad4-19c3-dac4bbfd8...@inlv.org>

   | That said, do you have any opinion on whether something like
   | { ... ; } 3>&-
   | should push/restore a closed file descriptor if it's already closed,

I suspect that it is just a bug, the usual way to do this is

saved_fd = fcntl(3, F_DUPFD, BIG_NUM);
close(3);

and later

dup2(saved_fd, 3);
close(saved_fd);

Which works fine, provided fd 3 is open at the beginning, otherwise
all of those fail with EBADF, which tends to just be ignored (and the
dup2() failing leaves fd 3 with whatever it was set to in between.)


It works fine on NetBSD sh though, and on every other ash derivative I 
know of except dash -- even for a file descriptor 3 that is closed at 
the beginning. How does NetBSD sh handle this?



Fixing it means adding code - as I understand it, in dash, that is
something they avoid unless it is absolutely essential


If it's a bug, surely it would meet that requirement.


 - and if no-one
can show where POSIX requires it, probably just won't happen.


The closest thing I've found is the first sentence of POSIX 2.7 
Redirection: "Redirection is used to open and close files for the 
current shell execution environment (see Shell Execution Environment) or 
for any command."


The first part of that sentence refers to exec'ing it, so is irrelevant 
here. The relevant bit is "or for any command". This includes compound 
commands.


I claim that this implies that a redirection added to a compound command 
should always cause the specified descriptor to be saved and restored, 
no matter the initial state of it, or the state initialised by the 
redirection.


The alternative seems inherently broken: the effect of an 'exec 3&- redirection for the same FD creates a reasonable 
expectation that it should restore that FD's state when leaving the 
compound command.


Moreover, every current POSIX-compliant shell I know of works as I would 
expect, except bash and dash. The behaviour of dash appears to be unique 
for current ash derivatives, as FreeBSD sh, NetBSD sh, and Busybox ash 
also work as I would expect. I think that's further evidence that the 
behaviour of bash and dash should be considered a bug.


That's what I've got. Is that a sane interpretation?

It would be nice if there were something more unequivocal in the 
standard, but it seems there isn't...



You might have more luck with bash (perhaps.)


Chet, what do you think?

Thanks,

- M.



Re: [1003.1(2016)/Issue7+TC2 0001193]: Brace expansion and {var}>file redirects in the shell

2018-04-27 Thread Chet Ramey
On 4/26/18 4:49 AM, Joerg Schilling wrote:

> See e.g. even a minor builtin like "times" that does not follow the standard:

This isn't exactly a compliance issue, since the standard only specifies
the format for the POSIX locale, and it doesn't include fractional seconds
anyway. It's a good idea to honor the locale, though.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: Pushing/restoring a file descriptor for a compound command

2018-04-27 Thread Robert Elz
Date:Sat, 28 Apr 2018 00:23:51 +0200
From:Martijn Dekker 
Message-ID:  <8800d6d5-67ea-fad4-19c3-dac4bbfd8...@inlv.org>

  | That said, do you have any opinion on whether something like
  | { ... ; } 3>&-
  | should push/restore a closed file descriptor if it's already closed,

I suspect that it is just a bug, the usual way to do this is

saved_fd = fcntl(3, F_DUPFD, BIG_NUM);
close(3);

and later

dup2(saved_fd, 3);
close(saved_fd);

Which works fine, provided fd 3 is open at the beginning, otherwise
all of those fail with EBADF, which tends to just be ignored (and the
dup2() failing leaves fd 3 with whatever it was set to in between.)

Fixing it means adding code - as I understand it, in dash, that is
something they avoid unless it is absolutely essential - and if no-one
can show where POSIX requires it, probably just won't happen.

You might have more luck with bash (perhaps.)

kre



Re: Pushing/restoring a file descriptor for a compound command

2018-04-27 Thread Martijn Dekker

Op 27-04-18 om 23:38 schreef Stephane Chazelas:

2018-04-27 20:28:57 +0200, Martijn Dekker:
[...]

: <&8 || echo "oops, closed"

[...]

Remember ":" is a special builtin, so its failure causes the
shell to exit. Another one of those accidents of implementation
of the Bourne shell that ended up being specified by POSIX, but
which makes little sense (but that other shells ended up
implementing for conformance).


Good point. Ignore the '|| echo "oops, closed"', it's pointless because 
a failed direction prints an error message anyway. I should have said 
that no shell produces an error there.


That said, do you have any opinion on whether something like
   { ... ; } 3>&-
should push/restore a closed file descriptor if it's already closed, so 
that the effect of exec-ing that descriptor within the compound command 
is local to that compound command?


- M.



Re: Pushing/restoring a file descriptor for a compound command

2018-04-27 Thread Stephane Chazelas
2018-04-27 20:28:57 +0200, Martijn Dekker:
[...]
> : <&8 || echo "oops, closed"
[...]

Remember ":" is a special builtin, so its failure causes the
shell to exit. Another one of those accidents of implementation
of the Bourne shell that ended up being specified by POSIX, but
which makes little sense (but that other shells ended up
implementing for conformance).

-- 
Stephane



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 10:00:50 +0100
From:Geoff Clare 
Message-ID:  <20180427090050.GA2538@lt2.masqnet>

quoting me:
  | > 4.  On the question of bug 985 ... (kind of related) - if quote removal is
  | > added to case pattern processing, it makes that into a different case 
from all
  | > of the others. [...]
  |
  | The danger here is that there are references to quote removal elsewhere

This isn't about any such potential dangers, which I don't think exist, but a
case where it seems to make a difference.

Consider this, where different shells produce different results:

$SHELL -c 'LC_ALL=C; case B in ([[:"alpha":]]) printf M;; (*) printf 
X;; esac'

bash bosh and pdksh print 'X' (fail to match), everything else I have tested 
(not posh or ksh88 - or a v7 sh) prints 'M' (matches).   That includes mksh
ksh93 and all the ash dervied shells I have access to.

In pdksh the issue is just that char classes don't match at all (not 
implemented) so that one we can ignore.  A true v7 sh would be the same.
(In those the input word 'p]' matches - or variants of that.)

The original test had var=alpha and the pattern was [[:"$var":]] but that
makes no difference at all (after expansion the two cases look the same).
"No difference" means the different shells produce the same results this way
as they do the other way, whether matching or not.

If either quote removal is specified to happen before pattern matching (but I 
really think that would break too many other cases) or if the way quoted 
strings are encoded in the shell is not literally as "string" then this matches
(quoted "alpha" is still alpha) (similarly if the pattern match code was
"clever" about quotes in patterns, aside from \ - but it is not, in any shell,
so I think that option is out of consideration).

This works (with ether the literal [[:alpha:]] or with [[:$var:]]) when the 
double
quotes are not present (except in pdksh of course.)

It does not work anywhere, and I would not really expect it to with the pattern
being [[:$var:]] (no quotes) with var='"alpha"' (though that would not be out of
the question if the "clever" quotes in patterns model was adopted.)   (The 
actual
test case gets a bit ugly to get the quoting right to allow that to be input, 
but 
that is not the issue,.)

kre



Re: Laundry list

2018-04-27 Thread Steffen Nurpmeso
Eric Blake  wrote:
 |On 04/27/2018 12:10 PM, Martijn Dekker wrote:
 |> I don't know of any way to accomplish that except by the de-facto
 |> standard mechanism of "#! /usr/bin/env sh". There is a long-time and
 |> highly widespread expectation that this will work.
 |> 
 |>>   In addition to shell
 |>> scripts, the shebang hack is also commonly used with awk and sed
 |>> scripts (just to name two other POSIX-specified languages).
 |> 
 |> IMO, that's another good reason to standardise the hashbang path plus
 |> the location of /usr/bin/env.
 |
 |If we standardize #! and the existence of /usr/bin/env, we should also
 |consider standardizing the BSD invention of 'env -S' that GNU coreutils
 |is now copying, as it serves as a very nice workaround for passing
 |multiple arguments to the real interpreter through the #! line even when
 |the OS passes only a single argument to env (as the #! interpreter).

Oh yes, this would be a tremendous improvement!
Elder FreeBSD (and/or MacOS) supported things like

  #!/usr/bin/env PERL5OPT=-C0 perl

and this is just impossible to do except by providing a complete
wrapper script otherwise (making self-containment a real problem
thus).  Having a standardized portable -S would be great.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: Laundry list

2018-04-27 Thread Eric Blake
On 04/27/2018 12:10 PM, Martijn Dekker wrote:

> 
> I don't know of any way to accomplish that except by the de-facto
> standard mechanism of "#! /usr/bin/env sh". There is a long-time and
> highly widespread expectation that this will work.
> 
>>   In addition to shell
>> scripts, the shebang hack is also commonly used with awk and sed
>> scripts (just to name two other POSIX-specified languages).
> 
> IMO, that's another good reason to standardise the hashbang path plus
> the location of /usr/bin/env.

If we standardize #! and the existence of /usr/bin/env, we should also
consider standardizing the BSD invention of 'env -S' that GNU coreutils
is now copying, as it serves as a very nice workaround for passing
multiple arguments to the real interpreter through the #! line even when
the OS passes only a single argument to env (as the #! interpreter).

https://www.freebsd.org/cgi/man.cgi?query=env
https://lists.gnu.org/archive/html/coreutils/2018-04/msg00011.html

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: Pushing/restoring a file descriptor for a compound command

2018-04-27 Thread Martijn Dekker

Op 27-04-18 om 18:49 schreef Martijn Dekker:
The author of dash, Herbert Xu, said in response to my bug report that 
dash is not obligated to push and restore that file descriptor if it is 
already closed when entering the compound command -- implying that a 
file descriptor should not be pushed if the local state would be 
different from the parent state.


Actually, that implication that I understood is not even correct. No 
shell, not even bash and dash, outputs "oops, closed" for the following:


exec 8To rephrase my request: I would welcome opinions on whether the 
dash/bash behaviour shown in the original message should reasonably 
considered a bug in standards terms.


Thanks,

- M.



Re: Laundry list

2018-04-27 Thread Martijn Dekker

Op 27-04-18 om 18:52 schreef Garrett Wollman:

< 
said:


The "#!" should be standardized - at this point, if your system doesn't support 
it,
everyone will consider your system broken.



I realize that there's some concern about standardizing pathnames, but
standardizing "/usr/bin/env" seems extremely reasonable, as well as bare names like 
"bash".


There's no need to standardize the actual pathname; a script can
include installation instructions to add the shebang (with an
appropriate, system-specified path), or indeed can come with an
installation script that does so automatically by invoking "command
-v" to find the path to the desired interpreter.


That installation script would need a way to run without editing that 
path (or explicitly specifying the shell on the command line) in the 
first place, or it would be a bit pointless.


I don't know of any way to accomplish that except by the de-facto 
standard mechanism of "#! /usr/bin/env sh". There is a long-time and 
highly widespread expectation that this will work.



  In addition to shell
scripts, the shebang hack is also commonly used with awk and sed
scripts (just to name two other POSIX-specified languages).


IMO, that's another good reason to standardise the hashbang path plus 
the location of /usr/bin/env.


Thanks for the feedback,

- M.



RE: Laundry list

2018-04-27 Thread Garrett Wollman
< 
said:

> The "#!" should be standardized - at this point, if your system doesn't 
> support it,
> everyone will consider your system broken.

> I realize that there's some concern about standardizing pathnames, but
> standardizing "/usr/bin/env" seems extremely reasonable, as well as bare 
> names like "bash".

There's no need to standardize the actual pathname; a script can
include installation instructions to add the shebang (with an
appropriate, system-specified path), or indeed can come with an
installation script that does so automatically by invoking "command
-v" to find the path to the desired interpreter.  In addition to shell
scripts, the shebang hack is also commonly used with awk and sed
scripts (just to name two other POSIX-specified languages).

-GAWollman



Pushing/restoring a file descriptor for a compound command

2018-04-27 Thread Martijn Dekker

I need references and opinions about the following, please.

Consider:

{
exec 8Should the effect of the 'exec' persist past the compound command (the 
curly braces block)?


My expectation is that the '8<&-' should push file descriptor 8 onto the 
shell's internal stack in a closed state, so that it is restored at the 
end of the block.


I think this should allow the effect of 'exec' to be local to that 
compound command, and I think this construct should be nestable.


According to my testing, nearly all shells do this. However, two very 
widely used shells, dash and bash, leave the file descriptor open beyond 
the block.


The author of dash, Herbert Xu, said in response to my bug report that 
dash is not obligated to push and restore that file descriptor if it is 
already closed when entering the compound command -- implying that a 
file descriptor should not be pushed if the local state would be 
different from the parent state.


He asked me for a POSIX reference proving otherwise. I can't find any.

Without this behaviour, an awkward workaround is required. To guarantee 
that the FD will be restored at the end of the block, you'd need to 
attempt to push it twice in two different states in two nested compound 
command blocks, for instance:


{
{
exec 8/dev/null

Does anyone have pointers to the POSIX text, or other strong evidence, 
that this workaround should not be necessary?


Thanks,

- M.



RE: Laundry list

2018-04-27 Thread Wheeler, David A
> From: Martijn Dekker [mailto:mart...@inlv.org]
...
> And that's the point of this message. I think that (A) certain
> universally implemented but unstandardised features should be
> standardised, and (B) standardised features that nobody implements
> should be removed, especially if they are inherently broken. Before
> cluttering the Austin group bug system with each of the points below,
> I'd be interested in your opinions on these.

I applaud this approach in general.  In my mind, widespread implementation
is a very good indication that something may be ready for standardization.
Sure, it's not ALWAYS true, but many of these look reasonable.

A few points below.

> A. Universally implemented but unstandardised features
> 
> A1. File names longer than 14 bytes. (_POSIX_NAME_MAX == 14)
> 
> What system still limits file names to 14 bytes?

For non-embedded systems, this should be increased.
This is too large for FAT filesystems (which are slowly going extinct),
and excessively small for everything else.  I don't know what the right
number is, but 255 is seems pretty common for pathname components.


> A2. /usr/bin/env
> 
> This is very, very commonly used in hashbang paths to deal with varying
> locations of the shell, e.g.:
> 
>   #! /usr/bin/env bash

The "#!" should be standardized - at this point, if your system doesn't support 
it,
everyone will consider your system broken.

I realize that there's some concern about standardizing pathnames, but
standardizing "/usr/bin/env" seems extremely reasonable, as well as bare names 
like "bash".

> A3. /dev/urandom
> 
> Access to cryptographically strong randomness is essential in 2018.

Agreed, and having a standard way to access it from the shell would encourage 
its use.


> A5. The -nt, -ot, -ef operators in test/[
> 
> All current shells support these. This is the only built-in way shells
> have to determine if a file is newer, older or the same file as another
> file. As far as I know, other POSIX utilities do not provide for this
> possibility. So I think these should be standardised.

Agreed.  I believe you can trick make into doing this,
but that would be hideous.

If shells commonly support it, then it should be in the standard.


> A7. 'var=foo exec somecommand ...' exports var to somecommand

Agreed.

--- David A. Wheeler




Re: Laundry list

2018-04-27 Thread Geoff Clare
Robert Elz  wrote, on 27 Apr 2018:
>
>   | In the unlikely event we do another TC, 
> 
> With no supporting evidence whatever (not even rumor)
> I had been supposing that there might be a new TC, perhaps
> next year (and tence the tc3 tags) - and that issue8 was not
> likely until perhaps 2022 or later.
> 
> It would be nice to hear of an advanced timeline for issue8
> (over my guess, I mean)
> 
> What are the plans?  (Such as they are known currently.)

We are using tc3 tags on bugs that would be suitable for inclusion
in a TC just in case we end up doing one, but we don't plan to.

There's a lot of work to do for Issue 8, so your guess of 2022 at
the earliest is probably not far wrong.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 15:24:30 +0100
From:Geoff Clare 
Message-ID:  <20180427142430.GB9716@lt2.masqnet>

  | This discussion seems to have come round to the same issue that was
  | raised recently in some comments in bug 1190, specifically Stephane's
  | notes 3960 and 3962 and my reply in note 3963.

Yes, I remembered seeing something like that, somewhere...

  | In summary: the need for a way to store a pattern in a variable such
  | that a pattern-magic character can be treated literally

Yes, that is the need.

  | is a reason to keep the first paragraph of 2.13.1 as-is and say that
  | shells which behave differently than bash here do not conform.

That would be nice.   I was going to say that I expect that Jörg would
not agree - but I see he has already done that

For now the best that might be possible, given that almost no shells
do this,  would be to make it unspecified whether this works, and
mark it as a future direction that a later rev will require it.

kre




Re: Laundry list

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 15:38:19 +0100
From:Geoff Clare 
Message-ID:  <20180427143819.GA10446@lt2.masqnet>

Completely changing the topic ...

  | In the unlikely event we do another TC, 

With no supporting evidence whatever (not even rumor)
I had been supposing that there might be a new TC, perhaps
next year (and tence the tc3 tags) - and that issue8 was not
likely until perhaps 2022 or later.

It would be nice to hear of an advanced timeline for issue8
(over my guess, I mean)

What are the plans?  (Such as they are known currently.)

kre



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 16:20:01 +0200
From:Joerg Schilling 
Message-ID:  <5ae33191.adgpivkbwgx8dc1y%joerg.schill...@fokus.fraunhofer.de>

  | But you forgot that after this variable content is expanded, it is quoted 
in a 
  | way to keep the content in the final result.

I didn't forget that, because it doesn't happen.   That's what the

bosh -c 'var="???";printf "%s\n" ${var}'

was meant to show.   The "???" is not kept in the final result,
it is expanded to produce all the 3 character filenames.

  | This however requires the macro 
  | expansion code (parameter expansion) to quote the \ at the end of the macro 
  | expansion to allow the \ to be kept visible after the final quote removal.

It doesn're require anything of the kind.   That \ is not subject to quote
removal, as it was not part of the original word.   Only quotes that were
in the original word get removed.   Sure, quoting it might be one way to
make that work, provided you can do it properly - but that does not
duplicae the original shell.

Remember, as you showed the code earlier, the original Bourne sh
parsed original word qouting by setting the QUOTE bit on the quoted
text.   Results of expansions don't get that.  Then quote removal is
just clearing that bit - it is all simple (and easy to code, and small,
which is why I assume it was done that way - despite all the idiotic
quoting rules it has left us with).

  | If this is not in the POSIX text,

It isn't, and should not be, as it is simply wrong.

The way the NetBSD sh (and original ash) copes with field splitting,
(and quote removal, or could, though that's actually done differently)
is by remembering (and updating as it changes) offsets into the word
to keep track of which chars are originals, and which are the results
of expansions.   The FreeBSD sh (which being based upon ash)
used to be the same, but they rewrote all of that part and now do it
a different way (but certainly not quoting the results of expansions).

kre




Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Geoff Clare
Joerg Schilling  wrote, on 27 Apr 2018:
>
> Geoff Clare  wrote:
> 
> > In summary: the need for a way to store a pattern in a variable such
> > that a pattern-magic character can be treated literally is a reason to
> > keep the first paragraph of 2.13.1 as-is and say that shells which
> > behave differently than bash here do not conform.
> 
> I am not convinced since _all_ other shells behave the same and since 
> changing 
> this in the shell would result in other missbehavior as well.
> 
> Your wish would e.g. result in a missbehaving "case".

The comments in bug 1190 that I referred to (in the part you snipped)
are about "case"!

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: Laundry list

2018-04-27 Thread Stephane Chazelas
2018-04-27 15:11:26 +0200, Martijn Dekker:
[...]
> For: someshell -c 'foo; echo does not exit'
> 
> Historical Bourne:
> - Xenix sh (1988): exits!
[...]

For the record, using a PDP11 emulator running Unix V7 (so the
original implementation of the Bourne shell in the late 70s):

$ sh -c 'foo; echo $?'
sh: foo: not found
1

So, it must have been broken later on.

-- 
Stephane



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 09:33:49 -0400
From:Shware Systems 
Message-ID:  <163074f534e-c83-4...@webjas-vaa062.srv.aolmail.net>

  | For my analysis, 2.6.5 says it is results which are subject to field 
splitting,

Yes, but irrelevant here

  |  with the parameter expand and direct entry both being one field as the 
pattern to evaluate
  | according to 2.6.6, 

yes.

  | and the treatment of the double quotes follows from 2.13.1

that is how I read the text.   I kind of doubt that is how it is intended to
work, but that is what it looks like to me as well.

  | before removal by 2.6.7

those quotes would not be rmeoved by that, but that should only
matter if the pattern matches no files - otherwise the pattern, and its
quotes, is removed, and the file names produced appear instead.

  | processing. 2.13.1 effectively has the quotes ignored,

That's how I read it.Of course, all this is based upon the (frankly
bogus) specification that quoting characters in words are retained
as is in the word for later processing.

  | using only the chars in between (the one ?), for matching purposes.

Yes, again, that is how I would read the current text.

  | 2.6.7 does not properly account for that when a pattern has been evaluated,
  | the ignored quotes are required to be removed to reflect the intent of the 
pattern.

No, that's not what happens.   If the pattern matches any files, the pattern 
vanishes, and the matched file names replace it (as many fields as needed).
Any quote characters produced there (files that contain quote characters
in their names) must be retained (I have plenty of those in my test directory.)

If the pattern does not match, the word will be retained unchanged, and the
quotes will remain in it.   That's actualy useful.

  | What is there now is more the requirements when set -f in effect,

No, it is not that - filename generation still happens, what's missing is
any processing of the quote characters.

  | and then quotes from var expansions, not being in the original input, would 
be
  | expected to stay in the result as literals.

Yes, agreed - either when filename expansion does not happen, or when
no files are matched.

kre
 
ps: please could you avoid top posting - my messages are long and boring 
enough the first time, no-one needs to get them resent in full as a part of
a reply!



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Geoff Clare  wrote:

> In summary: the need for a way to store a pattern in a variable such
> that a pattern-magic character can be treated literally is a reason to
> keep the first paragraph of 2.13.1 as-is and say that shells which
> behave differently than bash here do not conform.

I am not convinced since _all_ other shells behave the same and since changing 
this in the shell would result in other missbehavior as well.

Your wish would e.g. result in a missbehaving "case".

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 15:23:10 +0200
From:Joerg Schilling 
Message-ID:  <5ae3243e.8dyd5s4eftmrpyui%joerg.schill...@fokus.fraunhofer.de>

  | Robert Elz  wrote:
  |
  | > But it looked right, so I changed (not yet committed,
  |
  | This would be a mistake.

Perhaps.

  | > Then I started pondering other quote characters, since the quote
  | > characters are still in the string, that is, if the command were
  | >
  | >   $SHELL -c 'printf "%s\n" [a-e]\?.*'
  |
  | This is a different example, as you here have a quoted '?' instead of a 
quoted 
  | \ as in the first example.

There was never a quoted \ (except in the assignment to var).

  | >   bosh  -c 'printf "%s\n" [a-e]\?.*'
  | >   a?.??
  | >   b?.??
  | >   c?.??
  | >   e?.??
  |
  | See above, a different example results in a different behavior.

Of course, but the original example was

${SHELL} -c 'var="[a-e]\?.*";printf "%s\n" ${var}'
or
${SHELL} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'

which are identical to each other in effect.   The only difference
from the bosh example above is that this one has the pattern
(the same pattern) in a variable, where the bosh one had it
on the command line.

  | >   bosh -c 'var="???";printf "%s\n" ${var}' | wc -l
  | >   2297
  |
  | I am not sure what this should point to.

It indicates that the results of a variable expansion are not
"internally quoted" which is how you justified the earlier
example not working.   If the ${var} result was somehow
quoted, the ? chars that result would be quoted, and so
would not be matching characters.   But they're not, so
they are.   This is working as it should be, and there is
no "internal quoting" being performed.

kre




Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Geoff Clare
Robert Elz  wrote, on 27 Apr 2018:
>
> Date:Fri, 27 Apr 2018 15:06:52 +0200
> From:Joerg Schilling 
> 
>   | Since bash seems to be the only shell that works this way,
> 
> Until I changed the NetBSD sh (if that change is retained), yes.
> 
>   | I would call this a bug.
> 
> Then I think it would be also a bug in POSIX (as I think it
> actually specifies this result) and a deficiency - as there
> really needs to be a way to store a pattern in a variable
> such that a pattern-magic character can be treated literally.

This discussion seems to have come round to the same issue that was
raised recently in some comments in bug 1190, specifically Stephane's
notes 3960 and 3962 and my reply in note 3963.

In summary: the need for a way to store a pattern in a variable such
that a pattern-magic character can be treated literally is a reason to
keep the first paragraph of 2.13.1 as-is and say that shells which
behave differently than bash here do not conform.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 15:17:41 +0200
From:Joerg Schilling 
Message-ID:  <5ae322f5.uw3u84gim9o+bvrx%joerg.schill...@fokus.fraunhofer.de>

  | See my recent reply, this does not result in a quoted \.

Of course it doesn't - no-one wants (or ever attempted) a quoted \,
we want a quoted '?'

kre




Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Robert Elz  wrote:

> The examples with "" characters I expect will simply remain as they
> are in all shells, and the code I have been in the process of writing
> to allow that to "work" (based on the assumption that there is no reason
> why not - and even now, except that it doesn't work that way in other
> shells, I see no good reason to doubt) should just be consigned to the
> scrap heap (that code doesn't even compile yet, so no big loss.)
>
>   | In your example, expand() is told to expand:
>   |
>   |   [a-e]\\?.*
>
> No it isn't.  I said the \\ was irrelevant and I meant it.
>
> In
>   var="[a-e]\\?.*"
>
> which is the command that was used, the first \ is a quoting
> character, and is removed by quote removal (as are the
> enclosing "") just before the assignment to var is performed.
>
> The value assigned to var is
>
>   [a-e]\?.*

But you forgot that after this variable content is expanded, it is quoted in a 
way to keep the content in the final result. This however requires the macro 
expansion code (parameter expansion) to quote the \ at the end of the macro 
expansion to allow the \ to be kept visible after the final quote removal.

If this is not in the POSIX text, this is a bug of the same quality as the 
incorrect backus naur grammar for the shell in the POSIX standard text.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Geoff Clare
Robert Elz  wrote, on 27 Apr 2018:
>
> Date:Fri, 27 Apr 2018 10:00:50 +0100
> From:Geoff Clare 
> 
>   | I believe the former text is misleading and should be deleted.  It is
>   | effectively duplicating the requirements regarding backslashes stated in
>   | 2.2.1 and 2.2.3, but gets the details wrong.
> 
> Except that here it is talking about quoting characters in patterns,

Oops, you're right.  For some reason I had it in my head that this
special pattern-matching meaning was covered elsewhere, but now that I
look again I see that this is the place.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 15:06:52 +0200
From:Joerg Schilling 
Message-ID:  <5ae3206c.gzrnd81xboh3e0x7%joerg.schill...@fokus.fraunhofer.de>

  | Since bash seems to be the only shell that works this way,

Until I changed the NetBSD sh (if that change is retained), yes.

  | I would call this a bug.

Then I think it would be also a bug in POSIX (as I think it
actually specifies this result) and a deficiency - as there
really needs to be a way to store a pattern in a variable
such that a pattern-magic character can be treated literally.

I will leave it for Chet to say whether or not he considers this
to be a bug in bash.

  | I tested Historic Bourne, ksh88, ksh92, dash, yash, mksh posh, zsh, bosh.

I agree, and the FreeBSD and currently released (and all available)
NetBSD shells as well.

  | BTW: with the previous example, the "expand" function is told to expand:
  |
  | a*"?

That's the one where I missed the closing quote (deliverately) - let's
just forget that one for now until we get a real conclusion on what
should happen with pairs of quptes (and more importantly, \ quoting).

The examples with "" characters I expect will simply remain as they
are in all shells, and the code I have been in the process of writing
to allow that to "work" (based on the assumption that there is no reason
why not - and even now, except that it doesn't work that way in other
shells, I see no good reason to doubt) should just be consigned to the
scrap heap (that code doesn't even compile yet, so no big loss.)

  | In your example, expand() is told to expand:
  |
  | [a-e]\\?.*

No it isn't.  I said the \\ was irrelevant and I meant it.

In
var="[a-e]\\?.*"

which is the command that was used, the first \ is a quoting
character, and is removed by quote removal (as are the
enclosing "") just before the assignment to var is performed.

The value assigned to var is

[a-e]\?.*

which is exactly  the same as when the command was

var="[a-e]\?.*"

as there the \ is not a quoting character, as '?' isn't one
of the magic few that \ can quote inside a double quoted
string -- but another \ is.

If I had used
var='[a-e]\\?.*'

that would be different, there neither \ is a quoting char, and
what you said would be expanded would be correct.  But that
is not what was done (as I was using, as I always do when I
can, single quotes around the arg to sh -c - using single quotes
inside that string then gets ugly (bad for examples when the quoting
is not the point) so I avoid that when possible (of course, the
test cases include examples like that - doesn't matter if they're
incomprehensible.)

  | But:
  |
  | sh -c  'var="[a-e]?.*";printf "%s\n" ${var}'   
  | a?.??
  |
  | ...I have only one matching file.

This is an entirely different pattern, which matches a whole
different set of files (including the ones that the other pattern
matches - sometimes)

bosh -c  'var="[a-e]?.*";printf "%s\n" ${var}' |wc -l
  84

again, the wc is just because you really don't want to see the
list of odd filenames that match that pattern.

bosh is correct incidentally, all shells produce the same 84
files, but this is a very easy case.

The idea is to match files that contain a letter (one of the 5)
followed by a literal character '?' followed by a literal character '.'
followed by anything at all.   And to store that pattern in a
variable.  The literal '.' is no problem, the question is how
tio encode the literal ?.

I showed one way, using pattern magic, in my reply to Geoff,
the question is why not using shell quoting as well.

Note: that the section in 2.13.1 (which Geoff says is the correct
explanation of quoting in patterns) says:

When pattern matching is used where shell quote removal is not performed
[...]
special characters can be escaped to remove their special meaning by
preceding them with a  character.

"special characters" there is referring to the '*' '?' and '[' chars, and the
section goes on to allow \\ for matching a literal '\'.

Since 2.6.7 (Quote Removal) says ...

The quote characters (, single-quote, and double-quote) that
were present in the original word shall be removed unless they have
themselves been quoted.

which means that quote removal is not performed on text in a word that came
from the results of an expansion (that's not the original word) and so one 
could read 2.13.1 as saying that \ quoting of special characters is available
in this context, since quote removal is not performed there, (which then makes
it just the same as in literal patterns in the text, though there the \ acts as 
a
quoting character, and quotes the special characters that way.)

Now I am quite willing to admit (especially given that shells have not 
historically implemented this this way) that this might not be intended,
and that perhaps the spec needs to be changed to 

Re: Laundry list

2018-04-27 Thread Martijn Dekker

Op 27-04-18 om 15:11 schreef Martijn Dekker:

Op 26-04-18 om 16:19 schreef Geoff Clare:

[...]

| > B4. Shell may exit if command not found. (XCU 2.8.1, last row in table)
| >
| > I was only just made aware by kre that POSIX allows this. It's another thing
| > that no shell actually does.
|
| Are you sure?  If no shell does it, then it seems odd that it was
| allowed in the resolution of bug 882 which was relatively recent (2015).


Hmm. OK, time for some testing.

[...]
The original inspiration for POSIX, ksh88, does not exit, and neither 
does anything since -- unless there is still a shell I haven't 
discovered. Do you know of any others?


Yes, that would be the z/OS POSIX shell then. Does anyone know what it does?

Also, does anyone know whether this shell is an entirely separate entity 
or a port/derivative of another shell incarnation?


Thanks,

- M.



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Shware Systems
For my analysis, 2.6.5 says it is results which are subject to field splitting, 
with the parameter expand and direct entry both being one field as the pattern 
to evaluate according to 2.6.6, and the treatment of the double quotes follows 
from 2.13.1 before removal by 2.6.7 processing. 2.13.1 effectively has the 
quotes ignored, using only the chars in between (the one ?), for matching 
purposes. 2.6.7 does not properly account for that when a pattern has been 
evaluated, the ignored quotes are required to be removed to reflect the intent 
of the pattern. What is there now is more the requirements when set -f in 
effect, and then quotes from var expansions, not being in the original input, 
would be expected to stay in the result as literals.

On Friday, April 27, 2018 Robert Elz  wrote:

Date: Fri, 27 Apr 2018 11:03:57 +0200
From: Joerg Schilling 
Message-ID: <5ae2e77d.95ubF707FXNl6/H/%joerg.schill...@fokus.fraunhofer.de>

First, a (minor) apology - I should have made it clear that, yes, "set +f" was
intended, and that IFS was not intended to contain any unusual values (no 'a'
'*' "'"' '\' or '?' in it... ) Obviously anything like that would alter the 
results, and that kind of bizarreness is not what I was seeking to
query - and if I was, those pre-conditions would not have been forgotten.

| XCU 2.6.5 explains what happens after parameter expansion, the quoting 
happens 
| as the last action during parameter expansion.

2.6.5 is field splitting, which while it would normally be attempted in the
example I gave, would do nothing - and we could disable it by assuming IFS=''
if wanted - that should change nothing.

But in any case, unless some new text has been added in the resolution of
some bug that I am unaware of (which is most of them...) I see nothing in 2.6.5
which is even remotely similar to what you said. Can you cut/paste the 
relevant words, or quote line numbers, or if there's a change that is not yet
in the published text, the bug number ?

| The text related to double quotes refers only to "spaces" inside the result.

No, it means IFS characters - that is, something that was quoted is not
subject to field splitting - that's usually white space, but doesn't have to
be, but I agree, that's not relevant to anything here (since field splitting is
not going to change anything anyway, we can simply disable it, with IFS='')

| If you like, check:
|
| $shell -c "var='a*\"?\"'; echo \$var"
|
| alls shells agree here ;-)

Yes, they probably do in that case. They don't however in the case that
originally caused me to start looking at this.

[Aside: Martijn Dekker's modernish found some problems with NetBSD's
pattern matching - minor and obscure ones - but clearly bugs, and then
when I started testing, I found a few more ... so I created a large set of
tests for everything obscure and weird I could think of  and these
messages are the result of that: before I can "fix" anything I need to
understand what is the correct result, and why.]

The problem case is:

${SHELL} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'

There are 4 files in $PWD (when the above command is executed)
with names that start with a char in [a-e] followed by a '?' followed
by a '.' followed by two more '?' chars - and lots more irrelevant files).

Almost all shells simply print
[a-e]\?.*
which is the string assigned to "var" (whether the original input has
one or two \ characters makes no difference, and nor should it.)

But bash doesn't: (the -o posix given here makes no difference)

bash -o posix -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

So I started wondering why, and looked at the spec, and could find
nothing to suggest this should not be the result, rather, the text to
me reads as if it should be.

Even though nothing else I have available to test does that.

But it looked right, so I changed (not yet committed, nor are the other
bug fixes I have made to this) the NetBSD sh to produce the same
result as bash:

${SH} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

(${SH} is the obscure pathname to the uninstalled test build of my
development version of the NetBSD sh - I have it in a var because
it is way too long to type...) whereas the old way:

sh -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
[a-e]\?.*

the same as everyone else.

Then I started pondering other quote characters, since the quote
characters are still in the string, that is, if the command were

$SHELL -c 'printf "%s\n" [a-e]\?.*'

(here it is important that there just be one '\') all shells agree, that the
result where the 4 file names are printed is correct. For example:

bosh -c 'printf "%s\n" [a-e]\?.*'
a?.??
b?.??
c?.??
e?.??

In your earlier reply you said ...

| The result of a shell macro expansion is quoted internally before quote
| removal is applied.

but I cannot find any text anywhere which mandates that, and what's more,
it is nothing like what 

Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Shware Systems
For my analysis, 2.6.5 says it is results which are subject to field splitting, 
with the parameter expand and direct entry both being one field as the pattern 
to evaluate according to 2.6.6, and the treatment of the double quotes follows 
from 2.13.1 before removal by 2.6.7 processing. 2.13.1 effectively has the 
quotes ignored, using only the chars in between (the one ?), for matching 
purposes. 2.6.7 does not properly account for that when a pattern has been 
evaluated, the ignored quotes are required to be removed to reflect the intent 
of the pattern. What is there now is more the requirements when set -f in 
effect, and then quotes from var expansions, not being in the original input, 
would be expected to stay in the result as literals.

On Friday, April 27, 2018 Robert Elz  wrote:

Date: Fri, 27 Apr 2018 11:03:57 +0200
From: Joerg Schilling 
Message-ID: <5ae2e77d.95ubF707FXNl6/H/%joerg.schill...@fokus.fraunhofer.de>

First, a (minor) apology - I should have made it clear that, yes, "set +f" was
intended, and that IFS was not intended to contain any unusual values (no 'a'
'*' "'"' '\' or '?' in it... ) Obviously anything like that would alter the 
results, and that kind of bizarreness is not what I was seeking to
query - and if I was, those pre-conditions would not have been forgotten.

| XCU 2.6.5 explains what happens after parameter expansion, the quoting 
happens 
| as the last action during parameter expansion.

2.6.5 is field splitting, which while it would normally be attempted in the
example I gave, would do nothing - and we could disable it by assuming IFS=''
if wanted - that should change nothing.

But in any case, unless some new text has been added in the resolution of
some bug that I am unaware of (which is most of them...) I see nothing in 2.6.5
which is even remotely similar to what you said. Can you cut/paste the 
relevant words, or quote line numbers, or if there's a change that is not yet
in the published text, the bug number ?

| The text related to double quotes refers only to "spaces" inside the result.

No, it means IFS characters - that is, something that was quoted is not
subject to field splitting - that's usually white space, but doesn't have to
be, but I agree, that's not relevant to anything here (since field splitting is
not going to change anything anyway, we can simply disable it, with IFS='')

| If you like, check:
|
| $shell -c "var='a*\"?\"'; echo \$var"
|
| alls shells agree here ;-)

Yes, they probably do in that case. They don't however in the case that
originally caused me to start looking at this.

[Aside: Martijn Dekker's modernish found some problems with NetBSD's
pattern matching - minor and obscure ones - but clearly bugs, and then
when I started testing, I found a few more ... so I created a large set of
tests for everything obscure and weird I could think of  and these
messages are the result of that: before I can "fix" anything I need to
understand what is the correct result, and why.]

The problem case is:

${SHELL} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'

There are 4 files in $PWD (when the above command is executed)
with names that start with a char in [a-e] followed by a '?' followed
by a '.' followed by two more '?' chars - and lots more irrelevant files).

Almost all shells simply print
[a-e]\?.*
which is the string assigned to "var" (whether the original input has
one or two \ characters makes no difference, and nor should it.)

But bash doesn't: (the -o posix given here makes no difference)

bash -o posix -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

So I started wondering why, and looked at the spec, and could find
nothing to suggest this should not be the result, rather, the text to
me reads as if it should be.

Even though nothing else I have available to test does that.

But it looked right, so I changed (not yet committed, nor are the other
bug fixes I have made to this) the NetBSD sh to produce the same
result as bash:

${SH} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

(${SH} is the obscure pathname to the uninstalled test build of my
development version of the NetBSD sh - I have it in a var because
it is way too long to type...) whereas the old way:

sh -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
[a-e]\?.*

the same as everyone else.

Then I started pondering other quote characters, since the quote
characters are still in the string, that is, if the command were

$SHELL -c 'printf "%s\n" [a-e]\?.*'

(here it is important that there just be one '\') all shells agree, that the
result where the 4 file names are printed is correct. For example:

bosh -c 'printf "%s\n" [a-e]\?.*'
a?.??
b?.??
c?.??
e?.??

In your earlier reply you said ...

| The result of a shell macro expansion is quoted internally before quote
| removal is applied.

but I cannot find any text anywhere which mandates that, and what's more,
it is nothing like what 

Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Robert Elz  wrote:

> But it looked right, so I changed (not yet committed, nor are the other
> bug fixes I have made to this) the NetBSD sh to produce the same
> result as bash:
>
>   ${SH} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
>   a?.??
>   b?.??
>   c?.??
>   e?.??

This would be a mistake.

> Then I started pondering other quote characters, since the quote
> characters are still in the string, that is, if the command were
>
>   $SHELL -c 'printf "%s\n" [a-e]\?.*'

This is a different example, as you here have a quoted '?' instead of a quoted 
\ as in the first example.


> (here it is important that there just be one '\') all shells agree, that the
> result where the 4 file names are printed is correct.  For example:
>
>   bosh  -c 'printf "%s\n" [a-e]\?.*'
>   a?.??
>   b?.??
>   c?.??
>   e?.??

See above, a different example results in a different behavior.

> In your earlier reply you said ...
>
>   | The result of a shell macro expansion is quoted internally before quote
>   | removal  is applied.
>
> but I cannot find any text anywhere which mandates that, and what's more,
> it is nothing like what really happens:
>
>   bosh -c 'var="???";printf "%s\n" ${var}' | wc -l
>   2297

I am not sure what this should point to.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Robert Elz  wrote:

> We could require, than when stored in a variable, we quote
> things in pattern style "quoting" rather than shell style, that is,
> to take the example from my immediately previous message,
>
>   $SHELL -c 'var="[a-e][?].*";printf "%s\n" ${var}'
>
> lists the 4 filenames expected, for all values of $SHELL.

See my recent reply, this does not result in a quoted \.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: Laundry list

2018-04-27 Thread Martijn Dekker

[Resending to list. I'd accidentally replied to Geoff only.]

Op 26-04-18 om 16:19 schreef Geoff Clare:

Robert Elz  wrote, on 26 Apr 2018:


   | > B4. Shell may exit if command not found. (XCU 2.8.1, last row in table)
   | >
   | > I was only just made aware by kre that POSIX allows this. It's another 
thing
   | > that no shell actually does.
   |
   | Are you sure?  If no shell does it, then it seems odd that it was
   | allowed in the resolution of bug 882 which was relatively recent (2015).


Hmm. OK, time for some testing.

For: someshell -c 'foo; echo does not exit'

Historical Bourne:
- Xenix sh (1988): exits!
- Solaris 10.3 sh (2010): does *not* exit.

Current Bourne:
- schily's bosh: does not exit, including in POSIX mode.

AT ksh: none of the following exit:
- Solaris 10.3 ksh88 (2010)
- ksh93, 1993
- ksh93, 2012

pdksh: none of the following exit:
- original pdksh 5.2.14
- NetBSD ksh
- OpenBSD ksh
- mksh

bash: does not exit, including in POSIX mode (tested 2.05b [2002] 
through current git).


zsh: does not exit, including in 'emulate sh' mode (tested 4.3.11 
through current 5.5).


yash: does not exit, including in POSIX mode (tested 2.17 through 
current 2.47).


Almquist: none of the following exit:
- NetBSD sh
- FreeBSD sh
- dash
- Busybox ash
- Slackware Linux ash (an older ash, not up to POSIX scratch)

osh (from oil 0.4.0, http://www.oilshell.org/): does not exit. (This is 
a new shell written in Python that (cl)aims to have a bash-compatible 
mode, but a very quick initial test says it parses builtins at the 
parsing stage instead of the execution stage, so invalid arguments to 
builtins cause the program to refuse to even start running.)


Anyway, so I had to go back to 1988 to find a shell that exits. I think 
the fact this was fixed in the Solaris Bourne shell is probably evidence 
that this behaviour was considered broken early on.


The original inspiration for POSIX, ksh88, does not exit, and neither 
does anything since -- unless there is still a shell I haven't 
discovered. Do you know of any others?


kre wrote:

Since it is just a "may" it is very hard to argue against - one needs to
examine every possible sh and see what it does, and almost no-one
has the resources to do that.   Nothing is required to exit, so shell
authors just ignore this.



Script writers (who do have to deal with this)
are not well represented,


I agree. That's why I'm here.


 and also tend not to read the std anyway,
they just do what works, and "not exit" is what works.


One good reason to standardise it, IMHO.

Geoff wrote:

While working on bug 882 we did try to examine the behaviour of as
many shells as possible for the various cases in the table.  What's
particularly odd is that there are two "may exit" entries that have
a superscript 3 to indicate that a future version of this standard may
require the shell to not exit, but for some reason we decided not to
add a superscript 3 to the command not found case.  This makes me
wonder whether someone found a shell that does exit.  (The alternative
would be that we left it alone because it way already that way in the
old table - the superscript 3's are on new additions.)


I think maybe someone remembered an ancient historical practice. My 
testing says it's time to consign it to history where it belongs.


Thanks,

- M.



Re: Minutes of the 26th April 2018 Teleconference

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 11:54:15 +0100
From:Andrew Josey 
Message-ID:  

This is kind of odd...

  | Attendees:
  | Nick Stoughton, USENIX, ISO/IEC JTC 1/SC 22 OR
  | Joerg Schilling, FOKUS Fraunhofer
[...]

  | We deferred bugs 1084, 1085 and 1100 until Joerg is on the call.

It looks as if he was. 

What caues one of those (many, not just those 3) considerations
of bugs that have been deferred to ever get considered again?
Clearly the stated pre-condition was not enough.

kre



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 10:00:50 +0100
From:Geoff Clare 
Message-ID:  <20180427090050.GA2538@lt2.masqnet>

  | I believe the former text is misleading and should be deleted.  It is
  | effectively duplicating the requirements regarding backslashes stated in
  | 2.2.1 and 2.2.3, but gets the details wrong.

Except that here it is talking about quoting characters in patterns,
where different ones need to be quoted than when parsing.  If we
were to require that only "original" quotes can quote characters
in patterns, this wouldn't matter, but if we do that, I don't think
there is any way that we can (reasonably) store a pattern in a
variable where the pattern is to match a literal magic char (say
an asterisk, or question-mark) - that is, unless in that context
we were to require only "pattern" type quoting to ever be used.

Note "eval" doesn't really help - that removes quotes,  where we
need to add them, and while it is possible to write a pattern in
a form where it can be eval'd and produce the desired result,
that isn't something that I would normally expect almost anyone
to be able to work out how to do correctly (and safely - given that
the entire command needs to be eval'd there's no way to do just
the pattern word in question).

We could require, than when stored in a variable, we quote
things in pattern style "quoting" rather than shell style, that is,
to take the example from my immediately previous message,

$SHELL -c 'var="[a-e][?].*";printf "%s\n" ${var}'

lists the 4 filenames expected, for all values of $SHELL.

This means to quote a * ? or [ (and to be safe) \ outside
a bracket expression, one must include it in a (single character)
bracket expression, and in a bracket expression, to quote
! (or ^ if applicable) ] and '-' they need to be written in the
correct magic order so their special properties are lost.

But I think if that is to be the solution, we will need to spell
it out very clealy, and at the same time explain why a
pattern in a variable has a whole set of different rules
that a pattern simply written on the command line.

  | > But in a pattern?Which of these two applies?
  |
  | Depends where the pattern is.  Anywhere double quotes have an effect,
  | the backslash-within-double-quotes rule applies.  Elsewhere the "normal"
  | rule applies.

But the backslash within double quotes only applies the \ to quote the
double quote string magic chars ($ " ` \ and newline) whereas for
patterns what matters is the pattern magic chars (* ? [ etc).

Is that really what is supposed to happen?


  | > 4.  On the question of bug 985 ... (kind of related) - if quote removal is
  | > added to case pattern processing, [...]
  |
  | The danger here is that there are references to quote removal elsewhere
  | that could mean the wrong thing if case patterns are not subject to
  | quote removal.  You actually quoted one of these above from 2.13.1,

You could "fix" that by specifying that the pattern in a case statement
be subject to quote removal after the pattern has been used to match
against the word (the same way that filename patterns are subject
to quote removal after they have been used to match).   That would be
easy to implement, as the expanded pattern is just discarded after it has
failed to match (the original text remains for the next iteration of the
enclosing loop or whatever, if any, but that's unchanged in all cases.)

  | When pattern matching is used where shell quote removal is not
  | performed, ...
  |
  | This would apply to case patterns if quote removal is not performed
  | for them.

Yes, it would.   But ...

  | Okay, we could change this condition to something else but
  | can we be sure there aren't other similar side effects?  Are you 
  | willing to search through the standard for every occurence of the
  | substring "quot"?

Huh?   I'm confused - what other side effects are possible to
changing the wording about how case pattern matching in case
statements is done?

No-one is proposing altering what quote removal means, or how
that is performed.  Just whether it should be done in this particular
case, and what that means.   But yes, I do believe that the whole
of 2.13 needs extensive revision, not just fiddling here and there.

I'll leave your answer to the 2nd half of (or the addendum to) my message
from this morning until you have had time to consider my reply to
Jörg (and Mark), as you (more or less) said the same as Jörg.

kre




Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 11:03:57 +0200
From:Joerg Schilling 
Message-ID:  <5ae2e77d.95ubF707FXNl6/H/%joerg.schill...@fokus.fraunhofer.de>

First, a (minor) apology - I should have made it clear that, yes, "set +f" was
intended, and that IFS was not intended to contain any unusual values (no 'a'
'*' "'"' '\' or '?' in it... )  Obviously anything like that would alter the 
results, and that kind of bizarreness is not what I was seeking to
query - and if I was, those pre-conditions would not have been forgotten.

  | XCU 2.6.5 explains what happens after parameter expansion, the quoting 
happens 
  | as the last action during parameter expansion.

2.6.5 is field splitting, which while it would normally be attempted in the
example I gave, would do nothing - and we could disable  it by assuming IFS=''
if wanted - that should change nothing.

But in any case, unless some new text has been added in the resolution of
some bug that I am unaware of (which is most of them...) I see nothing in 2.6.5
which is even remotely similar to what you said.   Can you cut/paste the 
relevant words, or quote line numbers, or if there's a change that is not yet
in the published text, the bug number ?

  | The text related to double quotes refers only to "spaces" inside the result.

No, it means IFS characters - that is, something that was quoted is not
subject to field splitting - that's usually white space, but doesn't have to
be, but I agree, that's not relevant to anything here (since field splitting is
not going to change anything anyway, we can simply disable it, with IFS='')

  | If you like, check:
  |
  | $shell -c "var='a*\"?\"'; echo \$var"
  |
  | alls shells agree here ;-)

Yes, they probably do in that case.   They don't however in the case that
originally caused me to start looking at this.

[Aside: Martijn Dekker's modernish found some problems with NetBSD's
pattern matching - minor and obscure ones - but clearly bugs, and then
when I started testing, I found a few more ... so I created a large set of
tests for everything obscure and weird I could think of  and these
messages are the result of that: before I can "fix" anything I need to
understand what is the correct result, and why.]

The problem case is:

${SHELL} -c  'var="[a-e]\\?.*";printf "%s\n" ${var}'

There are 4 files in $PWD (when the above command is executed)
with names that start with a char in [a-e] followed by a '?' followed
by a '.' followed by two more '?' chars - and lots more irrelevant files).

Almost all shells simply print
[a-e]\?.*
which is the string assigned to "var" (whether the original input has
one or two \ characters makes no difference, and nor should it.)

But bash doesn't: (the -o posix given here makes no difference)

bash -o posix -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

So I started wondering why, and looked at the spec, and could find
nothing to suggest this should not be the result, rather, the text to
me reads as if it should be.

Even though nothing else I have available to test does that.

But it looked right, so I changed (not yet committed, nor are the other
bug fixes I have made to this) the NetBSD sh to produce the same
result as bash:

${SH} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

(${SH} is the obscure pathname to the uninstalled test build of my
development version of the NetBSD sh - I have it in a var because
it is way too long to type...) whereas the old way:

sh -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
[a-e]\?.*

the same as everyone else.

Then I started pondering other quote characters, since the quote
characters are still in the string, that is, if the command were

$SHELL -c 'printf "%s\n" [a-e]\?.*'

(here it is important that there just be one '\') all shells agree, that the
result where the 4 file names are printed is correct.  For example:

bosh  -c 'printf "%s\n" [a-e]\?.*'
a?.??
b?.??
c?.??
e?.??

In your earlier reply you said ...

  | The result of a shell macro expansion is quoted internally before quote
  | removal  is applied.

but I cannot find any text anywhere which mandates that, and what's more,
it is nothing like what really happens:

bosh -c 'var="???";printf "%s\n" ${var}' | wc -l
2297

(the wc is there just because (as shown) there are way too many 3 character
filenames to include the printf output directly...)

If "The result of a shell macro expansion is quoted internally" was happening,
then this example would look like

bosh -c 'printf "%s\n" "???" | wc -l'
   1

(the '1' being the literal string "???" of course).   Instead, what we're 
getting is:

bosh -c 'printf "%s\n" ??? | wc -l'
2297

which shows that the results of the macro expansion are 

Minutes of the 26th April 2018 Teleconference

2018-04-27 Thread Andrew Josey
All
These are the minutes from this weeks teleconference
regards
Andrew


Minutes of the 26th April 2018 Teleconference Austin-865 Page 1 of 1
Submitted by Andrew Josey, The Open Group. 27th April 2018

Attendees:
Nick Stoughton, USENIX, ISO/IEC JTC 1/SC 22 OR
Joerg Schilling, FOKUS Fraunhofer
Don Cragun, IEEE PASC OR
Andreas Grapentin, University Potsdam, HPI
Geoff Clare, The Open Group
Eric Blake, Red Hat
David Clissold, IBM
Martin Rehak, Oracle, The Open Group OR

Apologies:
Richard Hansen, Google
Andrew Josey, The Open Group 
Mark Ziegast, SHware Systems

* General news 

No news

* Outstanding actions

( Please note that this section has been flushed to shorten the minutes -
to locate the previous set of outstanding actions, look to the minutes
from 9 March 2018 and earlier)


* Current Business

Bug 1077: Recommend support for wide-character regcomp and regexec and/or 
specify multi-byte behavior OPEN
http://austingroupbugs.net/bug_view_page.php?bug_id=1077

Andrew reported he has completed his action and will report back
when he has an update from Apple.

We deferred bugs 1084, 1085 and 1100 until Joerg is on the call.



Bug 1102: inet_addr() cannot parse 255.255.255.255. Mark it as obsolete, add 
inet_aton(), or both? Accepted as Marked
http://austingroupbugs.net/view.php?id=1102

This item is tagged for Issue 8.

After applying the changes from 0001101, apply the following changes:

On page 934 line 31730-31731 (getaddrinfo() DESCRIPTION) change:
If the specified address family is AF_INET or AF_UNSPEC, address
strings using Internet standard dot notation as specified in
inet_addr( ) are valid

to:
If the specified address family is AF_INET or AF_UNSPEC, address
strings using Internet standard dot notation as specified in
inet_pton( ) are valid.

On page 1137 lines 38404-38406 (inet_addr(), inet_ntoa() SYNOPSIS),
mark the entire SYNOPSIS obsolescent and on the remainder of the
page remove the other OB shading that was added by 0001101.

On page 1138 after line 38441 but before the new paragraph added
from 0001101, insert a new paragraph:

Applications should prefer inet_pton( ) over inet_addr( ) for
the following reasons:
The return value from inet_addr( ) when converting
255.255.255.255 is indistinguisable from an error.

The inet_pton( ) function supports multiple address families.

The alternative textual representations supported by
inet_addr( ) (but not by inet_pton( )) are often used
maliciously to confuse or mislead users (e.g., for phishing).


On page 1138 line 38445 replace the FUTURE DIRECTIONS section with:
These functions are included only for compatibility with older
implementations and may be removed in a future version.


On page 1139 line 38478 (inet_pton() DESCRIPTION) delete:
(see inet_addr( ))



Bug 1103: Spacing issues in mailx "Command Escapes" after pdftotext:  OPEN
http://austingroupbugs.net/view.php?id=1103

Andrew has an action to check whether the fix in 1103 made it into
the PDF. 
Andrew confirms that the change to the troff was made, and the pdf was
rebuilt with that change in it.

Bug 1104: Missing period.Accepted
http://austingroupbugs.net/view.php?id=1104
This item is tagged for tc3-2008.

Bug 1105: problems with backslashes in awk strings and EREs OPEN
http://austingroupbugs.net/view.php?id=1105

We will pick up on this item next time.

Next Steps
--
The next call is on May 3rd, 2018 (a Thursday)

Calls are anchored on US time. (8am Pacific) 
This call will be for the regular 90 minutes.

http://austingroupbugs.net

An etherpad is usually up for the meeting, with a URL using the date format as 
below:

https://posix.rhansen.org/p/201x-mm-dd
username=posix password=2115756#


Andrew JoseyThe Open Group
Austin Group Chair  
Email: a.jo...@opengroup.org 
Apex Plaza, Forbury Road,Reading,Berks.RG1 1AX,England
Tel:+44 118 9023044







Austin Group teleconference +1-888-426-6840 PIN: 2115756

2018-04-27 Thread Single UNIX Specification
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//opengroup.org//NONSGML kigkonsult.se iCalcreator 2.22.1//
CALSCALE:GREGORIAN
METHOD:REQUEST
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20120311T02
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20121104T02
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
UID:5ae2fddb94...@opengroup.org
DTSTAMP:20180427T103923Z
ATTENDEE;ROLE=CHAIR:MAILTO:a.jo...@opengroup.org
CREATED:20180427T00Z
DESCRIPTION:Web/Project: Single UNIX Specification\nTitle: Austin Group tel
 econference +1-888-426-6840 PIN: 2115756\nDate/Time: 03-May-2018 at 11:00 
 America/New_York\nDuration: 1.50 hours\nURL: https://collaboration.opengro
 up.org/platform/single_unix_specification/events.php\n\n** All calls are a
 nchored on US time **\n\nTopic: Austin Group teleconference\n-
 --\nAudio conference information\n
 ---\nCall-in toll free
  number (US/Canada): +1-888-426-6840\nParticipant PIN: 2115756.\n\nAll Aus
 tin Group participants are most welcome to join the call.\nThe call will l
 ast for 1.5 hours .\nThis call is handling defect report processing.\n\nAn
  etherpad is usually up for a meeting\, with a URL using the date format a
 s below:\n\nhttp://posix.rhansen.org/p/201x-mm-dd\nusername=posix password
 =2115756#\n\nAdditional Call-in numbers:\nGermany Caller P
 aid0-69-2443-2290\nGermany Toll-Free  
  0800-000-1018\nUnited Kingdom   Caller Paid   0-20-305964
 51\nUnited Kingdom   Toll-Free  0800-368-0638\nUSA
  Caller Paid   215-861-6239\nUSA  
   Toll-Free   888-426-6840\nDenmark Caller
  Paid32711870\nDenmark Toll-Free  
  80-717000\nCzech Republic  Caller Paid 2-39016353\nCz
 ech Republic  Toll-Free   800-143-484\nCall-in numbers
  for other countries are available on request\n\nBug reports are available
  at:\nhttp://www.austingroupbugs.net\n
DTSTART;TZID=America/New_York:20180503T11
DURATION:PT1H30M0S
LAST-MODIFIED:20180427T063923Z
ORGANIZER;CN=Single UNIX Specification:MAILTO:do-not-re...@opengroup.org
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:Austin Group teleconference +1-888-426-6840 PIN: 2115756
TRANSP:OPAQUE
URL:https://collaboration.opengroup.org/platform/single_unix_specification/
 events.php
X-MICROSOFT-CDO-ALLDAYEVENT:FALSE
X-VISIBILITY:40
X-JOINBEFORE:5
X-CATEGORY:Teleconference
X-PLATO-SITE:Single UNIX Specification
X-PLATO-SITEID:136
END:VEVENT
END:VCALENDAR


meeting.ics
Description: application/ics


Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Geoff Clare  wrote:

> Robert Elz  wrote, on 27 Apr 2018:
> >
> > Oh, one more thing about patterns - a question this time, though the
> > answer might end up suggesting more text that needs to be in
> > the standard.
> > 
> > If I have
> > 
> > var='a*"?"'
> > 
> > and then I do
> > 
> > echo $var
> > 
> > what should the result be?   Is this absolutely the same as
> > 
> > echo a*"?"
> > 
> > ?
>
> No it's not the same.  The shell expands $var to all filenames that
> start with 'a' and end with double-quote, any character, double-quote.

Which is a result of the way, the internal quoting is added to the parameter 
expansion result.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Shware Systems  wrote:

> According to XCU 2.6.5, it's treated literally only when double quoted, e.g. 
> "$var", otherwise quote removal should still occur on the variable's contents 
> after any field splitting...

XCU 2.6.5 explains what happens after parameter expansion, the quoting happens 
as the last action during parameter expansion.

The text related to double quotes refers only to "spaces" inside the result.

If you like, check:

$shell -c "var='a*\"?\"'; echo \$var"

alls shells agree here ;-)

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Geoff Clare
Robert Elz  wrote, on 27 Apr 2018:
>
> 1.  There is text dealing with backslash processing at 2 separate places in
> 2.13.1.  First at lines 76212-3
> 
>   A  character shall escape the following character.
>   The escaping  shall be discarded.
> 
> and then at lines 76232-8 (which is on the following page)
> 
>   When pattern matching is used where shell quote removal is not performed
>   (such as in the argument to the find -name primary when find is being 
> called
>   using one of the exec functions as defined in the System Interfaces 
> volume
>   of POSIX.1-2008, or in the pattern argument to the fnmatch( ) 
> function), special
>   characters can be escaped to remove their special meaning by preceding
>   them with a  character. This escaping  is 
> discarded.
>   The sequence "\\" represents one literal . All of the 
> requirements
>   and effects of quoting on ordinary, shell special, and special pattern 
> characters
>   shall apply to escaping in this context.
> 
> Given the former, which is simple, and easy to follow, what is the point of 
> the latter?

I believe the former text is misleading and should be deleted.  It is
effectively duplicating the requirements regarding backslashes stated in
2.2.1 and 2.2.3, but gets the details wrong.

> What's more, in the latter, only special characters can be 
> escaped, after which the escaping \ is removed - in that version, what
> happens to a \ that is not followed by a special character ?

Unspecified.

> These two are kind of like backslash quoting in unquoted shell text (where the
> \ escapes anything (ignoring the \newline for this)) and backslash quoting in 
> double quoted strings, where the \ only escapes a specific set of characters,
> and other backslashes are left untouched.
> 
> In parsing and processing words it is no problem, as we know if we're in a
> double quoted string or not.
> 
> But in a pattern?Which of these two applies?

Depends where the pattern is.  Anywhere double quotes have an effect,
the backslash-within-double-quotes rule applies.  Elsewhere the "normal"
rule applies.

> 2.  Lines 76219-21:
> 
>   If any character (ordinary, shell special, or pattern special) is 
> quoted,
>   that pattern shall match the character itself.
> [that's fine]
>   The shell special characters always require quoting.
> [that's nonsense].

Agreed.  That sentence should be deleted.

> 3.  Lines 76247-9
> 
>   In such patterns, each  shall match a string of zero or more 
> characters,
> [fine]
>   matching the greatest possible number of characters that still allows 
> the
>   remainder of the pattern to match the string.
> 
> the "greatest possible" is unnecessary, and in some cases, actually incorrect
> (that's an idea taken from '*' in REs where a specification of this is 
> needed.)
> 
> It is not generally needed, as in general, shell patterns are just match or
> no-match - it is irrelevant exactly what matched where.
> 
>   So given the word (or file name)   abcdxbz
> the pattern
>   a*b*z
> matches, but no-one cares in the slightest whether the 'b' that was
> selected was the one after a or the one before z.   Which * matched the
> null string, and which matched the rest of the characters is irrelevant.
> There is no need to specify "greatest possible number" - the * just
> needs to match any number of characters that allows the remainder
> of the pattern to match.
> 
> The one place where we need more than match/no-match is in the variable
> expansion substring operators (# ## % %%).
> 
> There, assuming var contains the word above, we want (require) ${var#a*b}
> to match such that the 'b' that matches is the one after 'a', and ${var##a*b}
> to match so that the b that matches is the one before 'z'.
> 
> In the single char substring operators we want the '*' to match the smallest,
> not greatest, possible number of chars that allows the remainder of the
> pattern to match.   The only time "greatest" is relevant is for the double 
> char
> substring operators.

All true.  The descriptions of parameter expansions with %, %%, # and ##
cover this, so the "greatest possible number" clause in 2.13.2 should
just be deleted.

> 4.  On the question of bug 985 ... (kind of related) - if quote removal is
> added to case pattern processing, it makes that into a different case from all
> of the others.   In filename generation, pattern matching is done before
> quote removal, so the quotes are still there.  In parameter expansion 
> (substring matching) the pattern matching happens before quote removal,
> so the quotes in the pattern are still there.   To be consistent, it would be 
> best to leave the quotes in the pattern in a case statement, so processing of
> it is consistent with all of the others.

The danger here is that there are references to quote removal elsewhere
that could mean the wrong thing if case patterns are not subject to
quote 

Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Shware Systems
According to XCU 2.6.5, it's treated literally only when double quoted, e.g. 
"$var", otherwise quote removal should still occur on the variable's contents 
after any field splitting...

On Friday, April 27, 2018 Joerg Schilling  
wrote:

Robert Elz  wrote:

> Oh, one more thing about patterns - a question this time, though the
> answer might end up suggesting more text that needs to be in
> the standard.
>
> If I have
>
> var='a*"?"'
>
> and then I do
>
> echo $var
>
> what should the result be? Is this absolutely the same as
>
> echo a*"?"

No, it isn't.

The result of a shell macro expansion is quoted internally before quote removal 
is applied.

For this reason echo $var will print a*"?", while the latter prints a*?

Jörg

-- 
EMail:jo...@schily.net (home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'





Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Robert Elz  wrote:

> Oh, one more thing about patterns - a question this time, though the
> answer might end up suggesting more text that needs to be in
> the standard.
>
> If I have
>
>   var='a*"?"'
>
> and then I do
>
>   echo $var
>
> what should the result be?   Is this absolutely the same as
>
>   echo a*"?"

No, it isn't.

The result of a shell macro expansion is quoted internally before quote removal 
is applied.

For this reason echo $var will print a*"?", while the latter prints a*?

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'