Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-05-02 Thread Joerg Schilling
Robert Elz  wrote:

> Date:Fri, 27 Apr 2018 15:06:52 +0200
> From:Joerg Schilling 
> Message-ID:  
> <5ae3206c.gzrnd81xboh3e0x7%joerg.schill...@fokus.fraunhofer.de>
>
>   | Since bash seems to be the only shell that works this way,
>
> Until I changed the NetBSD sh (if that change is retained), yes.
>
>   | I would call this a bug.
>
> Then I think it would be also a bug in POSIX (as I think it
> actually specifies this result) and a deficiency - as there
> really needs to be a way to store a pattern in a variable
> such that a pattern-magic character can be treated literally.
>
> I will leave it for Chet to say whether or not he considers this
> to be a bug in bash.

In case that bash did pass the bosh conformance test suite, this was a suitable 
proposal.

Unfortunately, this is not the case and for this reason, there is no way to 
verify whether this bash deviation from other implementations is not related to 
other deviations as well.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-05-02 Thread Joerg Schilling
Robert Elz  wrote:

> Date:Fri, 27 Apr 2018 10:00:50 +0100
> From:Geoff Clare 
> Message-ID:  <20180427090050.GA2538@lt2.masqnet>
>
> quoting me:
>   | > 4.  On the question of bug 985 ... (kind of related) - if quote removal 
> is
>   | > added to case pattern processing, it makes that into a different case 
> from all
>   | > of the others. [...]
>   |
>   | The danger here is that there are references to quote removal elsewhere
>
> This isn't about any such potential dangers, which I don't think exist, but a
> case where it seems to make a difference.
>
> Consider this, where different shells produce different results:
>
>   $SHELL -c 'LC_ALL=C; case B in ([[:"alpha":]]) printf M;; (*) printf 
> X;; esac'
>
> bash bosh and pdksh print 'X' (fail to match), everything else I have tested 
> (not posh or ksh88 - or a v7 sh) prints 'M' (matches).   That includes mksh
> ksh93 and all the ash dervied shells I have access to.

Since the POSIXyfied ksh88 prints "X", it seems that this is a result of a 
change in ksh93 
that may not be POSIX compliant.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-29 Thread Chet Ramey
On 4/27/18 10:02 AM, Robert Elz wrote:
> Date:Fri, 27 Apr 2018 15:06:52 +0200
> From:Joerg Schilling 
> Message-ID:  
> <5ae3206c.gzrnd81xboh3e0x7%joerg.schill...@fokus.fraunhofer.de>
> 
>   | Since bash seems to be the only shell that works this way,
> 
> Until I changed the NetBSD sh (if that change is retained), yes.
> 
>   | I would call this a bug.
> 
> Then I think it would be also a bug in POSIX (as I think it
> actually specifies this result) and a deficiency - as there
> really needs to be a way to store a pattern in a variable
> such that a pattern-magic character can be treated literally.
> 
> I will leave it for Chet to say whether or not he considers this
> to be a bug in bash.

I don't. If a shell variable contains a literal backslash, that backslash
should be treated as an escape character by the pattern matching engine.
This is as the standard specifies.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 10:00:50 +0100
From:Geoff Clare 
Message-ID:  <20180427090050.GA2538@lt2.masqnet>

quoting me:
  | > 4.  On the question of bug 985 ... (kind of related) - if quote removal is
  | > added to case pattern processing, it makes that into a different case 
from all
  | > of the others. [...]
  |
  | The danger here is that there are references to quote removal elsewhere

This isn't about any such potential dangers, which I don't think exist, but a
case where it seems to make a difference.

Consider this, where different shells produce different results:

$SHELL -c 'LC_ALL=C; case B in ([[:"alpha":]]) printf M;; (*) printf 
X;; esac'

bash bosh and pdksh print 'X' (fail to match), everything else I have tested 
(not posh or ksh88 - or a v7 sh) prints 'M' (matches).   That includes mksh
ksh93 and all the ash dervied shells I have access to.

In pdksh the issue is just that char classes don't match at all (not 
implemented) so that one we can ignore.  A true v7 sh would be the same.
(In those the input word 'p]' matches - or variants of that.)

The original test had var=alpha and the pattern was [[:"$var":]] but that
makes no difference at all (after expansion the two cases look the same).
"No difference" means the different shells produce the same results this way
as they do the other way, whether matching or not.

If either quote removal is specified to happen before pattern matching (but I 
really think that would break too many other cases) or if the way quoted 
strings are encoded in the shell is not literally as "string" then this matches
(quoted "alpha" is still alpha) (similarly if the pattern match code was
"clever" about quotes in patterns, aside from \ - but it is not, in any shell,
so I think that option is out of consideration).

This works (with ether the literal [[:alpha:]] or with [[:$var:]]) when the 
double
quotes are not present (except in pdksh of course.)

It does not work anywhere, and I would not really expect it to with the pattern
being [[:$var:]] (no quotes) with var='"alpha"' (though that would not be out of
the question if the "clever" quotes in patterns model was adopted.)   (The 
actual
test case gets a bit ugly to get the quoting right to allow that to be input, 
but 
that is not the issue,.)

kre



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 15:24:30 +0100
From:Geoff Clare 
Message-ID:  <20180427142430.GB9716@lt2.masqnet>

  | This discussion seems to have come round to the same issue that was
  | raised recently in some comments in bug 1190, specifically Stephane's
  | notes 3960 and 3962 and my reply in note 3963.

Yes, I remembered seeing something like that, somewhere...

  | In summary: the need for a way to store a pattern in a variable such
  | that a pattern-magic character can be treated literally

Yes, that is the need.

  | is a reason to keep the first paragraph of 2.13.1 as-is and say that
  | shells which behave differently than bash here do not conform.

That would be nice.   I was going to say that I expect that Jörg would
not agree - but I see he has already done that

For now the best that might be possible, given that almost no shells
do this,  would be to make it unspecified whether this works, and
mark it as a future direction that a later rev will require it.

kre




Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 16:20:01 +0200
From:Joerg Schilling 
Message-ID:  <5ae33191.adgpivkbwgx8dc1y%joerg.schill...@fokus.fraunhofer.de>

  | But you forgot that after this variable content is expanded, it is quoted 
in a 
  | way to keep the content in the final result.

I didn't forget that, because it doesn't happen.   That's what the

bosh -c 'var="???";printf "%s\n" ${var}'

was meant to show.   The "???" is not kept in the final result,
it is expanded to produce all the 3 character filenames.

  | This however requires the macro 
  | expansion code (parameter expansion) to quote the \ at the end of the macro 
  | expansion to allow the \ to be kept visible after the final quote removal.

It doesn're require anything of the kind.   That \ is not subject to quote
removal, as it was not part of the original word.   Only quotes that were
in the original word get removed.   Sure, quoting it might be one way to
make that work, provided you can do it properly - but that does not
duplicae the original shell.

Remember, as you showed the code earlier, the original Bourne sh
parsed original word qouting by setting the QUOTE bit on the quoted
text.   Results of expansions don't get that.  Then quote removal is
just clearing that bit - it is all simple (and easy to code, and small,
which is why I assume it was done that way - despite all the idiotic
quoting rules it has left us with).

  | If this is not in the POSIX text,

It isn't, and should not be, as it is simply wrong.

The way the NetBSD sh (and original ash) copes with field splitting,
(and quote removal, or could, though that's actually done differently)
is by remembering (and updating as it changes) offsets into the word
to keep track of which chars are originals, and which are the results
of expansions.   The FreeBSD sh (which being based upon ash)
used to be the same, but they rewrote all of that part and now do it
a different way (but certainly not quoting the results of expansions).

kre




Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Geoff Clare
Joerg Schilling  wrote, on 27 Apr 2018:
>
> Geoff Clare  wrote:
> 
> > In summary: the need for a way to store a pattern in a variable such
> > that a pattern-magic character can be treated literally is a reason to
> > keep the first paragraph of 2.13.1 as-is and say that shells which
> > behave differently than bash here do not conform.
> 
> I am not convinced since _all_ other shells behave the same and since 
> changing 
> this in the shell would result in other missbehavior as well.
> 
> Your wish would e.g. result in a missbehaving "case".

The comments in bug 1190 that I referred to (in the part you snipped)
are about "case"!

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 09:33:49 -0400
From:Shware Systems 
Message-ID:  <163074f534e-c83-4...@webjas-vaa062.srv.aolmail.net>

  | For my analysis, 2.6.5 says it is results which are subject to field 
splitting,

Yes, but irrelevant here

  |  with the parameter expand and direct entry both being one field as the 
pattern to evaluate
  | according to 2.6.6, 

yes.

  | and the treatment of the double quotes follows from 2.13.1

that is how I read the text.   I kind of doubt that is how it is intended to
work, but that is what it looks like to me as well.

  | before removal by 2.6.7

those quotes would not be rmeoved by that, but that should only
matter if the pattern matches no files - otherwise the pattern, and its
quotes, is removed, and the file names produced appear instead.

  | processing. 2.13.1 effectively has the quotes ignored,

That's how I read it.Of course, all this is based upon the (frankly
bogus) specification that quoting characters in words are retained
as is in the word for later processing.

  | using only the chars in between (the one ?), for matching purposes.

Yes, again, that is how I would read the current text.

  | 2.6.7 does not properly account for that when a pattern has been evaluated,
  | the ignored quotes are required to be removed to reflect the intent of the 
pattern.

No, that's not what happens.   If the pattern matches any files, the pattern 
vanishes, and the matched file names replace it (as many fields as needed).
Any quote characters produced there (files that contain quote characters
in their names) must be retained (I have plenty of those in my test directory.)

If the pattern does not match, the word will be retained unchanged, and the
quotes will remain in it.   That's actualy useful.

  | What is there now is more the requirements when set -f in effect,

No, it is not that - filename generation still happens, what's missing is
any processing of the quote characters.

  | and then quotes from var expansions, not being in the original input, would 
be
  | expected to stay in the result as literals.

Yes, agreed - either when filename expansion does not happen, or when
no files are matched.

kre
 
ps: please could you avoid top posting - my messages are long and boring 
enough the first time, no-one needs to get them resent in full as a part of
a reply!



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Geoff Clare  wrote:

> In summary: the need for a way to store a pattern in a variable such
> that a pattern-magic character can be treated literally is a reason to
> keep the first paragraph of 2.13.1 as-is and say that shells which
> behave differently than bash here do not conform.

I am not convinced since _all_ other shells behave the same and since changing 
this in the shell would result in other missbehavior as well.

Your wish would e.g. result in a missbehaving "case".

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 15:23:10 +0200
From:Joerg Schilling 
Message-ID:  <5ae3243e.8dyd5s4eftmrpyui%joerg.schill...@fokus.fraunhofer.de>

  | Robert Elz  wrote:
  |
  | > But it looked right, so I changed (not yet committed,
  |
  | This would be a mistake.

Perhaps.

  | > Then I started pondering other quote characters, since the quote
  | > characters are still in the string, that is, if the command were
  | >
  | >   $SHELL -c 'printf "%s\n" [a-e]\?.*'
  |
  | This is a different example, as you here have a quoted '?' instead of a 
quoted 
  | \ as in the first example.

There was never a quoted \ (except in the assignment to var).

  | >   bosh  -c 'printf "%s\n" [a-e]\?.*'
  | >   a?.??
  | >   b?.??
  | >   c?.??
  | >   e?.??
  |
  | See above, a different example results in a different behavior.

Of course, but the original example was

${SHELL} -c 'var="[a-e]\?.*";printf "%s\n" ${var}'
or
${SHELL} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'

which are identical to each other in effect.   The only difference
from the bosh example above is that this one has the pattern
(the same pattern) in a variable, where the bosh one had it
on the command line.

  | >   bosh -c 'var="???";printf "%s\n" ${var}' | wc -l
  | >   2297
  |
  | I am not sure what this should point to.

It indicates that the results of a variable expansion are not
"internally quoted" which is how you justified the earlier
example not working.   If the ${var} result was somehow
quoted, the ? chars that result would be quoted, and so
would not be matching characters.   But they're not, so
they are.   This is working as it should be, and there is
no "internal quoting" being performed.

kre




Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Geoff Clare
Robert Elz  wrote, on 27 Apr 2018:
>
> Date:Fri, 27 Apr 2018 15:06:52 +0200
> From:Joerg Schilling 
> 
>   | Since bash seems to be the only shell that works this way,
> 
> Until I changed the NetBSD sh (if that change is retained), yes.
> 
>   | I would call this a bug.
> 
> Then I think it would be also a bug in POSIX (as I think it
> actually specifies this result) and a deficiency - as there
> really needs to be a way to store a pattern in a variable
> such that a pattern-magic character can be treated literally.

This discussion seems to have come round to the same issue that was
raised recently in some comments in bug 1190, specifically Stephane's
notes 3960 and 3962 and my reply in note 3963.

In summary: the need for a way to store a pattern in a variable such
that a pattern-magic character can be treated literally is a reason to
keep the first paragraph of 2.13.1 as-is and say that shells which
behave differently than bash here do not conform.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 15:17:41 +0200
From:Joerg Schilling 
Message-ID:  <5ae322f5.uw3u84gim9o+bvrx%joerg.schill...@fokus.fraunhofer.de>

  | See my recent reply, this does not result in a quoted \.

Of course it doesn't - no-one wants (or ever attempted) a quoted \,
we want a quoted '?'

kre




Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Robert Elz  wrote:

> The examples with "" characters I expect will simply remain as they
> are in all shells, and the code I have been in the process of writing
> to allow that to "work" (based on the assumption that there is no reason
> why not - and even now, except that it doesn't work that way in other
> shells, I see no good reason to doubt) should just be consigned to the
> scrap heap (that code doesn't even compile yet, so no big loss.)
>
>   | In your example, expand() is told to expand:
>   |
>   |   [a-e]\\?.*
>
> No it isn't.  I said the \\ was irrelevant and I meant it.
>
> In
>   var="[a-e]\\?.*"
>
> which is the command that was used, the first \ is a quoting
> character, and is removed by quote removal (as are the
> enclosing "") just before the assignment to var is performed.
>
> The value assigned to var is
>
>   [a-e]\?.*

But you forgot that after this variable content is expanded, it is quoted in a 
way to keep the content in the final result. This however requires the macro 
expansion code (parameter expansion) to quote the \ at the end of the macro 
expansion to allow the \ to be kept visible after the final quote removal.

If this is not in the POSIX text, this is a bug of the same quality as the 
incorrect backus naur grammar for the shell in the POSIX standard text.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Geoff Clare
Robert Elz  wrote, on 27 Apr 2018:
>
> Date:Fri, 27 Apr 2018 10:00:50 +0100
> From:Geoff Clare 
> 
>   | I believe the former text is misleading and should be deleted.  It is
>   | effectively duplicating the requirements regarding backslashes stated in
>   | 2.2.1 and 2.2.3, but gets the details wrong.
> 
> Except that here it is talking about quoting characters in patterns,

Oops, you're right.  For some reason I had it in my head that this
special pattern-matching meaning was covered elsewhere, but now that I
look again I see that this is the place.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 15:06:52 +0200
From:Joerg Schilling 
Message-ID:  <5ae3206c.gzrnd81xboh3e0x7%joerg.schill...@fokus.fraunhofer.de>

  | Since bash seems to be the only shell that works this way,

Until I changed the NetBSD sh (if that change is retained), yes.

  | I would call this a bug.

Then I think it would be also a bug in POSIX (as I think it
actually specifies this result) and a deficiency - as there
really needs to be a way to store a pattern in a variable
such that a pattern-magic character can be treated literally.

I will leave it for Chet to say whether or not he considers this
to be a bug in bash.

  | I tested Historic Bourne, ksh88, ksh92, dash, yash, mksh posh, zsh, bosh.

I agree, and the FreeBSD and currently released (and all available)
NetBSD shells as well.

  | BTW: with the previous example, the "expand" function is told to expand:
  |
  | a*"?

That's the one where I missed the closing quote (deliverately) - let's
just forget that one for now until we get a real conclusion on what
should happen with pairs of quptes (and more importantly, \ quoting).

The examples with "" characters I expect will simply remain as they
are in all shells, and the code I have been in the process of writing
to allow that to "work" (based on the assumption that there is no reason
why not - and even now, except that it doesn't work that way in other
shells, I see no good reason to doubt) should just be consigned to the
scrap heap (that code doesn't even compile yet, so no big loss.)

  | In your example, expand() is told to expand:
  |
  | [a-e]\\?.*

No it isn't.  I said the \\ was irrelevant and I meant it.

In
var="[a-e]\\?.*"

which is the command that was used, the first \ is a quoting
character, and is removed by quote removal (as are the
enclosing "") just before the assignment to var is performed.

The value assigned to var is

[a-e]\?.*

which is exactly  the same as when the command was

var="[a-e]\?.*"

as there the \ is not a quoting character, as '?' isn't one
of the magic few that \ can quote inside a double quoted
string -- but another \ is.

If I had used
var='[a-e]\\?.*'

that would be different, there neither \ is a quoting char, and
what you said would be expanded would be correct.  But that
is not what was done (as I was using, as I always do when I
can, single quotes around the arg to sh -c - using single quotes
inside that string then gets ugly (bad for examples when the quoting
is not the point) so I avoid that when possible (of course, the
test cases include examples like that - doesn't matter if they're
incomprehensible.)

  | But:
  |
  | sh -c  'var="[a-e]?.*";printf "%s\n" ${var}'   
  | a?.??
  |
  | ...I have only one matching file.

This is an entirely different pattern, which matches a whole
different set of files (including the ones that the other pattern
matches - sometimes)

bosh -c  'var="[a-e]?.*";printf "%s\n" ${var}' |wc -l
  84

again, the wc is just because you really don't want to see the
list of odd filenames that match that pattern.

bosh is correct incidentally, all shells produce the same 84
files, but this is a very easy case.

The idea is to match files that contain a letter (one of the 5)
followed by a literal character '?' followed by a literal character '.'
followed by anything at all.   And to store that pattern in a
variable.  The literal '.' is no problem, the question is how
tio encode the literal ?.

I showed one way, using pattern magic, in my reply to Geoff,
the question is why not using shell quoting as well.

Note: that the section in 2.13.1 (which Geoff says is the correct
explanation of quoting in patterns) says:

When pattern matching is used where shell quote removal is not performed
[...]
special characters can be escaped to remove their special meaning by
preceding them with a  character.

"special characters" there is referring to the '*' '?' and '[' chars, and the
section goes on to allow \\ for matching a literal '\'.

Since 2.6.7 (Quote Removal) says ...

The quote characters (, single-quote, and double-quote) that
were present in the original word shall be removed unless they have
themselves been quoted.

which means that quote removal is not performed on text in a word that came
from the results of an expansion (that's not the original word) and so one 
could read 2.13.1 as saying that \ quoting of special characters is available
in this context, since quote removal is not performed there, (which then makes
it just the same as in literal patterns in the text, though there the \ acts as 
a
quoting character, and quotes the special characters that way.)

Now I am quite willing to admit (especially given that shells have not 
historically implemented this this way) that this might not be intended,
and that perhaps the spec needs to be changed to make this more
clear - but as it is wr

Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Shware Systems
For my analysis, 2.6.5 says it is results which are subject to field splitting, 
with the parameter expand and direct entry both being one field as the pattern 
to evaluate according to 2.6.6, and the treatment of the double quotes follows 
from 2.13.1 before removal by 2.6.7 processing. 2.13.1 effectively has the 
quotes ignored, using only the chars in between (the one ?), for matching 
purposes. 2.6.7 does not properly account for that when a pattern has been 
evaluated, the ignored quotes are required to be removed to reflect the intent 
of the pattern. What is there now is more the requirements when set -f in 
effect, and then quotes from var expansions, not being in the original input, 
would be expected to stay in the result as literals.

On Friday, April 27, 2018 Robert Elz  wrote:

Date: Fri, 27 Apr 2018 11:03:57 +0200
From: Joerg Schilling 
Message-ID: <5ae2e77d.95ubF707FXNl6/H/%joerg.schill...@fokus.fraunhofer.de>

First, a (minor) apology - I should have made it clear that, yes, "set +f" was
intended, and that IFS was not intended to contain any unusual values (no 'a'
'*' "'"' '\' or '?' in it... ) Obviously anything like that would alter the 
results, and that kind of bizarreness is not what I was seeking to
query - and if I was, those pre-conditions would not have been forgotten.

| XCU 2.6.5 explains what happens after parameter expansion, the quoting 
happens 
| as the last action during parameter expansion.

2.6.5 is field splitting, which while it would normally be attempted in the
example I gave, would do nothing - and we could disable it by assuming IFS=''
if wanted - that should change nothing.

But in any case, unless some new text has been added in the resolution of
some bug that I am unaware of (which is most of them...) I see nothing in 2.6.5
which is even remotely similar to what you said. Can you cut/paste the 
relevant words, or quote line numbers, or if there's a change that is not yet
in the published text, the bug number ?

| The text related to double quotes refers only to "spaces" inside the result.

No, it means IFS characters - that is, something that was quoted is not
subject to field splitting - that's usually white space, but doesn't have to
be, but I agree, that's not relevant to anything here (since field splitting is
not going to change anything anyway, we can simply disable it, with IFS='')

| If you like, check:
|
| $shell -c "var='a*\"?\"'; echo \$var"
|
| alls shells agree here ;-)

Yes, they probably do in that case. They don't however in the case that
originally caused me to start looking at this.

[Aside: Martijn Dekker's modernish found some problems with NetBSD's
pattern matching - minor and obscure ones - but clearly bugs, and then
when I started testing, I found a few more ... so I created a large set of
tests for everything obscure and weird I could think of  and these
messages are the result of that: before I can "fix" anything I need to
understand what is the correct result, and why.]

The problem case is:

${SHELL} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'

There are 4 files in $PWD (when the above command is executed)
with names that start with a char in [a-e] followed by a '?' followed
by a '.' followed by two more '?' chars - and lots more irrelevant files).

Almost all shells simply print
[a-e]\?.*
which is the string assigned to "var" (whether the original input has
one or two \ characters makes no difference, and nor should it.)

But bash doesn't: (the -o posix given here makes no difference)

bash -o posix -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

So I started wondering why, and looked at the spec, and could find
nothing to suggest this should not be the result, rather, the text to
me reads as if it should be.

Even though nothing else I have available to test does that.

But it looked right, so I changed (not yet committed, nor are the other
bug fixes I have made to this) the NetBSD sh to produce the same
result as bash:

${SH} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

(${SH} is the obscure pathname to the uninstalled test build of my
development version of the NetBSD sh - I have it in a var because
it is way too long to type...) whereas the old way:

sh -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
[a-e]\?.*

the same as everyone else.

Then I started pondering other quote characters, since the quote
characters are still in the string, that is, if the command were

$SHELL -c 'printf "%s\n" [a-e]\?.*'

(here it is important that there just be one '\') all shells agree, that the
result where the 4 file names are printed is correct. For example:

bosh -c 'printf "%s\n" [a-e]\?.*'
a?.??
b?.??
c?.??
e?.??

In your earlier reply you said ...

| The result of a shell macro expansion is quoted internally before quote
| removal is applied.

but I cannot find any text anywhere which mandates that, and what's more,
it is nothing like what really happens:

bosh -c 'var="???";printf "%s\n" ${var}' | wc

Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Shware Systems
For my analysis, 2.6.5 says it is results which are subject to field splitting, 
with the parameter expand and direct entry both being one field as the pattern 
to evaluate according to 2.6.6, and the treatment of the double quotes follows 
from 2.13.1 before removal by 2.6.7 processing. 2.13.1 effectively has the 
quotes ignored, using only the chars in between (the one ?), for matching 
purposes. 2.6.7 does not properly account for that when a pattern has been 
evaluated, the ignored quotes are required to be removed to reflect the intent 
of the pattern. What is there now is more the requirements when set -f in 
effect, and then quotes from var expansions, not being in the original input, 
would be expected to stay in the result as literals.

On Friday, April 27, 2018 Robert Elz  wrote:

Date: Fri, 27 Apr 2018 11:03:57 +0200
From: Joerg Schilling 
Message-ID: <5ae2e77d.95ubF707FXNl6/H/%joerg.schill...@fokus.fraunhofer.de>

First, a (minor) apology - I should have made it clear that, yes, "set +f" was
intended, and that IFS was not intended to contain any unusual values (no 'a'
'*' "'"' '\' or '?' in it... ) Obviously anything like that would alter the 
results, and that kind of bizarreness is not what I was seeking to
query - and if I was, those pre-conditions would not have been forgotten.

| XCU 2.6.5 explains what happens after parameter expansion, the quoting 
happens 
| as the last action during parameter expansion.

2.6.5 is field splitting, which while it would normally be attempted in the
example I gave, would do nothing - and we could disable it by assuming IFS=''
if wanted - that should change nothing.

But in any case, unless some new text has been added in the resolution of
some bug that I am unaware of (which is most of them...) I see nothing in 2.6.5
which is even remotely similar to what you said. Can you cut/paste the 
relevant words, or quote line numbers, or if there's a change that is not yet
in the published text, the bug number ?

| The text related to double quotes refers only to "spaces" inside the result.

No, it means IFS characters - that is, something that was quoted is not
subject to field splitting - that's usually white space, but doesn't have to
be, but I agree, that's not relevant to anything here (since field splitting is
not going to change anything anyway, we can simply disable it, with IFS='')

| If you like, check:
|
| $shell -c "var='a*\"?\"'; echo \$var"
|
| alls shells agree here ;-)

Yes, they probably do in that case. They don't however in the case that
originally caused me to start looking at this.

[Aside: Martijn Dekker's modernish found some problems with NetBSD's
pattern matching - minor and obscure ones - but clearly bugs, and then
when I started testing, I found a few more ... so I created a large set of
tests for everything obscure and weird I could think of  and these
messages are the result of that: before I can "fix" anything I need to
understand what is the correct result, and why.]

The problem case is:

${SHELL} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'

There are 4 files in $PWD (when the above command is executed)
with names that start with a char in [a-e] followed by a '?' followed
by a '.' followed by two more '?' chars - and lots more irrelevant files).

Almost all shells simply print
[a-e]\?.*
which is the string assigned to "var" (whether the original input has
one or two \ characters makes no difference, and nor should it.)

But bash doesn't: (the -o posix given here makes no difference)

bash -o posix -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

So I started wondering why, and looked at the spec, and could find
nothing to suggest this should not be the result, rather, the text to
me reads as if it should be.

Even though nothing else I have available to test does that.

But it looked right, so I changed (not yet committed, nor are the other
bug fixes I have made to this) the NetBSD sh to produce the same
result as bash:

${SH} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

(${SH} is the obscure pathname to the uninstalled test build of my
development version of the NetBSD sh - I have it in a var because
it is way too long to type...) whereas the old way:

sh -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
[a-e]\?.*

the same as everyone else.

Then I started pondering other quote characters, since the quote
characters are still in the string, that is, if the command were

$SHELL -c 'printf "%s\n" [a-e]\?.*'

(here it is important that there just be one '\') all shells agree, that the
result where the 4 file names are printed is correct. For example:

bosh -c 'printf "%s\n" [a-e]\?.*'
a?.??
b?.??
c?.??
e?.??

In your earlier reply you said ...

| The result of a shell macro expansion is quoted internally before quote
| removal is applied.

but I cannot find any text anywhere which mandates that, and what's more,
it is nothing like what really happens:

bosh -c 'var="???";printf "%s\n" ${var}' | wc

Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Robert Elz  wrote:

> But it looked right, so I changed (not yet committed, nor are the other
> bug fixes I have made to this) the NetBSD sh to produce the same
> result as bash:
>
>   ${SH} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
>   a?.??
>   b?.??
>   c?.??
>   e?.??

This would be a mistake.

> Then I started pondering other quote characters, since the quote
> characters are still in the string, that is, if the command were
>
>   $SHELL -c 'printf "%s\n" [a-e]\?.*'

This is a different example, as you here have a quoted '?' instead of a quoted 
\ as in the first example.


> (here it is important that there just be one '\') all shells agree, that the
> result where the 4 file names are printed is correct.  For example:
>
>   bosh  -c 'printf "%s\n" [a-e]\?.*'
>   a?.??
>   b?.??
>   c?.??
>   e?.??

See above, a different example results in a different behavior.

> In your earlier reply you said ...
>
>   | The result of a shell macro expansion is quoted internally before quote
>   | removal  is applied.
>
> but I cannot find any text anywhere which mandates that, and what's more,
> it is nothing like what really happens:
>
>   bosh -c 'var="???";printf "%s\n" ${var}' | wc -l
>   2297

I am not sure what this should point to.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Robert Elz  wrote:

> We could require, than when stored in a variable, we quote
> things in pattern style "quoting" rather than shell style, that is,
> to take the example from my immediately previous message,
>
>   $SHELL -c 'var="[a-e][?].*";printf "%s\n" ${var}'
>
> lists the 4 filenames expected, for all values of $SHELL.

See my recent reply, this does not result in a quoted \.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Robert Elz  wrote:

Hi,

first the easy case:

> [Aside: Martijn Dekker's modernish found some problems with NetBSD's
> pattern matching - minor and obscure ones - but clearly bugs, and then
> when I started testing, I found a few more ... so I created a large set of
> tests for everything obscure and weird I could think of  and these
> messages are the result of that: before I can "fix" anything I need to
> understand what is the correct result, and why.]
>
> The problem case is:
>
>   ${SHELL} -c  'var="[a-e]\\?.*";printf "%s\n" ${var}'
>
> There are 4 files in $PWD (when the above command is executed)
> with names that start with a char in [a-e] followed by a '?' followed
> by a '.' followed by two more '?' chars - and lots more irrelevant files).
>
> Almost all shells simply print
>   [a-e]\?.*
> which is the string assigned to "var" (whether the original input has
> one or two \ characters makes no difference, and nor should it.)
>
> But bash doesn't: (the -o posix given here makes no difference)
>
>   bash -o posix -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
>   a?.??
>   b?.??
>   c?.??
>   e?.??

Since bash seems to be the only shell that works this way, I would call this a 
bug.

I tested Historic Bourne, ksh88, ksh92, dash, yash, mksh posh, zsh, bosh.

BTW: with the previous example, the "expand" function is told to expand:

a*"?

In your example, expand() is told to expand:

[a-e]\\?.*

and this must not be match the files you mention. The double slash is the 
quoting caused at the end of the macro expansion that I mentioned before.

sh -c  'var="[a-e]\?.*";printf "%s\n" ${var}'
[a-e]\?.*

But:

sh -c  'var="[a-e]?.*";printf "%s\n" ${var}'   
a?.??

...I have only one matching file.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 10:00:50 +0100
From:Geoff Clare 
Message-ID:  <20180427090050.GA2538@lt2.masqnet>

  | I believe the former text is misleading and should be deleted.  It is
  | effectively duplicating the requirements regarding backslashes stated in
  | 2.2.1 and 2.2.3, but gets the details wrong.

Except that here it is talking about quoting characters in patterns,
where different ones need to be quoted than when parsing.  If we
were to require that only "original" quotes can quote characters
in patterns, this wouldn't matter, but if we do that, I don't think
there is any way that we can (reasonably) store a pattern in a
variable where the pattern is to match a literal magic char (say
an asterisk, or question-mark) - that is, unless in that context
we were to require only "pattern" type quoting to ever be used.

Note "eval" doesn't really help - that removes quotes,  where we
need to add them, and while it is possible to write a pattern in
a form where it can be eval'd and produce the desired result,
that isn't something that I would normally expect almost anyone
to be able to work out how to do correctly (and safely - given that
the entire command needs to be eval'd there's no way to do just
the pattern word in question).

We could require, than when stored in a variable, we quote
things in pattern style "quoting" rather than shell style, that is,
to take the example from my immediately previous message,

$SHELL -c 'var="[a-e][?].*";printf "%s\n" ${var}'

lists the 4 filenames expected, for all values of $SHELL.

This means to quote a * ? or [ (and to be safe) \ outside
a bracket expression, one must include it in a (single character)
bracket expression, and in a bracket expression, to quote
! (or ^ if applicable) ] and '-' they need to be written in the
correct magic order so their special properties are lost.

But I think if that is to be the solution, we will need to spell
it out very clealy, and at the same time explain why a
pattern in a variable has a whole set of different rules
that a pattern simply written on the command line.

  | > But in a pattern?Which of these two applies?
  |
  | Depends where the pattern is.  Anywhere double quotes have an effect,
  | the backslash-within-double-quotes rule applies.  Elsewhere the "normal"
  | rule applies.

But the backslash within double quotes only applies the \ to quote the
double quote string magic chars ($ " ` \ and newline) whereas for
patterns what matters is the pattern magic chars (* ? [ etc).

Is that really what is supposed to happen?


  | > 4.  On the question of bug 985 ... (kind of related) - if quote removal is
  | > added to case pattern processing, [...]
  |
  | The danger here is that there are references to quote removal elsewhere
  | that could mean the wrong thing if case patterns are not subject to
  | quote removal.  You actually quoted one of these above from 2.13.1,

You could "fix" that by specifying that the pattern in a case statement
be subject to quote removal after the pattern has been used to match
against the word (the same way that filename patterns are subject
to quote removal after they have been used to match).   That would be
easy to implement, as the expanded pattern is just discarded after it has
failed to match (the original text remains for the next iteration of the
enclosing loop or whatever, if any, but that's unchanged in all cases.)

  | When pattern matching is used where shell quote removal is not
  | performed, ...
  |
  | This would apply to case patterns if quote removal is not performed
  | for them.

Yes, it would.   But ...

  | Okay, we could change this condition to something else but
  | can we be sure there aren't other similar side effects?  Are you 
  | willing to search through the standard for every occurence of the
  | substring "quot"?

Huh?   I'm confused - what other side effects are possible to
changing the wording about how case pattern matching in case
statements is done?

No-one is proposing altering what quote removal means, or how
that is performed.  Just whether it should be done in this particular
case, and what that means.   But yes, I do believe that the whole
of 2.13 needs extensive revision, not just fiddling here and there.

I'll leave your answer to the 2nd half of (or the addendum to) my message
from this morning until you have had time to consider my reply to
Jörg (and Mark), as you (more or less) said the same as Jörg.

kre




Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Robert Elz
Date:Fri, 27 Apr 2018 11:03:57 +0200
From:Joerg Schilling 
Message-ID:  <5ae2e77d.95ubF707FXNl6/H/%joerg.schill...@fokus.fraunhofer.de>

First, a (minor) apology - I should have made it clear that, yes, "set +f" was
intended, and that IFS was not intended to contain any unusual values (no 'a'
'*' "'"' '\' or '?' in it... )  Obviously anything like that would alter the 
results, and that kind of bizarreness is not what I was seeking to
query - and if I was, those pre-conditions would not have been forgotten.

  | XCU 2.6.5 explains what happens after parameter expansion, the quoting 
happens 
  | as the last action during parameter expansion.

2.6.5 is field splitting, which while it would normally be attempted in the
example I gave, would do nothing - and we could disable  it by assuming IFS=''
if wanted - that should change nothing.

But in any case, unless some new text has been added in the resolution of
some bug that I am unaware of (which is most of them...) I see nothing in 2.6.5
which is even remotely similar to what you said.   Can you cut/paste the 
relevant words, or quote line numbers, or if there's a change that is not yet
in the published text, the bug number ?

  | The text related to double quotes refers only to "spaces" inside the result.

No, it means IFS characters - that is, something that was quoted is not
subject to field splitting - that's usually white space, but doesn't have to
be, but I agree, that's not relevant to anything here (since field splitting is
not going to change anything anyway, we can simply disable it, with IFS='')

  | If you like, check:
  |
  | $shell -c "var='a*\"?\"'; echo \$var"
  |
  | alls shells agree here ;-)

Yes, they probably do in that case.   They don't however in the case that
originally caused me to start looking at this.

[Aside: Martijn Dekker's modernish found some problems with NetBSD's
pattern matching - minor and obscure ones - but clearly bugs, and then
when I started testing, I found a few more ... so I created a large set of
tests for everything obscure and weird I could think of  and these
messages are the result of that: before I can "fix" anything I need to
understand what is the correct result, and why.]

The problem case is:

${SHELL} -c  'var="[a-e]\\?.*";printf "%s\n" ${var}'

There are 4 files in $PWD (when the above command is executed)
with names that start with a char in [a-e] followed by a '?' followed
by a '.' followed by two more '?' chars - and lots more irrelevant files).

Almost all shells simply print
[a-e]\?.*
which is the string assigned to "var" (whether the original input has
one or two \ characters makes no difference, and nor should it.)

But bash doesn't: (the -o posix given here makes no difference)

bash -o posix -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

So I started wondering why, and looked at the spec, and could find
nothing to suggest this should not be the result, rather, the text to
me reads as if it should be.

Even though nothing else I have available to test does that.

But it looked right, so I changed (not yet committed, nor are the other
bug fixes I have made to this) the NetBSD sh to produce the same
result as bash:

${SH} -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
a?.??
b?.??
c?.??
e?.??

(${SH} is the obscure pathname to the uninstalled test build of my
development version of the NetBSD sh - I have it in a var because
it is way too long to type...) whereas the old way:

sh -c 'var="[a-e]\\?.*";printf "%s\n" ${var}'
[a-e]\?.*

the same as everyone else.

Then I started pondering other quote characters, since the quote
characters are still in the string, that is, if the command were

$SHELL -c 'printf "%s\n" [a-e]\?.*'

(here it is important that there just be one '\') all shells agree, that the
result where the 4 file names are printed is correct.  For example:

bosh  -c 'printf "%s\n" [a-e]\?.*'
a?.??
b?.??
c?.??
e?.??

In your earlier reply you said ...

  | The result of a shell macro expansion is quoted internally before quote
  | removal  is applied.

but I cannot find any text anywhere which mandates that, and what's more,
it is nothing like what really happens:

bosh -c 'var="???";printf "%s\n" ${var}' | wc -l
2297

(the wc is there just because (as shown) there are way too many 3 character
filenames to include the printf output directly...)

If "The result of a shell macro expansion is quoted internally" was happening,
then this example would look like

bosh -c 'printf "%s\n" "???" | wc -l'
   1

(the '1' being the literal string "???" of course).   Instead, what we're 
getting is:

bosh -c 'printf "%s\n" ??? | wc -l'
2297

which shows that the results of the macro expansion are not internally
quoted.   All shells do

Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Geoff Clare  wrote:

> Robert Elz  wrote, on 27 Apr 2018:
> >
> > Oh, one more thing about patterns - a question this time, though the
> > answer might end up suggesting more text that needs to be in
> > the standard.
> > 
> > If I have
> > 
> > var='a*"?"'
> > 
> > and then I do
> > 
> > echo $var
> > 
> > what should the result be?   Is this absolutely the same as
> > 
> > echo a*"?"
> > 
> > ?
>
> No it's not the same.  The shell expands $var to all filenames that
> start with 'a' and end with double-quote, any character, double-quote.

Which is a result of the way, the internal quoting is added to the parameter 
expansion result.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Shware Systems  wrote:

> According to XCU 2.6.5, it's treated literally only when double quoted, e.g. 
> "$var", otherwise quote removal should still occur on the variable's contents 
> after any field splitting...

XCU 2.6.5 explains what happens after parameter expansion, the quoting happens 
as the last action during parameter expansion.

The text related to double quotes refers only to "spaces" inside the result.

If you like, check:

$shell -c "var='a*\"?\"'; echo \$var"

alls shells agree here ;-)

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Geoff Clare
Robert Elz  wrote, on 27 Apr 2018:
>
> 1.  There is text dealing with backslash processing at 2 separate places in
> 2.13.1.  First at lines 76212-3
> 
>   A  character shall escape the following character.
>   The escaping  shall be discarded.
> 
> and then at lines 76232-8 (which is on the following page)
> 
>   When pattern matching is used where shell quote removal is not performed
>   (such as in the argument to the find -name primary when find is being 
> called
>   using one of the exec functions as defined in the System Interfaces 
> volume
>   of POSIX.1-2008, or in the pattern argument to the fnmatch( ) 
> function), special
>   characters can be escaped to remove their special meaning by preceding
>   them with a  character. This escaping  is 
> discarded.
>   The sequence "\\" represents one literal . All of the 
> requirements
>   and effects of quoting on ordinary, shell special, and special pattern 
> characters
>   shall apply to escaping in this context.
> 
> Given the former, which is simple, and easy to follow, what is the point of 
> the latter?

I believe the former text is misleading and should be deleted.  It is
effectively duplicating the requirements regarding backslashes stated in
2.2.1 and 2.2.3, but gets the details wrong.

> What's more, in the latter, only special characters can be 
> escaped, after which the escaping \ is removed - in that version, what
> happens to a \ that is not followed by a special character ?

Unspecified.

> These two are kind of like backslash quoting in unquoted shell text (where the
> \ escapes anything (ignoring the \newline for this)) and backslash quoting in 
> double quoted strings, where the \ only escapes a specific set of characters,
> and other backslashes are left untouched.
> 
> In parsing and processing words it is no problem, as we know if we're in a
> double quoted string or not.
> 
> But in a pattern?Which of these two applies?

Depends where the pattern is.  Anywhere double quotes have an effect,
the backslash-within-double-quotes rule applies.  Elsewhere the "normal"
rule applies.

> 2.  Lines 76219-21:
> 
>   If any character (ordinary, shell special, or pattern special) is 
> quoted,
>   that pattern shall match the character itself.
> [that's fine]
>   The shell special characters always require quoting.
> [that's nonsense].

Agreed.  That sentence should be deleted.

> 3.  Lines 76247-9
> 
>   In such patterns, each  shall match a string of zero or more 
> characters,
> [fine]
>   matching the greatest possible number of characters that still allows 
> the
>   remainder of the pattern to match the string.
> 
> the "greatest possible" is unnecessary, and in some cases, actually incorrect
> (that's an idea taken from '*' in REs where a specification of this is 
> needed.)
> 
> It is not generally needed, as in general, shell patterns are just match or
> no-match - it is irrelevant exactly what matched where.
> 
>   So given the word (or file name)   abcdxbz
> the pattern
>   a*b*z
> matches, but no-one cares in the slightest whether the 'b' that was
> selected was the one after a or the one before z.   Which * matched the
> null string, and which matched the rest of the characters is irrelevant.
> There is no need to specify "greatest possible number" - the * just
> needs to match any number of characters that allows the remainder
> of the pattern to match.
> 
> The one place where we need more than match/no-match is in the variable
> expansion substring operators (# ## % %%).
> 
> There, assuming var contains the word above, we want (require) ${var#a*b}
> to match such that the 'b' that matches is the one after 'a', and ${var##a*b}
> to match so that the b that matches is the one before 'z'.
> 
> In the single char substring operators we want the '*' to match the smallest,
> not greatest, possible number of chars that allows the remainder of the
> pattern to match.   The only time "greatest" is relevant is for the double 
> char
> substring operators.

All true.  The descriptions of parameter expansions with %, %%, # and ##
cover this, so the "greatest possible number" clause in 2.13.2 should
just be deleted.

> 4.  On the question of bug 985 ... (kind of related) - if quote removal is
> added to case pattern processing, it makes that into a different case from all
> of the others.   In filename generation, pattern matching is done before
> quote removal, so the quotes are still there.  In parameter expansion 
> (substring matching) the pattern matching happens before quote removal,
> so the quotes in the pattern are still there.   To be consistent, it would be 
> best to leave the quotes in the pattern in a case statement, so processing of
> it is consistent with all of the others.

The danger here is that there are references to quote removal elsewhere
that could mean the wrong thing if case patterns are not subject to
quote removal.  You actually qu

Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Shware Systems
According to XCU 2.6.5, it's treated literally only when double quoted, e.g. 
"$var", otherwise quote removal should still occur on the variable's contents 
after any field splitting...

On Friday, April 27, 2018 Joerg Schilling  
wrote:

Robert Elz  wrote:

> Oh, one more thing about patterns - a question this time, though the
> answer might end up suggesting more text that needs to be in
> the standard.
>
> If I have
>
> var='a*"?"'
>
> and then I do
>
> echo $var
>
> what should the result be? Is this absolutely the same as
>
> echo a*"?"

No, it isn't.

The result of a shell macro expansion is quoted internally before quote removal 
is applied.

For this reason echo $var will print a*"?", while the latter prints a*?

Jörg

-- 
EMail:jo...@schily.net (home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'





Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-27 Thread Joerg Schilling
Robert Elz  wrote:

> Oh, one more thing about patterns - a question this time, though the
> answer might end up suggesting more text that needs to be in
> the standard.
>
> If I have
>
>   var='a*"?"'
>
> and then I do
>
>   echo $var
>
> what should the result be?   Is this absolutely the same as
>
>   echo a*"?"

No, it isn't.

The result of a shell macro expansion is quoted internally before quote removal 
is applied.

For this reason echo $var will print a*"?", while the latter prints a*?

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-26 Thread Shware Systems
Assuming set +f in effect, the first 2 should expand identically, how I read 
XCU 2.6.5 and 2.6.6; treating the * as a glob special character and the ? as a 
literal. For the 3rd case the standard is silent on whether the closing " is 
assumed on reaching the end of the field established during token recognition, 
by the  after the 'r' in '$var', or is a syntax error when the glob is 
evaluated. The text assumes, in XCU 2.13 by use of 'quoting' generically, if 
single quotes or double quotes are used to begin a literal pattern sequence the 
application will ensure the closing quote is always present. I agree a 
statement should be added to XCU 2.13.1, Line 76221, about what is the required 
interpretation. It only has now a trailing '\' is undefined behavior.


In a message dated 4/26/2018 8:24:33 PM Eastern Standard Time, 
k...@munnari.oz.au writes:

 
Oh, one more thing about patterns - a question this time, though the

answer might end up suggesting more text that needs to be in
the standard.

If I have

 var='a*"?"'

and then I do

 echo $var

what should the result be? Is this absolutely the same as

 echo a*"?"

?

And if so, whay would happen if instead I had

 var='a*"?'

(and used it the same way?)

kre



Re: More questions/comments on XCU 2.13 (sh Pattern Matching)

2018-04-26 Thread Robert Elz
Oh, one more thing about patterns - a question this time, though the
answer might end up suggesting more text that needs to be in
the standard.

If I have

var='a*"?"'

and then I do

echo $var

what should the result be?   Is this absolutely the same as

echo a*"?"

?

And if so, whay would happen if instead I had

var='a*"?'

(and used it the same way?)

kre