Date:        Fri, 27 Apr 2018 15:06:52 +0200
    From:        Joerg Schilling <joerg.schill...@fokus.fraunhofer.de>
    Message-ID:  <5ae3206c.gzrnd81xboh3e0x7%joerg.schill...@fokus.fraunhofer.de>

  | Since bash seems to be the only shell that works this way,

Until I changed the NetBSD sh (if that change is retained), yes.

  | I would call this a bug.

Then I think it would be also a bug in POSIX (as I think it
actually specifies this result) and a deficiency - as there
really needs to be a way to store a pattern in a variable
such that a pattern-magic character can be treated literally.

I will leave it for Chet to say whether or not he considers this
to be a bug in bash.

  | I tested Historic Bourne, ksh88, ksh92, dash, yash, mksh posh, zsh, bosh.

I agree, and the FreeBSD and currently released (and all available)
NetBSD shells as well.

  | BTW: with the previous example, the "expand" function is told to expand:
  |
  |     a*"?

That's the one where I missed the closing quote (deliverately) - let's
just forget that one for now until we get a real conclusion on what
should happen with pairs of quptes (and more importantly, \ quoting).

The examples with "" characters I expect will simply remain as they
are in all shells, and the code I have been in the process of writing
to allow that to "work" (based on the assumption that there is no reason
why not - and even now, except that it doesn't work that way in other
shells, I see no good reason to doubt) should just be consigned to the
scrap heap (that code doesn't even compile yet, so no big loss.)

  | In your example, expand() is told to expand:
  |
  |     [a-e]\\?.*

No it isn't.  I said the \\ was irrelevant and I meant it.

In
        var="[a-e]\\?.*"

which is the command that was used, the first \ is a quoting
character, and is removed by quote removal (as are the
enclosing "") just before the assignment to var is performed.

The value assigned to var is

        [a-e]\?.*

which is exactly  the same as when the command was

        var="[a-e]\?.*"

as there the \ is not a quoting character, as '?' isn't one
of the magic few that \ can quote inside a double quoted
string -- but another \ is.

If I had used
        var='[a-e]\\?.*'

that would be different, there neither \ is a quoting char, and
what you said would be expanded would be correct.  But that
is not what was done (as I was using, as I always do when I
can, single quotes around the arg to sh -c - using single quotes
inside that string then gets ugly (bad for examples when the quoting
is not the point) so I avoid that when possible (of course, the
test cases include examples like that - doesn't matter if they're
incomprehensible.)

  | But:
  |
  | sh -c  'var="[a-e]?.*";printf "%s\n" ${var}'                   
  | a?.??
  |
  | ...I have only one matching file.

This is an entirely different pattern, which matches a whole
different set of files (including the ones that the other pattern
matches - sometimes)

        bosh -c  'var="[a-e]?.*";printf "%s\n" ${var}' |wc -l
              84

again, the wc is just because you really don't want to see the
list of odd filenames that match that pattern.

bosh is correct incidentally, all shells produce the same 84
files, but this is a very easy case.

The idea is to match files that contain a letter (one of the 5)
followed by a literal character '?' followed by a literal character '.'
followed by anything at all.   And to store that pattern in a
variable.  The literal '.' is no problem, the question is how
tio encode the literal ?.

I showed one way, using pattern magic, in my reply to Geoff,
the question is why not using shell quoting as well.

Note: that the section in 2.13.1 (which Geoff says is the correct
explanation of quoting in patterns) says:

        When pattern matching is used where shell quote removal is not performed
[...]
        special characters can be escaped to remove their special meaning by
        preceding them with a <backslash> character.

"special characters" there is referring to the '*' '?' and '[' chars, and the
section goes on to allow \\ for matching a literal '\'.

Since 2.6.7 (Quote Removal) says ...

        The quote characters (<backslash>, single-quote, and double-quote) that
        were present in the original word shall be removed unless they have
        themselves been quoted.

which means that quote removal is not performed on text in a word that came
from the results of an expansion (that's not the original word) and so one 
could read 2.13.1 as saying that \ quoting of special characters is available
in this context, since quote removal is not performed there, (which then makes
it just the same as in literal patterns in the text, though there the \ acts as 
a
quoting character, and quotes the special characters that way.)

Now I am quite willing to admit (especially given that shells have not 
historically implemented this this way) that this might not be intended,
and that perhaps the spec needs to be changed to make this more
clear - but as it is written now, I think it quite reasonable to assume
that \ quoting should work (as ot does in bash, and in my newest NetBSD
sh - for now) in var expansions, and the [?] trick should not be the
only way to achieve this.

The real question is, why should text in a variable (after it has been
expanded) be treated differently than text written in the script directly?

Historical accident is one answer I suppose, though not a very
appealing one - especially if it makes life harder for script writers.

kre


Reply via email to