Re: [1003.1(2008)/Issue 7 0000249]: Add standard support for $'...' in shell

Robert Elz via austin-group-l at The Open Group Fri, 05 Feb 2021 14:16:39 -0800

    Date:        Fri, 05 Feb 2021 21:54:52 +0100
    From:        Steffen Nurpmeso <stef...@sdaoden.eu>
    Message-ID:  <20210205205452.7tbl2%stef...@sdaoden.eu>



  | Well .. if i recall correctly quoting inside of ${xYz} has been
  | clarified not too long ago

Not the way that you seem to think.

  |  |And last (for now anyway), after "set -- A B C" what's the effect of
  |  |$'pfx\${@}sfx' ?
  |
  | This is interesting.  I would say it is identical to ${*} here.

In that case $'' could not be the only quoting mechanism that users use.

  | My MUA just turns it into UTF-8 (via a utf32_to_utf8 function that
  | uses the Unicode replacement character for erroneous codepoints)

The generation of the UTF-8 is not the issue, and the (relatively few)
values that are reserved can be handled.

  | You have to be careful a bit with Unicode.  There are guarantees
  | that must be fulfilled, see for example [1].  Since the shell is
  | producing UTF-8 it should ensure that no invalid UTF-8 sequences
  | are exposed to consumers.

Of course.

But: users are permitted to write $'\xfc\x13' and similar, and no-one
suggests that the shell should validate such sequences for valid UTF-8
encoding, and nor would anyone (I hope) claim the shell should object
to $'\u0207\xfc\x13' just because it happens to have a \u in it.
This is all just bits until it gets used somehow, at which point if
it is invalid, then so be it.

  |   When a process interprets a code unit sequence which purports to
  |   be in a Unicode character encoding form, it shall treat
  |   ill-formed code unit sequences as an error conddition and shall
  |   not interpret such sequences as characters.

That has to be a requirement on the application, not upon the programming
language implementation (the shell here) - when the shell is converting
the string, it has no idea how the script will interpret it, nothing
requires that a $'\uxxxx' value ever be used as "characters" (though
that would be a common use).

kre

Re: [1003.1(2008)/Issue 7 0000249]: Add standard support for $'...' in shell

Reply via email to