Re: Usage of \[oxdb] (was Re: String Literals, take 2)

Larry Wall Wed, 04 Dec 2002 12:21:59 -0800

On Wed, Dec 04, 2002 at 11:38:35AM -0800, Michael Lazzaro wrote:
: We still need to verify whether we can have, in qq strings:
: 
:    \033      - octal       (p5; deprecated but allowed in p6?)


I think it's disallowed.

:    \o33      - octal       (p5)
:    \x1b      - hex         (p5)
:    \d123     - decimal     (?)
:    \b1001    - binary      (?)

Can't really have \d and \b if they keep their current regex meanings.
I think the general form is:

   \0o33      - octal
   \0x1b      - hex 
   \0d123     - decimal
   \0b1001    - binary

\x and \o are then just shortcuts.

: and if so, if these are allowed too:
: 
:    \o{777}   -             (?)
:    \x{1b}    - "wide" hex  (p5)
:    \d{123}   -             (?)
:    \b{1001}  -             (?)

The general form could be

   \0o[33]      - octal
   \0x[1b]      - hex 
   \0d[123]     - decimal
   \0b[1001]    - binary

Or it could be

   \c[0o33]      - octal
   \c[0x1b]      - hex 
   \c[0d123]     - decimal
   \c[0b1001]    - binary

since \c is taking over \N's (rather ill-defined) duties.

: Note that \b conflicts with backspace.  I'd rather keep backspace than 
: binary, personally; I have yet to feel the need to call out a char in 
: binary.  :-)  Or we can make it dependent on the trailing digits, or 
: require the brackets, or require backspace to be spelt differently.

\c[^H], for instance.  We can overload the \c notation to our heart's
desire, as long as we don't conflict with its use for named characters:

    \c[GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI]

: But I think we'd definitely like to introduce \d.

Can't, unless we change \d to <digit> in regexen.

: There is also the question of what the bracketed format does.  "Wide" 
: chars, e.g. for Unicode, seem appropriate only in hex.  But it would 
: seem useful to allow a bracketed form for the others that prevents 
: ambiguities:
: 
:    "\o164" ne "\o{16}4"
:    "\d100" ne "\d{10}0"
: 
: Whether that means you can actually specify wide chars in \o, \d, and 
: \b or it's just a disambiguification of the Latin-1 case is open to 
: question.

There ain't no such thing as a "wide" character.  \xff is exactly
the same character as \x[ff].  A character in Perl is an abstract
codepoint number--how it's represented is of no concern to the
programmer (though it might be of concern to any interface to the
outside world, of course).  Do not think of Perl 6 strings as arrays
of bytes (except when they are (and probably not even then...)).

Larry

Re: Usage of \[oxdb] (was Re: String Literals, take 2)

Reply via email to