Chip Salzenberg wrote:
Would this be a good time to ask for explanation for C<str> beingThat's not quite it. C<str> is a forced Unicode level of "Bytes", with encoding "raw", which happens to not have any Unicode semantics attached to it.
never Unicode, while C<Str> is always Unicode, thus leading to an
inability to box a non-Unicode string?
And might I also ask why in Perl 6 (if not Parrot) there seems to beThere are two different things to consider at the P6 level: Unicode level, and encoding. Level is one of Bytes, CodePoints, Graphemes, or Language Dependent Characters (aka LChars aka Chars). It's the way of determining what a "character" means. This can all get a bit confusing for people who only speak English, since our language happens to map nicely into all the levels at once, with no "merging of multiple code points into a grapheme" monkey business.
no type support for strings with known encodings which are not subsets
of Unicode?
Encoding is how a particular string gets mapped into bits. I see P6 as needing to support all the common encodings (raw, ASCII, UTF\d+[be|le]?, UCS\d+) "out of the box", but then allowing the user to add more as they see fit (EBCDIC, etc).
Level and Encoding can be mixed and matched independently, except for the combos that don't make any sense.
-- Rod Adams