On Fri, Mar 25, 2005 at 07:38:10PM -0000, Chip Salzenberg wrote:
: Would this be a good time to ask for explanation for C<str> being
: never Unicode, while C<Str> is always Unicode, thus leading to an
: inability to box a non-Unicode string?

As Rod said, "str" is just a way of declaring a byte buffer, for which
"characters", "graphemes", "codepoints", and "bytes" all mean the
same thing.  Conversion or coercion to more abstract types must be
specified explicitly.

: And might I also ask why in Perl 6 (if not Parrot) there seems to be
: no type support for strings with known encodings which are not subsets
: of Unicode?

Well, because the main point of Unicode is that there *are* no encodings
that cannot be considered subsets of Unicode.  Perl 6 considers
itself to have abstract Unicode semantics regardless of the underlying
representation of the data, which could be Latin-1 or Big5 or UTF-76.

That being said, abstract Unicode itself has varying levels of
abstraction, which is how we end up with .codes, .graphs, and .chars
in addition to .bytes.

Larry

Reply via email to