It's really already very much like you want it to be.  Most Str objects
do not in fact have any byte semantics.  If you say "foo".bytes, that
is shorthand for "foo".bytes(:nf<c>, :enc<UTF-8>).  In other words,
you have to tell it what units you want the bytes to be measured in.
It just assumes utf-8 as a convenient default.  Likewise a Str does
not have any codepoint semantics unless you tell it the normalization
to assume.  Most strings are sequences of abstract graphemes; see also
http://www.nntp.perl.org/group/perl.perl6.language/2008/01/msg28281.html
as well as the recent definitions of .bytes, .codes, .graphs, and .chars
in http://svn.pugscode.org/pugs/docs/Perl6/Spec/Functions.pod .

We do still talk about the possibility of multi-level strings,
but that's basically the same as your object that presents both
Str and Buf interfaces.  That's an exception rather than the rule,
and certainly as you say it would need to be well typed as to its
encoding and normalization.  The same considerations apply
between grapheme and codepoint views of the same string, except
there only the normalization is needed, since codepoints are
above the encoding abstraction level.

Larry

Reply via email to