Simon Cozens wrote:
> =head1 The Parrot String API

Have you guys seen Topaz? One of many things I think Chip
did right was to build strings from a low-level buffer
concept. This moves memory management (and possibly raw-io)
out of the string class and into the buffer class.

The other major suggestion I have is to avoid "void *"
interfaces. Do we really need a string_make() that takes
the encoding enum? Aren't encodings different enough that
we'd want string_make_ascii(), string_make_utf32(), ...
Maybe I've been doing C++ too long, but taking advantage
of compiler type checks as much as possible seems like an
excellent goal -- especially since the various string_*
functions are going to be visible in user code.

The use of an encoding enum seems a little weird, but once
you explain why it will probably make sense. Right now the
only thing it seems good for is the transcoding system --
everything else is slower and vtables are more cumbersome
because you have to manage a vtable[num_encodings] array.

I'd make a string's encoding a direct pointer to its'
vtable struct. The transcoding algorithm seems like it
could be implemented using a string iterator on the source
with a character-by-character append to the destination.
We would need an abstract character object, but that
seems necessary anyway. (Perl doesn't have characters,
but does Perl 6 or Python?) The destination can be
pre-extended to the length of the source to avoid
fragmenting memory. How many encoding specific
optimizations are there? Is it worth having a table of
custom transcoders instead of using a generic algorithm?

- Ken

Reply via email to