On Tue, Sep 12, 2000 at 12:24:50AM +0200, Gisle Aas wrote:
> Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>
> > Please take a look at the (very rough) first draft of Encode, an extension
> > for character encoding conversions for Perl 5:
> >
> > http://www.iki.fi/jhi/Encode.tgz
> >
> > Download, plop it into the Perl 5.7 source directory, unpack,
> > re-Configure, rebuild. (Or, if you have a Perl 5.7 in your path,
> > cd to ext/Encode, perl Makefile.PL, make).
>
> I did not really understand the interface. It seems like you expose
> the fact that perl (currently) use utf8 internally too much.
Before we have the character mapping tables we don't have a choice --
and after that it shouldn't matter anyway since we should have
from_to(). Then people will never see the underlying utf8ness.
(Unless, of course they say from_to('latin1', 'utf8'), but that's
transparent and orthogonal.)
> I would like to see these convert perl strings to bytes:
>
> to_utf7
> to_utf8
> ...
> And these convert a sequence of bytes to perl strings:
>
> from_utf8
>
> You seem to want to define these function the opposite way. Perhaps
*Now* I understand why I couldn't ever figure out how to use Unicode::Map8 :-)
> The is_utf8() also seem wrong to me. I believe that the SV invariant
> should be that a string marked with the UTF8 flag should not contain
> illegal UTF8 sequences. Why is it not so?
I'm being paranoid. Keeps me alive.
> Regards,
> Gisle
--
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen