Re: string encoding

Simon Cozens Fri, 16 Feb 2001 16:19:55 -0800
On Fri, Feb 16, 2001 at 02:25:59PM -0800, Hong Zhang wrote:
> I think you already mixed the codepoint vc character. What you will get is
> 10th codepoint, not 10th character.

I think you're confused. Codepoints *are* characters. Combining characters are
taken care of as per the RFC.

> The UTF-32 has its problems too, such as cache locality, memory footprint,
> encoding conversion.

I'm talking about UTF16. You're talking about UTF32.
Try talking about what I'm talking about.

> I said it is not common case

And I am saying that it is.

> You need to exmine the first two bytes for UTF-16 too, right?

Wrong.

> > UTF16 : s += 2;            : O(1) : Good
> > UTF8  : s += UTF8WIDTH(*s) : O(n) : Bad
> 
> What I don't understand where you really use random access of string?

I have been through this many, many times. I am not going through it
again.

You are mistaken, and explaining our point is not helping.
Let us agree to differ.

-- 
sub UNIVERSAL::AUTOLOAD{ $UNIVERSAL::AUTOLOAD =~ s/::/ /; print
"$UNIVERSAL::AUTOLOAD "}; hacker perl another just;
Re: string encoding

Reply via email to