On Tue, Aug 17, 2010 at 10:34, Bulat Ziganshin <bulat.zigans...@gmail.com>wrote:

> Hello Johan,
>
> Tuesday, August 17, 2010, 12:20:37 PM, you wrote:
>
> >  I agree, Data.Text is great.  Unfortunately, its internal use of UTF-16
> >  makes it inefficient for many purposes.
>
> > It's not clear to me that using UTF-16 internally does make
> > Data.Text noticeably slower.
>
> not slower but require 2x more memory. speed is the same since
> Unicode contains 2^20 codepoints
>
>
This is not entirely correct because it all depends on your data.
For western languages is normally holds true that UTF16 occupies twice the
memory of UTF8, but for other languages code points might take up to 3 bytes
(I thought even 4, but the wikipedia page only mentions 3:
http://en.wikipedia.org/wiki/UTF-8).

That wikipedia page is a nice read anyway, it mentions some of the
advantages and disadvantages of the different encodings.
(The complexity of the code that determines the length of an UTF string
depends on the encoding for example)

Cheers,
 -Tako
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to