Re: [Patch] optimize utf8_to_ucs4

Georg Baum Mon, 30 Oct 2006 09:24:58 -0800

Joost Verburg wrote:

> Georg Baum wrote:
>> For ucs4 -> utf8 we would have to use a result string with a length of 6
>> times the input length, with the average length close to the inpurt
>> length if we want to be able to convert everything. That is probably too
>> much to be efficient.
> 
> ucs4 uses 4 bytes per character and utf8 1-4 bytes. I don't understand
> where you get this number from.


I read somewhere that the highest possible number of bytes for a single
character in utf8 is 6, but I forgot where. Abdel reported the same, and
now I am unsure, because wikipedia says 4. Does anybody know what is
correct?


Georg

Re: [Patch] optimize utf8_to_ucs4

Reply via email to