Am Mittwoch, 16. August 2006 18:41 schrieb Abdelrazak Younes:
> Hum... I am not I follows everything but let me summarize what I 
> understand from current code. The std::vectors I am talking about are:
> 
> * vector<char>: could be replaced by std::basic_string<char>
> * vector<unsigned char>: that is ucs2 right? That could be replaced by 
> std::basic_string<unsigned char>
> * vector<boost::uint32_t>: I guess that is ucs4 and that could be 
> replaced by std::basic_string<unsigned char>

aka lyx::docstring

> Internally we should just use one of those three types.

IMO only the last one. ucs2 is only for talking to qt, but that can easily 
be wrapped in fromqstr/toqstr, so we don't really need a ucs2 string type.

> The conversion  
> to this complicate utf8 encoding should happen on input/output only. 
> Handling a multi-byte encoding internally is just a recipe for a buggy 
> future IMHO.
> 
> So what I do not get right here?

multibyte != variable-byte. Multibyte is not bad per se. Both ucs2 and ucs4 
use a fixed number of bytes for one character (2 and 4, respectively, 
surprise, surprise!). The problem is a variable-byte encoding such as 
utf8.


Georg

Reply via email to