On Wed, May 14, 2014 at 2:25 PM, Armin Ronacher <[email protected] > wrote:
> Hi, > > On 02/05/2014 00:03, John Downey wrote: > >> I have actually always been a fan of how .NET did this. The System.String >> type >> is opinionated in how it is stored internally and does not allow anyone to >> change that (unlike Ruby). The conversion from String to byte[] is done >> using >> explicit conversion methods like: >> > Unfortunately the .NET string type does not support UCS4 and as such is a > nightmare to deal with. Also because the internal encoding is not UTF-8 > *any* interaction with the outside world (ignoring the win32 api) is going > through an encode/decode step which can be unnecessary. > > For instance if you would do that on Linux you would decode from utf-8 to > your internal UCS4 encoding, then encode back to utf-8 on the way back to > the terminal. (Aside from that, 32bit for a charpoint is too large as > unicode does not go in more than 21bit or something. Useless) > > Even keeping whole bytes, 3 bytes (24 bits) is effectively sufficient for the whole of Unicode. If you don't mind some arithmetic, you could thus use a backing array of bytes and just recompose the value on output. > > > Regards, > Armin > > _______________________________________________ > Rust-dev mailing list > [email protected] > https://mail.mozilla.org/listinfo/rust-dev >
_______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
