[EMAIL PROTECTED] schreef: > Speak of char, what about unicode? Will this be added/change ? You need > 16bit for unicode char. > This is a common misconception. Unicode is nothing more than a huge set of characters. Unicode basically only assigns a number to every character (glyph really) it deals with.
The amount of bits you need for storing unicode characters depends entirely on the encoding. And yes, one of those encodings happens to require 16 bit. This particular encoding is UTF-16 which btw is _not_ endian safe. One important thing to note btw; unicode has a character/glyph set of 31 bits large, so 16 bits would be too few to store all possible unicode glyphs. Then another common encoding system for unicode is UTF-8. This encoding has the advantage of being able to convert all glyphs with characters code below 128 in one byte (or 8 bits). Also an additional advantage is that UTF-8 (unlike UTF-16) never has a byte of 0x0 in it's encoding unless an 0x0 character actually is part of the encoded string. This allows 0x0 terminated strings (I figure you understand the advantage of this for C's string functions: they keep working). Then another advantage is that UTF-8 has a definition on the bit-level and as such is automatically endian-safe. Plus also very important: non-ASCII characters will never, ever have any byte as part of their encoding that represents an ASCII value. Aside from that the two most commonly used Unicode encodings are: UTF-8: (almost?) all unix-like OSes use it; UTF-16 (Windows uses this, I guess MS just doesn't like to play along as usual). From these two UTF-8 will be the easiest to implement because we can keep using char-types for strings, and we can keep using the C-library's string functions (in other words: fewer changes required). So all that we would need to add support for, to be able to use unicode, is I/O code. That means: code that decodes the UTF-8 string and renders it to screen, plus code that takes keyboard input and encodes it into UTF-8 strings where required. > I read that people can no enter some characters in text boxes. > This is the direct result from the fact that we currently only support the ASCII character set. -- Giel
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Warzone-dev mailing list Warzone-dev@gna.org https://mail.gna.org/listinfo/warzone-dev