[EMAIL PROTECTED] schreef:
> Speak of char, what about unicode?  Will this be added/change ? You need 
> 16bit for unicode char.
>   
This is a common misconception. Unicode is nothing more than a huge set
of characters. Unicode basically only assigns a number to every
character (glyph really) it deals with.

The amount of bits you need for storing unicode characters depends
entirely on the encoding. And yes, one of those encodings happens to
require 16 bit. This particular encoding is UTF-16 which btw is _not_
endian safe. One important thing to note btw; unicode has a
character/glyph set of 31 bits large, so 16 bits would be too few to
store all possible unicode glyphs. Then another common encoding system
for unicode is UTF-8. This encoding has the advantage of being able to
convert all glyphs with characters code below 128 in one byte (or 8
bits). Also an additional advantage is that UTF-8 (unlike UTF-16) never
has a byte of 0x0 in it's encoding unless an 0x0 character actually is
part of the encoded string. This allows 0x0 terminated strings (I figure
you understand the advantage of this for C's string functions: they keep
working). Then another advantage is that UTF-8 has a definition on the
bit-level and as such is automatically endian-safe. Plus also very
important: non-ASCII characters will never, ever have any byte as part
of their encoding that represents an ASCII value.

Aside from that the two most commonly used Unicode encodings are: UTF-8:
(almost?) all unix-like OSes use it; UTF-16 (Windows uses this, I guess
MS just doesn't like to play along as usual).

From these two UTF-8 will be the easiest to implement because we can
keep using char-types for strings, and we can keep using the C-library's
string functions (in other words: fewer changes required).

So all that we would need to add support for, to be able to use unicode,
is I/O code. That means: code that decodes the UTF-8 string and renders
it to screen, plus code that takes keyboard input and encodes it into
UTF-8 strings where required.
> I read that people can no enter some characters in text boxes.
>   
This is the direct result from the fact that we currently only support
the ASCII character set.

-- 
Giel

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Warzone-dev mailing list
Warzone-dev@gna.org
https://mail.gna.org/listinfo/warzone-dev

Reply via email to