>>> I'd prefer to have an option to use UTF-16 (treated as a 2-byte >>> character set with surrogate pairs) as that will only halve the >>> maximum allowed number of characters.
The maximum allowed number of characters in Unicode is about 1 Million. Which can be perfectly represented by either UTF-8 or UTF-16. >> Nope. If you take into account surrogates, UTF-16 will have the >> same maximum of 4 bytes per character. You should think of that not as 4 bytes but as two 16-bit words. > You are missing my point. There are two ways to consider UTF-16, one is > your interpretation where each character is 2-4 bytes, or as 2 byte > 'characters', where some codepoints are built from a surrogate pair > (which essential means that some codepoints require two 'characters', > which in isolation don't make much sense). I don't get your point. UTF-16 is a standard that uses one or two 16-bit words to represent one Unicode character (code point). That's the only way to consider it. (UCS-2 uses one 16-bit word, which is only usable for BMP characters, making it completely useless today.) > As most languages don't need those surrogate pairs for their > codepoints/glyphs, it is easier to consider UTF-16 to be 2 byte. As far > as I know this is how most UTF-16 implementations handle it. You mix up words like "byte", "character", "codepoint", and "glyph". In the good old ASCII days we had a 1:1:1 relationship between "bytes", "characters" and "glyphs". Today there is no such relationship anymore. In the Unicode system, you usually need more than one byte to represent a character. You may need more than one character to represent a glyph. Regards Stefan ------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel