On 11/11/19 2:57 PM, Jose Isaias Cabrera wrote: > Igor Tandetnik, on Monday, November 11, 2019 02:24 PM, wrote... >> On 11/11/2019 12:50 PM, Richard Damon wrote: >>> Writing 20 UTF-32 characters may ALSO print less than 20 glyphs to the >>> screen. >> Or more, depending on what you mean by "glyph". See e.g. U+FDFB (ARABIC >> LIGATURE JALLAJALALOUHOU, >> https://www.fileformat.info/info/unicode/char/fdfb/index.htm ) or U+FB03 >> (LATIN SMALL LIGATURE FFI, >> https://www.fileformat.info/info/unicode/char/fb03/index.htm) > Thanks for this, Igor. Again, UTF32 has lots of space, still. If you look > at the representation of these two characters, > > ARABIC LETTER JALLAJALALOUHOU UTF-32 (hex) 0x0000FDFB (fdfb) > LATIN SMALL LIGATURE FFI UTF-32 (hex) 0x0000FB03 (fb03) > > Look at their hex representations in UTF32: > 1. 0x0000FDFB > 2. 0x0000FB03 > > The first 4 0's are still unused spaces. Japanese, Chinese, etc., glyphs > have an unique UTF32 code, so, it will always work. > > josé
Unicode has decreed that the highest code-point that can be called a code-point is 0x10FFFF because to go higher breaks UTF-16, so there isn't as much room as you might think. This give us 1,114,112 possible code points. There are currently 137,994 code points assigned to characters, 66 assigned as non-characters, 2048 reserved for the surrogates, and a number reserved for private use, leaving 836,536 currently unassigned. This says we have some space to grow, but there are still a lot of archaic and unusual scripts that are being proposed or worked on. -- Richard Damon _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users