On Fri, 8 Nov 2024 at 11:37, Markus Scherer <[email protected]> wrote: > On Thu, Nov 7, 2024 at 3:03 PM Jim Breen via Unicode > <[email protected]> wrote: >> >> On rare occasions, I need to dig into UTF-8 at the bit level. I have a >> note pinned near my desk as an aide memoire. It has 3 lines: >> >> UTF-8 >> zzzzyyyyyxxxxx >> 1110zzzz 10yyyyyy 10xxxxxx > > 11110nnn 10zzzzzz 10yyyyyy 10xxxxxx
I haven't had any occasion to poke around at 21-bit Unicode codepoints. The JIS standards only have 303 kanji with them; all added in the JIS X 0213 standard introduced in 2000. [As I wrote in my "A Brief History of Japanese Character Set Standards" (https://www.edrdg.org/~jwb/paperdir/kanjicomp.html) "the main lasting impact of the JIS X 0213 standard will probably be the additional 303 kanji it contributed to Unicode."] Jim -- Jim Breen Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University http://www.jimbreen.org/ http://nihongo.monash.edu/
