Some brief and not complete answers follow. > I'm trying to get a grasp on exactly how many planes > are defined in Unicode > [...] > How many planes are defined in Unicode 3.1?
There are 17 planes, and everything will be re-written to reflect that, eventually. Most of the planes are empty (except for the non-characters). And two of the planes are full of user-defined private-use characters. The ISO 10646 standard is being revised (or has been already?) so that Unicode and 10646 all agree on 17 planes. Appendix C of Unicode 3.0 talks about planes. > BTW, it doesn't make sense for every code position > ending in FFFF or FFFE to be a non character. Perhaps it doesn't. But as I have said before, in other places: "If everything made sense, we wouldn't need surrealism to explain it." > 32 non character > code values in the arabic presentation form block Which are those? Can you point to precise codepoint values? > Why isn't the same rule applied to the hidden non > characters, so that every code value ending in FDD0 to > FDEF is also a non character? Is it to contribute to > their hidden nature? I don't understand this. What is special about those codepoints?