Why is there no UTF-24?
Well, I once proposed UTF-20...
See, these MathText characters take up a lot of space. No matter how you encode them; UTF-8, UTF-16 or UTF-32; they always are 4 bytes long.
True for them alone, in those UTFs. Short of defining another Unicode encoding, there are two answers that I can offer you:
1. Such characters are expected to be the minority of text, I suppose even in Math text, because there are lots of other characters in such documents - punctuation, spaces, digits, regular text - that are mostly on the BMP and thus shorter. So total Math documents with some MathText supplementary characters will use, on average, fewer than 3B/code point in UTF-8/16.
2. If you want compression, use the existing SCSU (UTR #6) and BOCU-1 (UTN #6), or general-purpose compressions like bzip2.
Note that this is only for text interchange - the majority of Unicode-aware software programs uses UTF-16 internally.
Best regards, markus
-- Opinions expressed here may not reflect my company's positions unless otherwise noted.