> Yuck, then you???re saddled with endianness issues. Good, that'll shake out the last of the big-endian systems.
> Plus null bytes > can then be part of the data, so most charset-oblivious software > breaks. I thought breaking 8-bit-only software was a good thing. > Not worth it, considering that 99.99% of text processing > is either gluing strings together without looking inside, > or processing them character-by-character. Processing them character by character in UCS-4 is so much easier than doing it in UTF-8. So is gluing them together. > Blindly indexing into a > string without having scanned it previously is so rare it doesn???t > merrit consideration. Blindly indexing into a file without having scanned it previously is so common that you don't even remark on it happening. A file, remember, is a string.