Mark E. Shoulson wrote:
Well, that's the difference under discussion. The "plain text" would seem to be either the qere or the ketiv (but not the combined "blended" form), since each of those is somewhat sensible.
Is there some place in the standard where it says text must be sensible?
No. The intent is to allow the author to unambiguously present his text. If that includes deliberate mis-spellings or other funny business, that's between the author and the reader.
In scripts with complex layout, of course, not all random character soup would be rendered the same by all systems. Which, I think is the point here. If this is a rather commonly used device, then in principle it's possible to ask why can this not be part of plain text.
If the necessary mechanisms to do this are cheap and simple, the answer is often to bring such things under the plain text umbrella. If it's complicated, the answer should be to leave it to mechanisms such as markup that deal well in (whatever required kind of) complexity.
This segues nicely to an answer to a different issue raised earlier in this thread:
Interlinear annotation characters were added to Unicode before we discovered a more general mechanism. Their main intent never was for interchange, but for internal representation, where the special character codes serve as anchors or placeholders in the text stream, but where formatting information is kept in a side buffer. Nowadays, we have 66 generic non-characters that are the correct tool for such process-internal use.
A./

