On 2012-07-02 2:45 PM, "Kurt Zeilenga" <[email protected]> wrote: > But I wonder if the XEP needs to say something about changes in valid text to valid text which might produce invalid text in the edit? Consider, if user replaces the single glyph in message, is it allowed to send just the code points that changed, or its necessary to all the code points of each glyph that was changed? That is, consider the text "tschuss" and the changed to add an diaeresis over the 'u'. Using decomposed characters, that a change of U+75 to U+75,U+308. Is it okay to RTT which inserts U+308 instead of replaces U+75 with U+75,U+308? >
Either way is allowed, though all my implementations use a "sends differences only" methodology, with success on all public XMPP servers tried so far. It is already covered in the second paragraph of the rewritten Section 4.5.4.2 "Guideline for Recipients" shown below: > > Note that [[Element <t/> – Insert Text]] is allowed to contain any subset sequence of Unicode characters from the real-time message. This may result in certain situations where the text transmitted in <t/> elements is allowed to be temporarily an incorrectly-formed Unicode string (i.e. orphaned standalone combining mark, orphaned direction-change character for bidi Unicode, etc.) but becomes correct when inserted into the middle of the recipient's real-time message, and passes recipient validation/normalization with no character modifications. Note that a compliant XML processor does not modify or fix Unicode errors caused by taking only a subset of characters from correctly-formed Unicode text. One alternative way for implementers to visualize this, is to visualize the Unicode text as an array of individual code points, and treat the p and n values accordingly. > > A minor edit to to clarify this for multiple characters forming one glyph, is to add "incompletely formed glyphs" to the list in the paranthesis. Would that make sense? Thanks, Mark Rejhon
