On 12/19/19 1:59 PM, Andrew Nenakhov wrote:
Is it really any better than escaped XML text?

Yes. Any sane implementation of XML parsers would resolve references as part of the parsing, so you would have to do extra work to find out what references were in the text before.

Plus, when doing the web client this means an additional escaping - deescaping routine every time when something is sent-displayed, cause browsers require their own escaping.

I hope that any web client would not use innerHtml or similar techniques to display the message body, but instead rely on document.createTextNode() which expects a string without references. Similarly inputElement.value and element.textContent give you their strings without references. In generally HTML/JS do their best to abstract away from references, because why should an application developer deal with that?

Also HTML uses a different set of predefined references then XML and has different requirements - ä is valid in HTML but not in XML (without it being defined as an entity in a DTD).

Why should standard be concerned about different server implementations converting anything?  If a server does some converting for some reason from one way of escaping XML to another, of course it should recalculate all references.

On the XML layer (which is what XMPP build on) this "conversion" does not change anything (the texts stay the same), that's why it is perfectly valid for a server to do it. The protocol on top of XML (and subsequently XMPP) should not deal with references, they are resolved on the layer below. That's why it is a bad idea to assume specific characters to be represented using certain references, because you can't control that (you can only assume things).

So I tried with Xabber/xabber.org and either your server or the client (I guess it's the server) seems to fail to properly do what you just said it should: When sending the message

<message type="chat">
  <body>>>>>></body>
<reference xmlns='urn:xmpp:reference:0' begin='1' end='1' type='markup'><bold/></reference> <reference xmlns='urn:xmpp:reference:0' begin='3' end='3' type='markup'><bold/></reference>
</message>

it is displayed as

&gt;>>>>

with g and ; in bold.


So far our 'non-standard' way of using references is in fact way more 'standard' than what is currently suggested by this mish-mash of different XEPs.

I guess we have different definitions of a standard. These mish-mash of different XEPs is a publicly viewable standard proposal. I am not aware of a documentation of what Xabber is doing

Not really cool, right?

What's bad about that? I would say that having "0..0 bold" is pretty weird, because it sounds like an empty range (it starts and ends at the same point, so it must be empty).


    The second integer represents the location of the first non-URL
    character occurring after the URL *(or the end of the string if the
    URL is the last part of the Tweet text)*


I think you are misunderstanding them here. I am pretty sure "the end of the string" is *after* the last character, not the last character.

Cited example of programming languages is valid only in part. Yes, it is so in java or python, but not so in swift, obj-c or erlang. The last three use index of the first character and length, which is  actually my favourite approach.

I don't think it really makes sense to discuss which programming language is the one that matters most, but:
- Swift has two operators "ABCDE"[2...4] = "CDE" and "ABCDE"[2..<4] = "CD"
- Objective-C substring functions require index and length
- Erlang uses 1-based indices, string:sub_string("ABCDE", 2, 4) = "BCD", thus is equivalent to python [1:4]

Also when you prefer index of first char and length, why not use <ref begin="2" length="2" /> then? For languages that take string length, you currently have to calculate length = end+1-begin (because you chose to have end one less than everyone else does).


ср, 18 дек. 2019 г. в 21:59, Marvin W <x...@larma.de <mailto:x...@larma.de>>:

    I don't think it really is a "change", in XEP-394 it is already defined
    this way ("the last affected codepoint is the one just before end" [1])
    and the example in XEP-372 [2] also counts that way (char 72 is the "J"
    of and char 78 is the space after "Juliet"). Only the text misleadingly
    says "An end attribute is similarly used for the index of the last
    character of the reference.", so this may need a clarification.


Well. I strongly object.

Either we need to change the text in XEP-372 slightly or we have to change the examples in XEP-372 and the text and examples in XEP-394 (because both should do the same). I see you have a strong opinion on the one side for some reason.

( Btw, did anyone but us implement this XEP at all?  )

Converse has an implementation of XEP-372 for mentions (the only usecase that is properly defined in that XEP IMO).

On 'already defined' 394. As we have learned from 0071 debacle, even widely implemented XEPs can be deprecated with vague reasoning, so deprecating a contradictory XEP that, to my knowledge, wasn't even implemented anywhere, shouldn't be too much of an issue.

Sure, we could deprecate XEP-394, but I don't see a proper replacement for it yet. I consider the thing Xabber is doing more like a misuse of XEP-372, which according to its abstract defines a method for one XMPP stanza to provide references to another entity, such as mentioning users, HTTP resources, or other XMPP resources - not a way for putting markup everywhere. I'd rather like to get rid of XEP-372 (which has a lot of unclear things and pending TODOs in it) then XEP-394 (which of course can surely be improved).
_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
_______________________________________________

Reply via email to