Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-18 Thread Dave Cridland
On Wed, 9 Dec 2020 at 19:21, Sam Whited wrote: > I believe this is a mischaracterization of my argument. My argument is > "everything will have a way to get at the underlying bytes, not > everything will have them pre-converted into code points". I think this, in particular, is not correct.

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-09 Thread Tedd Sterr
XML is a sequence of characters (not bytes.) References mark a portion of displayed text which is rendered as a sequence of characters (not bytes.) So it makes perfect sense to define references in terms of bytes. ___ Standards mailing list Info:

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-09 Thread Sam Whited
I believe this is a mischaracterization of my argument. My argument is "everything will have a way to get at the underlying bytes, not everything will have them pre-converted into code points". Also "this gives us the option to do certain optimizations on systems that support them, but using code

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-09 Thread Tedd Sterr
>> The decoding _should_ be done upfront - that's how you get a valid XML >> document. > I don't think this is true. XML is defined as UTF-8 (in this case), > which is a collection of bytes. They don't have to be separated out and > transformed into some higher representation of code points.

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-09 Thread Sam Whited
I don't think this is true. XML is defined as UTF-8 (in this case), which is a collection of bytes. They don't have to be separated out and transformed into some higher representation of code points. Just because Python et al. convert things into UTF-32 strings first doesn't mean everything has

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-09 Thread Jonas Schäfer
For the record: On Dienstag, 8. Dezember 2020 23:13:08 CET Sam Whited wrote: > I don't understand how this is part of the XML data model. Do you mean > that only Unicode encodings are supported by XML? If so, that's fair and > removes one of my arguments, I did not know that was the case.

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-09 Thread Tedd Sterr
Sam, your argument appears to be "I want to handle everything as bytes without doing any string decoding, so any other option would be more effort (less efficient) for me." XML is defined as a sequence of characters, not bytes - those characters subsequently need to be transformed into bytes

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-09 Thread Sam Whited
To try and show why I'm pushing back on this so hard here is an example of doing this three different ways: one assuming the references are bytes, two assuming the references are code points. https://play.golang.org/p/kKbr2hXd56U The third one I was forgetting I can do, and it looks quite nice

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-09 Thread Marvin W
Hi, On 09.12.20 08:59, Florian Schmaus wrote: > But the recipient would be able to apply the same rules regarding > localization as the sender when counting grapheme clusters. Which rules? Unicode does not provide a locale specific grapheme clustering algorithm, TR29 only mentions that those

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-09 Thread Florian Schmaus
On 12/7/20 11:34 PM, Marvin W wrote> On 07.12.20 19:34, Florian Schmaus wrote: We do have xml:lang, don't we? Unforunately, it doesn't help in all cases. It's perfectly fine to write a message with xml:lang="en": "chlapec" is "boy" in slowak This is 27 grapheme clusters, but I guess most

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-08 Thread Kevin Smith
On 8 Dec 2020, at 22:13, Sam Whited wrote: > I still think the data on the wire should describe the other data on the > wire, not some higher- level "decoded” representation Agree 100%. References et al. need to calculate how the data are going to be encoded on the wire, not some high level

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-08 Thread Sam Whited
The XML library I use does not give me a string or slice of code points, it gives me a slice of bytes because that's the level I'm operating at. Even at the higher level if I decode the bytes into a string (A Go string in this case), that is still just a slice of UTF-8 bytes (it does not decode

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-08 Thread Jonas Schäfer
On Freitag, 4. Dezember 2020 21:33:38 CET Sam Whited wrote: > On Fri, Dec 4, 2020, at 20:23, Florian Schmaus wrote: > > I begin to feel that a lot of your rationale is based on the idea that > > you always (/often?) have access to the raw UTF-8 bytes as they > > appeared on the wire. > > Yes,

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-07 Thread Marvin W
Hi, On 07.12.20 19:34, Florian Schmaus wrote: > We do have xml:lang, don't we? Unforunately, it doesn't help in all cases. It's perfectly fine to write a message with xml:lang="en": > "chlapec" is "boy" in slowak This is 27 grapheme clusters, but I guess most western people would count it as

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-07 Thread Florian Schmaus
Hi Marvin :) On 12/7/20 4:22 PM, Marvin W wrote: On 04.12.20 21:23, Florian Schmaus wrote: And I am in favor of code points because it allows us to aim for the extended grapheme cluster algorithm, while also allowing for the "simply count code points" fallback. XEP-0426 already discusses

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-07 Thread Marvin W
Hi, On 04.12.20 21:23, Florian Schmaus wrote: > And I am in favor of code points because it allows us to aim for the > extended grapheme cluster algorithm, while also allowing for the > "simply count code points" fallback. XEP-0426 already discusses why it's using codepoints instead of

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Sam Whited
On Fri, Dec 4, 2020, at 20:53, Florian Schmaus wrote: > If you count the bytes of the UTF-8 encoded representation, then there > is no way to have any fallback (as the indexes would be wrong). Maybe I don't understand the fallback you're proposing. I do understand your example, and assert that

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Florian Schmaus
On 12/4/20 9:33 PM, Sam Whited wrote: And I am in favor of code points because it allows us to aim for the extended grapheme cluster algorithm, while also allowing for the "simply count code points" fallback. If you do bytes you could also easily convert to codepoints and then to grapheme

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Sam Whited
On Fri, Dec 4, 2020, at 20:23, Florian Schmaus wrote: > I begin to feel that a lot of your rationale is based on the idea that > you always (/often?) have access to the raw UTF-8 bytes as they > appeared on the wire. Yes, most of it is. > While is is probably true for languages where the

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Florian Schmaus
On 12/4/20 8:25 PM, Sam Whited wrote: On Fri, Dec 4, 2020, at 19:00, Florian Schmaus wrote: My problem with your proposal is that it uses bytes. I don't get why you want to use bytes here. Naturally. Likewise my problem with your proposal is that it uses code points and I don't get why you'd

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Sam Whited
On Fri, Dec 4, 2020, at 19:00, Florian Schmaus wrote: > Often you don't get raw bytes from your XML parser, but an instance of > your programming language's native String type. But often your > programming language provides an API to encode that String to UTF-8 > encoded bytes, which *should*

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Florian Schmaus
On 12/4/20 7:29 PM, Sam Whited wrote: I don't understand this, if you get out bytes why would they be different to what was in the stream? Often you don't get raw bytes from your XML parser, but an instance of your programming language's native String type. But often your programming

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Sam Whited
I don't understand this, if you get out bytes why would they be different to what was in the stream? If you get a string in a language that assumes strings have some specific format (ie. are valid UTF-8 or UTF-16 or something) it makes sense that they might have had to be different, but would

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Kevin Smith
On 4 Dec 2020, at 16:41, Sam Whited wrote: > Bytes are the only way to not make assumptions about the libraries, > languages, etc. being used. Except that bytes are making significant assumptions about the libraries and languages being used. It’s assuming that what you get out of your parser

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Sam Whited
On Fri, Dec 4, 2020, at 16:10, Florian Schmaus wrote: > XMPP uses Unicode because XML, upon which XMPP is build, uses Unicode, > hence I doubt that you will ever find an API where e.g. > Message.getBody() will return data that is not Unicode encoded, but > uses some other encoding scheme. Wasn't

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Florian Schmaus
On 12/4/20 4:01 PM, Sam Whited wrote: On Fri, Dec 4, 2020, at 14:50, Florian Schmaus wrote: But this String will be represented in your programming language's native String representation, which may or may not match the bytes on the wire. That's the point, we can't guarantee what the

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Sam Whited
On Fri, Dec 4, 2020, at 14:50, Florian Schmaus wrote: > But this String will be represented in your programming language's > native String representation, which may or may not match the bytes on > the wire. That's the point, we can't guarantee what the representation is. It might be something

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Florian Schmaus
On 12/4/20 3:27 PM, Sam Whited wrote: FWIW I was a big proponent of doing it this way too, but I've changed my mind after seeing too many grapheme segmentation implementations be broken in small, different, ways. My new position is that we have to just count bytes and figure out a sane behavior

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Tedd Sterr
> FWIW I was a big proponent of doing it this way too, but I've changed my > mind after seeing too many grapheme segmentation implementations be > broken in small, different, ways. My new position is that we have to > just count bytes and figure out a sane behavior in case someone sends us > an

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Sam Whited
FWIW I was a big proponent of doing it this way too, but I've changed my mind after seeing too many grapheme segmentation implementations be broken in small, different, ways. My new position is that we have to just count bytes and figure out a sane behavior in case someone sends us an invalid

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Florian Schmaus
On 12/4/20 3:03 PM, Andrew Nenakhov wrote: Upping a year-old email thread for Florian. Thanks, but I am well aware of the thread and the situation. I think this below mixes aspects the XML layer with the Unicode layer, which do not have to get mixed when counting "characters". Ultimately

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2020-12-04 Thread Andrew Nenakhov
Upping a year-old email thread for Florian. ср, 18 дек. 2019 г. в 20:41, Marvin W : > > [inline] > > On 12/18/19 3:22 PM, Andrew Nenakhov wrote: > > In the end we have settled for counting characters of escaped string, so > > This sounds like a terrible idea. In encoded XML, ">", "", "" > and

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Jonas Schäfer
On Mittwoch, 18. Dezember 2019 17:27:04 CET Jonas Schäfer wrote: > On Mittwoch, 18. Dezember 2019 16:40:42 CET Marvin W wrote: > > [inline] > > > > On 12/18/19 3:22 PM, Andrew Nenakhov wrote: > > > In the end we have settled for counting characters of escaped string, so > > > > This sounds like

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Jonas Schäfer
On Mittwoch, 18. Dezember 2019 16:40:42 CET Marvin W wrote: > [inline] > > On 12/18/19 3:22 PM, Andrew Nenakhov wrote: > > In the end we have settled for counting characters of escaped string, so > > This sounds like a terrible idea. In encoded XML, ">", "", "" > and "]]>" are equivalent. I just

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Andrew Nenakhov
*counting symbols differently. сб, 21 дек. 2019 г. в 17:22, Andrew Nenakhov < andrew.nenak...@redsolution.com>: > сб, 21 дек. 2019 г. в 17:12, Ralph Meijer : > >> So, having unescaped > is valid for case 2, and serializers may choose to >> do so. >> > > Okay, whatever. We are already counting

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Andrew Nenakhov
сб, 21 дек. 2019 г. в 17:12, Ralph Meijer : > So, having unescaped > is valid for case 2, and serializers may choose to > do so. > Okay, whatever. We are already counting messages and escaping symbols all the time (cause servers to escape them anyway). It's far from being the first thing we do

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Ralph Meijer
On December 21, 2019 12:32:03 PM GMT+01:00, Andrew Nenakhov wrote: >сб, 21 дек. 2019 г. в 16:21, Ralph Meijer : > >> Just making sure everyone has the same interpretation: >> >> Case 1) The text has the sequence ]]>. In this case, in XML the > >MUST be >> escaped (with , or equivalent character

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Florian Schmaus
On 21.12.19 12:32, Andrew Nenakhov wrote: > > > сб, 21 дек. 2019 г. в 16:21, Ralph Meijer >: > > Just making sure everyone has the same interpretation: > > Case 1) The text has the sequence ]]>. In this case, in XML the > > MUST be escaped (with , or

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Andrew Nenakhov
сб, 21 дек. 2019 г. в 16:21, Ralph Meijer : > Just making sure everyone has the same interpretation: > > Case 1) The text has the sequence ]]>. In this case, in XML the > MUST be > escaped (with , or equivalent character reference). > Case 2) All occurances of > not preceded by ]]. Here > MAY

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Ralph Meijer
On December 21, 2019 11:57:19 AM GMT+01:00, Andrew Nenakhov wrote: >сб, 21 дек. 2019 г. в 15:45, Philipp Hörist : > >> >> I think you misunderstood the RFC, it's not a violation to send ">" >> unescaped. >> >> > The right angle bracket (>) *may *be represented using the string " > >> ", and

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Andrew Nenakhov
сб, 21 дек. 2019 г. в 15:45, Philipp Hörist : > > I think you misunderstood the RFC, it's not a violation to send ">" > unescaped. > > > The right angle bracket (>) *may *be represented using the string " > ", and *MUST*, for compatibility > , be escaped

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Philipp Hörist
Am Sa., 21. Dez. 2019 um 11:39 Uhr schrieb Andrew Nenakhov < andrew.nenak...@redsolution.com>: > > We assumed as much but weren't sure. Anyway, Marvin had sent a malformed > stanza, which was corrected (escaped) by the server. Next, a client that > counted characters in a different way than he

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Andrew Nenakhov
пт, 20 дек. 2019 г. в 20:34, Dave Cridland : > > I think we've just conclusively proven it does get changed during sending. > We certainly cannot rely on it not being changed, since absolutely nothing > in XML or XMPP prevents it being changed. > If you form the stanza according to the standard

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Andrew Nenakhov
сб, 21 дек. 2019 г. в 14:53, Florian Schmaus : > On 21.12.19 10:50, Andrew Nenakhov wrote: > > > > > > пт, 20 дек. 2019 г. в 19:25, Marvin W > >: > > > > On 12/20/19 1:15 PM, Andrew Nenakhov wrote: > > > You have sent a string '>', which was escaped to > > >

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Ralph Meijer
On December 21, 2019 10:57:02 AM GMT+01:00, Florian Schmaus wrote: >On 18.12.19 16:00, Marvin W wrote: >> It's indeed a good question if anything in XMPP allows servers or >> in-between entities to do normalization. I was under the assumption >that >> servers do not change the codepoints. In XML

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Florian Schmaus
On 18.12.19 16:00, Marvin W wrote: > It's indeed a good question if anything in XMPP allows servers or > in-between entities to do normalization. I was under the assumption that > servers do not change the codepoints. In XML [1] Characters with > multiple possible representations in ISO/IEC 10646

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Florian Schmaus
On 21.12.19 10:50, Andrew Nenakhov wrote: > > > пт, 20 дек. 2019 г. в 19:25, Marvin W >: > > On 12/20/19 1:15 PM, Andrew Nenakhov wrote: > > You have sent a string '>', which was escaped to > > '' before sending to the server. > > I have sent ">"

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Andrew Nenakhov
пт, 20 дек. 2019 г. в 19:25, Marvin W : > On 12/20/19 1:15 PM, Andrew Nenakhov wrote: > > You have sent a string '>', which was escaped to > > '' before sending to the server. > > I have sent ">" verbatim (exactly the stanza I send you in the last > mail was what went (TLS encrypted) to

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-21 Thread Florian Schmaus
On 18.12.19 15:22, Andrew Nenakhov wrote: > We're totally onboard with this XEP, and it is, in fact, the way we > already do count characters for references in all versions of Xabber. > > However, there is one important case not addressed in this XEP: XML > predefined entities. As others have

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-20 Thread Tedd Sterr
A few thoughts… If we consider character-range indices as referring to the points between characters, not the positions of the characters themselves, then there's no confusion over whether a character should be included - a character is either inside the range or outside of it. XML is a

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-20 Thread Dave Cridland
On Fri, 20 Dec 2019 at 14:43, Andrew Nenakhov < andrew.nenak...@redsolution.com> wrote: > > > пт, 20 дек. 2019 г. в 17:53, Dave Cridland : > >> >> >> On Fri, 20 Dec 2019 at 12:15, Andrew Nenakhov < >> andrew.nenak...@redsolution.com> wrote: >> >>> You have sent a string '>', which was escaped

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-20 Thread Andrew Nenakhov
пт, 20 дек. 2019 г. в 17:53, Dave Cridland : > > > On Fri, 20 Dec 2019 at 12:15, Andrew Nenakhov < > andrew.nenak...@redsolution.com> wrote: > >> You have sent a string '>', which was escaped to >> '' before sending to the server. >> > > Well, maybe. XML doesn't require you to escape '>' in

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-20 Thread Marvin W
On 12/20/19 1:15 PM, Andrew Nenakhov wrote: You have sent a string '>', which was escaped to '' before sending to the server. I have sent ">" verbatim (exactly the stanza I send you in the last mail was what went (TLS encrypted) to the server. According to XML standard "the ampersand

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-20 Thread Dave Cridland
On Fri, 20 Dec 2019 at 12:15, Andrew Nenakhov < andrew.nenak...@redsolution.com> wrote: > You have sent a string '>', which was escaped to > '' before sending to the server. > Well, maybe. XML doesn't require you to escape '>' in text, only in attribute values. Presumably, in order to

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-20 Thread Andrew Nenakhov
пт, 20 дек. 2019 г. в 00:49, Marvin W : > So I tried with Xabber/xabber.org and either your server or the client > (I guess it's the server) seems to fail to properly do what you just > said it should: When sending the message > > >> > type='markup'> > type='markup'> > > > it is

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-20 Thread Ralph Meijer
On 20-12-2019 12:55, Andrew Nenakhov wrote: чт, 19 дек. 2019 г. в 19:02, Ralph Meijer >: If you want consistent counting on all platforms and languages, counting Unicode characters seems to be the best way forward. We do not dispute that 'counting unicode

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-20 Thread Ralph Meijer
Oops, the following should have been sent to the list. On 19-12-2019 15:02, Ralph Meijer wrote: On 19-12-2019 13:59, Andrew Nenakhov wrote: ср, 18 дек. 2019 г. в 20:12, Ralph Meijer >:     My assumption was that we are looking at character data on the     abstract layer

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-19 Thread Marvin W
On 12/19/19 1:59 PM, Andrew Nenakhov wrote: Is it really any better than escaped XML text? Yes. Any sane implementation of XML parsers would resolve references as part of the parsing, so you would have to do extra work to find out what references were in the text before. Plus, when doing

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-19 Thread Andrew Nenakhov
ср, 18 дек. 2019 г. в 20:12, Ralph Meijer : > My assumption was that we are looking at character data on the abstract > layer /after/ parsing XML. You shouldn't see entities there (they'd be > resolved to their respective characters), nor should you see wrappers. > Hm, please, define 'abstract'

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-18 Thread Marvin W
On 12/18/19 5:00 PM, Ralph Meijer wrote: I'd not be opposed to changing the definition of 'end' here. Twitter Entities [1] also points to the character after. I don't think it really is a "change", in XEP-394 it is already defined this way ("the last affected codepoint is the one just before

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-18 Thread Ralph Meijer
On 18-12-2019 16:40, Marvin W wrote: [..] Also that's a weird counting there, usually I would expect end to point to the position after the last referenced character - at least that's what you do in most programming languages (e.g. ""[0:14] will give you "" without the last ";"). I'd not

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-18 Thread Marvin W
[inline] On 12/18/19 3:22 PM, Andrew Nenakhov wrote: In the end we have settled for counting characters of escaped string, so This sounds like a terrible idea. In encoded XML, ">", "", "" and "]]>" are equivalent. I just tried it out and servers indeed do convert all of those to their

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-18 Thread Ralph Meijer
My assumption was that we are looking at character data on the abstract layer /after/ parsing XML. You shouldn't see entities there (they'd be resolved to their respective characters), nor should you see

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-18 Thread Marvin W
It's indeed a good question if anything in XMPP allows servers or in-between entities to do normalization. I was under the assumption that servers do not change the codepoints. In XML [1] Characters with multiple possible representations in ISO/IEC 10646 (e.g. characters with both precomposed

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-18 Thread Andrew Nenakhov
We're totally onboard with this XEP, and it is, in fact, the way we already do count characters for references in all versions of Xabber. However, there is one important case not addressed in this XEP: XML predefined entities. Symbols that are to be escaped, as listed in

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-18 Thread Philipp Hörist
Am Mi., 18. Dez. 2019 um 12:02 Uhr schrieb Florian Schmaus : > I do like to point out that it is probably not really XMPP specific > (similar to XEP-0392: Consistent Color Generation), but I don't see a > reason why this shouldn't get XEP'ed up. > > I don't see the similarities, one is a pure UI

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-18 Thread Florian Schmaus
On 12/17/19 12:18 PM, p...@bouah.net wrote: > The XMPP Extensions Editor has received a proposal for a new XEP. > > Title: Character counting in message bodies > Abstract: > This document describes how to correctly count characters in message > bodies. This is required when referencing a position

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-17 Thread Lance Stout
An additional reference from XEP-0301 (In-Band Real Time Text) in support of this: https://xmpp.org/extensions/xep-0301.html#unicode_character_counting > On Dec 17, 2019, at 3:18 AM, p...@bouah.net wrote: > > The XMPP Extensions Editor has received a proposal for a new XEP. > > Title:

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-17 Thread Jonas Schäfer
On Dienstag, 17. Dezember 2019 12:18:53 CET p...@bouah.net wrote: > The XMPP Extensions Editor has received a proposal for a new XEP. > > Title: Character counting in message bodies > Abstract: > This document describes how to correctly count characters in message > bodies. This is required when

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-17 Thread Kevin Smith
I don’t have feedback to give at the moment, but this is a thing we’ve needed for a long time, so a big thank you to Marvin for getting something submitted. /K > On 17 Dec 2019, at 11:18, p...@bouah.net wrote: > > The XMPP Extensions Editor has received a proposal for a new XEP. > > Title:

Re: [Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-17 Thread Guus der Kinderen
This XEP might want to add an implementation note that relates to https://xmpp.org/extensions/xep-0245.html. When XEP-0245 is used, clients often use a different representation of the message from what's in the body (eg: replacing "/me" with a nickname). This makes it very easy to make mistakes in

[Standards] Proposed XMPP Extension: Character counting in message bodies

2019-12-17 Thread pep
The XMPP Extensions Editor has received a proposal for a new XEP. Title: Character counting in message bodies Abstract: This document describes how to correctly count characters in message bodies. This is required when referencing a position in the body. URL: