Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-15 Thread Evgeny Khramtsov
Tue, 07 Nov 2017 20:34:04 -
Jonas Wielicki (XSF Editor)  wrote:

> The XMPP Extensions Editor has received a proposal for a new XEP.
> 
> Title: Message Markup
> Abstract:
> This specification provides an alternative to XHTML-IM with rigid
> separation of content and markup information, improving the resilience
> against spoofing and injection attacks.
> 
> URL: https://xmpp.org/extensions/inbox/markup.html

Decent XEP. It doesn't require external parsers and is hard to screw
up. I wish XSF will advance it.
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-08 Thread Jonas Wielicki
On Dienstag, 7. November 2017 20:34:04 CET Jonas Wielicki wrote:
> URL: https://xmpp.org/extensions/inbox/markup.html

It has been brought up in xsf@ that this XEP seems contradictory, because it 
states in the Requirements:

> Textual data and markup metadata MUST be separated strictly.

But also contains ">" (for quotations) and "*" (for bullet point lists) in the 
.

Let me explain why I think this is not a contradiction and why I think this 
still makes sense (and please also consider my most recent message in the 
Message Styling thread [1] in this context).

- These characters are not needed at all for the markup itself to work. They 
are thus not a "markup language" per se and not "markup information" which is 
in the body, thus not violating the quoted requirement.

- They are useful to help human(!) users of plain-text clients to interpret 
the message correctly. As they are not needed for parsing, their placement can 
be more ambigious than with strictly defined markup languages like Styling. 
For example, it would be valid to have a marked up plain-text fallback like 
this:

The right term is c_a_ffeine.


(This would not be emphasised with Message Styling as it violates the rule 
that markup "MUST start with a whitespace character or be the beginning of the 
line or the byte stream" rule.)

- I’d like to state for the record (so that Council can take this into 
consideration when voting on this XEP, to the good or the bad) that I am 
considering removing the "Requirements on the contents of the  MUST NOT 
be imposed." requirement and in fact define plain-text fallbacks for each of 
the  child elements. This would make the  messages of marked 
up messages more consistent (while still not necessarily be parseable as 
ascii-based markup) and allow us to specify deletion of characters/codepoints 
(e.g. the "> " for a quotation).

  The obvious example would be to define how  uses ">" as symbol in 
the plain-text fallback. We could also define fallbacks for the inline span 
styles. Please note that this is exactly the other way round when looking at 
Styling. I sincerely think that both can meet in the middle and co-exist to a 
greater good.

Thoughts please?

kind regards,
Jonas


   [1]: https://mail.jabber.org/pipermail/standards/2017-November/033797.html

signature.asc
Description: This is a digitally signed message part.
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-08 Thread Goffi
Le mercredi 8 novembre 2017, 10:11:52 CET Jonas Wielicki a écrit :
> On Mittwoch, 8. November 2017 08:29:49 CET Georg Lukas wrote:
> > * Goffi  [2017-11-08 08:17]:
> > > about the stars in the list items, it's not really nice to keep them.
> > > 
> > > It would be good to have an attribute to say which plain text characters
> > > can be safely removed without changing the meaning.
> > > For instance type="numeric" means than "^[0-9]+\)" can be removed,
> > > type="star" mean that the first character must be a "*" and it can be
> > > removed.
> > 
> > That's a nice idea. We need a mechanism where characters can not be
> > removed (so we can't end up different meanings depending on client
> > capabilities), but replaced in a fashion that is directly mapped to the
> > body. Rendering a number at the beginning of an item differently, or
> > replacing a "* " with some bullet point seems like a sane (albeit
> > slightly complex) approach.
> 
> This isn’t trivial, depending on the level of safety we want against
> spoofing attacks. For example, ordered lists usually come in different
> shapes, even though *western* people usually only use arabic numerals,
> probably closely followed by alphabetic lists, I can see other locales
> using e.g. japanese numerals. At that point any simple "strip the number
> until the next dot +whitespace" rule falls apart.
> 
> More complex rules and erasing of characters in general open the door for
> possible attacks which need to be thought about (by removing critical parts
> of a message with plausibly deniable markup), which is why I omitted that
> functionality them for now. It can be added later thanks to the "clients
> MUST ignore unknown elements and attributes" rule (clients which do not
> understand it will simply leave the characters in place).
> 
> I can’t think of good examples right now, but that doesn’t mean that those
> attacks aren’t there, unfortunately. It would also be nice to be able to
> specify erasing of for example "*" for emphasized text, which would give us
> nice compatibility with the Message Styling proposal.
> 
> In any case, *if* such a thing is added to the XEP, the set of characters
> which can be erased by each markup must be thought about carefully and it
> must be restricted. I fear that this might end up being easy to get wrong.
> 
> kind regards,
> Jonas


I agree this is tough and may be an extension to this XEP. I was suggesting a 
restricted set though, not trying to cover every way to present list in text. 
If we use a markup (and declare it), then it's reasonable to restrict to a 
unique way to do a numerical list, just to be compatible with text client. In 
the same spirit, unordered list can be restricted solely to "*", forgetting 
about "-", "+" or other "o".

For bold or italic markup, I think this may be more complicated, as the word 
can be anywhere in the sentence and the meaning could be changed (and we are 
back in putting ugly markup in the body, even if this time we can remove it 
and there is not more ambiguity, so this is more acceptable).

Anyway, let's first see if this protoXEP go to an official number.

++
Goffi
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-08 Thread JC Brand
On Wed, Nov 08, 2017 at 10:14:43AM +0100, Jonas Wielicki wrote:
> On Mittwoch, 8. November 2017 08:44:16 CET Remko Tronçon wrote:
> > Hi,
> > 
> > On 7 November 2017 at 21:34, Jonas Wielicki  wrote:
> > > The XMPP Extensions Editor has received a proposal for a new XEP.
> > 
> > Minor remark: the XEP says that spans MUST NOT overlap. Is there a reason
> > for this? I'm asking, because the systems I have seen that use external
> > markup
> > like this don't impose this. 
> 
> I think the rule can be loosened to "MUST NOT overlap each others boundaries".

Could this be changed to "Spans MUST NOT overlap each others boundaries but may 
be
nested (fully-contained) within other spans" ?

This supports the usecase of striking through emphasized text.

> This will be needed if we ever allow erasing characters (see the other branch 
> of this thread) to do so accurately.
> 
> Overlapping boundaries however is tricky; XHTML and the big toolkits (Gtk and 
> Qt) don’t support that either (you have to merge the styles manually in the 
> overlapping region, if I’m not mistaken). So I’d like to have the sender fix 
> that up, because they’ll have to anyways.

JC
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-08 Thread Dave Cridland
On 8 November 2017 at 09:14, Jonas Wielicki  wrote:
> On Mittwoch, 8. November 2017 08:44:16 CET Remko Tronçon wrote:
>> Hi,
>>
>> On 7 November 2017 at 21:34, Jonas Wielicki  wrote:
>> > The XMPP Extensions Editor has received a proposal for a new XEP.
>>
>> Minor remark: the XEP says that spans MUST NOT overlap. Is there a reason
>> for this? I'm asking, because the systems I have seen that use external
>> markup
>> like this don't impose this.
>
> I think the rule can be loosened to "MUST NOT overlap each others boundaries".
>
> This will be needed if we ever allow erasing characters (see the other branch
> of this thread) to do so accurately.
>
> Overlapping boundaries however is tricky; XHTML and the big toolkits (Gtk and
> Qt) don’t support that either (you have to merge the styles manually in the
> overlapping region, if I’m not mistaken). So I’d like to have the sender fix
> that up, because they’ll have to anyways.

What would a receiver do if spans partially overlap? Should the
receiver reject the message, ignore the offending span[s], or what?

Dave.
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-08 Thread Jonas Wielicki
On Mittwoch, 8. November 2017 08:44:16 CET Remko Tronçon wrote:
> Hi,
> 
> On 7 November 2017 at 21:34, Jonas Wielicki  wrote:
> > The XMPP Extensions Editor has received a proposal for a new XEP.
> 
> Minor remark: the XEP says that spans MUST NOT overlap. Is there a reason
> for this? I'm asking, because the systems I have seen that use external
> markup
> like this don't impose this. 

I think the rule can be loosened to "MUST NOT overlap each others boundaries".

This will be needed if we ever allow erasing characters (see the other branch 
of this thread) to do so accurately.

Overlapping boundaries however is tricky; XHTML and the big toolkits (Gtk and 
Qt) don’t support that either (you have to merge the styles manually in the 
overlapping region, if I’m not mistaken). So I’d like to have the sender fix 
that up, because they’ll have to anyways.

kind regards,
Jonas

signature.asc
Description: This is a digitally signed message part.
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-08 Thread Jonas Wielicki
On Mittwoch, 8. November 2017 08:29:49 CET Georg Lukas wrote:
> * Goffi  [2017-11-08 08:17]:
> > about the stars in the list items, it's not really nice to keep them.
> > 
> > It would be good to have an attribute to say which plain text characters
> > can be safely removed without changing the meaning.
> > For instance type="numeric" means than "^[0-9]+\)" can be removed,
> > type="star" mean that the first character must be a "*" and it can be
> > removed.
> That's a nice idea. We need a mechanism where characters can not be
> removed (so we can't end up different meanings depending on client
> capabilities), but replaced in a fashion that is directly mapped to the
> body. Rendering a number at the beginning of an item differently, or
> replacing a "* " with some bullet point seems like a sane (albeit
> slightly complex) approach.

This isn’t trivial, depending on the level of safety we want against spoofing 
attacks. For example, ordered lists usually come in different shapes, even 
though *western* people usually only use arabic numerals, probably closely 
followed by alphabetic lists, I can see other locales using e.g. japanese 
numerals. At that point any simple "strip the number until the next dot
+whitespace" rule falls apart.

More complex rules and erasing of characters in general open the door for 
possible attacks which need to be thought about (by removing critical parts of 
a message with plausibly deniable markup), which is why I omitted that 
functionality them for now. It can be added later thanks to the "clients MUST 
ignore unknown elements and attributes" rule (clients which do not understand 
it will simply leave the characters in place).

I can’t think of good examples right now, but that doesn’t mean that those 
attacks aren’t there, unfortunately. It would also be nice to be able to 
specify erasing of for example "*" for emphasized text, which would give us 
nice compatibility with the Message Styling proposal. 

In any case, *if* such a thing is added to the XEP, the set of characters 
which can be erased by each markup must be thought about carefully and it must 
be restricted. I fear that this might end up being easy to get wrong.

kind regards,
Jonas

signature.asc
Description: This is a digitally signed message part.
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-07 Thread Remko Tronçon
Hi,

On 7 November 2017 at 21:34, Jonas Wielicki  wrote:

> The XMPP Extensions Editor has received a proposal for a new XEP.
>

Minor remark: the XEP says that spans MUST NOT overlap. Is there a reason
for this? I'm asking, because the systems I have seen that use external
markup
like this don't impose this. I'm not sure if this will ever matter, but
this makes adding a
bit of style to a piece of text non-incremental (you can't just add a
, you need to
break up existing spans).

thanks,
Remko
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-07 Thread Georg Lukas
* Goffi  [2017-11-08 08:17]:
> about the stars in the list items, it's not really nice to keep them.
> 
> It would be good to have an attribute to say which plain text characters can 
> be safely removed without changing the meaning.
> For instance type="numeric" means than "^[0-9]+\)" can be removed, 
> type="star" 
> mean that the first character must be a "*" and it can be removed.

That's a nice idea. We need a mechanism where characters can not be
removed (so we can't end up different meanings depending on client
capabilities), but replaced in a fashion that is directly mapped to the
body. Rendering a number at the beginning of an item differently, or
replacing a "* " with some bullet point seems like a sane (albeit
slightly complex) approach.


Georg
-- 
|| http://op-co.de ++  GCS d--(++) s: a C+++ UL+++ !P L+++ !E W+++ N  ++
|| gpg: 0x962FD2DE ||  o? K- w---() O M V? PS+ PE-- Y++ PGP+ t+ 5 R+  ||
|| Ge0rG: euIRCnet ||  X(+++) tv+ b+(++) DI+++ D- G e h- r++ y?   ||
++ IRCnet OFTC OPN ||_||


signature.asc
Description: PGP signature
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-07 Thread Goffi
Le mardi 7 novembre 2017, 21:34:04 CET Jonas Wielicki a écrit :
> The XMPP Extensions Editor has received a proposal for a new XEP.
> 
> Title: Message Markup
> Abstract:
> This specification provides an alternative to XHTML-IM with rigid
> separation of content and markup information, improving the resilience
> against spoofing and injection attacks.
> 
> URL: https://xmpp.org/extensions/inbox/markup.html
> 
> The Council will decide in the next two weeks whether to accept this
> proposal as an official XEP.
> ___
> Standards mailing list
> Info: https://mail.jabber.org/mailman/listinfo/standards
> Unsubscribe: standards-unsubscr...@xmpp.org
> ___


about the stars in the list items, it's not really nice to keep them.

It would be good to have an attribute to say which plain text characters can 
be safely removed without changing the meaning.
For instance type="numeric" means than "^[0-9]+\)" can be removed, type="star" 
mean that the first character must be a "*" and it can be removed.

++
Goffi
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-07 Thread Goffi
Le mardi 7 novembre 2017, 22:41:21 CET Marvin Gülker a écrit :

> §9 on security: one issue that comes to my mind is specifying
> out-of-range values for the "start" and "end" attributes by a malicious
> client.

Or a start without end/end without start, if a client replace it by HTML tags 
without checking, it could lead to an open tag without the corresponding 
closing one.

++
Goffi
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-07 Thread Marvin Gülker
This is an interesting approach. Specifying the markup completely
externally would not have occured to me, but is a really cool idea.

Some notes:

§4.1 has:

> The start and end attributes define the range at which the span is
> applied. They are in units of unicode code points in the character
> data if the body element. 

There's a typo at the end; it should be "of the body element", not "if
the body element".

§4.3 copes with lists, but does not have an example for nested lists,
which are not unusual I think. There's nesting already allowed in §4.4
for blockquotes, so there should be nesting for lists as well.

Do we need ordered lists?

§9 on security: one issue that comes to my mind is specifying
out-of-range values for the "start" and "end" attributes by a malicious
client.

Marvin

-- 
Blog: https://www.guelkerdev.de
PGP/GPG ID: F1D8799FBCC8BC4F
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-07 Thread Peter Saint-Andre
On 11/7/17 1:48 PM, Goffi wrote:
> Le mardi 7 novembre 2017, 21:34:04 CET Jonas Wielicki a écrit :
>> The XMPP Extensions Editor has received a proposal for a new XEP.
>>
>> Title: Message Markup
>> Abstract:
>> This specification provides an alternative to XHTML-IM with rigid
>> separation of content and markup information, improving the resilience
>> against spoofing and injection attacks.
>>
>> URL: https://xmpp.org/extensions/inbox/markup.html
>>
>> The Council will decide in the next two weeks whether to accept this
>> proposal as an official XEP.
>> ___
>> Standards mailing list
>> Info: https://mail.jabber.org/mailman/listinfo/standards
>> Unsubscribe: standards-unsubscr...@xmpp.org
>> ___
> 
> 
> Fantastic, I really love this one, this by far the best proposal we had, 
> thanks for that!
> 
> I think it solves all my concerns, it's clean separation, extensible, easy to 
> implement, not polluting the , and standardized.

Yes, this looks good to me.

Note that XEP-0301 has some text about Unicode character counting - we
could copy some of that to this spec.

> I'm all in favor of deprecating XHTML-IM is this one is accepted.

As the author of the XHTML-IM specification, I agree.

Peter




signature.asc
Description: OpenPGP digital signature
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


Re: [Standards] Proposed XMPP Extension: Message Markup

2017-11-07 Thread Goffi
Le mardi 7 novembre 2017, 21:34:04 CET Jonas Wielicki a écrit :
> The XMPP Extensions Editor has received a proposal for a new XEP.
> 
> Title: Message Markup
> Abstract:
> This specification provides an alternative to XHTML-IM with rigid
> separation of content and markup information, improving the resilience
> against spoofing and injection attacks.
> 
> URL: https://xmpp.org/extensions/inbox/markup.html
> 
> The Council will decide in the next two weeks whether to accept this
> proposal as an official XEP.
> ___
> Standards mailing list
> Info: https://mail.jabber.org/mailman/listinfo/standards
> Unsubscribe: standards-unsubscr...@xmpp.org
> ___


Fantastic, I really love this one, this by far the best proposal we had, 
thanks for that!

I think it solves all my concerns, it's clean separation, extensible, easy to 
implement, not polluting the , and standardized.

I'm all in favor of deprecating XHTML-IM is this one is accepted.

Great work and thanks a lot !

Goffi

P.-S.: it was really worthing having those debates, at the end this look like 
perfect technical solution
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___


[Standards] Proposed XMPP Extension: Message Markup

2017-11-07 Thread XSF Editor
The XMPP Extensions Editor has received a proposal for a new XEP.

Title: Message Markup
Abstract:
This specification provides an alternative to XHTML-IM with rigid
separation of content and markup information, improving the resilience
against spoofing and injection attacks.

URL: https://xmpp.org/extensions/inbox/markup.html

The Council will decide in the next two weeks whether to accept this
proposal as an official XEP.
___
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
___