Re: [Standards] Security issues with XHTML-IM (again)

Dave Cridland Thu, 12 Oct 2017 07:58:53 -0700

On 12 October 2017 at 15:19, Sam Whited <s...@samwhited.com> wrote:
> On Thu, Oct 12, 2017, at 03:09, Dave Cridland wrote:
>> I would note that in principle, a content security policy ought to
>> prevent such attacks outright.
>>
>> But there would, probably, remain several other innovative attacks,
>> such as passing client-specific markup intended to duplicate existing
>> UI elements.
>
> Indeed. Using a restricted subset of a complicated system always
> introduces the risk that some part of that complexity will not be
> understood and will leak out, possibly causing security issues. We see
> that on the web fairly regularly.
>
> It's my beleif that it's always better to use a simple, complete system
> instead of a restricted, complex system. We see the same thing with
> XMPP's use of XML: we may use a sane subset of it, but since the
> underlying libraries still handle things like proc insts and whatever
> the ampersand escape thing is called you still get attacks based on
> those every so often (even though they're forbidden in XMPP).
>
> I didn't bring this up in the original mail because it tends to get a
> bit abstract, but it's worth discussing if we move to make a
> replacement.
>


I think the problem isn't simply a subset of a complex system, it's
that sanitizing HTML is a difficult and largely error prone problem
which has repeatedly been the cause of a number of security problems.

I appreciate it's entirely possible, but even a simplified ruleset is
something like:

1) For each child element:
a) Discard if this is an unsupported element.
b) Remove any unsupported attributes.
c) For the style attribute, parse the CSS and:
    ii)  remove any unsupported attributes.
    i) For attributes which (might) contain a URL, ensure the URL is
of a scheme we think might be OK, although we won't tell you which
those are.
d) For each remaining HTML attribute which (might) contain a URL,
ensure that any URL is of a scheme we think be be OK, although we
won't tell you which those are.
e) Recurse for each child element.

>> So overall, I think we should move rich IM formatting to Markdown and
>> call it done.
>
> Let's discuss this in a separate thread. I'd really like to try and keep
> this about deprecating XHTML-IM, which I think is an orthogonal track of
> work (unless you disagree, in which case, please voice that here!).

It's clearly not orthogonal, since simply getting rid of XHTML-IM is
not deprecating it in favour of anything else.

But several clients have supported a basic Markdown-like syntax for
emphasis for years - Gajim, for example, supports both *bold* and
/italic/ at a quick test, and I think it has for years.

Slack does fine on just a handful more items (`preformat`, for example).

I appreciate Goffi's argument that Markdown-like syntaxes do not
handle tables, but guess what? Nor does XHTML-IM.

So my argument for keeping it in this thread is really in order to
understand what features of XHTML-IM are desirable rather than to
fully specify a replacement - once we know that we want XHTML-IM's
feature set to support bold, or tables, or inline images, or whatever
then we can move on to design a replacement.

Dave.
_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
_______________________________________________

Re: [Standards] Security issues with XHTML-IM (again)

Reply via email to