On Tue, Mar 4, 2014 at 11:26 AM, Andrew Sutherland
<asutherl...@asutherland.org> wrote:
> On 03/04/2014 03:13 AM, Henri Sivonen wrote:
>>
>> It saddens me that we are using non-compliant ad hoc parsers when we
>> already have two spec-compliant (at least at some point in time) ones.
>
> Interesting!  I assume you are referring to:
> https://github.com/davidflanagan/html5/blob/master/html5parser.js

That's most likely it.

> While we have a defense-in-depth strategy (CSP and iframe sandbox should be
> protecting us from the worst possible scenarios) and we're hopeful that
> Service Workers will eventually let us provide nsIContentPolicy-level
> protection, the quality of the HTML parser is of course fairly important[1]
> to the operation of the HTML sanitizer.  If you'd like to bless a specific
> implementation for workers to perform streaming HTML parsing or other some
> other explicit strategy, I'd be happy to file a bug for us to go in that
> direction.  Because we are using a white-list based mechanism and are fairly
> limited and arguably fairly luddite in what we whitelist, it's my hope that
> our errors are on the side of safety (and breaking adventurous HTML email
> :), but that is indeed largely hope.  Your input is definitely appreciated,
> especially as it relates to prioritizing such enhancements and potential
> risk from our current strategy.

Using David's parser you get idiomatic JS that you can modify without
GWT or Java in the way and that already works outside of the context
of a browser window object. With the Validator.nu parser plus GWT,
you'd have to port the glue code to the worker context but then you
could track parser changes from Gecko with a GWT recompile. In theory,
the algorithm shouldn't change often, but in practice, David's parser
lacks e.g. <template> awareness. Still, being able to see idiomatic JS
in a debugger probably outweighs being able to track Gecko with a
recompile, so I'm leaning on the side of recommending David's parser.
(And it would be good to have more of a reason to keep the idiomatic
JS version up-to-date!)

Either way, it should be possible to do a line-by-line port of
nsTreeSanitizer to replicate the sanitizing mode of Thunderbird.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to