On 05/14/2014 01:40 PM, Daniel Kinzler wrote:
>> This means that HTML returned from the preprocessor needs to be valid in
>> wikitext to avoid being stripped out by the sanitizer. Maybe that's actually
>> possible, but my impression is that you are shooting for something that's
>> closer to the behavior of a tag extension. Those already bypass the
>> sanitizer, so would be less troublesome in the short term.
> 
> Yes. Just treat <html>...</html> like a tag extension, and it should work 
> fine.
> Do you see any problems with that?

First of all you'll have to make sure that users cannot inject <html> tags
as that would enable arbitrary XSS. I might have missed it, but I believe
that this is not yet done in your current patch.

In contrast to normal tag extensions <html> would also contain fully
rendered HTML, and should not be piped through action=parse as is done in
Parsoid for tag extensions (in absence of a direct tag extension expansion
API end point). We and other users of the expandtemplates API will have to
add special-case handling for this pseudo tag extension.

In HTML, the <html> tag is also not meant to be used inside the body of a
page. I'd suggest using a different tag name to avoid issues with HTML
parsers and potential name conflicts with existing tag extensions.

Overall it does not feel like a very clean way to do this. My preference
would be to let the consumer directly ask for pre-expanded wikitext *or*
HTML, without overloading action=expandtemplates. Even indicating the
content type explicitly in the API response (rather than inline with an HTML
tag) would be a better stop-gap as it would avoid some of the security and
compatibility issues described above.

>> So it is important to think of renderers as services, so that they are
>> usable from the content API and Parsoid. For existing PHP code this could
>> even be action=parse, but for new renderers without a need or desire to tie
>> themselves to MediaWiki internals I'd recommend to think of them as their
>> own service. This can also make them more attractive to third party
>> contributors from outside the MediaWiki world, as has for example recently
>> happened with Mathoid.
> 
> True, but that has little to do with my patch. It just means that 3rd party
> Content objects should preferably implement getHtml() by calling out to a
> service object.

You are right that it is not an immediate issue with your patch. The point
is about the *longer-term* role of the ContentHandler vs. the content API.
The ContentHandler could either try to be the central piece of our new
content API, or could become an integration point that normally calls out to
the content API and other services to retrieve HTML.

To me the latter is preferable as it enables us to optimize the content API
for high request rates by concentrating on doing one job well, and lets us
leverage this API from the server-side MediaWiki front-end through
ContentHandler.

Gabriel

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to