Re: Content Rewriter Modularization: Design/Change

Ben Laurie Fri, 08 Aug 2008 06:11:05 -0700

[+google-caja-discuss]

On Thu, Aug 7, 2008 at 9:27 PM, John Hjelmstad <[EMAIL PROTECTED]> wrote:
> On Thu, Aug 7, 2008 at 3:20 AM, Ben Laurie <[EMAIL PROTECTED]> wrote:
>
>> On Wed, Aug 6, 2008 at 11:34 PM, John Hjelmstad <[EMAIL PROTECTED]> wrote:
>> > This proposal effectively enables the renderer to become a multi-pass
>> > compiler for gadget content (essentially, arbitrary web content). Such a
>> > compiler can provide several benefits: static optimization of gadget
>> content
>> > (auto-proxying of images, whitespace/comment removal, consolidation of
>> CSS
>> > blocks), security benefits (caja et al), new functionality (annotation of
>> > content for stats, document analysis, container-specific features), etc.
>> To
>> > my knowledge no such infrastructure exists today (with the possible
>> > exception of Caja itself, which I'd like to dovetail with this work).
>>
>> Caja clearly provides a large chunk of the code you'd need for this.
>> I'd like to hear how we'd manage to avoid duplication between the two
>> projects.
>>
>> A generalised framework for manipulating content sounds like a great
>> idea, but probably should not live in either of the two projects (Caja
>> and Shindig) but rather should be shared by both of them, I suspect.
>
>
> I agree on both counts. As I mentioned, the piece of this idea that I expect
> to change the most is the parse tree, and Caja's .parser.html and
> .parser.css packages contain much of what I've thrown in here as a base.
>
> My key requirements are:
> * Lightweight framework.
> * Parser modularity, mostly for HTML parsers (to re-use the good work done
> by WebKit or Gecko.. CSS/JS can come direct from Caja I'd bet)
> * Automatic maintenance of DOM<->String conversion.
> * Easy to manipulate structure.


I'm not sure what the value of parser modularity is? If the resulting
tree is different, then that's a problem for people processing the
tree. And if it is not, then why do we care?

>
> I'd love to see both projects share the same base syntax tree
> representations. I considered .parser.html(.DomTree) and .parser.css for
> these, but at the moment these appeared to be a little more tied to Caja's
> lexer/parser implementation than I preferred (though I admit
> AbstractParseTreeNode contains most of what's needed).
>
> To be sure, I don't see this as an end-all-be-all transformation system in
> any way. I'd just like to put *something* reasonable in place that we can
> play with, provide some benefit, and enhance into a truly sophisticated
> vision of document rewriting.
>
>
>>
>>
>> >  c. Add Gadget.getParsedContent().
>> >    i. Returns a mutable GadgetContentParseTree used to manipulate Gadget
>> > Contents.
>> >    ii. Mutable tree calls back to the Gadget object indicating when any
>> > change is made, and emits an error if setContent() has been called in the
>> > interim.
>>
>> In Caja we have been moving towards immutable trees...
>
>
> Interested to hear more about this. The whole idea is for the gadget's tree
> representation to be modifiable. Doing that with immutable trees to me
> suggests that a rewriter would have to create a completely new tree and set
> it as a representation of new content. That's convenient as far as the
> Gadget's maintenance of String<->Tree representations is concerned... but
> seems pretty heavyweight for many types of edits: in-situ modifications of
> text, content reordering, etc. That's particularly so in a single-threaded
> (viz rewriting) environment.

Never having been entirely sold on the concept, I'll let those on the
Caja team who advocate immutability explain why.

Re: Content Rewriter Modularization: Design/Change

Reply via email to