> -----Original Message-----
> From: wikitech-l-boun...@lists.wikimedia.org 
> [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of 
> Platonides
> Sent: 17 December 2008 00:20
> To: wikitech-l@lists.wikimedia.org
> Subject: Re: [Wikitech-l] Future of Javascript and mediaWiki
> 
> Jared Williams wrote:
> > SDCHing MediaWiki HTML would take some effort, as the page 
> output is 
> > between skin classes and OutputPage etc.
> > 
> > Also would want the translation text from 
> > \languages\messages\Messages*.php in there too I think. 
> Handling the 
> > $1 style placeholders is easy, its just determining what 
> message goes 
> > through which wfMsg*() function, and if the WikiText 
> translations can be preconverted to html.
> > 
> > But most of the HTML comes from article wikitext, so I 
> wonder wether 
> > it'd beat gzip by anything significant.
> > 
> > Jared
> 
> Note that SDCH is expected to be then gzipped, as they 
> fulfill different needs. They aren't incompatible.
> You would use a dictionary for common skin bits, perhaps also 
> adding some common page features, like the TOC code, 
> 'amp;action=edit&redlink=1" class="new"'...
> 
> Having a second dictionary for language dependant output 
> could be also interesting, but not all messages should be provided.

Unfortunately, whilst the useragent can announce it has multiple
dictionaries, 
the SDCH response can only indicate it used a single dictionary.

> 
> Simetrical wrote:
> > What happens if you have parser functions that depend on 
> the value of
> > $1 (allowed in some messages AFAIK)?  What if $1 contains wikitext 
> > itself (I wouldn't be surprised if that were true 
> somewhere)?  How do 
> > you plan to do this substitution anyway, JavaScript?  What about 
> > clients that don't support JavaScript?
> 
> /Usually/, you don't create the dictionary output by hand, 
> but pass the page to a "dictionary compresser" (or so is 
> expected, this is too much experimental yet). If a parser 
> function changed it completely, they will just be literals. 
> If you have a parametrized block, the vcdiff would see, "this 
> piece up to Foo matches this dictionary section, before $1.
> And this other matches the text following Foo..."

What I have atm, just traverses a directory of templates, 
using PHPs built in tokenizer to extract T_INLINE_HTML tokens 
into the dictionary (if greater than 3 bytes long), and replacing
with them with a call to output the vcdiff copy opcodes.

So
<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="<?php $e($this->lang);
?>">
        <head>
                <meta http-equiv="Content-Type" content="text/html;
charset=utf-8"/>
                <title><?php $e($this->title); ?>

Becomes
<?php $this->copy(0, 53);$e($this->lang); $this->copy(53,
91);$e($this->title); 

PHPs output buffering captures the output from the PHP code within the
template, 
which essentially becomes the data section of the vcdiff.

> 
> 
> 
> Jared wrote:
> > I do have working PHP code, That can parse PHP templates & language 
> > strings to generate the dictionary, and a new set of templates 
> > rewritten to output the vcdiff efficiently.
> 
> Please share?
> 

Intend too, I probably should document/add some comments first :)

Jared


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to