> -----Original Message----- > From: wikitech-l-boun...@lists.wikimedia.org > [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of > Platonides > Sent: 17 December 2008 00:20 > To: wikitech-l@lists.wikimedia.org > Subject: Re: [Wikitech-l] Future of Javascript and mediaWiki > > Jared Williams wrote: > > SDCHing MediaWiki HTML would take some effort, as the page > output is > > between skin classes and OutputPage etc. > > > > Also would want the translation text from > > \languages\messages\Messages*.php in there too I think. > Handling the > > $1 style placeholders is easy, its just determining what > message goes > > through which wfMsg*() function, and if the WikiText > translations can be preconverted to html. > > > > But most of the HTML comes from article wikitext, so I > wonder wether > > it'd beat gzip by anything significant. > > > > Jared > > Note that SDCH is expected to be then gzipped, as they > fulfill different needs. They aren't incompatible. > You would use a dictionary for common skin bits, perhaps also > adding some common page features, like the TOC code, > 'amp;action=edit&redlink=1" class="new"'... > > Having a second dictionary for language dependant output > could be also interesting, but not all messages should be provided.
Unfortunately, whilst the useragent can announce it has multiple dictionaries, the SDCH response can only indicate it used a single dictionary. > > Simetrical wrote: > > What happens if you have parser functions that depend on > the value of > > $1 (allowed in some messages AFAIK)? What if $1 contains wikitext > > itself (I wouldn't be surprised if that were true > somewhere)? How do > > you plan to do this substitution anyway, JavaScript? What about > > clients that don't support JavaScript? > > /Usually/, you don't create the dictionary output by hand, > but pass the page to a "dictionary compresser" (or so is > expected, this is too much experimental yet). If a parser > function changed it completely, they will just be literals. > If you have a parametrized block, the vcdiff would see, "this > piece up to Foo matches this dictionary section, before $1. > And this other matches the text following Foo..." What I have atm, just traverses a directory of templates, using PHPs built in tokenizer to extract T_INLINE_HTML tokens into the dictionary (if greater than 3 bytes long), and replacing with them with a call to output the vcdiff copy opcodes. So <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="<?php $e($this->lang); ?>"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <title><?php $e($this->title); ?> Becomes <?php $this->copy(0, 53);$e($this->lang); $this->copy(53, 91);$e($this->title); PHPs output buffering captures the output from the PHP code within the template, which essentially becomes the data section of the vcdiff. > > > > Jared wrote: > > I do have working PHP code, That can parse PHP templates & language > > strings to generate the dictionary, and a new set of templates > > rewritten to output the vcdiff efficiently. > > Please share? > Intend too, I probably should document/add some comments first :) Jared _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l