https://bugzilla.wikimedia.org/show_bug.cgi?id=47437

       Web browser: ---
            Bug ID: 47437
           Summary: ResourceLoader: Implement support for enhanced
                    minification (e.g. support UglifyJS)
           Product: MediaWiki
           Version: 1.22-git
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: Unprioritized
         Component: ResourceLoader
          Assignee: wikibugs-l@lists.wikimedia.org
          Reporter: krinklem...@gmail.com
                CC: krinklem...@gmail.com, roan.katt...@gmail.com,
                    tpars...@wikimedia.org
    Classification: Unclassified
   Mobile Platform: ---

Right now we use a very basic but fast minifier. It has to perform very well
due to the way we do on-demand package generation[1] whilst having a very high
cache hit ratio.

Though this is nice, it drastically limits our options and ability to implement
additional features.

Three features in particular:

* Implementing source maps[2] for easier debugging. At the moment with our
basic minification enabling "Prettification" in Chrome Dev Tools makes the
debugging experience "Okay" to deal with, but it is still all squashed into one
file (doesn't map to original file names). When we do even more advanced
minification this becomes even more important.

* Conditional code / stripping blocks. One of the things more sophisticated
minifiers are capable of is stripping dead code. Aside from the obvious rare
case of consistently unreachable code (which should just be wiped from the code
base), this is useful for debugging purposes. See also bug 37763. Right now we
have very few mw.log calls. I believe we avoid these because they take up
space. Though they are a no-op in production mode (the log method is an empty
function by default, in debug mode we load the actual module that populates the
method. So it isn't that they would pollute the console in production, but that
they take up javascript code. By putting them in something like `if (MW_DEBUG)
{ mw.log(...); }` we can have them be stripped by UglifyJS in production and
preserve them in debug mode by predefining a global constant MW_DEBUG set to
true or false respectively in UglifyJS.

* Better minification: variable name changes, optimising for gzip, optimising
statement to be shorter notation etc. [3]

So that's all great, but the problem is that, though UglifyJS[4] (for example)
is getting faster, it is still much too slow to run on many files at once
on-demand from the web server.

Last February when I was in San Francisco, Roan and I have been thinking about
something. I recall the following, though Roan might have a better version of
this:

* We'd run the quick minifier on cache miss to populate the cache quickly and
respond to the request. Then enqueue a job to run the advanced minifier
(asynchronously).
* The job queue will then run the elaborate minification process and replace
the cache item. We don't have to worry about the possibility of overwriting a
new version with a new version because the cache keys contain a hash of the raw
contents, so worst case scenario we're saving something that won't be used.

There's 2 details in particular I'm not sure about:
* How do we deliver them to the client? We have unique urls with version
timestamps.
- The only way to trigger a purge is to either keep track of all urls in
varnish that contain the module name and order a purge in varnish (after we
update memcached, of course, so it'd be a quick roundtrip to Apache to compose
a response from cached components)
- Or alternatively, cause a version bump in the module (touch() the files)

* The job queue, we can enqueue generic jobs that check everything. Or enqueue
a job per cache item. In either case we need to account for the case that the
enqueued job is no longer needed by the time it runs (in case we use generic
jobs, once the first one runs, it should cancel any other ones in the queue, in
case of module or item specific jobs cancel any for the same).

And then there is the question of how getting the javascript code and nodejs
deployed and execute it from php. Installing nodejs on every apache and
shelling out is probably not a good idea. Alternatively we could wrap it in a
priviate service (like Parsoid), so we set up a few of them in the bits cluster
and PHP would open a socket or HTTP request and POST or stream input and get
output back.


[1]
https://www.mediawiki.org/wiki/ResourceLoader/Features#On-demand_package_generation
[2] http://www.html5rocks.com/en/tutorials/developertools/sourcemaps/
https://github.com/mozilla/source-map
http://www.youtube.com/watch?v=HijZNR6kc9A
[3] https://github.com/mishoo/UglifyJS2#compressor-options
[4] https://github.com/mishoo/UglifyJS2

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to