Re: [Wikitech-l] hiphop progress

Ævar Arnfjörð Bjarmason Mon, 01 Mar 2010 05:34:49 -0800

On Mon, Mar 1, 2010 at 10:10, Domas Mituzas <midom.li...@gmail.com> wrote:
> Howdy,
>
>> Most of the code in MediaWiki works just fine with it (since most of
>> it is mundane) but things like dynamically including certain files,
>> declaring classes, eval() and so on are all out.
>
> There're two types of includes in MediaWiki, ones I fixed for AutoLoader and 
> ones I didn't - HPHP has all classes loaded, so AutoLoader is redundant.
> Generally, every include that just defines classes/functions is fine with 
> HPHP, it is just some of MediaWiki's startup logic (Setup/WebStart) that 
> depends on files included in certain order, so we have to make sure HipHop 
> understands those includes.
> There was some different behavior with file including - in Zend you can say 
> require("File.php"), and it will try current script's directory, but if you 
> do require("../File.php") - it will
>
> We don't have any eval() at the moment, and actually there's a mode when 
> eval() works, people are just scared too much of it.
> We had some double class definitions (depending on whether certain components 
> are available), as well as double function definitions ( ProfilerStub vs 
> Profiler )
>
> One of major problems is simply still not complete function set, that we'd 
> need:
>
> * session - though we could sure work around it by setting up our own Session 
> abstraction, team at facebook is already busy implementing full support
> * xdiff, mhash - the only two calls to it are from DiffHistoryBlob - so 
> getting the feature to work is mandatory for production, not needed for 
> testing :)
> * tidy - have to call the binary now
>
> function_exists() is somewhat crippled, as far as I understand, so I had to 
> work around certain issues there.
> There're some other crippled functions, which we hit through the testing...
>
> It is quite fun to hit all the various edge cases in PHP language (e.g. 
> interfaces may have constants) which are broken in hiphop.
> Good thing is having developers carefully reading/looking at those. Some 
> things are still broken, some can be worked around in MediaWiki.
>
> Some of crashes I hit are quite difficult to reproduce - it is easier to 
> bypass that code for now, and come up with good reproduction cases later.
>
>> Even if it wasn't hotspots like the parser could still be compiled
>> with hiphop and turned into a PECL extension.
>
> hiphop provides major boost for actual mediawiki initialization too - while 
> Zend has to reinitialize objects and data all the time, having all that in 
> core process image is quite efficient.
>
>> One other nice thing about hiphop is that the compiler output is
>> relatively readable compared to most compilers. Meaning that if you
>
> That especially helps with debugging :)
>
>> need to optimize some particular function it's easy to take the
>> generated .cpp output and replace the generated code with something
>> more native to C++ that doesn't lose speed because it needs to
>> manipulate everything as a php object.
>
> Well, that is not entirely true - if it manipulated everything as PHP object 
> (zval), it would be as slow and inefficient as PHP. The major cost benefit 
> here is that it does strict type inference, and falls back to Variant only 
> when it cannot come up with decent type.
> And yes, one can find offending code that causes the expensive paths. I don't 
> see manual C++ code optimizations as way to go though - because they'd be 
> overwritten by next code build.


The case I had in mind is when you have say a function in the parser
that takes a $string and munges it. If that turns out to be a
bottleneck you could just get a char* out of that $string and munge it
at the C level instead of calling the PHP wrappers for things like
explode() and other php string/array munging.

That's some future project once it's working and those bottlenecks are
found though, I was just pleasantly surprised that hphp makes this
relatively easy.

One large practical upshot of this is though that hacky things like
the parser which are the way they are because that's how you optimize
this sort of thing in PHP could be written in some babytalk version of
PHP that produces a real parse tree; It would be slower in pure php
but maybe hphp's speed could make up for it.

Then you could take that component & compile it to C++ (maybe with
some manual munging) and make libmediawiki-parse++ which, that would
be quite awesome :)

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] hiphop progress

Reply via email to