Edit report at https://bugs.php.net/bug.php?id=63380&edit=1
ID: 63380 Updated by: rricha...@php.net Reported by: tstarl...@php.net Summary: Allocation via libxml does not use PHP's per-request allocator Status: Assigned Type: Bug Package: XML related Operating System: Linux PHP Version: master-Git-2012-10-29 (Git) -Assigned To: rrichards +Assigned To: tstarling Block user comment: N Private report: N New Comment: There is a major problem with doing this and why I didn't end tying into the PHP memory allocator. Depending upon setup, it is extremely likely to be able to hit memory corruption and/or mix memory allocations between modules. i.e. using mod_perl and mod_php will cause PHP to override the libxml memory handling functions (which are global) and bleed into mod_perl (or any others that are using libxml2) causing any number of results (crashes, security issues, etc..). The only way to be able to do something like this would be to make it compile time option which is disabled by default allowing those who know their environment intimately can utilize this at their own risk, Don't know if you want to write a patch for that or not. Otherwise I don't see any way this could safely be added, Previous Comments: ------------------------------------------------------------------------ [2012-10-29 21:55:03] tstarl...@php.net https://github.com/php/php-src/pull/223 ------------------------------------------------------------------------ [2012-10-29 03:25:17] tstarl...@php.net Description: ------------ Allocation via libxml does not use PHP's per-request allocator. So any memory used by libxml will not be accounted against memory_get_usage() or memory_limit. At Wikimedia we use libxml DOM trees to store wikitext parse trees, because they are more compact in memory than the equivalent pure-PHP data structures. When these parse trees are cached, the memory requirements can become excessive, and the memory is typically not returned to the system after request termination. Using xmlMemSetup() to set hook functions which use PHP's per-request allocation functions will allow us to more effectively monitor and limit the use of libxml in production. I've developed a patch and will submit it to GitHub as a pull request. Test script: --------------- $doc = new DOMDocument; for ( $i = 0; $i < 1000000 ; $i++ ) { $doc->appendChild($doc->createElement('foo')); } print memory_get_usage()."\n"; Expected result: ---------------- 312331440 (with debug and ZTS) Actual result: -------------- 694256 ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=63380&edit=1