From:             
Operating system: 
PHP version:      5.3.4
Package:          DOM XML related
Bug Type:         Feature/Change Request
Bug description:Improve speed of DOMNode::C14N() on large XML documents

Description:
------------
The C14N() function appears to have a runtime that is O(N^2) (or possibly
worse?) depending on input size, which means that it becomes very slow as
the input grows. For example, an input with around 196000 nodes takes about
290 seconds, while an input with 486000 nodes takes 2200 seconds.



Note that this problem only occurs when canonicalizing a subtree of the
docuemnt. If we canonicalize the whole document, it completes almost
immediately.



The problem is that canonicalization uses an XPath expression to find the
nodeset that should be canonicalized. Evaluation of the XPath expression
takes a lot of time as the input size grows, but the libxml2
xmlC14NDocSaveTo() function also has to do a lookup in the nodeset returned
by the XPath expression for every node it encounters.



I believe a better solution would be to do this like it is done in the
xmlsec library. This library use the xmlC14NExecute()-function instead,
which accepts a callback that determines whether a node should be included
in the result. This should make the speed of canonicalization linear with
the input size.



Test script:
---------------
<?php

$doc = new DOMDocument();

$doc->load('some-large-xml-file.xml');

$start = microtime(TRUE);

$doc->documentElement->C14N(FALSE, FALSE);

echo "Done in " . (microtime(TRUE) - $start) . " seconds.\n";




-- 
Edit bug report at http://bugs.php.net/bug.php?id=53655&edit=1
-- 
Try a snapshot (PHP 5.2):            
http://bugs.php.net/fix.php?id=53655&r=trysnapshot52
Try a snapshot (PHP 5.3):            
http://bugs.php.net/fix.php?id=53655&r=trysnapshot53
Try a snapshot (trunk):              
http://bugs.php.net/fix.php?id=53655&r=trysnapshottrunk
Fixed in SVN:                        
http://bugs.php.net/fix.php?id=53655&r=fixed
Fixed in SVN and need be documented: 
http://bugs.php.net/fix.php?id=53655&r=needdocs
Fixed in release:                    
http://bugs.php.net/fix.php?id=53655&r=alreadyfixed
Need backtrace:                      
http://bugs.php.net/fix.php?id=53655&r=needtrace
Need Reproduce Script:               
http://bugs.php.net/fix.php?id=53655&r=needscript
Try newer version:                   
http://bugs.php.net/fix.php?id=53655&r=oldversion
Not developer issue:                 
http://bugs.php.net/fix.php?id=53655&r=support
Expected behavior:                   
http://bugs.php.net/fix.php?id=53655&r=notwrong
Not enough info:                     
http://bugs.php.net/fix.php?id=53655&r=notenoughinfo
Submitted twice:                     
http://bugs.php.net/fix.php?id=53655&r=submittedtwice
register_globals:                    
http://bugs.php.net/fix.php?id=53655&r=globals
PHP 4 support discontinued:          http://bugs.php.net/fix.php?id=53655&r=php4
Daylight Savings:                    http://bugs.php.net/fix.php?id=53655&r=dst
IIS Stability:                       
http://bugs.php.net/fix.php?id=53655&r=isapi
Install GNU Sed:                     
http://bugs.php.net/fix.php?id=53655&r=gnused
Floating point limitations:          
http://bugs.php.net/fix.php?id=53655&r=float
No Zend Extensions:                  
http://bugs.php.net/fix.php?id=53655&r=nozend
MySQL Configuration Error:           
http://bugs.php.net/fix.php?id=53655&r=mysqlcfg

Reply via email to