We encountered a strange performance problem related to the cost of building large nodesets. In particular, it seems that the cost of building a nodeset is not linear in the size. To demonstrate this we created three XML files:
<?xml version="1.0" ?> <topnode> <inner-node/> <inner-node/> ... </topnode> The first had 1,000 of the inner nodes (and the corresponding newlines), the second 3,000 and the last 10,000. In the tests, we use the following stylesheet: <?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" > <xsl:template match="topnode"> <xsl:call-template name="nodes-only"/> <xsl:call-template name="text-only"/> <xsl:call-template name="nodes-or-text"/> </xsl:template> <xsl:template name="nodes-only"> <xsl:value-of select="count(*)"/> <xsl:value-of select="' '"/> </xsl:template> <xsl:template name="text-only"> <xsl:value-of select="count(text())"/> <xsl:value-of select="' '"/> </xsl:template> <xsl:template name="nodes-or-text"> <xsl:value-of select="count(*|text())"/> <xsl:value-of select="' '"/> </xsl:template> </xsl:stylesheet> >From the profiling using version 1.1.12, we see the following times: Template Name 1000 nodes 3000 nodes 10,000 nodes nodes-or-text 4890 77834 1870993 text-only 1274 12357 213519 nodes-only 27 257 1526 which corresponds to the following speed ratios Template Name 1000 nodes 3000 nodes 10,000 nodes nodes-or-text 1 15.9 times 382.6 times text-only 1 9.7 times 167.6 times nodes-only 1 9.5 times 56.5 times This is definitely not scaling linearly. And, a second interesting point, is that count *|text() is 5 times slower than counting * and then counting text() separately. Regards, Jerome Pesenti __________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail _______________________________________________ xslt mailing list, project page http://xmlsoft.org/XSLT/ [email protected] http://mail.gnome.org/mailman/listinfo/xslt
