Stefan Behnel, 09.11.2009 09:57: > It's the last operation, merging and sorting large sets of results, that > makes this extremely slow - it takes 92% of the evaluation time in my tests > (using libxml2 2.7.5). It's much faster to traverse the document in a > single step, and just select single attributes from it, that can quickly be > appended to the node set. > > I imagine that this step could actually be optimised away in many cases > (like the case above, where results are guaranteed to be found in doc > order), so I guess it's just in there to avoid too much special casing. But > it seriously kills the performance here.
Would it be ok to add a new "int isInDocumentOrder" field to the xmlNodeSet struct that would be true for node sets that are known to be in document order? That would make it easy to skip the sorting step in all cases where building the node set follows document order anyway. Given that node comparison is horribly expensive (more than 90% of the sort time in my tests), I think it's absolutely worth avoiding the sorting step whenever possible. Also, for sorted node sets, xmlXPathNodeSetAdd() could compare the new node to the last node in the node-set and clear the flag if the new node breaks the document order. That way, only N-1 comparisons would be required for a sorted set of N nodes, instead of something like O(N^2) currently. Stefan _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
