Re: [xml] confusing xpath performance characteristics

Stefan Behnel Wed, 27 Jan 2010 05:00:21 -0800

[bump]

Any comments?


Stefan Behnel, 09.11.2009 19:23:
> Stefan Behnel, 09.11.2009 09:57:
>> It's the last operation, merging and sorting large sets of results, that
>> makes this extremely slow - it takes 92% of the evaluation time in my tests
>> (using libxml2 2.7.5). It's much faster to traverse the document in a
>> single step, and just select single attributes from it, that can quickly be
>> appended to the node set.
>>
>> I imagine that this step could actually be optimised away in many cases
>> (like the case above, where results are guaranteed to be found in doc
>> order), so I guess it's just in there to avoid too much special casing. But
>> it seriously kills the performance here.
> 
> Would it be ok to add a new "int isInDocumentOrder" field to the xmlNodeSet
> struct that would be true for node sets that are known to be in document 
> order?
> 
> That would make it easy to skip the sorting step in all cases where
> building the node set follows document order anyway. Given that node
> comparison is horribly expensive (more than 90% of the sort time in my
> tests), I think it's absolutely worth avoiding the sorting step whenever
> possible.
> 
> Also, for sorted node sets, xmlXPathNodeSetAdd() could compare the new node
> to the last node in the node-set and clear the flag if the new node breaks
> the document order. That way, only N-1 comparisons would be required for a
> sorted set of N nodes, instead of something like O(N^2) currently.
> 
> Stefan
> _______________________________________________
> xml mailing list, project page  http://xmlsoft.org/
> [email protected]
> http://mail.gnome.org/mailman/listinfo/xml
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Re: [xml] confusing xpath performance characteristics

Reply via email to