On Thu, 2012-10-18 at 18:00 -0700, Zhigang Chen wrote:
> Hi
> 
> We sometimes run into the situation where a pretty expensive xpath
> (e.g. .//table//td[@class]) is run on a big document (~ 9M) and it
> takes very very long. In fact we never see it finish.

[resending from the right account, sorry]

I routinely do queries like that with a 50 MByte document, but with
XQuery implementations (XPath 2) rather than XPath 1. I get results in
the order of a few milliseconds.

It would probably be worth adding element indexes to libxml2, even if
they can't easily be built during parsing.

In the meantime you could try to speed this query up yourself by writing
it as
     .//tr[@class]
or, if this is not HTML.,
     .//tr[@class][ancestor::table]

Liam

-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org freenode/#xml
Co-author, 5th edition of "Beginning XML", Wrox, July 2012
-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org freenode/#xml

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to