> the xml.dom.minidom object is too slow when parsing such a big XML file > to a DOM object. while pulldom should spend quite a long time going > through the whole database file. How to enhance the searching speed? > Are there existing solution or algorithm? Thank you for your > suggetion...
I've told you that before, and I tell you again: RDBMS is the way to go. There might be XML-parsers that work faster - I suppose cElementTree can gain you some speed - but ultimately the problems are inherent in the representation as DOM: no type-information, no indices, no nothing. Just a huge pile of nodes in memory. So all searches are linear in the number of nodes. Of course you might be able to create indices yourself, even devise a clever scheme to make using them as declarative as possible. But that would in the end mean nothing but re-creating RDBMS technology - why do that, if it's already there? Maybe there are frameworks out there that support you in this, but the very nature of XML makes that for sure a more tedious task than just defining a simple SQL-Schema. If I'd have to search for some XML-tools that go beyond DOM, I'd go for uche ogbuji's 4suite as a starter and work my way down from there - maybe AMARA is what you need? Now having said that: I'm not a SQL-bigot. Just use the right tool for the job. Regards, Diez -- http://mail.python.org/mailman/listinfo/python-list