Jeffrey L. Taylor wrote: > Quoting David Kahn <[email protected]>: >> parser (I know REXML is notorious for being slow, but given how fast it was >> to call a regex on the file, I am thinking that this will still be faster >> than all parsers). >> > > Look at using LibXML::XML::Reader > > http://libxml.rubyforge.org/rdoc/index.html > > What most XML parsing libraries are doing is reading the entire XML file > into > memory, probably storing the raw text, parsing it, and creating an even > bigger > data structure for the whole file, then searching over it. Nokogiri at > least > does some of the searching in C, instead of Ruby (it uses libxml2). > > With LibXML::XML::Reader is possible (with some not very pretty code) to > make > one pass thru the XML file, parsing as you go, and create data > structures for > just the information of interest. Enormously faster.
Interesting; that seems worth knowing about. But wouldn't Reader still have to create a DOM tree to do the searching in the first place? > > HTH, > Jeffrey Best, -- Marnen Laibow-Koser http://www.marnen.org [email protected] -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.

