On Tue, Dec 9, 2008 at 8:09 PM, Chris Cosner <[EMAIL PROTECTED]> wrote: > Question: What is the speediest tool to pull data from an xml feed that will > only be a few hundred lines at most? Some regexes will be necessary. > > Context: > I am playing with the google books data api. They provide a feed, which you > can see an example of here: > http://code.google.com/apis/books/docs/gdata/developers_guide_protocol.html > (scroll about halfway down) > > I can send search terms to the api and get back some information about the > first three results in Google Book Search to integrate with our own search > results. [Done] So in some cases the user may click through to GBS, and in > others stay on our site. The GBS feed duplicates some tags, such as > "dc:identifier" and the only way to distinguish them will be with a regex on > the contents, or by noting tag order. > > With the CPAN module XML::XSLT I am able to transform this pretty rapidly. I > tried using XML::Twig, but it seemed too slow for this purpose. > > However, XML::XSLT does not support regexes. > > So I expect that I'll just have to transform the text as far as possible > with XML::XSLT and the use Perl directly to finish the job.
I would avoid regexes when working with XML. They break much too easily. Stick to XML tools. There are any number of XML parsers available at CPAN. Sean -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/