Re: XML::Reader

Dana Hudes Tue, 11 May 2010 17:33:21 -0700

Klaus,
Thanks for contributing to CPAN 

On the question of callbacks: This is Perl, there's more than way to do it 
(whatever 'it' is ). That said the use of callbacks is very Perlish. Indeed 
Perl itself uses callbacks in it's builtins: look at sort(), the comparison 
function is a callback.

It is also true that while callbacks provide a clean interface (instead of 
overloading a generic object method, for example) you are adding another 
function call to processing. But instead of having a clean function call you 
have substituted calling an iterator from your object. This is actually more 
costly. 

Look at XML::Simple. You get back a data structure and deal with it yourself 
using native Perl iteration (foreach). You can then do whatever you want: write 
to a file, find the piece you want or do some tranformation and turn it back 
into XML.  
Sent from my BlackBerry® smartphone with Nextel Direct Connect

-----Original Message-----
From: Klaus <klau...@gmail.com>
Date: Tue, 11 May 2010 10:10:33 
To: Jonathan Rockway<j...@jrock.us>
Cc: <module-authors@perl.org>
Subject: Re: XML::Reader

On Tue, May 11, 2010 at 7:49 AM, Jonathan Rockway <j...@jrock.us> wrote:

> * On Tue, Apr 27 2010, Klaus wrote:
> > I have released XML::Reader (ver 0.34)
> > http://search.cpan.org/~keichner/XML-Reader-0.34/lib/XML/Reader.pm<http://search.cpan.org/%7Ekeichner/XML-Reader-0.34/lib/XML/Reader.pm>
>

by the way, I have now released a new version of XML::Reader (ver 0.35)
with some bug fixes, warts removed, relicensing, etc...
http://search.cpan.org/~keichner/XML-Reader-0.35/lib/XML/Reader.pm<http://www.google.com/url?sa=D&q=http://search.cpan.org/%7Ekeichner/XML-Reader-0.35/lib/XML/Reader.pm&usg=AFQjCNFMDvw04s1jwrzMvJCddJWgkjfcJg>

> > To explain the module, I have created a small demonstration program
> > that extracts XML-subtrees (for example any path that ends with '/.../
> > a') memory efficiently.
> >
> > An XML document can be very large (possibly many gigabytes), but is
> > composed of XML-subtrees, each of which is only a few kilobytes in
> > size. The demonstration program reads XML-subtrees one by one, only
> > the memory for one subtree is held at a time. Each subtree can then be
> > processed further at your convenience (for example by using regular
> > expressions, or, by using other XML-Modules, such as XML::Simple). In
> > principle, XML::Reader has no event driven callback functions, you
> > have to loop over the XML-document yourself and the resulting XML-
> > subtree is represented in text format.
>
> So apparently I am rather behind on module-authors, but I just thought
> I'd ask if you've taken a look at XML::Twig?  That seems to be the main
> module for this sort of thing, and seems to have an established
> userbase.  Maybe patches to that would be more productive than
> reinventing the wheel?

Thanks for your message.

I would position XML::Reader in the same space as XML::Twig and
XML::TokeParser.

I have taken a look at XML::Twig which has an established userbase
and I agree in that XML::Reader duplicates many of the functionalities
already provided by XML::Twig.

However, unlike XML::Twig, XML::Reader does not rely on callback
functions to parse the XML. With XML::Reader you loop over the
XML-document yourself and the resulting XML-elements (and/or
XML-subtrees) are represented in text format. This style of processing
XML is similar to the classic pattern:

"open my $fh, '<', 'file.txt'; while (<$fh>) { do_sth($_); } close $fh;"

This pattern is also implemented by XML::TokeParser. However,
unlike XML::TokeParser, XML::Reader records the full XML path as
it processes the XML-document, therefore it can target not only
specific tags, but it can also target a full path of nested element
tags (a simplified XPath like expression).

I would say that XML::Reader fills an ecological niche that is
neither filled by XML::Twig, nor by XML::TokeParser.

Regards,
Klaus

Re: XML::Reader

Reply via email to