Klaus, Thanks for contributing to CPAN On the question of callbacks: This is Perl, there's more than way to do it (whatever 'it' is ). That said the use of callbacks is very Perlish. Indeed Perl itself uses callbacks in it's builtins: look at sort(), the comparison function is a callback.
It is also true that while callbacks provide a clean interface (instead of overloading a generic object method, for example) you are adding another function call to processing. But instead of having a clean function call you have substituted calling an iterator from your object. This is actually more costly. Look at XML::Simple. You get back a data structure and deal with it yourself using native Perl iteration (foreach). You can then do whatever you want: write to a file, find the piece you want or do some tranformation and turn it back into XML. Sent from my BlackBerry® smartphone with Nextel Direct Connect -----Original Message----- From: Klaus <klau...@gmail.com> Date: Tue, 11 May 2010 10:10:33 To: Jonathan Rockway<j...@jrock.us> Cc: <module-authors@perl.org> Subject: Re: XML::Reader On Tue, May 11, 2010 at 7:49 AM, Jonathan Rockway <j...@jrock.us> wrote: > * On Tue, Apr 27 2010, Klaus wrote: > > I have released XML::Reader (ver 0.34) > > http://search.cpan.org/~keichner/XML-Reader-0.34/lib/XML/Reader.pm<http://search.cpan.org/%7Ekeichner/XML-Reader-0.34/lib/XML/Reader.pm> > by the way, I have now released a new version of XML::Reader (ver 0.35) with some bug fixes, warts removed, relicensing, etc... http://search.cpan.org/~keichner/XML-Reader-0.35/lib/XML/Reader.pm<http://www.google.com/url?sa=D&q=http://search.cpan.org/%7Ekeichner/XML-Reader-0.35/lib/XML/Reader.pm&usg=AFQjCNFMDvw04s1jwrzMvJCddJWgkjfcJg> > > To explain the module, I have created a small demonstration program > > that extracts XML-subtrees (for example any path that ends with '/.../ > > a') memory efficiently. > > > > An XML document can be very large (possibly many gigabytes), but is > > composed of XML-subtrees, each of which is only a few kilobytes in > > size. The demonstration program reads XML-subtrees one by one, only > > the memory for one subtree is held at a time. Each subtree can then be > > processed further at your convenience (for example by using regular > > expressions, or, by using other XML-Modules, such as XML::Simple). In > > principle, XML::Reader has no event driven callback functions, you > > have to loop over the XML-document yourself and the resulting XML- > > subtree is represented in text format. > > So apparently I am rather behind on module-authors, but I just thought > I'd ask if you've taken a look at XML::Twig? That seems to be the main > module for this sort of thing, and seems to have an established > userbase. Maybe patches to that would be more productive than > reinventing the wheel? Thanks for your message. I would position XML::Reader in the same space as XML::Twig and XML::TokeParser. I have taken a look at XML::Twig which has an established userbase and I agree in that XML::Reader duplicates many of the functionalities already provided by XML::Twig. However, unlike XML::Twig, XML::Reader does not rely on callback functions to parse the XML. With XML::Reader you loop over the XML-document yourself and the resulting XML-elements (and/or XML-subtrees) are represented in text format. This style of processing XML is similar to the classic pattern: "open my $fh, '<', 'file.txt'; while (<$fh>) { do_sth($_); } close $fh;" This pattern is also implemented by XML::TokeParser. However, unlike XML::TokeParser, XML::Reader records the full XML path as it processes the XML-document, therefore it can target not only specific tags, but it can also target a full path of nested element tags (a simplified XPath like expression). I would say that XML::Reader fills an ecological niche that is neither filled by XML::Twig, nor by XML::TokeParser. Regards, Klaus