Re: Extract attribute from huge xml file

Jenda Krynicky Thu, 03 Jan 2008 16:48:46 -0800

From: [EMAIL PROTECTED]
> On Dec 10 2007, 3:43 am, [EMAIL PROTECTED] (Chas. Owens) wrote:
> > On Dec 10, 2007 8:24 AM, Tim Bowden <[EMAIL PROTECTED]> wrote:
> > Unfortunately that won't work with structured data like XML.  You best
> > bet is to use something like XML::Twig to grab the top level records
> > and output them to a new file.  for instance, say we have an XML file
> > that looks like this
> >
> > <root>
> >         <records set="1">
> >                 <record>foo</record>
> >                 <record>bar</record>
> >                 <record>baz</record>
> >         </records>
> >         <records set="2">
> >                 <record>quux</record>
> >         </records>
> >         <records set="3">
> >                 <record>foofoo</record>
> >                 <record>foobar</record>
> >         </records>
> > </root>
> >
> > and we only want the first two sets of records.  We could use this
> > code to produce a new file with only those records
> >
> > #!/usr/bin/perl
> >
> > use strict;
> > use warnings;
> >
> > use XML::Twig;
> >
> > my $i;
> > my $t = XML::Twig->new(
> >         twig_handlers => {
> >                 records => sub {
> >                         exit if ++$i > 2;
> >                         $_->print;
> >                         $_->flush;
> >                 }
> >         }
> > );
> >
> > print "<root>";
> > $t->parsefile("t.xml");
> > print "</root>";
> 
> 
> BTW, I forgot to mention.  I have a huge XML file that has 4,621
> records.  I am trying to compare with an array of 1,187 specific
> record identifiers, and just print out those records (as XML).  So I
> will have a new working XML file with 1,187 records.


First change that array of IDs to a hash like this:

 my %wanted;
 @[EMAIL PROTECTED] = ();

and then print only the records for which 
exists($wanted{$record_id}).

It's easy to find out the current record id from within the record 
handler, the unnamed subroutine specified in the XML::Twig->new() 
constructor. Look into the XML::Twig's documentation.

Jenda
===== [EMAIL PROTECTED] === http://Jenda.Krynicky.cz =====
When it comes to wine, women and song, wizards are allowed 
to get drunk and croon as much as they like.
        -- Terry Pratchett in Sourcery


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Extract attribute from huge xml file

Reply via email to