Hi,

I am trying to parse the data out of am XML file. The file is below.
Most of the data is easily grabbed but the keywords stretch over 
several newlines and there can anywhere between 0 and 20 entries. I 
have tried using /m and /s but these don't seem to work. I have set 
$/="<image>", I don't know if this is impacting on my attempts. But 
changing it does help either.

Here is what I am using at the moment:
==============
my $datafile = "news.xml";
open(FH,$datafile)|| die "Can't open $datafile: $!\n";
while (defined($i=<FH>)) {
        $/ = </image>;
        if ( $i =~ /\?xml version*/ ) {
                next;
        }
        (my $splnum) = ($i =~ /<image number=.(\w\d+\/\d+)/i);
        (my $title) = ($i =~ /<title>(.*)<\/title>/);
        (my $date ) = ($i =~ /<date>(.*)<\/date>/);
        (my $credit) = ($i =~ /<credit>(.*)<\/credit>/);
        (my $caption) = ($i =~ /<caption>(.*)<\/caption>/);
        (my $keywords) = ($i =~ /<keyword>(.*)<\/keyword>/);
        chomp($splnum,$title,$date,$credit);
        print "$splnum $title $date $credit $keywords\n";
 }
===============

This only grabs the first keyword (NERVE FIBRE, OVERLAPPING) and I 
need them all. Also the processing seems to stop after to records 
when there are 470 in $datafile!!. I can't work that out either.

Any ideas? There are a lot of xml modules out there butI don't know if 
any would help.
Thanx.
Dp.


=========== news.xml ============
<?xml version='1.0'?>
<images>
<image number='P350/041'>
<title>Coloured SEM of two overlapping nerve fibres</title>
<date>09-Jul-98</date>
<credit>CREDIT: JUERGEN BERGER, MAX-PLANCK 
INSTITUTE/SCIENCE PHOTO LIBRARY</credit>
<caption>CREDIT: JUERGEN BERGER, MAX-PLANCK INSTITUTE/ 
SCIENCE PHOTO LIBRARY Nerve fibres. Coloured scanning electron 
micrograph (SEM) of overlapping nerve fibres. Each fibre is made up 
of several individual axons. An axon is a long extension from a nerve 
cell (or neurone) which is the main output process of the cell. Some 
small neurone cell bodies (rounded) can be seen here alongside the 
axons. Nerve fibres rapidly relay signals between the central nervous 
system (the brain and spinal cord) and muscles and organs in the 
body. This allows the body to react quickly to any situation. 
Magnification unknown.</caption>
<keywords>
<keyword>NERVE FIBRE, OVERLAPPING</keyword>
<keyword>AXON, NERVE FIBRE, OVERLAPPING</keyword>
<keyword>FIBRE, NERVE, OVERLAPPING</keyword>
<keyword>NERVE CELL, WITH FIBRES</keyword>
<keyword>NEURONE, WITH NERVE FIBRES</keyword>
<keyword>HUMAN BODY, ANATOMY, NERVOUS</keyword>
<keyword>SYSTEM, NERVE FIBRE, FIBRES</keyword>
</keywords>
</image>
</images>
~~
Dermot Paikkos * [EMAIL PROTECTED]
Network Administrator @ Science Photo Library
Phone: 0207 432 1100 * Fax: 0207 286 8668


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to