Here is the Perl code which I'm using for substitution in xml file
UIIE_A_123456.xml.
I'm getting search string and replacement string for substitution from another
file (cleanup.txt)
But after substitution, variable of replaced string doesn't interpolated as
defined in 'cleanup.txt' i.e.,
<!DOCTYPE $1 SYSTEM "http://cats.tfinforma.com/dtd/tfja/dtd/TFJA.dtd">
SHOULD BE
<!DOCTYPE article SYSTEM "http://cats.tfinforma.com/dtd/tfja/dtd/TFJA.dtd">
Perl Code:
#!/usr/bin/perl -w -s
$/ = undef;
open(INFILE,"<c:/UIIE_A_123456.xml") or die "Cannot Open: xml";
$_=<INFILE>;xml_cleanup();
close(INFILE);
open(OUTFILE,">c:/UIIE_A_123456") or die "Cannot Open: xml";
print OUTFILE $_;
sub xml_cleanup
{
open(IN,"<c:/cleanup.xml") or die "Cannot Open: cleanup.xml";
$cleanup=<IN>;
close(IN);
while ($cleanup=~/<text>(.*?)<\/text><with>(.*?)<\/with>/gc)
{
$txt="";
$rep="";
$txt=$1;$rep=$2;
if($_ =~ s/$txt/$rep/g){print "Replaced $txt with $rep\n";}
else{print "Not found $txt\n";}
}
}
File: UIIE_A_123456.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE $1 SYSTEM "http://cats.tfinforma.com/dtd/tfja/dtd/TFJA.dtd">
<article doi="10.1080/10826080600847985" tagger="TechBooks" articleid="123456"
copyrighttf="yes" copyrightowner="TF" documenttype="Original" yearofpub="2009"
numcolorpages="???" id="a123456" coverdate="2009">
<meta contenttype="Review-static/background" productid="UIIE" volumenum="116"
issuenum="10" firstpage="1" lastpage="00" pagecount="?"
pdffilename="UIIE_A_123456_O.pdf" pdffilesize="?????" pdfpagecount="???"
seq="??" partofspecissue="no">
<journalcode>UIIE</journalcode>
<issn type="print">0740-817X</issn>
<issn type="electronic">1545-8830</issn>
<coden>IIE Transactions, Vol. 116, No. 10, 2009, pp. 1–00</coden>
<author primaryauthor="yes" corresponding="yes" seq="1">
<name><prefix>Dr.</prefix><givenname>Yoshiyuki</givenname><middlename>M.
K.</middlename><surname>Ohno</surname></name>
<contactinfo>
<contact>
<position primaryaffiliation="yes" affilref="AF0001">
<email url="yohno-...@umin.ac.jp">yohno-...@umin.ac.jp</email>
<email url="mkyohno-...@umin.ac.jp">mkyohno-...@umin.ac.jp</email>
</position>
</contact>
<toctitle>Systemic Effect of Timolol Maleate Ophthalmic</toctitle>
<tocauthor>Yoshiyuki et al.</tocauthor>
<contact>
<position primaryaffiliation="yes" affilref="AF0003"/>
</contact>
<contact>
<address><internat><addline>Address correspondence to Yoshiyuki Ohno,
Department of Pharmacy, University of Tokyo Hospital, Faculty of Medicine,
University of Tokyo, 7-3-1, Hongo,
Bunkyo-ku</addline><city>Tokyo</city><postalcode>113-8655</postalcode><country>Japan</country><phone>+81-3-5800-9446</phone><fax>81-3-5800-8689</fax></internat></address>
</contact>
</contactinfo>
cleanup.txt
<text><toctitle>.*?<\/toctitle>\n?</text><with></with>
<text><tocauthor>.*?<\/tocauthor>\n?</text><with></with>
<text><\?xml version[^>]+></text><with><?xml version="1.0"
encoding="utf-8"?></with>
<text><\!DOCTYPE (article|unarticle) [^>]+></text><with><!DOCTYPE $1 SYSTEM
"http://cats.tfinforma.com/dtd/tfja/dtd/TFJA.dtd"></with>
<text><\?xml:stylesheet type="text\/xsl" href="[^"]+"\?>\n</text><with></with>
Thanks
_______________________________________________
Perl-Win32-Users mailing list
Perl-Win32-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs