Here is the Perl code which I'm using for substitution in xml file 
UIIE_A_123456.xml.
I'm getting search string and replacement string for substitution from another 
file (cleanup.txt)

But after substitution, variable of replaced string doesn't interpolated as 
defined in 'cleanup.txt' i.e., 
<!DOCTYPE $1 SYSTEM "http://cats.tfinforma.com/dtd/tfja/dtd/TFJA.dtd";>
SHOULD BE
<!DOCTYPE article SYSTEM "http://cats.tfinforma.com/dtd/tfja/dtd/TFJA.dtd";>

Perl Code:

#!/usr/bin/perl -w -s

 $/ = undef;
 open(INFILE,"<c:/UIIE_A_123456.xml") or die "Cannot Open: xml";
 $_=<INFILE>;xml_cleanup();
 close(INFILE);
 open(OUTFILE,">c:/UIIE_A_123456") or die "Cannot Open: xml";
 print OUTFILE $_;


sub xml_cleanup
{
 open(IN,"<c:/cleanup.xml") or die "Cannot Open: cleanup.xml";
 $cleanup=<IN>;
 close(IN);
 while ($cleanup=~/<text>(.*?)<\/text><with>(.*?)<\/with>/gc)
 {
  $txt="";
  $rep="";
  $txt=$1;$rep=$2;
  if($_ =~ s/$txt/$rep/g){print "Replaced $txt with $rep\n";}
  else{print "Not found $txt\n";}
 }
}

File: UIIE_A_123456.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE $1 SYSTEM "http://cats.tfinforma.com/dtd/tfja/dtd/TFJA.dtd";>
<article doi="10.1080/10826080600847985" tagger="TechBooks" articleid="123456" 
copyrighttf="yes" copyrightowner="TF" documenttype="Original" yearofpub="2009" 
numcolorpages="???" id="a123456" coverdate="2009">
<meta contenttype="Review-static/background" productid="UIIE" volumenum="116" 
issuenum="10" firstpage="1" lastpage="00" pagecount="?" 
pdffilename="UIIE_A_123456_O.pdf" pdffilesize="?????" pdfpagecount="???" 
seq="??" partofspecissue="no">
<journalcode>UIIE</journalcode>
<issn type="print">0740-817X</issn>
<issn type="electronic">1545-8830</issn>
<coden>IIE Transactions, Vol. 116, No. 10, 2009, pp. 1&ndash;00</coden>
<author primaryauthor="yes" corresponding="yes" seq="1">
<name><prefix>Dr.</prefix><givenname>Yoshiyuki</givenname><middlename>M. 
K.</middlename><surname>Ohno</surname></name>
<contactinfo>
<contact>
<position primaryaffiliation="yes" affilref="AF0001">
<email url="yohno-...@umin.ac.jp">yohno-...@umin.ac.jp</email>
<email url="mkyohno-...@umin.ac.jp">mkyohno-...@umin.ac.jp</email>
</position>
</contact>
<toctitle>Systemic Effect of Timolol Maleate Ophthalmic</toctitle>
<tocauthor>Yoshiyuki et al.</tocauthor>
<contact>
<position primaryaffiliation="yes" affilref="AF0003"/>
</contact>
<contact>
<address><internat><addline>Address correspondence to Yoshiyuki Ohno, 
Department of Pharmacy, University of Tokyo Hospital, Faculty of Medicine, 
University of Tokyo, 7-3-1, Hongo, 
Bunkyo-ku</addline><city>Tokyo</city><postalcode>113-8655</postalcode><country>Japan</country><phone>&plus;81-3-5800-9446</phone><fax>81-3-5800-8689</fax></internat></address>
</contact>
</contactinfo>

cleanup.txt
<text><toctitle>.*?<\/toctitle>\n?</text><with></with>
<text><tocauthor>.*?<\/tocauthor>\n?</text><with></with>
<text><\?xml version[^>]+></text><with><?xml version="1.0" 
encoding="utf-8"?></with>
<text><\!DOCTYPE (article|unarticle) [^>]+></text><with><!DOCTYPE $1 SYSTEM 
"http://cats.tfinforma.com/dtd/tfja/dtd/TFJA.dtd";></with>
<text><\?xml:stylesheet type="text\/xsl" href="[^"]+"\?>\n</text><with></with>


Thanks
_______________________________________________
Perl-Win32-Users mailing list
Perl-Win32-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to