On 4/26/05, N. Ganesh Babu wrote:
> 
> the above code is not working if the input is like this.   "A Practical
> Guide to <u>CD-Rom</u> and <u>DVD</u>"
> the output "A Practical Guide to CD-Rom and CD-Rom"
> 

One way is to get the list of texts between the <u> and </u> tag. I
choose to do it together with the substitution, using the "e"
modifier:
   my @un;
   $line=~s!<u>(.+?)</u>!push @un,$1;"<u></u>"!ige;

After processing, put it back in using a for loop over the list of
saved texts, or an s///e construct:
   $line=~s!<u></u>!shift @un!ige;

Note the question mark in ".+?". In reference to your question, yes,
you must use it, or the match will be greedy.
Personally I think you're workin too hard - you should be able to do
any processing on the line and not touch the the <u> delimited text,
without having to resort to removing it. But of course TIMTOWTDI :-)
Here is a complete working example:
###################### begin code
use strict;
use warnings;
while(defined(my $line=<DATA>)) {
   print '-' x 80 , "\n";
   print "Original line: $line";
   my @un;
   $line=~s!<u>(.+?)</u>!push @un,$1;"<u></u>"!ige;
   #do something with line...
   print "line after data removal: $line";
   # put back the data
   $line=~s!<u></u>!shift @un!ige;
   print "line after data replace: $line";
}

__DATA__
A Practical Guide to <u>CD-Rom</u>
A Practical Guide to <u>CD-Rom</u> and <u>DVD</u>
###################### end code

BTW, I would be happy if any of the gurus on the list could shorten:
   my @un;
   $line=~s!<u>(.+?)</u>!push @un,$1;"<u></u>"!ige;
To a single line. Unlike "m//", "s///" never seems to return the
results of "()" in the RE, even in list context. Annoying :-(

HTH,
-- 
Offer Kaye

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to