Eugeny Altshuler am Mittwoch, 8. März 2006 11.35:
> Hello!
> I have such problem, I need to make multistring replacement... How can
> I do this?
> I tried to make perl script which acquires file form STDIN and prints
> result into STDOUT
> cat file.html | ./myscript

That's one way to pass the file content to a script via the STDIN filehandle.
A shorter way is to pass the filename to the script:

$ ./myscript file.html

Try out this code:

use strict;
use warnings;

open my $fh, $ARGV[0] or die "can't open passed file '$ARGV[0]': $!";

local $/; # to slurp the file at once
my $joined=<$fh>;

print $joined;

# end of script

perldoc perlvar (for @ARGV and $/)
perldoc open

> I work with html-files and I want to convert them into necessory format.
> I want to delete all stuff before <BODY> and after </BODY>... Also I
> want to delete <SPAN> tags that can be splitted to few strings...
> $joined = join("", <>);
> $joined =~ s/.*<BODY>//ig;
> $joined =~ s/<SPAN[^<>]*>//ig;
> $joined =~ s/<\/SPAN>//ig;
> $joined =~ s/<FONT[^<>]*>//ig;
> $joined =~ s/<\/FONT>//ig;
> $joined =~ s/<BODY[^<>]*>//ig;
> $joined =~ s/<\/BODY.*>//ig;
> $joined =~ s/STYLE="[^"]*"//ig;

Without testing, the main problems here are:
- the missing /s modifier: Your html has more than one line!
- use of .* (greedy) instad of .*? (non greedy)

Studying the regexp man pages will help you getting the right regexps. 
See perldoc perlre and others.

Here is some untested code: may do the expected or not :-)

($joined)=~$joined=~m,(<BODY>.*?</BODY>),is or die "no body";

This code is just a "hack" and assumes correct html syntax as well as no '>' 
and '<' in tag attribues. Correct regexpes would be longer.

A more structured way to do such things is the usage of one of the html 
parsers on Some of them can deal with erroneos html, others 
can not.


To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<> <>

Reply via email to