Eugeny Altshuler am Mittwoch, 8. März 2006 11.35:
> Hello!
>
> I have such problem, I need to make multistring replacement... How can
> I do this?
>
> I tried to make perl script which acquires file form STDIN and prints
> result into STDOUT
>
> cat file.html | ./myscript

That's one way to pass the file content to a script via the STDIN filehandle.
A shorter way is to pass the filename to the script:

$ ./myscript file.html

Try out this code:

#!/usr/bin/perl
use strict;
use warnings;

open my $fh, $ARGV[0] or die "can't open passed file '$ARGV[0]': $!";

local $/; # to slurp the file at once
my $joined=<$fh>;

print $joined;

# end of script

See:
perldoc perlvar (for @ARGV and $/)
perldoc open

> I work with html-files and I want to convert them into necessory format.
> I want to delete all stuff before <BODY> and after </BODY>... Also I
> want to delete <SPAN> tags that can be splitted to few strings...
>
> $joined = join("", <>);
> $joined =~ s/.*<BODY>//ig;
> $joined =~ s/<SPAN[^<>]*>//ig;
> $joined =~ s/<\/SPAN>//ig;
> $joined =~ s/<FONT[^<>]*>//ig;
> $joined =~ s/<\/FONT>//ig;
> $joined =~ s/<BODY[^<>]*>//ig;
> $joined =~ s/<\/BODY.*>//ig;
> $joined =~ s/STYLE="[^"]*"//ig;

Without testing, the main problems here are:
- the missing /s modifier: Your html has more than one line!
- use of .* (greedy) instad of .*? (non greedy)

Studying the regexp man pages will help you getting the right regexps. 
See perldoc perlre and others.

Here is some untested code: may do the expected or not :-)

($joined)=~$joined=~m,(<BODY>.*?</BODY>),is or die "no body";
$joined=~s,<(?:SPAN|FONT)[^>]*?>,,igs;
$joined=~s,STYLE\s*=\s*(["']).*?\1,,igs;

This code is just a "hack" and assumes correct html syntax as well as no '>' 
and '<' in tag attribues. Correct regexpes would be longer.

A more structured way to do such things is the usage of one of the html 
parsers on search.cpan.org. Some of them can deal with erroneos html, others 
can not.


hth,
Hans

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to