> > > > -----Original Message-----
> > > > From: Johnstone, Colin [mailto:[EMAIL PROTECTED]]
> > > > Sent: Thursday, January 23, 2003 12:29 PM
> > > > To: '[EMAIL PROTECTED]'
> > > > Subject: Removing HTML Tags
> > > > 
> > > > 
> > > > Gidday all,
> > > >  
> > > > When using our CMS (Interwoven Teamsite) I want to remove 
> > > > from any textarea any html tags that I don't want content 
> > > > contributors to use. On in particular is the font tag. Can 
> > > > one use a regex to remove these?
> > > >  
> > > > I guess Im looking for a regex to remove anything between the 
> > > > font tags e.g <font>and </font>. Of course their could be 
> > > > anynumber of attributes in the openning font tag.
> > > >  
> > > > Any help appreciated
> > > > Thanking you in anticipation  
> > > >  
> > > > Colin Johnstone 
> > > > 
> > > >  
> > > > 
> > > > [Toby wrote]
> > > use strict;
> > > use warnings;
> > > 
> > > my @remove_tags = qw(i b font);
> > > 
> > > my $html = 'Some text. <i>italics</i> <b>bold</b> <font  
> > > class="myclass"
> > > size="2">Hi There</font> <font>ABC
> > > </font> <h1>Hi</h1>';
> > > 
> > > foreach my $tag (@remove_tags)
> > > {
> > >   $html =~ s!<$tag.*?>(.*?)</$tag>!$1!gs;
> >       
> >     # Actually this is probably a bit better :)
> >       $html =~ s!<$tag.*?>(.*?)</$tag>!$1!gsim;
> 
> 
> 
> [simran wrote]
> does it make sense to use the 's' and 'm' modifiers 
> together... doesn't
> 's' mean treat the text as a "single line" and 'm' mean "treat it as
> multiple lines"! ?
> 
> 

m = Specifies that if string has newline or carriage return chars, the ^ and
$ ops match the start and end of the string, rather than individual lines

s = Allows use of '.' to match a newline char

I don't need to use both.  It's a bad habit. The 'm' modifier is not
necessary in this case.

Observe the following:

use strict;
use warnings;

my @remove_tags = qw(i b font);

# This html contains nested tags and
# some tags span multiple line

my $html = 'Some text. <i>italics</i> <b>bold</b> <font  class="myclass"
size="2"><i>Hi There</i></font> <font>ABC
</font> <h1>Hi</h1>';

foreach my $tag (@remove_tags)
{
        $html =~ s!<$tag.*?>(.*?)</$tag>!$1!gis;
        #$html =~ s!<$tag.*?>(.*?)</$tag>!$1!gim;
}

print $html;




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to