> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, June 13, 2006 1:51 PM
> To: [email protected]
> Subject: Angle Brackets remain when tags removed using
> HTML::TreeBuilder
>
> Hello -
> Hopefully, this is an easy one.
>
> I have some ugly HTML like this:
>
> <span style="font-weight: bold;"><font
> style="font-family: Arial;"
> face=Arial>MMCM4</font></span>
>
> I am trying to get rid of the <font> tags using HTML::TreeBuilder.
>
> Here is my script:
>
> #!/usr/bin/perl
> use strict;
> use warnings;
> use HTML::TreeBuilder;
>
> my $filename = "test.htm";
> open OUT, ">", "output.txt" || die "Can't open $!";
>
> my $root = HTML::TreeBuilder->new;
> $root->ignore_text(0);
> $root->ignore_ignorable_whitespace(0);
> $root->no_space_compacting(1);
> $root->parse_file($filename);
>
> my @fonts = $root->look_down('_tag', 'font');
>
> foreach my $font (@fonts) {
> $font->tag(undef);
> $font->attr('face',undef);
> $font->attr('style',undef);
> }
try
foreach my $font (@fonts) { $font->replace_with_content->delete; }
That's untested, but I think it will do what you want.
Forrest Cahoon
not speaking for merrill corporation