Hello - Hopefully, this is an easy one. I have some ugly HTML like this:
<span style="font-weight: bold;"><font style="font-family: Arial;" face=Arial>MMCM4</font></span> I am trying to get rid of the <font> tags using HTML::TreeBuilder. Here is my script: #!/usr/bin/perl use strict; use warnings; use HTML::TreeBuilder; my $filename = "test.htm"; open OUT, ">", "output.txt" || die "Can't open $!"; my $root = HTML::TreeBuilder->new; $root->ignore_text(0); $root->ignore_ignorable_whitespace(0); $root->no_space_compacting(1); $root->parse_file($filename); my @fonts = $root->look_down('_tag', 'font'); foreach my $font (@fonts) { $font->tag(undef); $font->attr('face',undef); $font->attr('style',undef); } print OUT $root->as_HTML("","",{}); $root->delete(); And here is what the output looks like: <span style="font-weight: bold;"><>MMCM4</></span> The problem is that although the font tags/attributes themselves are removed, the angle bracket pairs <> and </> are left behind. This causes the starting <> to be rendered in the browser. I've tried using $font->detach and $font->delete, but these methods also delete the text content which must be preserved. It seems there must be something obvious I am missing. Thanks Dave