I love it when I answer my own questions 5 minutes later. I had been
sweating over this all night into this morning. Now I finally figured it
out: create a super-literal (to use Sean Burke's terminology):
use strict;
use HTML::TreeBuilder;
use utf8;
my $string = "m\x{c3}\x{b8}\x{c3}\x{b8}se";
my $outfile = 'treebuild3.html';
open O, ">$outfile" or die $!;
my $tree = HTML::TreeBuilder->new_from_content(<<"EOHTML");
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body>
</body>
</html>
EOHTML
my $literal = HTML::Element->new('~literal', text => $string); # THE WINNING
LINE!
my $body = $tree->look_down('_tag' => 'body');
$body->push_content($literal);
On 11/17/05, Terrence Brannon <[EMAIL PROTECTED]> wrote:
>
> I would like to know how to place Unicode character sequences in an HTML
> file whose charset is utf-8. The plain perl program below works fine for
> this purpose, but I don't know what to do to get the HTML::TreeBuilder
> version to work.
>
> Also: I am not sure if this will remain the official support channel for
> HTML::Tree now that it has changed hands, so I am cc'ing the new maintainer
> as well.
>
> # Working Program
>
> use strict;
> #use utf8;
>
> my $string = "m\x{c3}\x{b8}\x{c3}\x{b8}se";
>
> open O, '>moose.html' or die $!;
>
> print O <<"EOHTML";
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
> </head>
> <body>
> $string
> </body>
> </html>
> EOHTML
>
> # Fails to preserve unicode characters
>
> use strict;
> use HTML::TreeBuilder;
>
>
> my $string = "m\x{c3}\x{b8}\x{c3}\x{b8}se";
>
> open O, '>tbmoose.html' or die $!;
>
> my $tree = HTML::TreeBuilder->new_from_content(<<"EOHTML");
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
> </head>
> <body>
>
> </body>
> </html>
> EOHTML
>
> my $body = $tree->look_down('_tag' => 'body');
> $body->push_content($string);
>
> print O $tree->as_HTML;
>
>
>