On 23/02/2012 00:59, Webley Silvernail wrote:
I have an HTML page that is updated automatically each day. I am using HTML::TreeBuilder to create and insert the new content. Most of the time, this works fine, but I've hit a snag when existing text nodes on the page includes a gt or lt symbol. For example, I might have an existing element on the page that looks like this: <td><B</td> When the page is updated, depending on how I print the output, this may cause problems. Some techniques I use to print the output work OK for the new part but affect the existing content adversely. Other techniques work well with the existing content but cause problems with the new content. Here are some of the output approaches I have tried: I. print OUT $root->as_HTML('', '', {}); Results: new content looks good, but the existing content is affected: <td><B</td> #The browser won't render this and generally just blanks out the text node. II. print OUT $root->as_HTML('<>&', '', {}); Results: existing content looks good; new content is output with all of the< > in the HTML source encoded as entity references (i.e. raw HTML is rendered by the browser). III. use Encode qw(encode decode); ... my $string_rep = $root->as_HTML('<>&', '', {}); print OUT encode('UTF-8',$string_rep); Results: same as test II. IV. use HTML::Entities; ... my $string_rep = $root->as_HTML('<>&', '', {}); print OUT encode_entities($string_rep); Results: Entire page is output with all of the< > in the HTML source encoded as entity references (i.e. raw HTML is rendered by the browser). V. Various iterations of the above approaches using a subsequent call to HTML Tidy to attempt to clean up the HTML.
Hey Webley Approach II is the correct one. The problem is with the way you are adding your new content, which is presumably as a text string (in which case HTML::Element is correct to render it as text!). The correct way is to build an HTML::Element tree with calls like my $tree = HTML::TreeBuilder->new_from_content($content); my $new = HTML::ELement->new('b'); $new->push_content('This text in BOLD'); my $place = $tree->look_down(_tag => 'div', id => 'insertion'); $place->push_content($new); all depending on what you want to insert and how you locate the place in the document to insert it. The above will build content like <b>This text in BOLD</b> and insert it under an element marked <div id="insertion"> An alternative is to pass your new string to HTML::Treebuilder to build a new HTML fragment from your string and then insert that into your document. HTH, Rob -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/