James.Q.L wrote: > --- Rob Dixon <[EMAIL PROTECTED]> wrote: > > > James.Q.L wrote: > > > > > > I am trying to add html tag to an existing html file using HTML::TreeBuiler. > > > > > > the problem is that the added tags isn't encoded after the output. everything > > > els is fine. > > > > > > ########### > > > my $tree = HTML::TreeBuilder->new; > > > $tree->no_space_compacting(1); > > > $tree->store_comments(1); > > > $tree->parse_file("f.html"); > > > > > > my $ele; > > > my $tag = '<tr><td>TEST</td></tr>'; > > > > > > for (@{ $tree->extract_links('a', 'href') }) { > > > my($link, $element, $attr, $tag) = @$_; > > > if ($link=~ m{/mylink\.htm}) { > > > $ele = $element; > > > last; > > > } > > > } > > > $ele = $ele->parent(); # td > > > $ele = $ele->parent(); # tr > > > my $r = $ele->postinsert($tag); > > > > > > open F,">f.html"; > > > print F $tree->as_HTML('<>&',' ',{}); > > > close F; > > > $tree->delete; > > > > > > the output in html turn out to be > > > > > > <tr><td><b>TEST</b></td></tr> > > > > > > > > > > The 'postinsert' method expects an HTML::Element object to add > > into the tree, not simple HTML text. The easiest way to build > > what you want here is probably with the 'new_from_lol' method > > like this. > > > > my $tag = HTML::Element->new_from_lol (['tr', ['td', 'TEST']]) > > > > HTH, > > > > Rob > > > > > From HTML::Element doc, > $h->postinsert($element_or_text...) > i suppose it means postinsert accept $element or text. and the insertion didn't > yield any error > with text as parameter.and it looks much simpler/easier than the new_from_lol. > > using HTML::Element->new_from_lol (['tr', ['td', 'TEST']]) did lead to the correct > html output. > but when i am not using 'TEST' directly in new_from_lol > > $insert= <<TEST; > <tr><td>some more tags here</td></tr> > TEST > $h = HTML::Element->new_from_lol (['tr', ['td', $insert]]); > then do postinsert($h); > > the tags in $insert still don't get decoded after output.
If you want to insert a new piece of HTML then you have to supply 'post_insert' with an HTML::Element object. If you supply plain text then that text will be inserted literally. That's why the < and > brackets have been translated to < and >, which is great if you want to display HTML in the browser, but not for much else. The 'new_from_lol' method lets you build HTML elements from lists of lists ('lols') but if you need to create an element directly from HTML then you have a problem as HTML::Treebuilder::new_from_content will wrap whatever you give it in <html>, <head>, <body>, <table> etc. tags to make it into a valid HTML page. As Sean says you can search the resulting tree for the part you really want, but there is a better way. Treebuilder is kind enough to set an attribute called '_implicit' on those elements that weren't present in the original HTML, so you can just search for the first element with '_implicit' undefined like this: #!perl use strict; use warnings; use HTML::TreeBuilder; my $insert= <<TEST; <tr> <td> some more tags here </td> </tr> TEST my $element = HTML::TreeBuilder->new_from_content($insert); $element = $element->look_down('_implicit', undef); print $element->as_HTML; __END__ OUTPUT <tr><td>some more tags here</td></tr> So you can then go ahead and 'post_insert' your new $element. If you have a fixed piece of HTML that you want to add then you would still be better off coding it up using 'new_from_lol', but if the content varies then you could package the lines above as a subroutine: sub html_element { my $html = shift; my $element = HTML::TreeBuilder->new_from_content($html); $element->look_down('_implicit', undef); } and then call it like this my $r = $ele->postinsert(html_element($tag)); HTH, Rob