James.Q.L wrote:

> --- Rob Dixon <[EMAIL PROTECTED]> wrote:
>
> > James.Q.L wrote:
> > >
> > > I am trying to add html tag to an existing html file using HTML::TreeBuiler.
> > > 
> > > the problem is that the added tags isn't encoded after the output. everything 
> > > els is fine.
> > > 
> > > ###########
> > > my $tree = HTML::TreeBuilder->new; 
> > > $tree->no_space_compacting(1);
> > > $tree->store_comments(1);
> > > $tree->parse_file("f.html");
> > > 
> > > my $ele;
> > > my $tag = '<tr><td>TEST</td></tr>';
> > > 
> > > for (@{  $tree->extract_links('a', 'href')  }) {
> > >       my($link, $element, $attr, $tag) = @$_;
> > >   if ($link=~ m{/mylink\.htm}) {
> > >   $ele = $element;
> > >   last;
> > >   }
> > > }
> > > $ele = $ele->parent(); # td
> > > $ele = $ele->parent(); # tr
> > > my $r = $ele->postinsert($tag);
> > > 
> > > open F,">f.html";
> > > print F $tree->as_HTML('<>&',' ',{});
> > > close F;
> > > $tree->delete;
> > > 
> > > the output in html turn out to be
> > > 
> > > &lt;tr&gt;&lt;td&gt;&lt;b&gt;TEST&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;
> > > 
> > > 
> > 
> > The 'postinsert' method expects an HTML::Element object to add
> > into the tree, not simple HTML text. The easiest way to build
> > what you want here is probably with the 'new_from_lol' method
> > like this.
> > 
> >   my $tag = HTML::Element->new_from_lol (['tr', ['td', 'TEST']])
> > 
> > HTH,
> > 
> > Rob
> > 
> 
> 
> From HTML::Element doc,
>     $h->postinsert($element_or_text...)
> i suppose it means postinsert accept $element or text. and the insertion didn't 
> yield any error
> with text as parameter.and it looks much simpler/easier than the new_from_lol. 
> 
> using HTML::Element->new_from_lol (['tr', ['td', 'TEST']]) did lead to the correct 
> html output.
> but when i am not using 'TEST' directly in new_from_lol
> 
> $insert= <<TEST;
> <tr><td>some more tags here</td></tr>
> TEST
> $h = HTML::Element->new_from_lol (['tr', ['td', $insert]]);
> then do postinsert($h);
> 
> the tags in $insert still don't get decoded after output.

If you want to insert a new piece of HTML then you have to supply
'post_insert' with an HTML::Element object. If you supply plain text
then that text will be inserted literally. That's why the < and >
brackets have been translated to &lt; and &gt;, which is great if
you want to display HTML in the browser, but not for much else.
The 'new_from_lol' method lets you build HTML elements from lists
of lists ('lols') but if you need to create an element directly
from HTML then you have a problem as HTML::Treebuilder::new_from_content
will wrap whatever you give it in <html>, <head>, <body>, <table> etc.
tags to make it into a valid HTML page.

As Sean says you can search the resulting tree for the part you
really want, but there is a better way. Treebuilder is kind enough
to set an attribute called '_implicit' on those elements that weren't
present in the original HTML, so you can just search for the first
element with '_implicit' undefined like this:


#!perl
use strict;
use warnings;

use HTML::TreeBuilder;

my $insert= <<TEST;
<tr>
  <td>
    some more tags here
  </td>
</tr>
TEST

my $element = HTML::TreeBuilder->new_from_content($insert);
$element = $element->look_down('_implicit', undef);

print $element->as_HTML;

__END__

OUTPUT

<tr><td>some more tags here</td></tr>


So you can then go ahead and 'post_insert' your new $element.

If you have a fixed piece of HTML that you want to add then
you would still be better off coding it up using 'new_from_lol',
but if the content varies then you could package the lines
above as a subroutine:


sub html_element {
  my $html = shift;
  my $element = HTML::TreeBuilder->new_from_content($html);
  $element->look_down('_implicit', undef);
}


and then call it like this


my $r = $ele->postinsert(html_element($tag));


HTH,

Rob

Reply via email to