Re: question with as_html output
Rob Dixon wrote: > > Rob Dixon wrote: > > > > If you have a fixed piece of HTML that you want to add then > > you would still be better off coding it up using 'new_from_lol', > > but if the content varies then you could package the lines > > above as a subroutine: > > > > > > sub html_element { > > my $html = shift; > > my $element = HTML::TreeBuilder->new_from_content($html); > > $element->look_down('_implicit', undef); > > } > > > > > > and then call it like this > > > > > > my $r = $ele->postinsert(html_element($tag)); > > Another point: this will work, but will result in a memory > leak as HTML::Treebuilder objects aren't destroyed > automatically when they go out of scope. This way is better: > > sub html_element { > my $tree = HTML::TreeBuilder->new_from_content(shift); > my $element = $tree->look_down('_implicit', undef); > $element->detach; > $tree->delete; > $element; > } I'll shut up after this post! I've just been re-reading the POD for HTML::TreeBuilder, and note there is a method 'disembowel' which does exactly what I've coded above, which makes things a lot neater. It looks like this (but don't forget that you still need to call $element->delete when you're done with using the code fragment). HTH, Rob #!perl use strict; use warnings; use HTML::TreeBuilder; my $insert= < some more tags here TEST my $element = HTML::TreeBuilder->new_from_content($insert)->disembowel; print $element->as_HTML; **OUTPUT** some more tags here
Re: question with as_html output
Rob Dixon wrote: > > Rob Dixon wrote: > > > > If you have a fixed piece of HTML that you want to add then > > you would still be better off coding it up using 'new_from_lol', > > but if the content varies then you could package the lines > > above as a subroutine: > > > > > > sub html_element { > > my $html = shift; > > my $element = HTML::TreeBuilder->new_from_content($html); > > $element->look_down('_implicit', undef); > > } > > > > > > and then call it like this > > > > > > my $r = $ele->postinsert(html_element($tag)); > > Another point: this will work, but will result in a memory > leak as HTML::Treebuilder objects aren't destroyed > automatically when they go out of scope. This way is better: > > sub html_element { > my $tree = HTML::TreeBuilder->new_from_content(shift); > my $element = $tree->look_down('_implicit', undef); > $element->detach; > $tree->delete; > $element; > } I'll shut up after this post! I've just been re-reading the POD for HTML::TreeBuilder, and note there is a method 'disembowel' which does exactly what I've coded above, which makes things a lot neater. It looks like this (but don't forget that you still need to call $element->delete when you're done with using the code fragment). HTH, Rob #!perl use strict; use warnings; use HTML::TreeBuilder; my $insert= < some more tags here TEST my $element = HTML::TreeBuilder->new_from_content($insert)->disembowel; print $element->as_HTML; **OUTPUT** some more tags here
Re: question with as_html output
Rob Dixon wrote: > > Rob Dixon wrote: > > > > If you have a fixed piece of HTML that you want to add then > > you would still be better off coding it up using 'new_from_lol', > > but if the content varies then you could package the lines > > above as a subroutine: > > > > > > sub html_element { > > my $html = shift; > > my $element = HTML::TreeBuilder->new_from_content($html); > > $element->look_down('_implicit', undef); > > } > > > > > > and then call it like this > > > > > > my $r = $ele->postinsert(html_element($tag)); > > Another point: this will work, but will result in a memory > leak as HTML::Treebuilder objects aren't destroyed > automatically when they go out of scope. This way is better: > > sub html_element { > my $tree = HTML::TreeBuilder->new_from_content(shift); > my $element = $tree->look_down('_implicit', undef); > $element->detach; > $tree->delete; > $element; > } I'll shut up after this post! I've just been re-reading the POD for HTML::TreeBuilder, and note there is a method 'disembowel' which does exactly what I've coded above, which makes things a lot neater. It looks like this (but don't forget that you still need to call $element->delete when you're done with using the code fragment). HTH, Rob #!perl use strict; use warnings; use HTML::TreeBuilder; my $insert= < some more tags here TEST my $element = HTML::TreeBuilder->new_from_content($insert)->disembowel; print $element->as_HTML; **OUTPUT** some more tags here
Re: question with as_html output
Rob Dixon wrote: > > Rob Dixon wrote: > > > > If you have a fixed piece of HTML that you want to add then > > you would still be better off coding it up using 'new_from_lol', > > but if the content varies then you could package the lines > > above as a subroutine: > > > > > > sub html_element { > > my $html = shift; > > my $element = HTML::TreeBuilder->new_from_content($html); > > $element->look_down('_implicit', undef); > > } > > > > > > and then call it like this > > > > > > my $r = $ele->postinsert(html_element($tag)); > > Another point: this will work, but will result in a memory > leak as HTML::Treebuilder objects aren't destroyed > automatically when they go out of scope. This way is better: > > sub html_element { > my $tree = HTML::TreeBuilder->new_from_content(shift); > my $element = $tree->look_down('_implicit', undef); > $element->detach; > $tree->delete; > $element; > } I'll shut up after this post! I've just been re-reading the POD for HTML::TreeBuilder, and note there is a method 'disembowel' which does exactly what I've coded above, which makes things a lot neater. It looks like this (but don't forget that you still need to call $element->delete when you're done with using the code fragment). HTH, Rob #!perl use strict; use warnings; use HTML::TreeBuilder; my $insert= < some more tags here TEST my $element = HTML::TreeBuilder->new_from_content($insert)->disembowel; print $element->as_HTML; **OUTPUT** some more tags here
Re: question with as_html output
Rob Dixon wrote: > > If you have a fixed piece of HTML that you want to add then > you would still be better off coding it up using 'new_from_lol', > but if the content varies then you could package the lines > above as a subroutine: > > > sub html_element { > my $html = shift; > my $element = HTML::TreeBuilder->new_from_content($html); > $element->look_down('_implicit', undef); > } > > > and then call it like this > > > my $r = $ele->postinsert(html_element($tag)); Another point: this will work, but will result in a memory leak as HTML::Treebuilder objects aren't destroyed automatically when they go out of scope. This way is better: sub html_element { my $tree = HTML::TreeBuilder->new_from_content(shift); my $element = $tree->look_down('_implicit', undef); $element->detach; $tree->delete; $element; } Apologies. Rob
Re: question with as_html output
--- Rob Dixon <[EMAIL PROTECTED]> wrote: > James.Q.L wrote: > > > --- Rob Dixon <[EMAIL PROTECTED]> wrote: > > > > > James.Q.L wrote: > > > > > > > > I am trying to add html tag to an existing html file using HTML::TreeBuiler. > > > > > > > > the problem is that the added tags isn't encoded after the output. everything > > > > els is fine. > > > > > > > > ### > > > > my $tree = HTML::TreeBuilder->new; > > > > $tree->no_space_compacting(1); > > > > $tree->store_comments(1); > > > > $tree->parse_file("f.html"); > > > > > > > > my $ele; > > > > my $tag = 'TEST'; > > > > > > > > for (@{ $tree->extract_links('a', 'href') }) { > > > > my($link, $element, $attr, $tag) = @$_; > > > > if ($link=~ m{/mylink\.htm}) { > > > > $ele = $element; > > > > last; > > > > } > > > > } > > > > $ele = $ele->parent(); # td > > > > $ele = $ele->parent(); # tr > > > > my $r = $ele->postinsert($tag); > > > > > > > > open F,">f.html"; > > > > print F $tree->as_HTML('<>&',' ',{}); > > > > close F; > > > > $tree->delete; > > > > > > > > the output in html turn out to be > > > > > > > >> > > > > > > > > > > > > > The 'postinsert' method expects an HTML::Element object to add > > > into the tree, not simple HTML text. The easiest way to build > > > what you want here is probably with the 'new_from_lol' method > > > like this. > > > > > > my $tag = HTML::Element->new_from_lol (['tr', ['td', 'TEST']]) > > > > > > HTH, > > > > > > Rob > > > > > > > > > From HTML::Element doc, > > $h->postinsert($element_or_text...) > > i suppose it means postinsert accept $element or text. and the insertion didn't > > yield any > error > > with text as parameter.and it looks much simpler/easier than the new_from_lol. > > > > using HTML::Element->new_from_lol (['tr', ['td', 'TEST']]) did lead to the correct > > html > output. > > but when i am not using 'TEST' directly in new_from_lol > > > > $insert= < > some more tags here > > TEST > > $h = HTML::Element->new_from_lol (['tr', ['td', $insert]]); > > then do postinsert($h); > > > > the tags in $insert still don't get decoded after output. > > If you want to insert a new piece of HTML then you have to supply > 'post_insert' with an HTML::Element object. If you supply plain text > then that text will be inserted literally. That's why the < and > > brackets have been translated to < and >, which is great if > you want to display HTML in the browser, but not for much else. > The 'new_from_lol' method lets you build HTML elements from lists > of lists ('lols') but if you need to create an element directly > from HTML then you have a problem as HTML::Treebuilder::new_from_content > will wrap whatever you give it in , , , etc. > tags to make it into a valid HTML page. > > As Sean says you can search the resulting tree for the part you > really want, but there is a better way. Treebuilder is kind enough > to set an attribute called '_implicit' on those elements that weren't > present in the original HTML, so you can just search for the first > element with '_implicit' undefined like this: > > > #!perl > use strict; > use warnings; > > use HTML::TreeBuilder; > > my $insert= < > > some more tags here > > > TEST > > my $element = HTML::TreeBuilder->new_from_content($insert); > $element = $element->look_down('_implicit', undef); > > print $element->as_HTML; > > __END__ > > OUTPUT > > some more tags here > > > So you can then go ahead and 'post_insert' your new $element. > > If you have a fixed piece of HTML that you want to add then > you would still be better off coding it up using 'new_from_lol', > but if the content varies then you could package the lines > above as a subroutine: > > > sub html_element { > my $html = shift; > my $element = HTML::TreeBuilder->new_from_content($html); > $element->look_down('_implicit', undef); > } > > > and then call it like this > > > my $r = $ele->postinsert(html_element($tag)); > i like it this way. thank you, rob. Qiang time for me to read more doc :) __ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com TEST
Re: question with as_html output
James.Q.L wrote: > --- Rob Dixon <[EMAIL PROTECTED]> wrote: > > > James.Q.L wrote: > > > > > > I am trying to add html tag to an existing html file using HTML::TreeBuiler. > > > > > > the problem is that the added tags isn't encoded after the output. everything > > > els is fine. > > > > > > ### > > > my $tree = HTML::TreeBuilder->new; > > > $tree->no_space_compacting(1); > > > $tree->store_comments(1); > > > $tree->parse_file("f.html"); > > > > > > my $ele; > > > my $tag = 'TEST'; > > > > > > for (@{ $tree->extract_links('a', 'href') }) { > > > my($link, $element, $attr, $tag) = @$_; > > > if ($link=~ m{/mylink\.htm}) { > > > $ele = $element; > > > last; > > > } > > > } > > > $ele = $ele->parent(); # td > > > $ele = $ele->parent(); # tr > > > my $r = $ele->postinsert($tag); > > > > > > open F,">f.html"; > > > print F $tree->as_HTML('<>&',' ',{}); > > > close F; > > > $tree->delete; > > > > > > the output in html turn out to be > > > > > >> > > > > > > > > > The 'postinsert' method expects an HTML::Element object to add > > into the tree, not simple HTML text. The easiest way to build > > what you want here is probably with the 'new_from_lol' method > > like this. > > > > my $tag = HTML::Element->new_from_lol (['tr', ['td', 'TEST']]) > > > > HTH, > > > > Rob > > > > > From HTML::Element doc, > $h->postinsert($element_or_text...) > i suppose it means postinsert accept $element or text. and the insertion didn't > yield any error > with text as parameter.and it looks much simpler/easier than the new_from_lol. > > using HTML::Element->new_from_lol (['tr', ['td', 'TEST']]) did lead to the correct > html output. > but when i am not using 'TEST' directly in new_from_lol > > $insert= < some more tags here > TEST > $h = HTML::Element->new_from_lol (['tr', ['td', $insert]]); > then do postinsert($h); > > the tags in $insert still don't get decoded after output. If you want to insert a new piece of HTML then you have to supply 'post_insert' with an HTML::Element object. If you supply plain text then that text will be inserted literally. That's why the < and > brackets have been translated to < and >, which is great if you want to display HTML in the browser, but not for much else. The 'new_from_lol' method lets you build HTML elements from lists of lists ('lols') but if you need to create an element directly from HTML then you have a problem as HTML::Treebuilder::new_from_content will wrap whatever you give it in , , , etc. tags to make it into a valid HTML page. As Sean says you can search the resulting tree for the part you really want, but there is a better way. Treebuilder is kind enough to set an attribute called '_implicit' on those elements that weren't present in the original HTML, so you can just search for the first element with '_implicit' undefined like this: #!perl use strict; use warnings; use HTML::TreeBuilder; my $insert= < some more tags here TEST my $element = HTML::TreeBuilder->new_from_content($insert); $element = $element->look_down('_implicit', undef); print $element->as_HTML; __END__ OUTPUT some more tags here So you can then go ahead and 'post_insert' your new $element. If you have a fixed piece of HTML that you want to add then you would still be better off coding it up using 'new_from_lol', but if the content varies then you could package the lines above as a subroutine: sub html_element { my $html = shift; my $element = HTML::TreeBuilder->new_from_content($html); $element->look_down('_implicit', undef); } and then call it like this my $r = $ele->postinsert(html_element($tag)); HTH, Rob TEST
Re: question with as_html output
At 01:15 PM 2003-09-15, James.Q.L wrote: the output in html turn out to beI added a hack just for this king of thing -- I got bored making whole treelets when I just wanted some literal stuff. So: use HTML::TreeBuilder; use strict; my $root = HTML::TreeBuilder->new_from_content("Hi!"); my $p = $root->look_down('_tag' => 'p') || die "No p?!?"; $p->postinsert( ['~literal', {'text' => "" } ], # See HTML::Element and look for "~literal" ); print "\nAnd now:\n", $root->as_HTML; -- Sean M. Burkehttp://search.cpan.org/~sburke/ TEST
Re: question with as_html output
James.Q.L wrote: > > I am trying to add html tag to an existing html file using HTML::TreeBuiler. > > the problem is that the added tags isn't encoded after the output. everything els is > fine. > > ### > my $tree = HTML::TreeBuilder->new; > $tree->no_space_compacting(1); > $tree->store_comments(1); > $tree->parse_file("f.html"); > > my $ele; > my $tag = 'TEST'; > > for (@{ $tree->extract_links('a', 'href') }) { > my($link, $element, $attr, $tag) = @$_; > if ($link=~ m{/mylink\.htm}) { > $ele = $element; > last; > } > } > $ele = $ele->parent(); # td > $ele = $ele->parent(); # tr > my $r = $ele->postinsert($tag); > > open F,">f.html"; > print F $tree->as_HTML('<>&',' ',{}); > close F; > $tree->delete; > > the output in html turn out to be > >> > The 'postinsert' method expects an HTML::Element object to add into the tree, not simple HTML text. The easiest way to build what you want here is probably with the 'new_from_lol' method like this. my $tag = HTML::Element->new_from_lol (['tr', ['td', 'TEST']]) HTH, Rob TEST
Re: question with as_html output
--- Rob Dixon <[EMAIL PROTECTED]> wrote: > James.Q.L wrote: > > > > I am trying to add html tag to an existing html file using HTML::TreeBuiler. > > > > the problem is that the added tags isn't encoded after the output. everything els > > is fine. > > > > ### > > my $tree = HTML::TreeBuilder->new; > > $tree->no_space_compacting(1); > > $tree->store_comments(1); > > $tree->parse_file("f.html"); > > > > my $ele; > > my $tag = 'TEST'; > > > > for (@{ $tree->extract_links('a', 'href') }) { > > my($link, $element, $attr, $tag) = @$_; > > if ($link=~ m{/mylink\.htm}) { > > $ele = $element; > > last; > > } > > } > > $ele = $ele->parent(); # td > > $ele = $ele->parent(); # tr > > my $r = $ele->postinsert($tag); > > > > open F,">f.html"; > > print F $tree->as_HTML('<>&',' ',{}); > > close F; > > $tree->delete; > > > > the output in html turn out to be > > > >> > > > > > The 'postinsert' method expects an HTML::Element object to add > into the tree, not simple HTML text. The easiest way to build > what you want here is probably with the 'new_from_lol' method > like this. > > my $tag = HTML::Element->new_from_lol (['tr', ['td', 'TEST']]) > > HTH, > > Rob > >From HTML::Element doc, $h->postinsert($element_or_text...) i suppose it means postinsert accept $element or text. and the insertion didn't yield any error with text as parameter.and it looks much simpler/easier than the new_from_lol. using HTML::Element->new_from_lol (['tr', ['td', 'TEST']]) did lead to the correct html output. but when i am not using 'TEST' directly in new_from_lol $insert= TEST new_from_lol (['tr', ['td', $insert]]); then do postinsert($h); the tags in $insert still don't get decoded after output. any idea? Qiang __ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com