Hi All.

I have read the page and the O'rielly book on PWL. I must be thick or something. but when I dump the content of the web page into the treeBuilder via a scaler. Then I try and print the tag. I get:

HTML::Element=HASH(0x41b1074)->Tag ( )

Below is the code extract. I have included the HTML:element and HTML::Treebuilder modules.



sub download_file {
 print "accessing new stream \n";
# content of web page passed to this routine.
 my $root = HTML::TreeBuilder->new;
 $root = HTML::TreeBuilder->new_from_content($_[0]);
#scan_for_non_table_text($root->find_by_tag_name('h5'));
scan_for_non_table_text($root);

  $root->delete; # erase this tree because we're done with it
  return;
} # end sub

sub scan_for_non_table_text {
 my $element = $_[0];
 #return if $element->tag eq 'table';   # prune!
 foreach my $child ($element->content_list) {

   if (ref $child) {  # it's an element
print "This is an element\n";
print "$child->Tag ( )\n";
scan_for_non_table_text($child);  # recurse!
   } else {           # it's a text node!
     my $text .= $child;
     print "text node: $text\n";
   }
 }
 return;
}

I can get the text. But not the name of the Tag or Attributes. I am starting at the top of the tree. The sub routine naming is screwed because I have done so much playing around and haven't fixed things up.

So whatam I doing wrong?

Sean

Message ----- From: "John Delacour" <johndelac...@gmail.com>
To: <beginners@perl.org>
Sent: Wednesday, January 12, 2011 10:17 PM
Subject: Re: Moving through tree's using LWP


At 12:52 +0200 12/01/2011, Shlomi Fish wrote:

http://search.cpan.org/~jfearn/HTML-
Tree-4.1/lib/HTML/Element.pm#$h-%3Etag%28%29_or_$h-%3Etag%28%27tagname%27%29

(sorry for the broken link).

Links won't break in proper mailers if you enclose them in <>


<http://search.cpan.org/~jfearn/HTML-Tree-4.1/lib/HTML/Element.pm#$h-%3Etag%28%29_or_$h-%3Etag%28%27tagname%27%29>


JD

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/





--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to