mark,
the following is close. i played around with a number of different approaches, and i have a basic question...
####################### my @sectionTRs = $doc->findnodes('//[EMAIL PROTECTED]"sectionheading"]]'); foreach $section (@sectionTRs) { my $nextTR = $section->nextSibling(); my @keys = $nextTR->findnodes('td/strong'); my @values = $nextTR->findnodes('td/text()'); # save keys and values } #######################
how do i view the actual underlying html for the nextTR/key/value...
when dealing with libxml, how can i display/print out/dump what i'm looking at.
various methods produce XML::LibXML::NodeList, XML::LibXML::Element, etc... but i can't find any way to print/display the underlying html/text for a given element.
Well, use
perldoc XML::LibXML::NodeList
perldoc XML::LibXML::Element
to find out about the methods of theose classes..
Unfortunately, the Element documentation doesn't make explicit that
Element is a subclass of Node, but also see
perldoc XML::LibXML::Node
to see other methods inherited by Element...
And in general
perldoc XML::LibXML as a good starting point.
my question, is there a way to "convert" the node/nodelist/element to html... or am i essentially hosed!!
Looks like they already are `html'...Depending on what you are really after, $node->toString will convert the node to a string, or ...
the basic reason for the question is to try and be able to switch between using libxml/xpath and treebuilder methods as i'm parsing html docs...
I'm not sure what `treebuilder' is, but there are DOM methods in XML::LibXML, so you can create a new document (see XML::LibXML::Document), and add nodes to it; either new ones or ones extracted from another document (also see $node->cloneNode(1)).
searches of google/cpan haven't turned up a working solution...
thanks..
-bruce
_______________________________________________ Perl-XML mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
-- [EMAIL PROTECTED] http://math.nist.gov/~BMiller/