http://nagoya.apache.org/bugzilla/show_bug.cgi?id=2267 *** shadow/2267 Thu Jun 21 07:48:01 2001 --- shadow/2267.tmp.9980 Thu Jun 21 07:48:01 2001 *************** *** 0 **** --- 1,48 ---- + +============================================================================+ + | XPathAPI seems to apply queries always at the whole document | + +----------------------------------------------------------------------------+ + | Bug #: 2267 Product: XalanJ2 | + | Status: NEW Version: 2.0.x | + | Resolution: Platform: PC | + | Severity: Major OS/Version: Linux | + | Priority: Other Component: org.apache.xpath | + +----------------------------------------------------------------------------+ + | Assigned To: [EMAIL PROTECTED] | + | Reported By: [EMAIL PROTECTED] | + | CC list: Cc: | + +----------------------------------------------------------------------------+ + | URL: | + +============================================================================+ + | DESCRIPTION | + I am not sure, if this is really a bug, or what + + I am using JTidy to parse and tidy up HTML files and apply XPath queries to them + in order to extract content. + + I tried to apply query to a Document and in a second step a different query + to the resulting Nodes. The problem is now that the second query applied to the + resulting nodes, returns me results of the whole Document. + + This is some samplecode: + + JTidyScraper myScraper = new + JTidyScraper("http://www.somehost.com/...../somefile.html"); + + Document myDocument = (Document)myScraper.getContent(); + + NodeList myNodeList = XPathAPI.selectNodeList(myDocument, + ".//form/p|.//form/blockquote/p"); + + for(int i=0; i < myNodeList.getLength(); i++){ + Node theOuterNode = myNodeList.item(i); + + //Here i got the problem that the query ".//p/a" is applied to the whole + Document instead of only the "theOuterNode" + //I would expect to get only nodes within the scope of "theOuterNode" + NodeList theInnerNodeList = + XPathAPI.selectNodeList(theOuterNode,".//p/a"); + + for(int j=0; j < theInnerNodeList.getLength(); j++){ + Node theInnerNode = theInnerNodeList.item(j); + } + } \ No newline at end of file
