also tried:
public static class HTMLDocumentBuilderFactory extends
DocumentBuilderFactoryImpl {
public HTMLDocumentBuilderFactory() throws SAXException {
SchemaFactory sfac =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
String schemaString = ""
+ "<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\">"
+ " <xs:attribute name=\"id\" type=\"xs:ID\"/>"
+ "</xs:schema>";
Schema schema = sfac.newSchema(new StreamSource(
new StringReader(schemaString)));
setSchema(schema);
setValidating(true);
}
}
and in main:
System.setProperty("javax.xml.parsers.DocumentBuilderFactory",
HTMLDocumentBuilderFactory.class.getName());
Ittay Dror wrote:
i've also tried this:
Parser p = new Parser(); // from tagsoup
SAX2DOM sax2dom = new SAX2DOM();
Document doc = (Document)sax2dom.getDOM();
DOMConfiguration config = doc.getDomConfig();
config.setParameter("schema-type","http://www.w3.org/TR/REC-xml");
config.setParameter("schema-location", "/tmp/xhtml1-transitional.dtd");
// config.setParameter("datatype-normalization", Boolean.FALSE);
//config.setParameter("psvi", Boolean.TRUE);
config.setParameter("validate",Boolean.TRUE);
doc.insertBefore(doc.getImplementation().createDocumentType("html",
null, "/tmp/xhtml1-transitional.dtd"), null);
p.setContentHandler(sax2dom);
InputSource docsrc = new InputSource("/tmp/test.html");
docsrc.setSystemId("/tmp/xhtml1-transitional.dtd");
p.parse(docsrc);
System.out.println(doc.getElementById("foo"));
i get null in the console
thanx,
ittay
Ittay Dror wrote:
hi,
i'm new to xalan and xsl. i'm trying to get elements using the id
function from an html document. i don't work with an xsl document,
just trying to get elements.
this is my code (basically, slightly modified ApplyXPathJaxp sample):
InputSource xml = new InputSource("/tmp/test.html");
xml.setEncoding("US-ASCII");
String expr = "id('foo')";
// Create a new XPath
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
Object result = null;
try {
// compile the XPath expression
XPathExpression xpathExpr = xpath.compile(expr);
// Evaluate the XPath expression against the input
document
Node node = (Node) xpathExpr.evaluate(xml, XPathConstants.NODE);
System.out.println(node);
}
catch (Exception e) {
e.printStackTrace();
} and my html is:
<html>
<body>
<label id="foo">hello</label>
</body>
</html>
i've tried the following:
1. add a doctype <!DOCTYPE html SYSTEM "/tmp/xhtml1-transitional.dtd">
(the dtd and ent files are saved locally). this is not a good option
for me, since i want to parse arbitrary html documents (i'll do that
with tagsoup and SAX2DOM, but first i want to get this to work)
2. creating a DOM, setting in the config the schema and scema-type,
and using XPathAPI, or doc.getElementById
3. setting the system id in the InputSource
from debugging, it seems that the parser doesn't recognize 'id' as
being an id.
i would really like the id() function to work (the xpath expressions
are used by users, and id() is more natural, than defining keys,
though i don't know how to do that either)
thanks for your help,
ittay
--
===================================
Ittay Dror
openQRM Team Leader,
R&D, Qlusters Inc.
[EMAIL PROTECTED]
+972-3-6081994 Fax: +972-3-6081841
http://www.openQRM.org
- Keeps your Data-Center Up and Running