also tried:
public static class HTMLDocumentBuilderFactory extends 
DocumentBuilderFactoryImpl {
                public HTMLDocumentBuilderFactory() throws SAXException {
                        SchemaFactory sfac =
               SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

           String schemaString = ""
               + "<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\";>"
               + "    <xs:attribute name=\"id\" type=\"xs:ID\"/>"
               + "</xs:schema>";
Schema schema = sfac.newSchema(new StreamSource(
                               new StringReader(schemaString)));
setSchema(schema);
           setValidating(true);
                }
        }

and in main:

System.setProperty("javax.xml.parsers.DocumentBuilderFactory", 
HTMLDocumentBuilderFactory.class.getName());


Ittay Dror wrote:
i've also tried this:
 Parser p = new Parser(); // from tagsoup

 SAX2DOM sax2dom = new SAX2DOM();
 Document doc = (Document)sax2dom.getDOM();
 DOMConfiguration config = doc.getDomConfig();
 config.setParameter("schema-type","http://www.w3.org/TR/REC-xml";);
 config.setParameter("schema-location", "/tmp/xhtml1-transitional.dtd");
 // config.setParameter("datatype-normalization", Boolean.FALSE);
 //config.setParameter("psvi", Boolean.TRUE);
 config.setParameter("validate",Boolean.TRUE);
doc.insertBefore(doc.getImplementation().createDocumentType("html", null, "/tmp/xhtml1-transitional.dtd"), null); p.setContentHandler(sax2dom); InputSource docsrc = new InputSource("/tmp/test.html");
 docsrc.setSystemId("/tmp/xhtml1-transitional.dtd");
 p.parse(docsrc);
System.out.println(doc.getElementById("foo")); i get null in the console

thanx,
ittay

Ittay Dror wrote:
hi,

i'm new to xalan and xsl. i'm trying to get elements using the id function from an html document. i don't work with an xsl document, just trying to get elements.

this is my code (basically, slightly modified ApplyXPathJaxp sample):
       InputSource xml = new InputSource("/tmp/test.html");
       xml.setEncoding("US-ASCII");
              String expr = "id('foo')";
                     // Create a new XPath
       XPathFactory factory = XPathFactory.newInstance();
       XPath xpath = factory.newXPath();
       Object result = null;
       try {
         // compile the XPath expression
         XPathExpression xpathExpr = xpath.compile(expr);
// Evaluate the XPath expression against the input document
         Node node = (Node) xpathExpr.evaluate(xml, XPathConstants.NODE);
         System.out.println(node);
    }
       catch (Exception e) {
         e.printStackTrace();
       }       and my html is:

<html>
<body>
       <label id="foo">hello</label>
</body>
</html>

i've tried the following:
1. add a doctype <!DOCTYPE html SYSTEM "/tmp/xhtml1-transitional.dtd"> (the dtd and ent files are saved locally). this is not a good option for me, since i want to parse arbitrary html documents (i'll do that with tagsoup and SAX2DOM, but first i want to get this to work) 2. creating a DOM, setting in the config the schema and scema-type, and using XPathAPI, or doc.getElementById
3. setting the system id in the InputSource

from debugging, it seems that the parser doesn't recognize 'id' as being an id.

i would really like the id() function to work (the xpath expressions are used by users, and id() is more natural, than defining keys, though i don't know how to do that either)

thanks for your help,
ittay





--
===================================
Ittay Dror openQRM Team Leader, R&D, Qlusters Inc.
[EMAIL PROTECTED]
+972-3-6081994 Fax: +972-3-6081841

http://www.openQRM.org
- Keeps your Data-Center Up and Running

Reply via email to