Hi Steffen,

First I got the same error message as you did. Then I commented out the
lines
    domParser.setProperty("
http://apache.org/xml/properties/dom/document-class-name";,
                          "org.apache.html.dom.HTMLDocumentImpl");
It worked fine.

The problem is caused by "fDocumentImpl == null" (in
org.apache.xerces.parsers.DOMParser), when you set "document-class-name" to
some class other than the default one "DocumentImpl".

By setting fDocument and fDocumentImpl the same value, I solved the
problem. But I don't know whether this is the desired behavior. Should
fDocumentImpl be non-null when we use a document implementation other than
DocumentImpl?

diff -w -r1.41 DOMParser.java
937a938,942
>                         if ( fDocument instanceof DocumentImpl )
>                         {
>                             fDocumentImpl = (DocumentImpl)fDocument;
>                             fDocumentImpl.setErrorChecking(false);
>                         }

If this is desired, could someone commit the patch? Or could someone
provide explaination about fDocuemtnImpl?

Cheers,
Sandy Gao
Software Developer, IBM Canada
(1-416) 448-3255
[EMAIL PROTECTED]

(See attached file: diff.txt)



                                                                                
                                   
                    Steffen                                                     
                                   
                    <[EMAIL PROTECTED]       To:     xerces user <[EMAIL 
PROTECTED]>,                       
                    et>                   [email protected]           
                                   
                                         cc:                                    
                                   
                    03/20/2001           Subject:     plz help on parsing xhtml 
                                   
                    12:37 PM                                                    
                                   
                    Please respond                                              
                                   
                    to                                                          
                                   
                    xerces-j-user                                               
                                   
                                                                                
                                   
                                                                                
                                   



Hi all,

since nobody replied to my posting some days ago, i will post
a specific code example that gives me a headache, i hope anyone
can explain this behaviour to me, i cant.

I use xerces 1.3.1 to parse xhtml with the following code:

public static void main (java.lang.String[] args) {

String xhtmlString= "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD XHTML 1.0
Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\";>
<html xmlns=\"http://www.w3.org/1999/xhtml\";>  <head>    <title>Virtual
Library</title>  </head>  <body>    <p>Moved to ae&auml;    <a
href=\"http://vlib.org/\";>vlib.org</a>    .</p>  </body></html>";

  ByteArrayInputStream origStream = new
ByteArrayInputStream(xhtmlString.getBytes());
 InputSource origInput = new InputSource(origStream);

 DOMParser domParser = new DOMParser();

 try {
     //
domParser.setFeature("http://apache.org/xml/features/validation/dynamic
",true);


domParser.setProperty("
http://apache.org/xml/properties/dom/document-class-name";,

      "org.apache.html.dom.HTMLDocumentImpl");
     //
domParser.setFeature("
http://apache.org/xml/features/dom/include-ignorable-whitespace";,
false);
     domParser.setFeature("http://xml.org/sax/features/validation";,
true);

 } catch (Exception e) {
     System.out.println("error in setting up parser
property"+e.getMessage());
 }

org.w3c.dom.Document htmlDocument = null;

try {
     domParser.parse(origInput);
     System.out.println("Parse Success");
 }
 catch (Exception e) {
     System.out.println("Exception :"+e.getMessage());
     //e.printStackTrace();

 }
}

i get the Exception :The attribute type is required in the declaration
of attribute "events" for element "html".

I know that the xhtmlString is  correct, its taken from the xhtml RFC
from w3.org.
The EntityResolver retrieves the DTDs specified in the DOCType from the
Web,
so they should be correct  too.

I assume the exception is thrown while parsing the DTDs, but the Message
is very strange,
because in the Declaration for element "html":

....
<!ELEMENT html (head, body)>
<!ATTLIST html
  %i18n;
  xmlns       %URI;          #FIXED 'http://www.w3.org/1999/xhtml'
  >
...

there is not even an attribute "events".

What am i doing wrong, is there a bug in xerces, when parisng the xhtml
DTD (cant believe it) ?

in the method where the exception is thrown, there is a comment,
in org.apache.xerces.framework.XMLDTDScanner.scanAttlistDecl()

 ...
   decreaseMarkupDepth();
   return;
  }
  // REVISIT - review this code...
  if (!sawSpace) {
   if (fEntityReader.lookingAtSpace(true)) {
    fEntityReader.skipPastSpaces();
   }
.....


plz Help me in my confusion,
thanks

Steffen Glomb


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Attachment: diff.txt
Description: Binary data

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to