Swanson, Brion writes: > If I recall correctly, a conforming XML parser such as Xerces is required to > resolve all entities it encounters in an XML document regardless of whether > or not you are validating that document.
Why? > In your case, (I believe) you're > telling the parser not to load any external DTDs even for the purposes of > entity resolution. So when it encounters your ® entity it checks its > well-known entities (< > & ' ") and failing that, > attempts to find the declaration of the entity in the DOCTYPE line or in an > external DTD. > It only throws an exception if my document does not contain a <!DOCTYPE ...> element. The parser is quite happy not to resolve the entities as long as there is a DOCTYPE element in the document and external DTD loading is turned off. Why is this a problem? My employer is using Arbortext's Epic editor to edit software manuals conforming to the Docbook dtd. Our manuals consist of a book file (BOOK_TITLE_book.xml) that lists all the chapters of the book named BOOK_TITLE as external entities. The book file contains the DOCTYPE declaration for the book. The chapter files are xml fragments of a larger XML document and thus have no DOCTYPE declaration. (Actually, there is a declaration but it is commented out. The commented out declaration is used by Epic when a user opens a chapter file to determine its doctype.) I am developing an independent Java application that needs to extract information from our software manuals. My Java app uses Xerces to parse the manuals, including the chapter files. However, I have discovered that Xerces won't parse any chapter file that contains character entities that are not one of the five builtin XML entities. I have found that I can work around the problem by inserting a dummy DOCTYPE declaration in the chapter files. Epic seems to ignore the dummy declaration and the declaration makes Xerces 2.2.0 happy as long as I turn off external DTD loading. My concern is that this workaround seems to depend on possible bugs in Xerces and/or in Epic that may someday be fixed, thereby breaking my application. - Paul > It doesn't surprise me then that your parser dies if you prevent it from > finding the entity declaration it needs to continue parsing. > > Please correct me if I'm wrong. > > Cheers! > Brion Swanson > > -----Original Message----- > From: Paul Kinnucan [mailto:[EMAIL PROTECTED] > Sent: Wednesday, October 23, 2002 4:40 PM > To: [EMAIL PROTECTED] > Subject: Entity resolution problem > > > Hi, > > Why does Xerces 2.2 throw an exception and quit when > it encounters an entity reference (e.g., ®) even > though I have specified > > parser.setFeature("http://apache.org/xml/features/nonvalidating/load-externa > l-dtd", false); > > The parser throws the exception > > org.xml.sax.SAXParseException: The entity "reg" was referenced, but not > declared. > > If I put a DOCTYPE declaration at the head of the file, Xerces > parses the file without any problem. > > - Paul > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]