I'm using the xerces 1.3.1 parser, with emphasis on SAX-2 parsing. The file I was parsing looks like this: <?xml version="1.0" encoding="iso-8859-1"?> <!DOCTYPE xsl:stylesheet [ <!ENTITY copy "©"> ]> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ebt="http://www.ebt.com/2001/XSL/Transform"> <xsl:variable name="ebt:company-name" select='"eBusiness Technologies"'/> <xsl:variable name="ebt:copyright" select='"Copyright © 1999-2001 eBT"'/> </xsl:stylesheet>
The problem I have is the sequencing of events for the entity reference ©. When using the SAX 2 event framework and processing the xsl:variable element named ebt:copyright. I get the event org.xml.sax.ext.LexicalHandler.startEntity BEFORE I get the event org.xml.sax.ContentHandler.startElement. When your processing a DOCTYPE (DTD) you have the startDTD and endDTD markers to provide some context for the startEntity event. However, there is no such marker to provide context for an Element, and thus there is no way to detect that the entity your processing is associated with an attribute node. As far as my code knows, it could just as well be processing an Entity Reference node, which it handles differently than an entity reference in an attribute value. Since the startElement event requires that the processing of the attributes has been completed (it's passed as a parameter), the only good solution I can think of is to add startAttribute and endAttribute events to the LexicalHandler, similar to what's done with the DTD events. I am not sure of who controls and what the state of the SAX-2 specification is in order to have these added to the specification or what influence the xerces team has in this area. I have been able to work around this problem by adding the public method getScannerState() to the class org.apache.xerces.framework.XMLDocumentScanner. When the state is SCANNER_STATE_ATTRIBUTE_VALUE I'm able to conclude that I'm processing an Element which provides me the context I desire. This solution is NOT desirable in the long run as it requires me to modify and generate a new xerces.jar file and I would rather distribute an official build of xerces than a private one for our product offering. However, providing a public method to get the internal state of the XML scanner would not be a bad idea, and would be my fallback position if a modification cannot be made to the SAX-2 LexicalHandler interface. I would appreciate hearing if someone had a better solution to my problem. If not, then can the xerces team can have this modification to the SAX-2 LexicalHandler approved and implemented. If not, then a public method to get the XML scanner state would be greatly appreciated. Steven L. Murray eBusiness Technologies [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
