[ http://issues.apache.org/jira/browse/XALANJ-1710?page=all ] Brian Minchau closed XALANJ-1710: ---------------------------------
Closing this issue. In the 2.7 code two things have changed: 1) More accurate information about if a character is in an encoding or not. Due to an injected bug about 2 years ago Sun's CharToByteConverter was not used at all and information was often wrong, causing this SAXException. As of release 2.7 this error will only happen when the character really is not in the encoding. 2) This problem was reduced to a warning, not an error. So it only issues a message and processing continues. It might fill up your webserver's log, but no exception. > Incorrect SAXException about bad integral value of a character to be written > out. > --------------------------------------------------------------------------------- > > Key: XALANJ-1710 > URL: http://issues.apache.org/jira/browse/XALANJ-1710 > Project: XalanJ2 > Type: Bug > Components: Serialization > Versions: Latest Development Code > Environment: Operating System: Other > Platform: Other > Reporter: Brian Minchau > Priority: Blocker > Fix For: 2.7 > Attachments: TransformerImplPatch.txt, apache.patch.24278.txt, > apache.patch.24279.txt, bug4.xml, bug4.xsl > > I got this exception for Xalan-J interpretive (note that this works > just fine with XSTLC): > org.xml.sax.SAXException: > Attempt to output character of integral value 338 > that is not represented in specified output encoding of . > at org.apache.xml.serializer.ToTextStream.writeNormalizedChars > (ToTextStream.java:393) > at org.apache.xml.serializer.ToTextStream.characters > (ToTextStream.java:237) > at org.apache.xml.utils.FastStringBuffer.sendSAXcharacters > (FastStringBuffer.java:1024) > at org.apache.xml.dtm.ref.sax2dtm.SAX2DTM.dispatchCharactersEvents > (SAX2DTM.java:599) > . . . > There are two problems. I shouldn't get this message at all, but if I should > then it should have the name of the encoding UTF-8, which it doesn't. > I'm gong to attach a simple XML/XSL pair as a testcase. This problem is in > ToTextStream and is due to the fix for bug 795 being applied. The else {...} > clause in writing out a character in ToTextStream: > if (S_LINEFEED == c && useLineSep) > { > writer.write(m_lineSep, 0, m_lineSepLen); > } > else if (c <= M_MAXCHARACTER) > { > writer.write(c); > } > else if (isUTF16Surrogate(c)) > { > writeUTF16Surrogate(c, ch, i, end); > i++; // two input characters processed > } > else > { > String encoding = getEncoding(); > String integralValue = Integer.toString(c); > throw new SAXException(XMLMessages.createXMLMessage( > XMLErrorResources.ER_ILLEGAL_CHARACTER, > new Object[]{ integralValue, encoding})); > } > now gives a SAXException, but it used to just write out the character anyways. > The problem is that M_MAXCHARACTER is 127 and the encoding is not set for > the ToTextStream serializer at all. Should the encoding be set? I'm not > sure > because this is an intermediate, internal use of a serializer to create a > value. > It is not the final serializer, which would be a ToXMLStream one. > Perhaps we need a way to officially signal to a serializer that it doesn't > have > to do any escaping or worry about character encoding. We've had trouble like > this before where '&' turned into & then into & because of double > processing by an intermediate and then a final serializer. It would be > cleaner > to let a serializer know that it is just an intermediate utility one. I've > discussed this with Morris Kwan, but he doesn't think that this is a needed > in > general, probably just for ToTextStream. > Still we've managed to make the serializer independant of Xalan-J > interpretive > and of XSLTC, I'd like to make the reverse more true and just use the > serializer by its interface only.... but I'm digressing. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
