Hello Simon! Thank you for your quick response.
The reason for sending this message to two groups is the following: I still think this is a bug/feature in the xalan package, and if so it should be patched. Of course i could have written two mails with exactly the same content. Look at this example: I have this element: <element>I am tired</element> I extract the text part "I am tired" I translate the text part into something like this: "Ich bin m�de" I exchange the german y, which is an ISO-8859-1 charachter with a character code larger than 127, into a proper XML numeric character entity (ý) giving us the following string "Ich bin mýde" I exchange the text part of the element, giving (If the API works properly) <element>Ich bin mýde</element> This is then serialized into <element>Ich bin m&#253;de</element> Which from my point of view is incorrect behaviour, since the text content of the previous element was totally correct from an XML point of view?! Any godd ideas?! BR /Erik -----Ursprungligt meddelande----- Fr�n: Simon Kitching [mailto:[EMAIL PROTECTED] Skickat: den 24 september 2003 11:38 Till: Erik Ytterman Kopia: [EMAIL PROTECTED]; 'Beatrice Nilsson' �mne: Re: Numeric entity problem Hi Eric, First of all, a minor note on etiquette: it is generally frowned upon to post to both user and dev email lists. The user list is certainly the best place for this sort of question. I believe that Xerces is behaving exactly as expected; you told it that the contents of a text node is a string containing the characters: '&', '2', '3', etc This is *text* to xerces, and because text cannot contain an ampersand, it is escaped when writing the data out. I suggest you try this: char[] c = {253}; // array of 1 char which is unicode char #253 String str = new String(c); Now put this string (containing the unicode character #253) into the node. I suspect there is actually a way to specify unicode chars directly in string literals, maybe something like: String s = "\xFD"; I'm not sure about that, though. Regards, Simon On Wed, 2003-09-24 at 21:11, Erik Ytterman wrote: > Dear All! > > I'm struggling with a problem that needs to be solve as soon as > possible. Hope that you will be able to help me. I will attach parts > of the code. > > I'm doing the following: > > 1. Recive a callback with a proper XML document. > (DocumentHandler.handleDocument()) > > 2. Use XPath to find the element to process > (DocumentHandler.translateDocument()) > > 3. Find the text content of this element. > (DocumentHandler.translateDocument()) > > 4. Translate the textual content of the element. > (OpenB2BUtil.translateString()) > > 5. An ugly hack to transform any characters except ASCII into numeric > entities. (OpenB2BUtil.etitifyIsoString()) > > 6. Replace the textual content of the element, including numeric > entities (DocumentHandler.translateDocument()) > > 7. Serialize the resulting DOM tree using transformers > (OpenB2BUtil.documentToStream()) > > Problem: > > As can be seen from the code, I replace the textual content of an > element, with a string that contains numeric entities (ý). My > problem is that the serialization seem to translate this into > (&#253;). > > Questions: > > 1. Is this a bug in xalan, from my point of view, it should leave the > numeric entity in the text payload untouched, since it is proper XML. > > 2. If not, is there a way to disable this "feature" in Xalan, so that > these, perfectly legal numeric entities are let through in the > serialization > > 3. If not, any sugestions on how to solve the problem? > > /Erik > > > > >
