I still couldn't get this solved. We did a hack for that deadline - replacing the bytes for � by the bytes for the String "(R)".....which is really a nasty hack. We are still trying to figure out why it isn't working.
Chris, I think � is part of windows-1252 encoding. http://www.juha.karvonen.name/hyoty/char/ says that. Not sure how genuine that site is. � is part of both windows-1252 and iso-8859-1 encodings. What do you mean by "If you can, check to see what the � and � characters are in the Java system"? Suraj, I opened the source xml file in XMLSpy and I am able to view the � as is. We still couldn't figure out whats the problem and how to solve this problem. I really wonder if this so tough. Or am I missing something basic. I am really doing simple stuff. Either reading the source into a String and passing into the Transformer. Or, passing in the InputStream into the Transformer. Please let me know if there are any solutions for this. Thanks in advance, Pramodh. ----- Original Message ----- From: "Christopher Ebert" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, November 06, 2003 8:15 AM Subject: RE: Directly referenced special characters as "?" If you can, check to see what the � and � characters are in the Java system. You have to be careful, because nearly anything you do to print them out may serialize them to a character set that doesn't have them (and so print a ?). The surest way is to print out the characters as bytes along with a '?' and see if they match. This will tell you if you're losing the characters because they're not in the input character set or not the correct encoding for the character set (e.g. not in ISO-8859-1). This often happens with Windows: Windows uses Cp 1252 as the standard encoding, which is very similar to ISO-8859-1, but not the same, so it can look like it's working for a long time*. If so, fix the input encoding, or change all special characters to entities. Chris * See 'Dogg's Hamlet' for further discussion of the nature of this problem: http://buedg.daig-kastura.de/stoppard/stopp2.htm -----Original Message----- From: Pramodh Peddi [mailto:[EMAIL PROTECTED] Sent: Thursday, November 06, 2003 12:06 AM To: [EMAIL PROTECTED] Subject: Directly referenced special characters as "?" Hi, I couldn't solve my problem fully yet. I posted a request a couple of days ago and the responses helped me a bit, but not entirely. I am having an xml (source) file which has different special characters - some of which are referenced thru entities (like ™) and others are referenced directly (like � and �). The entity referenced characters are coming up fine while transforming, but the directly referenced chars are coming up as "?" chars. I am using Java1.4.2's Transformer for transforming. This is what I am doing on the Java code: ********************************************************* if (filePath != null) { sftp.get(filePath, rawfileOutputStream); rawfileOutputStream.close(); } ByteArrayInputStream rawfileInputStream = new ByteArrayInputStream(rawfileOutputStream.toByteArray()); ByteArrayOutputStream transformedFileOutputStream = new ByteArrayOutputStream(); File transformedFile = new File("../server/ic/deploy/data.war/" + this.taxXSLTResult); FileOutputStream out = new FileOutputStream(transformedFile); transformer.transform( new StreamSource(new InputStreamReader(rawfileInputStream), this.dtdURL), new StreamResult(out)); rawfileInputStream.close(); transformedFileOutputStream.close(); **************************************************************************** ******************** The source file has "windows-1252" encoding header. And in xsl, I tried xsl: encoding="iso-8859-1" and xsl: encoding = "windows-1252". Niether of these worked. I even tried to shange the bytes into String and again into bytes. Nothing works. I would really appreciate if anyone I can get any help!! Thanks in advance, Pramodh.
