Character Encoding in TXTRenderer
Hi all. I am using fop-0.20.4 to create PDF and text files from XML files encoded in ISO-8859-1. While the PDF files are ok, the text files are always UTF-8 encoded. By looking at the TXTRenderers sources I found the reason for this behaviour: The TXTRenderer uses the TXTStream class to write to an OutputStram and this TXTStream assumes a UTF-8 encoding: public void add(String str) { if (!doOutput) return; try { byte buff[] = str.getBytes(UTF-8); out.write(buff); } catch (IOException e) { throw new RuntimeException(e.toString()); } } I don't want to simply change this to another fixed encoding, even though I always - at least thats what I know now - will use this encoding. My FO files always contain an encoding attribute in the XML declaration so I thought the ContentHandler might instruct the renderer which encoding to use but the SAXContentHandler does not get this information. I would like to fix this, but I am not sure how to do it. Is there any preferable way to tell the renderer which encoding to use? thanks Torsten -- _ Torsten Straube * picturesafe media/data/bank GmbH Lüerstr. 3 * D-30175 Hannover * phone: 0511/85620-53 fax: 0511/85620-10 * mailto:[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Character Encoding
- Original Message - From: J.Pietschmann [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, July 09, 2002 9:58 PM Subject: Re: Character Encoding Holger Prause wrote: I use the character squence #8722; in a html page it will be dispalyed as an - minus sign. So far so good.Now i want to use that chracter sequence in FO but in the ^ ^ ^ ^ ^ ^ ^ It is a character reference Yes your are right. generated pdf it will displayed as an # sign(which stands for undefined ?) This means the selected font does not have a glyph for it. Ok i undestand that, its also written in the FOP Faq. What can i do to display this character squence, changeing the encoding in the stylesheet(or using xsl:output /)? The only way is to get a font with a glyph for it and let FOP use it. The mathematical minus is pretty esoteric, you'll probably need a special math font, rummage through implementations for MathML or TeX distributions. Why can't you usse a dash or hyphen? What i wanted was a dash, but for some reasons i choosed the character reference #8722; which is , like u already said, a mathematical minus. Now i use a the character reference for dash , and i works fine with my font. Thx for the quick response, Bye, Holger J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Character encoding on other platforms (previously os/390)
[EMAIL PROTECTED] schrieb: I've had a couple folks ask me for the modified code so the proper character encoding is returned on the toString().getBytes() is US-ASCII. This is cool that other people besides me need this. [..] I downloaded this snapshot xml-fop_20020515162132 and I don't see any modification to the code. Is this change going to be incorporated or has been incorporated in a way that I missed? It hasn't been incorporated yet but it's on my todo list and should be in the next maintenance release. Many Thanks, Jason West Christian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Character encoding on other platforms (previously os/390)
I've had a couple folks ask me for the modified code so the proper character encoding is returned on the toString().getBytes() is US-ASCII. This is cool that other people besides me need this. I downloaded this snapshot xml-fop_20020515162132 and I don't see any modification to the code. Is this change going to be incorporated or has been incorporated in a way that I missed? Basically I had to dig through and change every instance from return result.toString().getBytes(); to try { return result.toString().getBytes(PDFConstants.Encoding); } catch ( UnsupportedEncodingException e ) { return result.toString().getBytes(); } The constant in the class is just my habit and could easily be replaced by just the string constant US-ASCII. Is there anything else I can do to make sure that these changes get incorporated if they are not already? Many Thanks, Jason West - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]