For some reason, the body of my previous message did not make it to the list. Only the attachment made it. I'm not sure why. Here is the body! I've just included the attachment in the body.
---------------------------------------------------------------------- Using fop 0.95 with the PDF output format, if my input text that mixes Roman and Cyrillic characters, what do I have to do to get fop to show the proper Cyrillic characters in the PDF output (rather than just '#')? This works with the rtf, txt, and awt output methods but not with PS or PDF. I am aware of the need to create Unicode mappings in PDF (having written software that generates PDF), but I don't see how to tell fop to do this. I understand that such a mapping is not required for the other formats since the mapping is handled by the viewer. I apologize if this is a FAQ. I've searched the list archives and google, and I've seen many similar questions, but they seem to refer to older versions of fop, and I haven't been able to see resolutions. I've seen documentation about embedding fonts, but it seems to be geared more toward adding typefaces than mapping Unicode characters. I am using a Debian system. I've tested this both with the debian fop packages and by just downloading a binary distribution. I've also installed Type 1 Cyrillic fonts and run fop with the following configuration file: <fonts> <directory>/usr/share/fonts/X11/Type1</directory> <auto-detect/> </fonts> but this had no effect. Perhaps I need to do more than that. Looking at the source code, I can see that fop is explicitly substituting '#' for any character that it doesn't know how to map, but it seems to be hard-coding WinAnsiEncoding for the mapping. As far as I know, WinAnsiEncoding is a single-byte encoding and is not going to have the Cyrillic characters in it. I could be off here....I've just looked lightly through the code. I am certain that the # characters are being generated by fop and not the result of some kind of font substitution issue at viewing time. Here is an excerpt from the actual PDF content stream as generated by fop: q 1 0 0 1 72 72 cm BT /F1 12 Tf 1 0 0 -1 0 10.266 Tm [(Russian) ( ) (spelling) ( ) (of) ( ) (Berkenblit:) ( ) (##########.) ] TJ ET Q I must be missing something here. Here are some specific questions: * Is what I'm doing supposed to work? It seems like fop should be able to do the right thing with UTF-8 encoded text in multiple languages. fop is just silently substituting # without even generating a warning, even if I run with the -d flag. * Do I have to set up a table somewhere that maps a range of characters to a font, or is fop supposed to do that automatically? Is there some configuration that I can use to tell it to use a different mapping that it might already know about? Is just embedding the appropriate font sufficient? I'm not aware of Type 1 fonts containing Unicode mapping data. I've attached a sample fo input file. I am just running fop a.fo a.pdf to generate the output. Thanks for any assistance! ---------------------------------------------------------------------- <?xml version="1.0" encoding="UTF-8"?> <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"> <fo:layout-master-set> <fo:simple-page-master master-name="master" margin="1in"> <fo:region-body region-name="main-body"/> </fo:simple-page-master> </fo:layout-master-set> <fo:page-sequence master-reference="master"> <fo:flow flow-name="main-body"> <fo:block> Russian spelling of Berkenblit: Беркенблит. </fo:block> </fo:flow> </fo:page-sequence> </fo:root> --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]