Re: creating a PDF document containing UTF-8 characters
Melih Ovadya wrote: Thank you Jeremias, I followed the instructions and now I can make it work by embedding the font in the document. Looking at the samples in userconfig.xml, this is how I register the fonts: font metrics-file=ttfarialuni.xml embed-file=C:\WINDOWS\Fonts\ArialUni.ttf kerning=yes font-triplet name=ArialUni style=normal weight=normal/ font-triplet name=ArialUni style=normal weight=bold/ font-triplet name=ArialUni style=italic weight=normal/ font-triplet name=ArialUni style=italic weight=bold/ /font /fonts What happens if I omit the embed-file tag? After the PDF is created, on the client machine, how will it find the right font to display the characters in that case? You will see # symbols for every character that couldn't be mapped to a known font. The # characters are actually generated into the PDF (as far as I can tell... because every PDF viewer showed the same placeholder character.) With this flag, does FO include whole font in the PDF, or just a subset to be able to display the Unicode characters? Half way between those two. It will embed enough of the font to display all characters used in the document. That is, it does noticeably enlarge the PDF even if the document only contains characters from US-ASCII. A long-standing issue I have with this stuff is that on Windows, I have more than enough fonts to display a huge subset of Unicode, and Java makes use of font substitution, such that when you write, say, 猫, it will display it even if you've set the font to Times New Roman. But with PDF, if I don't have ArialUni.ttf on my machine and it wasn't embedded into the PDF, the PDF viewer isn't smart enough to substitute fonts. I think (correct me if I'm wrong) that this is more an issue with PDF, though, than an issue with FOP. I personally prefer generating HTML to generating PDF, and we only generate PDF (and TIFF, for that matter) due to client feature requests. :-/ Daniel -- Daniel Noll NUIX Pty Ltd Level 8, 143 York Street, Sydney 2000 Phone: (02) 9283 9010 Fax: (02) 9283 9020 This message is intended only for the named recipient. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this message or attachment is strictly prohibited. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: creating a PDF document containing UTF-8 characters
On 01.09.2005 06:25:54 Daniel Noll wrote: Melih Ovadya wrote: Thank you Jeremias, I followed the instructions and now I can make it work by embedding the font in the document. Looking at the samples in userconfig.xml, this is how I register the fonts: font metrics-file=ttfarialuni.xml embed-file=C:\WINDOWS\Fonts\ArialUni.ttf kerning=yes font-triplet name=ArialUni style=normal weight=normal/ font-triplet name=ArialUni style=normal weight=bold/ font-triplet name=ArialUni style=italic weight=normal/ font-triplet name=ArialUni style=italic weight=bold/ /font /fonts What happens if I omit the embed-file tag? After the PDF is created, on the client machine, how will it find the right font to display the characters in that case? You will see # symbols for every character that couldn't be mapped to a known font. The # characters are actually generated into the PDF (as far as I can tell... because every PDF viewer showed the same placeholder character.) Actually, omitting the embed-file tag will result in Acrobat Reader complaining about not finding the font. FOP is not able to handle this properly, yet. With this flag, does FO include whole font in the PDF, or just a subset to be able to display the Unicode characters? Half way between those two. It will embed enough of the font to display all characters used in the document. That is, it does noticeably enlarge the PDF even if the document only contains characters from US-ASCII. A long-standing issue I have with this stuff is that on Windows, I have more than enough fonts to display a huge subset of Unicode, and Java makes use of font substitution, such that when you write, say, ç«, it will display it even if you've set the font to Times New Roman. But with PDF, if I don't have ArialUni.ttf on my machine and it wasn't embedded into the PDF, the PDF viewer isn't smart enough to substitute fonts. I think (correct me if I'm wrong) that this is more an issue with PDF, though, than an issue with FOP. It's more an issue with FOP in my opinion. So let's hope for you two that someone will eventually add this missing feature. Did I say that the source code of FOP is available? :-) I personally prefer generating HTML to generating PDF, and we only generate PDF (and TIFF, for that matter) due to client feature requests. :-/ Jeremias Maerki - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: creating a PDF document containing UTF-8 characters
On 01.09.2005 09:33:06 Daniel Noll wrote: Jeremias Maerki wrote: It's more an issue with FOP in my opinion. So let's hope for you two that someone will eventually add this missing feature. Did I say that the source code of FOP is available? :-) So there is a feature in PDF to store the actual content as UTF-8 without the font embedded, and to tell the PDF viewer to use whatever fonts it can find to render the document? Because that's what we really need to solve the problem. I know that Acrobat can replace non-embedded Type1 fonts so I believe it should be possible for TrueType, too. FOP is probably not even far from supporting that. It's probably only about a few details, but then I'm not the specialist on TrueType fonts in PDF. I'd have to dive in a lot further to make a really usable statement in this regard. But! I don't know how far this font substitution goes. Using whatever fonts it can find is probably stretching the reality considerably. I think there are some standard to do font substitution but I have never worked with them. Particularly in our system, we can't guarantee that the person generating the PDF has the same fonts as the person reading it. And worse, we can't even guarantee that the person generating it has ARIALUNI.TTF (as it's not freely redistributable.) So embedding fonts in any fashion doesn't solve the problem, and neither does font substitution at the time of generation. In such an environment it is normally by far the best solution to embed the font. You just need to find fonts that you can redistribute in embedded form. But you may need to buy licenses. I was thinking that the lack of font substitution at the time of rendering was more of an issue with PDF, but if you say it's an issue with FOP then I'll take your word for it because I don't know PDF well at all, just its damage to society. :-) Damage to society when not using font embedding, yes. :-) If you can't control the target environment you're bound to run into problems if you don't embed fonts. Sad reality. I believe that font substitution will only 95% of your problems. Jeremias Maerki - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: creating a PDF document containing UTF-8 characters
Thank you Jeremias, I followed the instructions and now I can make it work by embedding the font in the document. Looking at the samples in userconfig.xml, this is how I register the fonts: font metrics-file=ttfarialuni.xml embed-file=C:\WINDOWS\Fonts\ArialUni.ttf kerning=yes font-triplet name=ArialUni style=normal weight=normal/ font-triplet name=ArialUni style=normal weight=bold/ font-triplet name=ArialUni style=italic weight=normal/ font-triplet name=ArialUni style=italic weight=bold/ /font /fonts What happens if I omit the embed-file tag? After the PDF is created, on the client machine, how will it find the right font to display the characters in that case? With this flag, does FO include whole font in the PDF, or just a subset to be able to display the Unicode characters? Regards, melih - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: creating a PDF document containing UTF-8 characters
It's not a bug. Please consult http://xmlgraphics.apache.org/fop/faq.html#pdf-characters On 30.08.2005 01:59:56 Melih Ovadya wrote: Hi, I have a question regarding to UTF-8 characters when creating a PDF document: I have the initial content in byte[] myBytes, and myBytes contains some Japanese characters. Yet, once I render it, each Japanese character is displayed as a pound sign (#). Is this a known bug, or am I missing something? Jeremias Maerki - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: creating a PDF document containing UTF-8 characters
Thank you for this info. Is there a way to set the font for the whole document so that it would render the Japanese characters correctly? melih -Original Message- From: Jeremias Maerki [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 30, 2005 12:29 AM To: fop-users@xmlgraphics.apache.org Subject: Re: creating a PDF document containing UTF-8 characters It's not a bug. Please consult http://xmlgraphics.apache.org/fop/faq.html#pdf-characters On 30.08.2005 01:59:56 Melih Ovadya wrote: Hi, I have a question regarding to UTF-8 characters when creating a PDF document: I have the initial content in byte[] myBytes, and myBytes contains some Japanese characters. Yet, once I render it, each Japanese character is displayed as a pound sign (#). Is this a known bug, or am I missing something? Jeremias Maerki - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: creating a PDF document containing UTF-8 characters
You simply need to get a TrueType font that contains the necessary characters. Create the font metrics file and add it to the configuration as shown in the documentation. You can then simply specify font-family=MyJapaneseFont on the fo:root element. It is automatically inherited by all child elements. On 30.08.2005 21:13:22 Melih Ovadya wrote: Thank you for this info. Is there a way to set the font for the whole document so that it would render the Japanese characters correctly? melih -Original Message- From: Jeremias Maerki [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 30, 2005 12:29 AM To: fop-users@xmlgraphics.apache.org Subject: Re: creating a PDF document containing UTF-8 characters It's not a bug. Please consult http://xmlgraphics.apache.org/fop/faq.html#pdf-characters On 30.08.2005 01:59:56 Melih Ovadya wrote: Hi, I have a question regarding to UTF-8 characters when creating a PDF document: I have the initial content in byte[] myBytes, and myBytes contains some Japanese characters. Yet, once I render it, each Japanese character is displayed as a pound sign (#). Is this a known bug, or am I missing something? Jeremias Maerki Jeremias Maerki - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]