Re: creating a PDF document containing UTF-8 characters

2005-09-01 Thread Daniel Noll

Melih Ovadya wrote:


Thank you Jeremias, I followed the instructions and now I can make it work
by embedding the font in the document.

Looking at the samples in userconfig.xml, this is how I register the fonts:
font metrics-file=ttfarialuni.xml
embed-file=C:\WINDOWS\Fonts\ArialUni.ttf kerning=yes
   font-triplet name=ArialUni style=normal weight=normal/
   font-triplet name=ArialUni style=normal weight=bold/
   font-triplet name=ArialUni style=italic weight=normal/
   font-triplet name=ArialUni style=italic weight=bold/
  /font

/fonts

What happens if I omit the embed-file tag? After the PDF is created, on the
client machine, how will it find the right font to display the characters in
that case?

You will see # symbols for every character that couldn't be mapped to a 
known font. The # characters are actually generated into the PDF (as far 
as I can tell... because every PDF viewer showed the same placeholder 
character.)



With this flag, does FO include whole font in the PDF, or just a
subset to be able to display the Unicode characters?
 

Half way between those two. It will embed enough of the font to display 
all characters used in the document. That is, it does noticeably enlarge 
the PDF even if the document only contains characters from US-ASCII.


A long-standing issue I have with this stuff is that on Windows, I have 
more than enough fonts to display a huge subset of Unicode, and Java 
makes use of font substitution, such that when you write, say, 猫, it 
will display it even if you've set the font to Times New Roman. But with 
PDF, if I don't have ArialUni.ttf on my machine and it wasn't embedded 
into the PDF, the PDF viewer isn't smart enough to substitute fonts. I 
think (correct me if I'm wrong) that this is more an issue with PDF, 
though, than an issue with FOP. I personally prefer generating HTML to 
generating PDF, and we only generate PDF (and TIFF, for that matter) due 
to client feature requests. :-/


Daniel

--
Daniel Noll

NUIX Pty Ltd
Level 8, 143 York Street, Sydney 2000
Phone: (02) 9283 9010
Fax:   (02) 9283 9020

This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: creating a PDF document containing UTF-8 characters

2005-09-01 Thread Jeremias Maerki

On 01.09.2005 06:25:54 Daniel Noll wrote:
 Melih Ovadya wrote:
 
 Thank you Jeremias, I followed the instructions and now I can make it work
 by embedding the font in the document.
 
 Looking at the samples in userconfig.xml, this is how I register the fonts:
 font metrics-file=ttfarialuni.xml
 embed-file=C:\WINDOWS\Fonts\ArialUni.ttf kerning=yes
 font-triplet name=ArialUni style=normal weight=normal/
 font-triplet name=ArialUni style=normal weight=bold/
 font-triplet name=ArialUni style=italic weight=normal/
 font-triplet name=ArialUni style=italic weight=bold/
/font
 
 /fonts
 
 What happens if I omit the embed-file tag? After the PDF is created, on the
 client machine, how will it find the right font to display the characters in
 that case?
 
 You will see # symbols for every character that couldn't be mapped to a 
 known font. The # characters are actually generated into the PDF (as far 
 as I can tell... because every PDF viewer showed the same placeholder 
 character.)

Actually, omitting the embed-file tag will result in Acrobat Reader
complaining about not finding the font. FOP is not able to handle this
properly, yet.

  With this flag, does FO include whole font in the PDF, or just a
 subset to be able to display the Unicode characters?
   
 
 Half way between those two. It will embed enough of the font to display 
 all characters used in the document. That is, it does noticeably enlarge 
 the PDF even if the document only contains characters from US-ASCII.
 
 A long-standing issue I have with this stuff is that on Windows, I have 
 more than enough fonts to display a huge subset of Unicode, and Java 
 makes use of font substitution, such that when you write, say, 猫, it 
 will display it even if you've set the font to Times New Roman. But with 
 PDF, if I don't have ArialUni.ttf on my machine and it wasn't embedded 
 into the PDF, the PDF viewer isn't smart enough to substitute fonts. I 
 think (correct me if I'm wrong) that this is more an issue with PDF, 
 though, than an issue with FOP.

It's more an issue with FOP in my opinion. So let's hope for you two
that someone will eventually add this missing feature. Did I say that
the source code of FOP is available? :-)

 I personally prefer generating HTML to 
 generating PDF, and we only generate PDF (and TIFF, for that matter) due 
 to client feature requests. :-/


Jeremias Maerki


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: creating a PDF document containing UTF-8 characters

2005-09-01 Thread Jeremias Maerki

On 01.09.2005 09:33:06 Daniel Noll wrote:
 Jeremias Maerki wrote:
 
 It's more an issue with FOP in my opinion. So let's hope for you two
 that someone will eventually add this missing feature. Did I say that
 the source code of FOP is available? :-)
 
   
 
 So there is a feature in PDF to store the actual content as UTF-8
 without the font embedded, and to tell the PDF viewer to use whatever
 fonts it can find to render the document?  Because that's what we
 really need to solve the problem.

I know that Acrobat can replace non-embedded Type1 fonts so I believe it
should be possible for TrueType, too. FOP is probably not even far from
supporting that. It's probably only about a few details, but then I'm
not the specialist on TrueType fonts in PDF. I'd have to dive in a lot
further to make a really usable statement in this regard.

But! I don't know how far this font substitution goes. Using whatever
fonts it can find is probably stretching the reality considerably. I
think there are some standard to do font substitution but I have never
worked with them.

 Particularly in our system, we can't guarantee that the person
 generating the PDF has the same fonts as the person reading it.  And
 worse, we can't even guarantee that the person generating it has
 ARIALUNI.TTF (as it's not freely redistributable.)  So embedding fonts
 in any fashion doesn't solve the problem, and neither does font
 substitution at the time of generation.

In such an environment it is normally by far the best solution to embed
the font. You just need to find fonts that you can redistribute in
embedded form. But you may need to buy licenses.

 I was thinking that the lack of font substitution at the time of
 rendering was more of an issue with PDF, but if you say it's an issue
 with FOP then I'll take your word for it because I don't know PDF well
 at all, just its damage to society. :-)

Damage to society when not using font embedding, yes. :-) If you can't
control the target environment you're bound to run into problems if you
don't embed fonts. Sad reality. I believe that font substitution will
only 95% of your problems.

Jeremias Maerki


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: creating a PDF document containing UTF-8 characters

2005-08-31 Thread Melih Ovadya
Thank you Jeremias, I followed the instructions and now I can make it work
by embedding the font in the document.

Looking at the samples in userconfig.xml, this is how I register the fonts:
font metrics-file=ttfarialuni.xml
embed-file=C:\WINDOWS\Fonts\ArialUni.ttf kerning=yes
font-triplet name=ArialUni style=normal weight=normal/
font-triplet name=ArialUni style=normal weight=bold/
font-triplet name=ArialUni style=italic weight=normal/
font-triplet name=ArialUni style=italic weight=bold/
   /font

/fonts

What happens if I omit the embed-file tag? After the PDF is created, on the
client machine, how will it find the right font to display the characters in
that case? With this flag, does FO include whole font in the PDF, or just a
subset to be able to display the Unicode characters?

Regards,
melih


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: creating a PDF document containing UTF-8 characters

2005-08-30 Thread Jeremias Maerki
It's not a bug. Please consult
http://xmlgraphics.apache.org/fop/faq.html#pdf-characters

On 30.08.2005 01:59:56 Melih Ovadya wrote:
 Hi,
 
 I have a question regarding to UTF-8 characters when creating a PDF
 document: 
 
 I have the initial content in byte[] myBytes, and myBytes contains some
 Japanese characters. Yet, once I render it, each Japanese character is
 displayed as a pound sign (#). 
 
 Is this a known bug, or am I missing something?


Jeremias Maerki


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: creating a PDF document containing UTF-8 characters

2005-08-30 Thread Melih Ovadya
Thank you for this info. Is there a way to set the font for the whole
document so that it would render the Japanese characters correctly?

melih

-Original Message-
From: Jeremias Maerki [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, August 30, 2005 12:29 AM
To: fop-users@xmlgraphics.apache.org
Subject: Re: creating a PDF document containing UTF-8 characters

It's not a bug. Please consult
http://xmlgraphics.apache.org/fop/faq.html#pdf-characters

On 30.08.2005 01:59:56 Melih Ovadya wrote:
 Hi,
 
 I have a question regarding to UTF-8 characters when creating a PDF
 document: 
 
 I have the initial content in byte[] myBytes, and myBytes contains some
 Japanese characters. Yet, once I render it, each Japanese character is
 displayed as a pound sign (#). 
 
 Is this a known bug, or am I missing something?


Jeremias Maerki


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: creating a PDF document containing UTF-8 characters

2005-08-30 Thread Jeremias Maerki
You simply need to get a TrueType font that contains the necessary
characters. Create the font metrics file and add it to the configuration
as shown in the documentation. You can then simply specify
font-family=MyJapaneseFont on the fo:root element. It is automatically
inherited by all child elements.

On 30.08.2005 21:13:22 Melih Ovadya wrote:
 Thank you for this info. Is there a way to set the font for the whole
 document so that it would render the Japanese characters correctly?
 
 melih
 
 -Original Message-
 From: Jeremias Maerki [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, August 30, 2005 12:29 AM
 To: fop-users@xmlgraphics.apache.org
 Subject: Re: creating a PDF document containing UTF-8 characters
 
 It's not a bug. Please consult
 http://xmlgraphics.apache.org/fop/faq.html#pdf-characters
 
 On 30.08.2005 01:59:56 Melih Ovadya wrote:
  Hi,
  
  I have a question regarding to UTF-8 characters when creating a PDF
  document: 
  
  I have the initial content in byte[] myBytes, and myBytes contains some
  Japanese characters. Yet, once I render it, each Japanese character is
  displayed as a pound sign (#). 
  
  Is this a known bug, or am I missing something?
 
 
 Jeremias Maerki


Jeremias Maerki


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]