RE: [PDFdev] Chinese character display

Mark Storer Thu, 17 Jul 2003 09:26:29 -0700

PDFdev is a service provided by PDFzone.com | http://www.pdfzone.com
_____________________________________________________________


I suspect that the reason the original file is working in exchange has
something to do with some custom font trickery.  No true WinAnsiEncoded font
contains any chinese characters.

I'm not sure why it won't work in more recent browsers either, but I have a
theory: with very release of acrobat, Adobe tightens up their support of the
spec a bit, cutting out "sloppy" PDF structures.  I'm guessing who ever
produced your file in the first place was relying on some "sloppiness" to
get the text to display properly.

SO when Adobe cleaned up their implementation of the PDF spec, it closed a
door your file needed to work properly.  Dead file.

I'm guessing your font just claims to be WinAnsiEncoding, but is actually a
custom pile of glyphs in a 1-off font just for that subset of characters.
To test my theory, you can try copy-n-pasting the text into a regular .txt
file (or whatever).  I'll bet you won't get chinese, just garbage.  The use
of WinAnsiEncoding tells me that each character has to be a single byte, so
you definitely won't bet getting unicode, or any chinese character encoding.

Properly supporting multi-byte text is fairly complicated.  I suggest you
read the PDF specification sections regarding "Type 0" fonts.  Getting ahold
of some sample PDFs wouldn't hurt either.  If you can't find samples any
other way, I recommend going to some .cn or .jp web page and use the web
capture features built into acrobat 5 & 6.  

So you create a type 0 font in the PDF... now you have to provide text in
the proper encoding.  Most chinese fonts will support "Big 5" and unicode,
and perhaps other encodings.  It all depends on which "charmaps" are
supported.  Adobe has their own set of character tables for each language,
and character maps that map various values to entries ONE of those tables.
They have unicode charmaps, and big 5 charmaps, and so on.

So you create a type 0 font, select a supported charmap, then write out your
data in  the format required by that charmap.

Easy.  KIDS STUFF!

Or something.

Now how to actually do all these things depends on your development
environment, tools, etc.  So what are you using?

--Mark Storer
  Software Engineer
  Cardiff Software
#include <disclaimer>
typdef std::disclaimer<Cardiff> Discard;


> -----Original Message-----
> From: Rodgers, Bruce [mailto:[EMAIL PROTECTED]
> Sent: Thursday, July 17, 2003 7:20 AM
> To: PDFDevList (E-mail)
> Subject: [PDFdev] Chinese character display 
> 
> 
> 
> PDFdev is a service provided by PDFzone.com | http://www.pdfzone.com
> _____________________________________________________________
> 
> Greetings all - 
> 
> Here is a one-time brief background:
> 
> My company sells a product which generates and displays PDF 
> files in an 
> external window similar to the ActiveView sample in the Acrobat SDK.
> For the web, it generates a PDF file that is loaded into 
> Acrobat Reader. 
> The product runs with Acrobat Business Tools 4.05, Acrobat 4.05 or 
> Acrobat 5.0, as well as AcroRead 5.1, etc.  
> We are working to support multibyte character set (Chinese) data. 
> 
> 
> The problem is this: 
> Our PDF files (with Chinese character data) are correctly 
> displayed in 
> Acrobat Exchange 3.0 (direct US install with no Chinese 
> packs), but they
> are incorrectly displayed in Acrobat business Tools 4.05, 
> Acrobat 5.0, and 
> AcroRead 5.1, even with Chinese font packs applied.  When the 
> raw PDF file
> is 
> opened in Notepad (Chinese Win2000 Server), the characters 
> are (for the most
> part) 
> translated correctly! 
> 
> So, it seems that the Chinese character codes themselves 
> within the PDF file
> are 
> correct,  but they won't display correctly in Acrobat 
> versions later than
> Exchange 3.0. 
> 
> Here is a brief excerpt showing the lines that specify encoding, etc.
> 
> ...
> <<
> /Type /Font
> /Subtype /TrueType
> /Name /F0
> /BaseFont /CourierNew
> /FirstChar 32
> /LastChar 255
> /Widths [ 595 595 595 ...[deletia]... 595 595 595]
> /Encoding /WinAnsiEncoding
> /FontDescriptor 12 0 R
> >>
> ...
> 
> 
> NOTE:
> I have a 230k .ZIP file that contains the entore sample PDF file with
> Chinese 
> characters, as well as a .DOC file with some screenshots showing the
> behavior. 
> But rather than spam everyone in this mailing list with it, I 
> can provide it
> on request.
> 
> 
> Adobe has provided some good clues, and noted that the PDF 
> does not contain
> any 
> reference to using Chinese font resources, but they're not sure why it
> actually 
> displays in Exchange 3.01.
> 
> Any ideas on this very strange problem, and how to generated 
> PDFs that will
> display multibyte characters?   I'm hoping it's a simple BaseFont or
> Encoding 
> change, but....  
> 
> Thanks in advance for any help!
> 
> G. Bruce Rodgers
> Senior Software Developer
> eiStream, Inc.
> 1225 Jefferson Road
> Rochester, NY  14623
> Phone: (585) 424-1950 x262
> E-mail: [EMAIL PROTECTED]
> 
> 
> 
> To change your subscription:
> http://www.pdfzone.com/discussions/lists-pdfdev.html
> 

To change your subscription:
http://www.pdfzone.com/discussions/lists-pdfdev.html

RE: [PDFdev] Chinese character display

Reply via email to