RE: Issue while extracting chinese chars from pdf

Srinivaas_Venkatarayan Fri, 14 Oct 2011 04:32:20 -0700

HI,

Can someone pls help me with this issue? From this url 
http://www.pinxue.net/java/PDFBox_String_Charset_analyze_en.html it looks like 
PDFBox can handle CJK fonts but I'm not sure what is that I have to do to 
extract Chinese chars.

Thanks
Srinivaas
From: Srinivaas_Venkatarayan
Sent: Wednesday, October 12, 2011 5:12 PM
To: '[email protected]'
Subject: Issue while extracting chinese chars from pdf

Hi,

I'm trying to extract the text contents of a PDF file and store it in a txt 
file using PDFBox (ver 1.6.0). I have issues extracting the content of a PDF 
that has Chinese characters in it. Attached is the PDF and the java code. I'm 
not sure what encoding is being used in this PDF. Can you pls help?

Thanks
Srini

________________________________
DISCLAIMER:
This email (including any attachments) is intended for the sole use of the 
intended recipient/s and may contain material that is CONFIDENTIAL AND PRIVATE 
COMPANY INFORMATION. Any review or reliance by others or copying or 
distribution or forwarding of any or all of the contents in this message is 
STRICTLY PROHIBITED. If you are not the intended recipient, please contact the 
sender by email and delete all copies; your cooperation in this regard is 
appreciated.

RE: Issue while extracting chinese chars from pdf

Reply via email to