_______________________________________________________________________________________

Note: This e-mail is subject to the disclaimer contained at the bottom of this 
message.
_______________________________________________________________________________________


Hi, 

I have looked at the PDF file. It looks as if text in all the pages were 
scanned as images. I am certain that one cannot extract text from (text scanned 
as) images using PDFBox. Could someone correct me if I am wrong.

Thanks,
Stephen

-----Original Message-----
From: Big Donkeys [mailto:[email protected]] 
Sent: Friday, 20 July 2012 6:09 AM
To: [email protected]
Subject: Can't extract text Adobe-WinCharSetFFFF-UCS2

Hi, I'm having some troubles extracting text from some South Korean PDF 
files using PDFTextStripper.  When I try I get a "severe error could not parse 
predefined CMAP file for 'Adobe-WinCharSetFFFF-UCS2'" message and then 
gives me some gibberish.  File opens and displays fine in Adobe reader.   
I'm using pdfbox-app-1.7.0.jar.

Here is a link to an example PDF that gives me trouble:

http://eng.khoa.go.kr/inc/func/fileDownloadBlob_nori.asp?cmsCd=CM0237&ntNo=626&fNo=4

Any ideas?  

_______________________________________________________________________________________

The information transmitted in this message and its attachments (if any) is 
intended 
only for the person or entity to which it is addressed.
The message may contain confidential and/or privileged material. Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance 
upon this information, by persons or entities other than the intended recipient 
is 
prohibited.

If you have received this in error, please contact the sender and delete this 
e-mail 
and associated material from any computer.

The intended recipient of this e-mail may only use, reproduce, disclose or 
distribute 
the information contained in this e-mail and any attached files, with the 
permission 
of the sender.

This message has been scanned for viruses.
_______________________________________________________________________________________

Reply via email to