Re: Japanese characters

Andreas Lehmkühler Wed, 28 Aug 2013 03:14:39 -0700

Hi,

> Zak Bennett <[email protected]> hat am 28. August 2013 um 01:20
> geschrieben:
>
>
> Hi guys,
>
> Firstly I apologise if this question has been repeated often. Having looked
> around I have found a number of individuals with the same issue as myself.
>
> Have you discovered any workarounds to the issue of returning Japanese text
> information from a PDF using pdfbox? If not, would this be an issue which
> the dev team is currently working to solve?
Please be more specific. There are 3 known cases:


- PDFBox can extract the text of pdfs containing foreign (non latin)
languages depending on the used font
- the text extraction doesn't work because of the used font and a
wrong/incomplete
Implementation in PDFBox
- the text can't be extracted, even the adobe test fails see [1]

So, the question is, did you ever try to extract text? If not, give it a try [2]

> Best regards,
>
> Zak

BR
Andreas Lehmkühler

[1] http://pdfbox.apache.org/userguide/faq.html#notext
[2] http://pdfbox.apache.org/commandline/

Re: Japanese characters

Reply via email to