[
https://issues.apache.org/jira/browse/PDFBOX-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672967#action_12672967
]
Andreas Lehmkühler commented on PDFBOX-420:
-------------------------------------------
I have two questions before I try to add your code to the trunk:
1. Your patch contains a package with 4 new classes. All of them have an old
pdfbox license header. If I add this changes to pdfbox, we have to change the
license to the Apache License 2.0. [1]. Is that ok for you and the author Pin
Xue who is mentioned in 2 of these files?
2. Is the cmapSubstitutions mapping in PDFont complete or do you only add the
mappings you are interested in? I asked, because if I add the code, I'd like to
use a complete mapping. As far as I understand the CharCode2Unicode mapping
there are some unicode files missing in your mapping, e.g. the korean files.
[1] http://www.apache.org/licenses/LICENSE-2.0
> Japanese Characters are garbled.
> --------------------------------
>
> Key: PDFBOX-420
> URL: https://issues.apache.org/jira/browse/PDFBOX-420
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 0.8.0-incubator
> Reporter: Takashi Komatsubara
> Priority: Critical
> Attachments: supportJapanese-fontbox.patch, supportJapanese.patch,
> TestFilesForJapaneseGarbledIssue.zip
>
>
> The extracted Japanese characters are completely garbled.
> This issue is very critical for Japanese users.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.