[jira] Commented: (PDFBOX-420) Japanese Characters are garbled.

JIRA Thu, 12 Feb 2009 05:24:26 -0800

    [ 
https://issues.apache.org/jira/browse/PDFBOX-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672967#action_12672967
 ]


Andreas Lehmkühler commented on PDFBOX-420:
-------------------------------------------

I have two questions before I try to add your code to the trunk:

1. Your patch contains a package with 4 new classes. All of them have an old 
pdfbox license header. If I add this changes to pdfbox, we have to change the 
license to the Apache License 2.0. [1]. Is that ok for you and the author Pin 
Xue who is mentioned in 2 of these files?

2. Is the cmapSubstitutions mapping in PDFont complete or do you only add the 
mappings you are interested in? I asked, because if I add the code, I'd like to 
use a complete mapping. As far as I understand the CharCode2Unicode mapping 
there are some unicode files missing in your mapping, e.g. the korean files.


[1] http://www.apache.org/licenses/LICENSE-2.0

> Japanese Characters are garbled.
> --------------------------------
>
>                 Key: PDFBOX-420
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-420
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 0.8.0-incubator
>            Reporter: Takashi Komatsubara
>            Priority: Critical
>         Attachments: supportJapanese-fontbox.patch, supportJapanese.patch, 
> TestFilesForJapaneseGarbledIssue.zip
>
>
> The extracted Japanese characters are completely garbled.
> This issue is very critical for Japanese users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PDFBOX-420) Japanese Characters are garbled.

Reply via email to