[jira] [Comment Edited] (PDFBOX-2524) [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages

John Hewson (JIRA) Tue, 02 Dec 2014 19:57:37 -0800

    [ 
https://issues.apache.org/jira/browse/PDFBOX-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14232539#comment-14232539
 ]


John Hewson edited comment on PDFBOX-2524 at 12/3/14 3:56 AM:
--------------------------------------------------------------

Thanks, yes you understood correctly. Here's my review:

- Your new PDType0Font constructor shouldn't call readEncoding() or 
fetchCMapUCS2(), as those methods are for reading a font from a PDF, not 
embedding a new font.
- getFontWidthsArray is parsing a string of space delimited integers, which you 
created in PDCIDFontType2Embedder#setItemsForCIDFont. These two methods should 
not be using strings to exchange data, what was your reason for doing this?
- CmapSubtable#getGlyphIdToCharacterCode() exposes private implementation 
details from CmapSubtable, however I'd recommend using CID = GID rather than 
your current approach, which would mean that you won't need this information 
anyway.
- Using CID = GID would make getCIDToGID redundant, and generate smaller PDF 
files because you can use the Identity cid2gid mapping.
- Please remove unused import statements
- Please do not import with .*


was (Author: jahewson):
Thanks, yes you understood correctly. Here's my review:

- Your new PDType0Font constructor shouldn't call readEncoding() or 
fetchCMapUCS2(), as those methods are for reading a font from a PDF, not 
embedding a new font.
- getFontWidthsArray is parsing a string a space delimited integers, which was 
created in PDCIDFontType2Embedder#setItemsForCIDFont. These two methods should 
not be using strings to exchange data, what was your reason for doing this?
- CmapSubtable#getGlyphIdToCharacterCode() exposes private implementation 
details from CmapSubtable, however I'd recommend using CID = GID rather than 
your current approach, which would mean that you won't need this information 
anyway.
- Using CID = GID would make getCIDToGID redundant, and generate smaller PDF 
files because you can use the Identity cid2gid mapping.
- Please remove unused import statements
- Please do not import with .*

> [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages
> ------------------------------------------------------------------------------
>
>                 Key: PDFBOX-2524
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2524
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Writing
>    Affects Versions: 2.0.0
>            Reporter: Keiji Suzuki
>            Assignee: John Hewson
>         Attachments: Type0.java, Type0CJK.java, Type0Unicode.java, 
> cidtype0.diff, two-new-fonts.diff
>
>
> I made two PDFont classes for creating PDF documents in CJK and 
> non-ISO-8859-1 languages.
> One is PDType0CJKFont. This is for using CJK fonts included in the Asian font 
> package of Adobe Reader. This font doesn't require the target font at the 
> time of creating PDF documentary. This font uses UTF-16 as a text code and 
> supports surrogate pair characters.
> The other is PDType0UnicodeFont. This is for using TrueType Type0 Font which 
> can deal with any Unicode characters like a ArialUnicodeMS. Only the 
> characters which are used actually in the document are embedde. Realizing 
> this, you have to call the PDType0Unicode.reloadFont() method just before 
> closing PDPageContentStream. I think this specification is ugly, but I could 
> not thought of a suitable way to remove this spec. This font uses the 
> original glyph code of the embedded font as a text code and supports 
> surrogate pair characters too.
> Example programs using these two fonts are also attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (PDFBOX-2524) [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages

Reply via email to