[
https://issues.apache.org/jira/browse/PDFBOX-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14232539#comment-14232539
]
John Hewson edited comment on PDFBOX-2524 at 12/3/14 3:56 AM:
--------------------------------------------------------------
Thanks, yes you understood correctly. Here's my review:
- Your new PDType0Font constructor shouldn't call readEncoding() or
fetchCMapUCS2(), as those methods are for reading a font from a PDF, not
embedding a new font.
- getFontWidthsArray is parsing a string of space delimited integers, which you
created in PDCIDFontType2Embedder#setItemsForCIDFont. These two methods should
not be using strings to exchange data, what was your reason for doing this?
- CmapSubtable#getGlyphIdToCharacterCode() exposes private implementation
details from CmapSubtable, however I'd recommend using CID = GID rather than
your current approach, which would mean that you won't need this information
anyway.
- Using CID = GID would make getCIDToGID redundant, and generate smaller PDF
files because you can use the Identity cid2gid mapping.
- Please remove unused import statements
- Please do not import with .*
was (Author: jahewson):
Thanks, yes you understood correctly. Here's my review:
- Your new PDType0Font constructor shouldn't call readEncoding() or
fetchCMapUCS2(), as those methods are for reading a font from a PDF, not
embedding a new font.
- getFontWidthsArray is parsing a string a space delimited integers, which was
created in PDCIDFontType2Embedder#setItemsForCIDFont. These two methods should
not be using strings to exchange data, what was your reason for doing this?
- CmapSubtable#getGlyphIdToCharacterCode() exposes private implementation
details from CmapSubtable, however I'd recommend using CID = GID rather than
your current approach, which would mean that you won't need this information
anyway.
- Using CID = GID would make getCIDToGID redundant, and generate smaller PDF
files because you can use the Identity cid2gid mapping.
- Please remove unused import statements
- Please do not import with .*
> [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages
> ------------------------------------------------------------------------------
>
> Key: PDFBOX-2524
> URL: https://issues.apache.org/jira/browse/PDFBOX-2524
> Project: PDFBox
> Issue Type: Improvement
> Components: Writing
> Affects Versions: 2.0.0
> Reporter: Keiji Suzuki
> Assignee: John Hewson
> Attachments: Type0.java, Type0CJK.java, Type0Unicode.java,
> cidtype0.diff, two-new-fonts.diff
>
>
> I made two PDFont classes for creating PDF documents in CJK and
> non-ISO-8859-1 languages.
> One is PDType0CJKFont. This is for using CJK fonts included in the Asian font
> package of Adobe Reader. This font doesn't require the target font at the
> time of creating PDF documentary. This font uses UTF-16 as a text code and
> supports surrogate pair characters.
> The other is PDType0UnicodeFont. This is for using TrueType Type0 Font which
> can deal with any Unicode characters like a ArialUnicodeMS. Only the
> characters which are used actually in the document are embedde. Realizing
> this, you have to call the PDType0Unicode.reloadFont() method just before
> closing PDPageContentStream. I think this specification is ugly, but I could
> not thought of a suitable way to remove this spec. This font uses the
> original glyph code of the embedded font as a text code and supports
> surrogate pair characters too.
> Example programs using these two fonts are also attached.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)