[
https://issues.apache.org/jira/browse/PDFBOX-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016882#comment-14016882
]
Antti Lankila edited comment on PDFBOX-922 at 6/3/14 5:06 PM:
--------------------------------------------------------------
Version 2, more palatable.
This one uses Identity-H for charcode -> CID, and Identity for CID -> GID, and
then has a few hacks in PDFont (encodeCID never has worked as far as I can
tell) and PDType0Font to make it work.
I would have liked to use the fontbox's CMap facility to do the codepoint ->
CID conversion, but I could not work out how the CMap stuff works. The method
lookupCID() is a CID -> String conversion, apparently, lookup(byte[], int, int)
does the reverse but it goes into some CodespaceRange check that is probably
not 8-bit clean. I just gave up trying to figure out what this is supposed to
be doing and just did a hashmap in PDType0Font to do the String->CID conversion.
There are probably other issues remaining, like PDFont's getStringWidth()
starts out via conversion to ISO-8859-1, which can't be correct.
was (Author: [email protected]):
Version 2, more palatable.
This one uses Identity-H for charcode -> CID, and Identity for CID -> GID, and
then has a few hacks in PDFont (encodeCID never has worked as far as I can
tell) and PDType0Font to make it work.
I would have liked to use the fontbox's CMap facility to do the codepoint ->
CID conversion, but I could not work out how the CMap stuff works. The method
lookupCID() is a CID->String conversion, apparently, lookup(byte[], int, int)
does the reverse but it goes into some CodespaceRange check that is probably
not 8-bit clean. I just gave up trying to figure out what this is supposed to
be doing and just did a hashmap in PDType0Font to do the String->CID conversion.
There are probably other issues remaining, like PDFont's getStringWidth()
starts out via conversion to ISO-8859-1, which can't be correct.
> True type PDFont subclass only supports WinAnsiEncoding (hardcoded!)
> --------------------------------------------------------------------
>
> Key: PDFBOX-922
> URL: https://issues.apache.org/jira/browse/PDFBOX-922
> Project: PDFBox
> Issue Type: New Feature
> Components: Writing
> Affects Versions: 1.3.1
> Environment: JDK 1.6 / OS irrelevant, tried against 1.3.1 and 1.2.0
> Reporter: Thanos Agelatos
> Assignee: Andreas Lehmkühler
> Attachments: pdfbox-unicode.diff, pdfbox-unicode2.diff
>
>
> PDFBox cannot embed Identity-H or Identity-V type TTF fonts in the PDF it
> creates, making it impossible to create PDFs in any language apart from
> English and ones supported in WinAnsiEncoding. This behaviour is caused
> because method PDTrueTypeFont.loadTTF has hardcoded WinAnsiEncoding inside,
> and there is no Identity-H or Identity-V Encoding classes provided (to set
> afterwards via PDFont.setFont() )
> This excludes the following languages plus many others:
> - Greek
> - Bulgarian
> - Swedish
> - Baltic languages
> - Malteze
> The PDF created contains garbled characters and/or squares.
> Simple test case:
> PDDocument doc = null;
> try {
> doc = new PDDocument();
> PDPage page = new PDPage();
> doc.addPage(page);
> // extract fonts for fields
> byte[] arialNorm = extractFont("arial.ttf");
> //byte[] arialBold = extractFont("arialbd.ttf");
> //PDFont font = PDType1Font.HELVETICA;
> PDFont font = PDTrueTypeFont.loadTTF(doc, new
> ByteArrayInputStream(arialNorm));
>
> PDPageContentStream contentStream = new
> PDPageContentStream(doc, page);
> contentStream.beginText();
> contentStream.setFont(font, 12);
> contentStream.moveTextPositionByAmount(100, 700);
> contentStream.drawString("Hello world from PDFBox
> ελληνικά"); // text here may appear garbled; insert any text in Greek or
> Bulgarian or Malteze
> contentStream.endText();
> contentStream.close();
> doc.save("pdfbox.pdf");
> System.out.println(" created!");
> } catch (Exception ioe) {
> ioe.printStackTrace();
> } finally {
> if (doc != null) {
> try { doc.close(); } catch (Exception e) {}
> }
> }
--
This message was sent by Atlassian JIRA
(v6.2#6252)