[ https://issues.apache.org/jira/browse/PDFBOX-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273185#comment-13273185 ]
Tilman Hausherr commented on PDFBOX-1296: ----------------------------------------- I installed the source code and found two possible causes. I used the netbeans shortcuts PDF file to test. 1) Font.canDisplayUpTo() seems to be buggy. It returns -1 although the displayed result is wrong. Even Font.canDisplay() is buggy. It returns true sometimes when the font is "DejaVu Sans Bold" or "Liberation Serif Bold". What seems to help is to ask whether the font can display space, a and A. 2) I suspect that acrobat reader does some font mapping, but pdfbox doesn't. By changing the font in org.apache.pdfbox.pdmodel.font.PDSimpleFont.drawString() to an appropriate font, I get better results. So either the font mapping should be improved, or the DejaVu and the Liberation fonts should be included in the distribution. Here's my "improved" code, still contains lots of logging, and is rather meant to prove something, not really for production: /** * Font.canDisplayUpTo() is buggy: * http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6623219 * http://stackoverflow.com/questions/1754697/displaying-chinese-text-in-an-applet * * @param font * @param s * @return -1 when all is good * * This method returns an offset into the String s which is the first character this Font * cannot display without using the missing glyph code. If the Font can display all characters, * -1 is returned. */ int canDisplayUpTo(Font font, String s) { if (!font.canDisplay(' ') || !font.canDisplay('a') || !font.canDisplay('A')) return 0; //TH font.canDisplay does not always tell the truth int len = s.length(); int index = 0; while (index < len) { int codePoint = s.codePointAt(index); if (!font.canDisplay(codePoint)) { return index; } index += Character.charCount(codePoint); } return -1; } /** * {@inheritDoc} */ @Override public void drawString( String string, Graphics g, float fontSize, AffineTransform at, float x, float y ) throws IOException { Font _awtFont = getawtFont(); // mdavis - fix fontmanager.so/dll on sun.font.FileFont.getGlyphImage // for font with bad cmaps? // Type1 fonts are not affected as they don't have cmaps // if (!isType1Font() && _awtFont.canDisplayUpTo(string) != -1) if (!isType1Font() && canDisplayUpTo(_awtFont,string) != -1) { log.warn("Changing font on <" + string + "> from <" + _awtFont.getName() + "> to the default font"); if (_awtFont.getName().startsWith("DejaVu") || _awtFont.getName().startsWith("Liberation")) { log.warn("Before: " + _awtFont); String name; String style; if (_awtFont.getName().startsWith("DejaVu Sans")) name = "Lucida Sans"; else if (_awtFont.getName().startsWith("Liberation Serif")) name = "Times New Roman"; else if (_awtFont.getName().startsWith("Liberation Sans")) name = "Arial"; else if (_awtFont.getName().startsWith("Liberation Mono")) name = "Courier New"; else name = "Lucida Sans"; switch (_awtFont.getStyle()) { case Font.BOLD: style = "BOLD"; break; case Font.ITALIC: style = "ITALIC"; break; case Font.PLAIN: style = "PLAIN"; break; default: if (_awtFont.getStyle() == (Font.BOLD|Font.ITALIC)) style = "BOLDITALIC"; else style = "PLAIN"; } if (_awtFont.getName().endsWith("Bold")) style = "BOLD"; else if (_awtFont.getName().endsWith("Bold Italic")) style = "BOLDITALIC"; else if (_awtFont.getName().endsWith("Italic")) style = "ITALIC"; _awtFont = Font.decode(name + "-" + style + "-" + Integer.toString(_awtFont.getSize())); log.warn("After: " + _awtFont); } else { _awtFont = Font.decode(null); log.warn("Default: " + _awtFont); } } else { log.info("Unchanged: " + _awtFont.getName() + " für '" + string + "'" + ", upto: " + canDisplayUpTo(_awtFont,string)); } Graphics2D g2d = (Graphics2D)g; g2d.setRenderingHint( RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON ); writeFont(g2d, at, _awtFont, x, y, string); } > Warnung: Changing font on < > from <AMAKEA+TimesNewRoman> to the default font > ----------------------------------------------------------------------------- > > Key: PDFBOX-1296 > URL: https://issues.apache.org/jira/browse/PDFBOX-1296 > Project: PDFBox > Issue Type: Bug > Components: PDModel > Affects Versions: 1.6.0 > Environment: XP, JDK 1.7 > Reporter: Tilman Hausherr > Attachments: outside-in-01.png, outside-in.pdf, shortcuts-01.png, > shortcuts.pdf > > > Pdfbox does not produce the correct fonts in the PNG file created with the > following code and I get a lot of warnings: > PDDocument document = null; > try > { > document = PDDocument.load(pdfFile); > List pages = document.getDocumentCatalog().getAllPages(); > int p = 0; > for (Object pobj : pages) > { > PDPage page = (PDPage) pobj; > ++p; > BufferedImage bim = page.convertToImage(); > // Test with output in memory, to see the size > ByteArrayOutputStream memout = new > ByteArrayOutputStream(); > boolean memoutok = ImageIO.write(bim, "png", memout); > if (!memoutok) > System.err.println ("mem write failed for " + p); > memout.reset(); > memout.close(); > // Test with output to png file > String fname = String.format("%s-%02d.png", prefix, p); > boolean foutok = ImageIO.write(bim, "png", new > File(fname)); > if (!foutok) > System.err.println ("file write failed for " + p); > .... > Apr 26, 2012 2:41:11 PM org.apache.pdfbox.util.PDFStreamEngine processOperator > Information: unsupported/disabled operation: i > Apr 26, 2012 2:41:12 PM org.apache.pdfbox.util.PDFStreamEngine processOperator > Information: unsupported/disabled operation: ri > Apr 26, 2012 2:41:12 PM org.apache.pdfbox.pdmodel.font.PDSimpleFont drawString > Warnung: Changing font on < > from <AMAKEA+TimesNewRoman> to the default font > Apr 26, 2012 2:41:13 PM org.apache.pdfbox.pdmodel.font.PDSimpleFont drawString > Warnung: Changing font on < > from <AMAKEA+TimesNewRoman> to the default font > Apr 26, 2012 2:41:13 PM org.apache.pdfbox.pdmodel.font.PDSimpleFont drawString > Warnung: Changing font on <O> from <AMAKME+Arial,Bold> to the default font > Apr 26, 2012 2:41:13 PM org.apache.pdfbox.pdmodel.font.PDSimpleFont drawString -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira