[jira] [Assigned] (PDFBOX-2294) Improve vertical text drawing as an experiment
[ https://issues.apache.org/jira/browse/PDFBOX-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Hewson reassigned PDFBOX-2294: --- Assignee: John Hewson Improve vertical text drawing as an experiment -- Key: PDFBOX-2294 URL: https://issues.apache.org/jira/browse/PDFBOX-2294 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Environment: Windows7, JDK 1.7 Reporter: tani Assignee: John Hewson Priority: Minor Attachments: vertical_writing_mode.patch When I converted some PDF into image using PDFToImage, I got unexpected image which has horizontally located text even its encoding was Identity-V. After trying attached patch file as workaround, I could see improvement. (Even though it remains deviations of text location partially..) (Related article) http://mail-archives.apache.org/mod_mbox/pdfbox-users/201408.mbox/%3C53FC4BC8.8070003%40marino.co.jp%3E PDF file which I dealt is as follows. http://blogs.adobe.com/CCJKType/files/2012/07/TaroUTR50SortedList112.pdf -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2294) Improve vertical text drawing as an experiment
[ https://issues.apache.org/jira/browse/PDFBOX-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113399#comment-14113399 ] John Hewson commented on PDFBOX-2294: - Thanks I was thinking of adding a similar patch to my branch for PDFBOX-2262, I'll do that. Improve vertical text drawing as an experiment -- Key: PDFBOX-2294 URL: https://issues.apache.org/jira/browse/PDFBOX-2294 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Environment: Windows7, JDK 1.7 Reporter: tani Assignee: John Hewson Priority: Minor Attachments: vertical_writing_mode.patch When I converted some PDF into image using PDFToImage, I got unexpected image which has horizontally located text even its encoding was Identity-V. After trying attached patch file as workaround, I could see improvement. (Even though it remains deviations of text location partially..) (Related article) http://mail-archives.apache.org/mod_mbox/pdfbox-users/201408.mbox/%3C53FC4BC8.8070003%40marino.co.jp%3E PDF file which I dealt is as follows. http://blogs.adobe.com/CCJKType/files/2012/07/TaroUTR50SortedList112.pdf -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113410#comment-14113410 ] Tilman Hausherr commented on PDFBOX-2262: - differences: PDFBOX-2091.pdf - y missing PDFBOX-2191-006816.pdf - bullets between blocks wrong PDFBOX-2192-006972.pdf - also problem with bullets exceptions: PDFBOX-2048-confidential.pdf PDFBOX-1735-confidential.pdf PDFBOX-2251-070075.pdf - NPE without stack trace?! PDFBOX-940.pdf {code} SCHWERWIEGEND: Error converting file PDFBOX-1735-confidential.pdf java.lang.NullPointerException at org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.codeToGID(PDTrueTypeFont.java:189) at org.apache.pdfbox.rendering.font.TTFGlyph2D.getGIDForCharacterCode(TTFGlyph2D.java:104) at org.apache.pdfbox.rendering.font.TTFGlyph2D.getPathForCharacterCode(TTFGlyph2D.java:91) at org.apache.pdfbox.rendering.PageDrawer.drawGlyph2D(PageDrawer.java:319) at org.apache.pdfbox.rendering.PageDrawer.processGlyph(PageDrawer.java:295) at org.apache.pdfbox.util.PDFStreamEngine.processText(PDFStreamEngine.java:475) at org.apache.pdfbox.rendering.PageDrawer.processText(PageDrawer.java:263) at org.apache.pdfbox.util.PDFStreamEngine.showText(PDFStreamEngine.java:314) {code} {code} SCHWERWIEGEND: Error converting file PDFBOX-2048-confidential.pdf java.io.EOFException at org.apache.fontbox.ttf.TTFDataStream.readUnsignedInt(TTFDataStream.java:134) at org.apache.fontbox.ttf.IndexToLocationTable.read(IndexToLocationTable.java:58) at org.apache.fontbox.ttf.TrueTypeFont.readTable(TrueTypeFont.java:289) at org.apache.fontbox.ttf.TTFParser.parseTables(TTFParser.java:154) at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:135) at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:109) at org.apache.pdfbox.pdmodel.font.PDCIDFontType2.init(PDCIDFontType2.java:70) at org.apache.pdfbox.pdmodel.font.PDFontFactory.createDescendantFont(PDFontFactory.java:123) at org.apache.pdfbox.pdmodel.font.PDType0Font.init(PDType0Font.java:63) at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:81) at org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:215) {code} {code} SCHWERWIEGEND: Error converting file PDFBOX-940.pdf java.lang.NullPointerException at org.apache.pdfbox.pdmodel.font.PDCIDFont.getWidth(PDCIDFont.java:194) at org.apache.pdfbox.pdmodel.font.PDType0Font.getWidth(PDType0Font.java:189) at org.apache.pdfbox.util.PDFStreamEngine.processText(PDFStreamEngine.java:416) at org.apache.pdfbox.rendering.PageDrawer.processText(PageDrawer.java:263) at org.apache.pdfbox.util.PDFStreamEngine.showAdjustedTextRun(PDFStreamEngine.java:349) at org.apache.pdfbox.util.PDFStreamEngine.showAdjustedText(PDFStreamEngine.java:335) at org.apache.pdfbox.util.operator.text.ShowTextGlyph.process(ShowTextGlyph.java:69) {code} Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1442) bar chart converted from PDF is totally a black area.
[ https://issues.apache.org/jira/browse/PDFBOX-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113417#comment-14113417 ] John Hewson commented on PDFBOX-1442: - Yes, pattern matrix (if present) is concatenated to the stream's initial matrix to get the transformation from pattern space to device space. bar chart converted from PDF is totally a black area. - Key: PDFBOX-1442 URL: https://issues.apache.org/jira/browse/PDFBOX-1442 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.7.1, 1.8.0, 1.8.2, 1.8.3, 1.8.4, 1.8.5, 2.0.0 Reporter: James Zhou Labels: shading, shadingpattern Attachments: PATTYP1.pdf-1.png, clientfocus.PNG, clientfocus.pdf, funsh01.pdf-1.png, pdfbox-1442.pdf-1.png, pdfbox-1442.pdf-1.png The bar charts converted from PDF is totally a black area. The code is as following: {code} PDFImageWriter imageWriter = new PDFImageWriter(); boolean success = imageWriter.writeImage(document, imageFormat, password, startPage, endPage, outputPrefix, imageType, resolution); if (!success) { logger.error( Error: no writer found for image format ' + imageFormat + ' ); System.exit(1); } {code} I will attach the ppt and PNG files later -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-1442) bar chart converted from PDF is totally a black area.
[ https://issues.apache.org/jira/browse/PDFBOX-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113417#comment-14113417 ] John Hewson edited comment on PDFBOX-1442 at 8/28/14 6:27 AM: -- Yes, the pattern matrix (if present) is concatenated to the stream's initial matrix to get the transformation from pattern space to device space. was (Author: jahewson): Yes, pattern matrix (if present) is concatenated to the stream's initial matrix to get the transformation from pattern space to device space. bar chart converted from PDF is totally a black area. - Key: PDFBOX-1442 URL: https://issues.apache.org/jira/browse/PDFBOX-1442 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.7.1, 1.8.0, 1.8.2, 1.8.3, 1.8.4, 1.8.5, 2.0.0 Reporter: James Zhou Labels: shading, shadingpattern Attachments: PATTYP1.pdf-1.png, clientfocus.PNG, clientfocus.pdf, funsh01.pdf-1.png, pdfbox-1442.pdf-1.png, pdfbox-1442.pdf-1.png The bar charts converted from PDF is totally a black area. The code is as following: {code} PDFImageWriter imageWriter = new PDFImageWriter(); boolean success = imageWriter.writeImage(document, imageFormat, password, startPage, endPage, outputPrefix, imageType, resolution); if (!success) { logger.error( Error: no writer found for image format ' + imageFormat + ' ); System.exit(1); } {code} I will attach the ppt and PNG files later -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2291) Differences in Overlay stamping between version 1.8.2 and 1.8.6
[ https://issues.apache.org/jira/browse/PDFBOX-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2291: Issue Type: Bug (was: Test) Differences in Overlay stamping between version 1.8.2 and 1.8.6 --- Key: PDFBOX-2291 URL: https://issues.apache.org/jira/browse/PDFBOX-2291 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.8.6, 1.8.7 Environment: JDK 7 Reporter: Markus Sticker Labels: Overlay, Regression Attachments: Doc1.pdf, out.2.pdf, out.3.pdf, zf_static_title.pdf Hello, I try to overlay PDF File with other PDF Pages. So to say to stamp the first page with a title page and for the following pages I want to stamp the head and also the foot of each page. I want to use the version 1.8.6 of pdfbox. I tried last time with the version 1.8.2 with success, but now I struggle because my PDF view tells me that the stamped PDF is demaged. Just to explain the attached files: Dok1.pdf = original Out.2.pdf = stamped with 1.8.6 Out.3.pdf = stamped with 1.8.2 So what is the difference between 1.8.2 and 1.8.6 in this case? Kind regards Markus -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1695) Improve pdfbox tests
[ https://issues.apache.org/jira/browse/PDFBOX-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113461#comment-14113461 ] ASF subversion and git services commented on PDFBOX-1695: - Commit 1621065 from [~tilman] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1621065 ] PDFBOX-1695: show diff result as absolute substraction of color differences, so that minor differences will be found, but not seen Improve pdfbox tests Key: PDFBOX-1695 URL: https://issues.apache.org/jira/browse/PDFBOX-1695 Project: PDFBox Issue Type: Improvement Affects Versions: 1.8.2, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Priority: Minor Labels: tdd, test-driven, testing Attachments: ccitt4.tif, jbig2test-01.png, jbig2test.pdf I'd like to improve the tests for rendering. org/apache/pdfbox/util/TestPDFToImage.java is disabled in pdfbox\pom.xml . This has been disabled since 2009 ?! So I enabled it here. The subdir rendering is missing in pdfbox\target\test-output for these tests When a test fails because the rendered image is not identical, no detailed message appears on the console. It appears only in pdfbox.log and not on the console. this is because of the settings in pdfbox\src\test\resources\logging.properties If this is on purpose, please change the texts in pdfbox\src\test\java\org\apache\pdfbox\util\*.java from One or more failures, see test log for details to One or more failures, see test logfile 'pdfbox.log' for details I wanted to attach a PDF with ccitt g4 compression and its rendering created with the 1.8.2 version, but it doesn't work out, seems that CIB generates files that can be rendered properly with 1.8.2. However I attach the TIFF g4 file, and a JBIG2 test file from it. I don't have access to a Xerox WorkCentre (enter jbig2 in google news :-) ) so I used a free service, so there's a watermark. It should be included into pdfbox\src\test\resources\input\rendering I have created the image myself and I give it into the public domain. If my suggestion is accepted, it would be nice if people could create files that fail in current versions or have failed in old versions, and release these files to the public domain, so that they can be added to the tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113491#comment-14113491 ] ASF subversion and git services commented on PDFBOX-2262: - Commit 1621078 from [~jahewson] in branch 'pdfbox/branches/no-awt' [ https://svn.apache.org/r1621078 ] PDFBOX-2262: Missing encoding entries should map to .notdef Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113493#comment-14113493 ] ASF subversion and git services commented on PDFBOX-2262: - Commit 1621079 from [~jahewson] in branch 'pdfbox/branches/no-awt' [ https://svn.apache.org/r1621079 ] PDFBOX-2262: use default width in CIDFonts Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113497#comment-14113497 ] ASF subversion and git services commented on PDFBOX-2262: - Commit 1621081 from [~jahewson] in branch 'pdfbox/branches/no-awt' [ https://svn.apache.org/r1621081 ] PDFBOX-2262: Reinstate RAFDataStream for performance Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113496#comment-14113496 ] ASF subversion and git services commented on PDFBOX-2262: - Commit 1621080 from [~jahewson] in branch 'pdfbox/branches/no-awt' [ https://svn.apache.org/r1621080 ] PDFBOX-2262: Better logging for TTF errors Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113499#comment-14113499 ] ASF subversion and git services commented on PDFBOX-2262: - Commit 1621082 from [~jahewson] in branch 'pdfbox/branches/no-awt' [ https://svn.apache.org/r1621082 ] PDFBOX-2262: FontProvider thread safety Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113512#comment-14113512 ] John Hewson commented on PDFBOX-2262: - That should help with the exceptions, I'll take a look at the rendering issues tomorrow. Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
Jenkins build is back to normal : PDFBox-trunk » PDFBox parent #1220
See https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$pdfbox-parent/1220/
Jenkins build is back to normal : PDFBox-trunk #1220
See https://builds.apache.org/job/PDFBox-trunk/1220/changes
Jenkins build is back to normal : PDFBox-trunk » Apache XmpBox #1220
See https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$xmpbox/1220/
[jira] [Created] (PDFBOX-2295) Checkboxes missing
simon steiner created PDFBOX-2295: - Summary: Checkboxes missing Key: PDFBOX-2295 URL: https://issues.apache.org/jira/browse/PDFBOX-2295 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 2.0.0 Reporter: simon steiner java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar PDFToImage c21-5916.pdf -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2295) Checkboxes missing
[ https://issues.apache.org/jira/browse/PDFBOX-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] simon steiner updated PDFBOX-2295: -- Description: PDF http://svn.apache.org/viewvc/incubator/pdfbox/trunk/test/input/c21-5916%20.pdf?revision=682412view=copathrev=793348 java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar PDFToImage c21-5916.pdf was:java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar PDFToImage c21-5916.pdf Checkboxes missing -- Key: PDFBOX-2295 URL: https://issues.apache.org/jira/browse/PDFBOX-2295 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 2.0.0 Reporter: simon steiner PDF http://svn.apache.org/viewvc/incubator/pdfbox/trunk/test/input/c21-5916%20.pdf?revision=682412view=copathrev=793348 java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar PDFToImage c21-5916.pdf -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2295) Checkboxes missing
[ https://issues.apache.org/jira/browse/PDFBOX-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113632#comment-14113632 ] Andreas Lehmkühler commented on PDFBOX-2295: Seems to be a font issue. A wingdings font is used to produce the checkbox. Checkboxes missing -- Key: PDFBOX-2295 URL: https://issues.apache.org/jira/browse/PDFBOX-2295 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 2.0.0 Reporter: simon steiner PDF http://svn.apache.org/viewvc/incubator/pdfbox/trunk/test/input/c21-5916%20.pdf?revision=682412view=copathrev=793348 java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar PDFToImage c21-5916.pdf -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (PDFBOX-2270) PDField.getFullyQualifiedName() returns name adding suffix '.null'
[ https://issues.apache.org/jira/browse/PDFBOX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler reassigned PDFBOX-2270: -- Assignee: Andreas Lehmkühler PDField.getFullyQualifiedName() returns name adding suffix '.null' -- Key: PDFBOX-2270 URL: https://issues.apache.org/jira/browse/PDFBOX-2270 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.5.0, 1.7.1, 1.8.0, 1.8.6 Environment: JSE1.6 Reporter: Javier García Sánchez Assignee: Andreas Lehmkühler Labels: PDAcroForm, PDField, getFullyQualifiedName() Attachments: TesterFields.java, business_loan_app_1_signer.pdf Original Estimate: 120h Remaining Estimate: 120h We have several pdf files where each one contains one pdf form with their own fields. We need to read all pdf fields and list them into a txt file. The problem comes when a pdf form has duplicated field names, so the field.getFullyQualifiedName() returns the name of the field wrong, adding '.null' at the final of field's names. --Situation: 1) PDf file containing a pdf form 2) The pdf form contains lot of fields, some of their field's names are duplicated, like for example 'Applicant.city'. 3) When I try to list all of field's names, duplicate field's names comes with a suffix '.null' -- this only happends on duplicated field's names. -- --Example: 1) PDF Form with 4 fields whos names are: 'Applicant.name', 'Applicant.phone', 'Applicant.ssn', 'Applicant.name'. 2)After running the code shown bellow, the result list is: 'Applicant.name.null', 'Applicant.phone', 'Applicant.ssn', 'Applicant.name.null'. -- --Attach the code for listing all pdf form field's names: public static SetString printFields( PDDocument doc ) throws IOException { PDDocumentCatalog docCatalog = doc.getDocumentCatalog(); PDAcroForm acroForm = docCatalog.getAcroForm(); List fields = acroForm.getFields(); Iterator fieldsIter = fields.iterator(); SetString fieldSet = new HashSetString(); while ( fieldsIter.hasNext() ){ PDField field = (PDField)fieldsIter.next(); // String fieldFullName = processField(field); fieldSet.addAll( processField( field ) ); } return fieldSet; } private static SetString processField( PDField field ) throws IOException { List kids = field.getKids(); SetString result = new HashSetString(); if( kids != null ){ Iterator kidsIter = kids.iterator(); while ( kidsIter.hasNext() ){ Object pdfObj = kidsIter.next(); if( pdfObj instanceof PDField ){ PDField kid = (PDField)pdfObj; result.addAll( processField( kid ) ); } } }else{ //System.out.println( field.getFullyQualifiedName(): + field.getFullyQualifiedName() ); result.add( field.getFullyQualifiedName() ); } return result; } field.getFullyQualifiedName() is returning duplicated field's names with a prefix '.null'. Thanks in advance. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2295) Checkboxes missing
[ https://issues.apache.org/jira/browse/PDFBOX-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113864#comment-14113864 ] Tilman Hausherr commented on PDFBOX-2295: - It is fixed in PDFBOX-2262, in the version of this morning. Checkboxes missing -- Key: PDFBOX-2295 URL: https://issues.apache.org/jira/browse/PDFBOX-2295 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 2.0.0 Reporter: simon steiner PDF http://svn.apache.org/viewvc/incubator/pdfbox/trunk/test/input/c21-5916%20.pdf?revision=682412view=copathrev=793348 java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar PDFToImage c21-5916.pdf -- This message was sent by Atlassian JIRA (v6.2#6252)
AcroForm fields and appearance stream generation
Hi, there are cases where a form field doesn’t contain an appearance e.g. when the form was filled and the NeedAppearances flag in the forms dictionary has been set. In such cases for rendering an appearance stream needs to be generated. Am I right that for PDFBox # we should respect a NeedAppearances flag when setting a fields value so that we don’t generate an appearance stream in that case # we shouldn’t generate an appearance stream during the parsing stage if none exists # we shall generate an appearance stream if non exists when rendering the PDF BR Maruan
Re: AcroForm fields and appearance stream generation
Hi, Am 28.08.2014 18:23, schrieb Maruan Sahyoun: Hi, there are cases where a form field doesn’t contain an appearance e.g. when the form was filled and the NeedAppearances flag in the forms dictionary has been set. In such cases for rendering an appearance stream needs to be generated. Am I right that for PDFBox # we should respect a NeedAppearances flag when setting a fields value so that we don’t generate an appearance stream in that case +1 # we shouldn’t generate an appearance stream during the parsing stage if none exists +1 # we shall generate an appearance stream if non exists when rendering the PDF +1 BR Maruan BR Andreas Lehmkühler
[jira] [Assigned] (PDFBOX-2291) Differences in Overlay stamping between version 1.8.2 and 1.8.6
[ https://issues.apache.org/jira/browse/PDFBOX-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler reassigned PDFBOX-2291: -- Assignee: Andreas Lehmkühler Differences in Overlay stamping between version 1.8.2 and 1.8.6 --- Key: PDFBOX-2291 URL: https://issues.apache.org/jira/browse/PDFBOX-2291 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.8.6, 1.8.7 Environment: JDK 7 Reporter: Markus Sticker Assignee: Andreas Lehmkühler Labels: Overlay, Regression Attachments: Doc1.pdf, out.2.pdf, out.3.pdf, zf_static_title.pdf Hello, I try to overlay PDF File with other PDF Pages. So to say to stamp the first page with a title page and for the following pages I want to stamp the head and also the foot of each page. I want to use the version 1.8.6 of pdfbox. I tried last time with the version 1.8.2 with success, but now I struggle because my PDF view tells me that the stamped PDF is demaged. Just to explain the attached files: Dok1.pdf = original Out.2.pdf = stamped with 1.8.6 Out.3.pdf = stamped with 1.8.2 So what is the difference between 1.8.2 and 1.8.6 in this case? Kind regards Markus -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114083#comment-14114083 ] Tilman Hausherr commented on PDFBOX-2262: - In truetypefont.java, getFullName() and toString() throw an NPE if the naming table is null. (Happened to me as I wanted to do some debug logging) Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114094#comment-14114094 ] ASF subversion and git services commented on PDFBOX-2262: - Commit 1621174 from [~jahewson] in branch 'pdfbox/branches/no-awt' [ https://svn.apache.org/r1621174 ] PDFBOX-2262: thread safety for fonts Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114102#comment-14114102 ] John Hewson commented on PDFBOX-2262: - {quote} In truetypefont.java, getFullName() and toString() throw an NPE if the naming table is null. (Happened to me as I wanted to do some debug logging) {quote} Fixed :) Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114101#comment-14114101 ] ASF subversion and git services commented on PDFBOX-2262: - Commit 1621175 from [~jahewson] in branch 'pdfbox/branches/no-awt' [ https://svn.apache.org/r1621175 ] PDFBOX-2262: Allow getNaming()to be null in TTF Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114111#comment-14114111 ] John Hewson commented on PDFBOX-2262: - I don't have PDFBOX-2048-confidential.pdf, can you send it to me off-list? Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114134#comment-14114134 ] Tilman Hausherr commented on PDFBOX-2262: - This happens for the PDFBOX-2048-confidential.pdf file when reading the loca table: {code} Aug 28, 2014 8:32:33 PM org.apache.pdfbox.util.TestPDFToImage doTestFile SCHWERWIEGEND: Error converting file PDFBOX-2048-confidential.pdf java.io.IOException: Could not read embedded TTF for font FAAABC+TimesNewRomanPS-BoldMT at org.apache.pdfbox.pdmodel.font.PDCIDFontType2.init(PDCIDFontType2.java:81) at org.apache.pdfbox.pdmodel.font.PDFontFactory.createDescendantFont(PDFontFactory.java:123) at org.apache.pdfbox.pdmodel.font.PDType0Font.init(PDType0Font.java:63) at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:81) at org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:215) at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:569) at org.apache.pdfbox.util.operator.text.SetTextFont.process(SetTextFont.java:48) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:536) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:269) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:236) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:190) at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:161) at org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:228) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:160) at org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:95) at org.apache.pdfbox.util.TestPDFToImage.doTestFile(TestPDFToImage.java:220) at org.apache.pdfbox.util.TestPDFToImage.testRenderImage(TestPDFToImage.java:332) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:176) at junit.framework.TestCase.runBare(TestCase.java:141) at junit.framework.TestResult$1.protect(TestResult.java:122) at junit.framework.TestResult.runProtected(TestResult.java:142) at junit.framework.TestResult.run(TestResult.java:125) at junit.framework.TestCase.run(TestCase.java:129) at junit.framework.TestSuite.runTest(TestSuite.java:255) at junit.framework.TestSuite.run(TestSuite.java:250) at junit.textui.TestRunner.doRun(TestRunner.java:116) at junit.textui.TestRunner.start(TestRunner.java:183) at junit.textui.TestRunner.main(TestRunner.java:137) at org.apache.pdfbox.util.TestPDFToImage.main(TestPDFToImage.java:382) Caused by: java.io.EOFException at org.apache.fontbox.ttf.TTFDataStream.readUnsignedInt(TTFDataStream.java:134) at org.apache.fontbox.ttf.IndexToLocationTable.read(IndexToLocationTable.java:58) at org.apache.fontbox.ttf.TrueTypeFont.readTable(TrueTypeFont.java:285) at org.apache.fontbox.ttf.TTFParser.parseTables(TTFParser.java:142) at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:122) at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:96) at org.apache.pdfbox.pdmodel.font.PDCIDFontType2.init(PDCIDFontType2.java:72) ... 32 more {code} Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added.
[jira] [Commented] (PDFBOX-2091) Some characters are not rendered (font with symbol encoding)
[ https://issues.apache.org/jira/browse/PDFBOX-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114184#comment-14114184 ] ASF subversion and git services commented on PDFBOX-2091: - Commit 1621184 from [~jahewson] in branch 'pdfbox/branches/no-awt' [ https://svn.apache.org/r1621184 ] PDFBOX-2262/PDFBOX-2091: Fallback to 'post' table for nonsymbolic TTFs Some characters are not rendered (font with symbol encoding) Key: PDFBOX-2091 URL: https://issues.apache.org/jira/browse/PDFBOX-2091 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 2.0.0 Reporter: Juraj Lonc Assignee: Andreas Lehmkühler Fix For: 2.0.0 Attachments: PDFBOX-2091_TTFGlyph2D.diff, missing_yaccute.pdf, output.png Some characters are not rendered (see attached PDF). In this case it is yaccute. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2091) Some characters are not rendered (font with symbol encoding)
[ https://issues.apache.org/jira/browse/PDFBOX-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114186#comment-14114186 ] John Hewson edited comment on PDFBOX-2091 at 8/28/14 7:09 PM: -- I've just fixed this issue again as part of PDFBOX-2262. The conversation above is mostly incorrect, this file is a valid PDF and rendering it follows the spec exactly. The font _does_ contain the character mappings, but not in its (3,1) or (1,0) cmap tables, instead the PostScript names for glyphs are contained in the post table. The PDF spec describes exactly what to do in the fallback situation for nonsymbolic fonts: {quote} PDF 32000, p266: In any of these cases, if the glyph name cannot be mapped as specified, the glyph name shall be looked up in the font program’s “post” table (if one is present) and the associated glyph description shall be used. {quote} I've removed the existing workaround of using the (3, 0) table and replaced it with the mechanism described in the spec. was (Author: jahewson): I've just fixed this issue again as part of PDFBOX-2262. The conversation above is mostly incorrect, this file is a valid PDF and rendering it follows the spec exactly. The font _does_ contain the character mappings, but not in it's (3,1) or (1,0) cmap tables, instead the PostScript names for glyphs are contained in the post table. The PDF spec describes exactly what to do in the fallback situation for nonsymbolic fonts: {quote} PDF 32000, p266: In any of these cases, if the glyph name cannot be mapped as specified, the glyph name shall be looked up in the font program’s “post” table (if one is present) and the associated glyph description shall be used. {quote} I've removed the existing workaround of using the (3, 0) table and replaced it with the mechanism described in the spec. Some characters are not rendered (font with symbol encoding) Key: PDFBOX-2091 URL: https://issues.apache.org/jira/browse/PDFBOX-2091 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 2.0.0 Reporter: Juraj Lonc Assignee: Andreas Lehmkühler Fix For: 2.0.0 Attachments: PDFBOX-2091_TTFGlyph2D.diff, missing_yaccute.pdf, output.png Some characters are not rendered (see attached PDF). In this case it is yaccute. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114183#comment-14114183 ] ASF subversion and git services commented on PDFBOX-2262: - Commit 1621184 from [~jahewson] in branch 'pdfbox/branches/no-awt' [ https://svn.apache.org/r1621184 ] PDFBOX-2262/PDFBOX-2091: Fallback to 'post' table for nonsymbolic TTFs Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2091) Some characters are not rendered (font with symbol encoding)
[ https://issues.apache.org/jira/browse/PDFBOX-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114186#comment-14114186 ] John Hewson commented on PDFBOX-2091: - I've just fixed this issue again as part of PDFBOX-2262. The conversation above is mostly incorrect, this file is a valid PDF and rendering it follows the spec exactly. The font _does_ contain the character mappings, but not in it's (3,1) or (1,0) cmap tables, instead the PostScript names for glyphs are contained in the post table. The PDF spec describes exactly what to do in the fallback situation for nonsymbolic fonts: {quote} PDF 32000, p266: In any of these cases, if the glyph name cannot be mapped as specified, the glyph name shall be looked up in the font program’s “post” table (if one is present) and the associated glyph description shall be used. {quote} I've removed the existing workaround of using the (3, 0) table and replaced it with the mechanism described in the spec. Some characters are not rendered (font with symbol encoding) Key: PDFBOX-2091 URL: https://issues.apache.org/jira/browse/PDFBOX-2091 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 2.0.0 Reporter: Juraj Lonc Assignee: Andreas Lehmkühler Fix For: 2.0.0 Attachments: PDFBOX-2091_TTFGlyph2D.diff, missing_yaccute.pdf, output.png Some characters are not rendered (see attached PDF). In this case it is yaccute. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114220#comment-14114220 ] John Hewson commented on PDFBOX-2262: - The problem with PDFBOX-2048-confidential.pdf is that the embedded TTF is 37980 bytes, but COSStream#getUnfilteredStream() is returning a stream with only 6747 bytes. Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114220#comment-14114220 ] John Hewson edited comment on PDFBOX-2262 at 8/28/14 7:37 PM: -- The problem with PDFBOX-2048-confidential.pdf is that the embedded TTF is 37980 bytes, but COSStream#getUnfilteredStream() is returning a stream with only 6747 bytes, hence EOFException. was (Author: jahewson): The problem with PDFBOX-2048-confidential.pdf is that the embedded TTF is 37980 bytes, but COSStream#getUnfilteredStream() is returning a stream with only 6747 bytes. Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114236#comment-14114236 ] John Hewson commented on PDFBOX-2262: - Lets ignore PDFBOX-2048-confidential.pdf, because the problem is not related to AWT fonts, the stream dictionarry for the FAAABC+TimesNewRomanPS-BoldMT font is bous: {code} /Length1 37908 /Length 4412 /Filter /FlateDecode {code} There's no such thing as Length1 for a TTF stream, and Length has the wrong value, so the FlateFilter is ending early. Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114236#comment-14114236 ] John Hewson edited comment on PDFBOX-2262 at 8/28/14 7:46 PM: -- Lets ignore PDFBOX-2048-confidential.pdf, because the problem is not related to AWT fonts, the stream dictionarry for the FAAABC+TimesNewRomanPS-BoldMT font is bogus: {code} /Length1 37908 /Length 4412 /Filter /FlateDecode {code} There's no such thing as Length1 for a TTF stream, and Length has the wrong value, so the FlateFilter is ending early. was (Author: jahewson): Lets ignore PDFBOX-2048-confidential.pdf, because the problem is not related to AWT fonts, the stream dictionarry for the FAAABC+TimesNewRomanPS-BoldMT font is bous: {code} /Length1 37908 /Length 4412 /Filter /FlateDecode {code} There's no such thing as Length1 for a TTF stream, and Length has the wrong value, so the FlateFilter is ending early. Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114236#comment-14114236 ] John Hewson edited comment on PDFBOX-2262 at 8/28/14 7:45 PM: -- Lets ignore PDFBOX-2048-confidential.pdf, because the problem is not related to AWT fonts, the stream dictionarry for the FAAABC+TimesNewRomanPS-BoldMT font is bous: {code} /Length1 37908 /Length 4412 /Filter /FlateDecode {code} There's no such thing as Length1 for a TTF stream, and Length has the wrong value, so the FlateFilter is ending early. was (Author: jahewson): Lets ignore PDFBOX-2048-confidential.pdf, because the problem is not related to AWT fonts, the stream dictionarry for the FAAABC+TimesNewRomanPS-BoldMT font is bous: {code} /Length1 37908 /Length 4412 /Filter /FlateDecode {code} There's no such thing as Length1 for a TTF stream, and Length has the wrong value, so the FlateFilter is ending early. Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114236#comment-14114236 ] John Hewson edited comment on PDFBOX-2262 at 8/28/14 7:49 PM: -- Lets ignore PDFBOX-2048-confidential.pdf, because the problem is not related to AWT fonts, the stream dictionarry for the FAAABC+TimesNewRomanPS-BoldMT font is bogus: {code} /Length1 37908 /Length 4412 /Filter /FlateDecode {code} There's no such thing as Length1 for a TTF stream, and Length has the wrong value, so the FlateFilter is ending early. All other files are now rendering correctly for me. was (Author: jahewson): Lets ignore PDFBOX-2048-confidential.pdf, because the problem is not related to AWT fonts, the stream dictionarry for the FAAABC+TimesNewRomanPS-BoldMT font is bogus: {code} /Length1 37908 /Length 4412 /Filter /FlateDecode {code} There's no such thing as Length1 for a TTF stream, and Length has the wrong value, so the FlateFilter is ending early. Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (PDFBOX-2295) Checkboxes missing
[ https://issues.apache.org/jira/browse/PDFBOX-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Hewson reassigned PDFBOX-2295: --- Assignee: John Hewson Checkboxes missing -- Key: PDFBOX-2295 URL: https://issues.apache.org/jira/browse/PDFBOX-2295 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 2.0.0 Reporter: simon steiner Assignee: John Hewson PDF http://svn.apache.org/viewvc/incubator/pdfbox/trunk/test/input/c21-5916%20.pdf?revision=682412view=copathrev=793348 java -jar ~/pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar PDFToImage c21-5916.pdf -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114251#comment-14114251 ] Tilman Hausherr commented on PDFBOX-2262: - I traced through the flate filter, it reads 6748 bytes, i.e. it isn't a bug in getUnfilteredStream(). However in the trunk a premature EOF in a table didn't result in an exception, i.e. that file renders. Yes, the other files are ok now :-) Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114267#comment-14114267 ] Tilman Hausherr commented on PDFBOX-2262: - /Length1 is described in the spec: {quote} (Required for Type 1 and TrueType fonts) The length in bytes of the clear-text portion of the Type 1 font program (see below), or the entire TrueType font program, after it has been decoded using the filters specified by the stream’s Filter entry, if any.{quote} Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114278#comment-14114278 ] Tilman Hausherr commented on PDFBOX-2262: - The length 4412 is definitely wrong. So this might be a new bug in PDFBox too, because it should detect when the length is wrong and read the stream the hard way. Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114321#comment-14114321 ] John Hewson commented on PDFBOX-2262: - I'm going to take a look at PDFBOX-2294 and do some cleaning up and refactoring, then if everything still works it will be time to merge this branch into trunk. Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114318#comment-14114318 ] John Hewson commented on PDFBOX-2262: - {quote} /Length1 is described in the spec: {quote} So it is, I had thought it was Type 1 only. We could probably use Length1 to sanity check the decode data length and keep reading if it is too short. Yes, it's a new PDFBox bug. Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114325#comment-14114325 ] ASF subversion and git services commented on PDFBOX-2262: - Commit 1621202 from [~jahewson] in branch 'pdfbox/branches/no-awt' [ https://svn.apache.org/r1621202 ] PDFBOX-2262: Expose encoding details for debugging Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114334#comment-14114334 ] ASF subversion and git services commented on PDFBOX-2262: - Commit 1621203 from [~jahewson] in branch 'pdfbox/branches/no-awt' [ https://svn.apache.org/r1621203 ] PDFBOX-2262: Refactor text positioning Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2262) Remove usage of AWT fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114468#comment-14114468 ] Tilman Hausherr commented on PDFBOX-2262: - I found the bug, see PDFBOX-2296. With the fix, the file renders fine and much better than in the trunk. Remove usage of AWT fonts - Key: PDFBOX-2262 URL: https://issues.apache.org/jira/browse/PDFBOX-2262 Project: PDFBox Issue Type: Improvement Components: PDModel, Rendering Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: Basiswissen-Vorschriften.pdf, Basiswissen-Vorschriften.pdf-1.png, Basiswissen-Vorschriften.pdf-1.png-diff.png, Basiswissen-Vorschriften.pdf-9.png, Basiswissen-Vorschriften.pdf-9.png-diff.png, ELVIA-Reiserucktritt-Vollschutz.pdf-1.png, FreeSansTest.pdf, PDFBOX-1094-094730.pdf-1.png, PDFBOX-1770.pdf-1.png, bugzilla867751.pdf-2.png, bugzilla867751.pdf-2.png-diff.png, bugzilla886049.pdf, bugzilla886049.pdf-1.png, test_1fd9a_test.pdf We're still using AWT fonts to render the standard 14 built-in fonts, which causes rendering problems and encoding issues (see PDFBOX-2140). We're also using AWT for some fallback fonts. Removal of these AWT fonts isn't too difficult, we need to load the fonts using the existing PDFFontManager mechanism which has recently been added. All missing TrueType fonts loaded from disk have been using SystemFontManager for a number of weeks now. We should ship some sensible default fonts with PDFBox, such as the Liberation fonts (see PDFBOX-2169, PDFBOX-2263), in case PDFFontManager can't find anything suitable, rather than falling back to the default TTF font, but by default we'll probe the system for suitable fonts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2296) Wrong stream length used for truetype font
[ https://issues.apache.org/jira/browse/PDFBOX-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2296: Description: The file of PDFBOX-2048 has a wrong encoded font length, it is 4412 in the PDF but it is really about 27350. This wrong length is used to read the encoded font stream and this results in further trouble (EOF). The problem is that the wrong length is passed to createFilteredStream() instead of just calling it without parameters. In cosStream.doDecode() unFilteredStream = filteredStream (there is a FIXME there!!!), and in doDecode(COSName filterName, int filterIndex) unFilteredStream.getLength() is used, which returns the expectedLength. was: The file of PDFBOX-2048 has a wrong font length, it is 4412 in the PDF but it is really about 27350. This wrong length is used to read the font and this results in further trouble. The problem is that the wrong length is passed to createFilteredStream() instead of just calling it without parameters. In cosStream.doDecode() unFilteredStream = filteredStream (there is a FIXME there!!!), and in doDecode(COSName filterName, int filterIndex) unFilteredStream.getLength() is used, which returns the expectedLength. Wrong stream length used for truetype font -- Key: PDFBOX-2296 URL: https://issues.apache.org/jira/browse/PDFBOX-2296 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: Tilman Hausherr The file of PDFBOX-2048 has a wrong encoded font length, it is 4412 in the PDF but it is really about 27350. This wrong length is used to read the encoded font stream and this results in further trouble (EOF). The problem is that the wrong length is passed to createFilteredStream() instead of just calling it without parameters. In cosStream.doDecode() unFilteredStream = filteredStream (there is a FIXME there!!!), and in doDecode(COSName filterName, int filterIndex) unFilteredStream.getLength() is used, which returns the expectedLength. -- This message was sent by Atlassian JIRA (v6.2#6252)