[jira] [Commented] (PDFBOX-3732) IllegalArgumentException when refreshing an appearance and no font resources are defined
[ https://issues.apache.org/jira/browse/PDFBOX-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005940#comment-16005940 ] ASF subversion and git services commented on PDFBOX-3732: - Commit 1794785 from [~msahyoun] in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1794785 ] PDFBOX-3732: ensure default entries for /DA and /DR when accessing an AcroForm and the form doesn't contain these > IllegalArgumentException when refreshing an appearance and no font resources > are defined > > > Key: PDFBOX-3732 > URL: https://issues.apache.org/jira/browse/PDFBOX-3732 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.5 >Reporter: simon steiner > Fix For: 2.0.6, 3.0.0 > > Attachments: out.pdf, out-reader.pdf, PDFBOX3732-minimal.pdf, > PDFBOX3732-minimal-reader.pdf, refreshAppearances.patch > > > PDDocument doc = PDDocument.load(new File("out.pdf")); > doc.getDocumentCatalog().getAcroForm().setNeedAppearances(false); > doc.getDocumentCatalog().getAcroForm().refreshAppearances(); > doc.save("pdfbox.pdf"); > doc.close(); > Exception in thread "main" java.lang.IllegalArgumentException: /DR is a > required entry > at > org.apache.pdfbox.pdmodel.interactive.form.PDDefaultAppearanceString.(PDDefaultAppearanceString.java:82) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005924#comment-16005924 ] Andreas Lehmkühler commented on PDFBOX-3788: I already had this feeling when I went to bed yesterday that my changes might be a bad idea ... however, I've reverted my changes. Thanks [~tilman] for the pointer. > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: genko_oc_shiryo1.pdf, > YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3732) IllegalArgumentException when refreshing an appearance and no font resources are defined
[ https://issues.apache.org/jira/browse/PDFBOX-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005921#comment-16005921 ] ASF subversion and git services commented on PDFBOX-3732: - Commit 1794784 from [~msahyoun] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1794784 ] PDFBOX-3732: ensure default entries for /DA and /DR when accessing an AcroForm and the form doesn't contain these > IllegalArgumentException when refreshing an appearance and no font resources > are defined > > > Key: PDFBOX-3732 > URL: https://issues.apache.org/jira/browse/PDFBOX-3732 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.5 >Reporter: simon steiner > Fix For: 2.0.6, 3.0.0 > > Attachments: out.pdf, out-reader.pdf, PDFBOX3732-minimal.pdf, > PDFBOX3732-minimal-reader.pdf, refreshAppearances.patch > > > PDDocument doc = PDDocument.load(new File("out.pdf")); > doc.getDocumentCatalog().getAcroForm().setNeedAppearances(false); > doc.getDocumentCatalog().getAcroForm().refreshAppearances(); > doc.save("pdfbox.pdf"); > doc.close(); > Exception in thread "main" java.lang.IllegalArgumentException: /DR is a > required entry > at > org.apache.pdfbox.pdmodel.interactive.form.PDDefaultAppearanceString.(PDDefaultAppearanceString.java:82) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005919#comment-16005919 ] ASF subversion and git services commented on PDFBOX-3788: - Commit 1794783 from [~lehmi] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1794783 ] PDFBOX-3788: revert former changes due to a regression > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: genko_oc_shiryo1.pdf, > YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005918#comment-16005918 ] ASF subversion and git services commented on PDFBOX-3788: - Commit 1794782 from [~lehmi] in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1794782 ] PDFBOX-3788: revert former changes due to a regression > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: genko_oc_shiryo1.pdf, > YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Resolved] (PDFBOX-3789) Some text missing in rendering
[ https://issues.apache.org/jira/browse/PDFBOX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-3789. - Resolution: Fixed Fix Version/s: 3.0.0 2.0.6 > Some text missing in rendering > -- > > Key: PDFBOX-3789 > URL: https://issues.apache.org/jira/browse/PDFBOX-3789 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Reporter: Tilman Hausherr >Assignee: Tilman Hausherr > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V.pdf, > PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V_unc.pdf > > > The text in the table is missing, it was there in 2.0.5. I suspect it is due > to the missing width (Adobe mentions it). The file is truncated but is > parsed; the error happens also when saving the parsed file and rendering that > one. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3789) Some text missing in rendering
[ https://issues.apache.org/jira/browse/PDFBOX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005514#comment-16005514 ] Tilman Hausherr commented on PDFBOX-3789: - I could also have rewritten containsKey() to return false if the entry is null but this isn't the same. > Some text missing in rendering > -- > > Key: PDFBOX-3789 > URL: https://issues.apache.org/jira/browse/PDFBOX-3789 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Reporter: Tilman Hausherr >Assignee: Tilman Hausherr > Labels: regression > Attachments: PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V.pdf, > PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V_unc.pdf > > > The text in the table is missing, it was there in 2.0.5. I suspect it is due > to the missing width (Adobe mentions it). The file is truncated but is > parsed; the error happens also when saving the parsed file and rendering that > one. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3789) Some text missing in rendering
[ https://issues.apache.org/jira/browse/PDFBOX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3789: Affects Version/s: (was: 2.0.6) > Some text missing in rendering > -- > > Key: PDFBOX-3789 > URL: https://issues.apache.org/jira/browse/PDFBOX-3789 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Reporter: Tilman Hausherr >Assignee: Tilman Hausherr > Labels: regression > Attachments: PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V.pdf, > PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V_unc.pdf > > > The text in the table is missing, it was there in 2.0.5. I suspect it is due > to the missing width (Adobe mentions it). The file is truncated but is > parsed; the error happens also when saving the parsed file and rendering that > one. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3789) Some text missing in rendering
[ https://issues.apache.org/jira/browse/PDFBOX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005512#comment-16005512 ] ASF subversion and git services commented on PDFBOX-3789: - Commit 1794767 from [~tilman] in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1794767 ] PDFBOX-3789: treat /WIDTHS with null entry as if /WIDTHS was missing > Some text missing in rendering > -- > > Key: PDFBOX-3789 > URL: https://issues.apache.org/jira/browse/PDFBOX-3789 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 2.0.6 >Reporter: Tilman Hausherr >Assignee: Tilman Hausherr > Labels: regression > Attachments: PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V.pdf, > PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V_unc.pdf > > > The text in the table is missing, it was there in 2.0.5. I suspect it is due > to the missing width (Adobe mentions it). The file is truncated but is > parsed; the error happens also when saving the parsed file and rendering that > one. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3789) Some text missing in rendering
[ https://issues.apache.org/jira/browse/PDFBOX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005511#comment-16005511 ] ASF subversion and git services commented on PDFBOX-3789: - Commit 1794766 from [~tilman] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1794766 ] PDFBOX-3789: treat /WIDTHS with null entry as if /WIDTHS was missing > Some text missing in rendering > -- > > Key: PDFBOX-3789 > URL: https://issues.apache.org/jira/browse/PDFBOX-3789 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 2.0.6 >Reporter: Tilman Hausherr >Assignee: Tilman Hausherr > Labels: regression > Attachments: PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V.pdf, > PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V_unc.pdf > > > The text in the table is missing, it was there in 2.0.5. I suspect it is due > to the missing width (Adobe mentions it). The file is truncated but is > parsed; the error happens also when saving the parsed file and rendering that > one. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005498#comment-16005498 ] Tilman Hausherr commented on PDFBOX-3788: - More problems: - PDFBOX-3714-2.pdf , the signature can no longer be seen - PDFBOX-2990 and PDFBOX-3369 same exception > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: genko_oc_shiryo1.pdf, > YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3788: Attachment: genko_oc_shiryo1.pdf > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: genko_oc_shiryo1.pdf, > YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Reopened] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr reopened PDFBOX-3788: - {code} java.io.IOException: Missing root object specification in trailer. org.apache.pdfbox.pdfparser.COSParser.parseTrailerValuesDynamically(COSParser.java:2128) org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:227) org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:276) org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1005) {code} with the attached file > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3783) java.io.IOException: Expected root dictionary, but got this: COSNull{}
[ https://issues.apache.org/jira/browse/PDFBOX-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005462#comment-16005462 ] ASF subversion and git services commented on PDFBOX-3783: - Commit 1794764 from [~lehmi] in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1794764 ] PDFBOX-3783: removed misleading comment > java.io.IOException: Expected root dictionary, but got this: COSNull{} > -- > > Key: PDFBOX-3783 > URL: https://issues.apache.org/jira/browse/PDFBOX-3783 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Tilman Hausherr >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: PDFBOX-3783-72GLBIGUC6LB46ELZFBARRJTLN4RBSQM.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > java.io.IOException: Expected root dictionary, but got this: COSNull{} > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:230) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:276) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1005) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:943) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1375) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1293) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1276) > org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:262) > org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3783) java.io.IOException: Expected root dictionary, but got this: COSNull{}
[ https://issues.apache.org/jira/browse/PDFBOX-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005443#comment-16005443 ] ASF subversion and git services commented on PDFBOX-3783: - Commit 1794762 from [~lehmi] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1794762 ] PDFBOX-3783: removed misleading comment > java.io.IOException: Expected root dictionary, but got this: COSNull{} > -- > > Key: PDFBOX-3783 > URL: https://issues.apache.org/jira/browse/PDFBOX-3783 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Tilman Hausherr >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: PDFBOX-3783-72GLBIGUC6LB46ELZFBARRJTLN4RBSQM.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > java.io.IOException: Expected root dictionary, but got this: COSNull{} > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:230) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:276) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1005) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:943) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1375) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1293) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1276) > org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:262) > org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3789) Some text missing in rendering
[ https://issues.apache.org/jira/browse/PDFBOX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3789: Component/s: PDModel > Some text missing in rendering > -- > > Key: PDFBOX-3789 > URL: https://issues.apache.org/jira/browse/PDFBOX-3789 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 2.0.6 >Reporter: Tilman Hausherr >Assignee: Tilman Hausherr > Labels: regression > Attachments: PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V.pdf, > PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V_unc.pdf > > > The text in the table is missing, it was there in 2.0.5. I suspect it is due > to the missing width (Adobe mentions it). The file is truncated but is > parsed; the error happens also when saving the parsed file and rendering that > one. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3789) Some text missing in rendering
[ https://issues.apache.org/jira/browse/PDFBOX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3789: Description: The text in the table is missing, it was there in 2.0.5. I suspect it is due to the missing width (Adobe mentions it). The file is truncated but is parsed; the error happens also when saving the parsed file and rendering that one. (was: The text in the table is missing. I suspect it is due to the missing width (Adobe mentions it). The file is truncated but is parsed; the error happens also when saving the parsed file and rendering that one.) > Some text missing in rendering > -- > > Key: PDFBOX-3789 > URL: https://issues.apache.org/jira/browse/PDFBOX-3789 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.6 >Reporter: Tilman Hausherr >Assignee: Tilman Hausherr > Labels: regression > Attachments: PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V.pdf, > PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V_unc.pdf > > > The text in the table is missing, it was there in 2.0.5. I suspect it is due > to the missing width (Adobe mentions it). The file is truncated but is > parsed; the error happens also when saving the parsed file and rendering that > one. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3789) Some text missing in rendering
[ https://issues.apache.org/jira/browse/PDFBOX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3789: Labels: regression (was: ) > Some text missing in rendering > -- > > Key: PDFBOX-3789 > URL: https://issues.apache.org/jira/browse/PDFBOX-3789 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.6 >Reporter: Tilman Hausherr >Assignee: Tilman Hausherr > Labels: regression > Attachments: PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V.pdf, > PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V_unc.pdf > > > The text in the table is missing. I suspect it is due to the missing width > (Adobe mentions it). The file is truncated but is parsed; the error happens > also when saving the parsed file and rendering that one. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3789) Some text missing in rendering
[ https://issues.apache.org/jira/browse/PDFBOX-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3789: Attachment: PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V.pdf PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V_unc.pdf > Some text missing in rendering > -- > > Key: PDFBOX-3789 > URL: https://issues.apache.org/jira/browse/PDFBOX-3789 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.6 >Reporter: Tilman Hausherr >Assignee: Tilman Hausherr > Attachments: PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V.pdf, > PDFBOX-3789-4KBI7ITHG6MSXR7DOTKZX6DQZJ5UF64V_unc.pdf > > > The text in the table is missing. I suspect it is due to the missing width > (Adobe mentions it). The file is truncated but is parsed; the error happens > also when saving the parsed file and rendering that one. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Created] (PDFBOX-3789) Some text missing in rendering
Tilman Hausherr created PDFBOX-3789: --- Summary: Some text missing in rendering Key: PDFBOX-3789 URL: https://issues.apache.org/jira/browse/PDFBOX-3789 Project: PDFBox Issue Type: Bug Affects Versions: 2.0.6 Reporter: Tilman Hausherr Assignee: Tilman Hausherr The text in the table is missing. I suspect it is due to the missing width (Adobe mentions it). The file is truncated but is parsed; the error happens also when saving the parsed file and rendering that one. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Resolved] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler resolved PDFBOX-3788. Resolution: Fixed I've removed the repair mechanism which was triggered during parsing the xref information. Now, an IOException is thrown and the rebuildTrailer mechanism is called. > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005305#comment-16005305 ] ASF subversion and git services commented on PDFBOX-3788: - Commit 1794754 from [~lehmi] in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1794754 ] PDFBOX-3788: optimized debug message > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005304#comment-16005304 ] ASF subversion and git services commented on PDFBOX-3788: - Commit 1794753 from [~lehmi] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1794753 ] PDFBOX-3788: optimized debug message > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005294#comment-16005294 ] ASF subversion and git services commented on PDFBOX-3788: - Commit 1794751 from [~lehmi] in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1794751 ] PDFBOX-3788: remove repair mechanism when parsing the xref information, trigger rebuilding the trailer instead > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-3788: --- Fix Version/s: 3.0.0 2.0.6 > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Fix For: 2.0.6, 3.0.0 > > Attachments: YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005284#comment-16005284 ] ASF subversion and git services commented on PDFBOX-3788: - Commit 1794750 from [~lehmi] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1794750 ] PDFBOX-3788: remove repair mechanism when parsing the xref information, trigger rebuilding the trailer instead > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Attachments: YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
Re: 2.0.6 release ?
Am 10.05.2017 um 17:12 schrieb Tilman Hausherr: Thanks for the test... the sum is still negative, but if we'd ignore the truncated files I bet we'd be positive. I have downloaded a few of the regressions but won't create issues this time as yesterday's turned out to be duplicates, I'll wait for Andreas next commit and will create issues only if these aren't solved. I guess the new exception aren't related. I've already created an issue for the first one, PDFBOX-3788 I didn't had a chance to look at the second file. I just tested my fix for the first one and it still fails. @Andreas - ping me if you didn't keep the "secret" URL. It isn't that secret as Tim posted it somewhere in this thread ... Some misc thoughts... 039800.pdf: "refinery's" is a different token than refinery. Shouldn't "refinery's" be three tokens? I mention this because refinery is probably in a dictionary. Some differences are because of a different treatment of the space in bad fonts. Some were improved, and some now look like this "C I T I E S W I T H O U T D R U G S". There is an open issue about these. It is tricky because if we treat these like 1 word, we'd also lose spaces where we don't want. commoncrawl2/5N/5NSKV4CTVY4KT7R2FGY4XJDIK4PRLA4Z I can't find. I used http://XXX.XXX.XXX.XXX/docs/commoncrawl2/5N/5NSKV4CTVY4KT7R2FGY4XJDIK4PRLA4Z Tilman Am 10.05.2017 um 11:42 schrieb Allison, Timothy B.: Haven't had a chance to look. Reports are here: http://162.242.228.174/reports/reports_pdfbox_2_0_6_20170510.tar.gz - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
[ https://issues.apache.org/jira/browse/PDFBOX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005178#comment-16005178 ] Andreas Lehmkühler commented on PDFBOX-3788: I've already found the cause and a possible solution, but some of the isator tests fail. > java.lang.RuntimeException: java.io.IOException: Catalog cannot be found > > > Key: PDFBOX-3788 > URL: https://issues.apache.org/jira/browse/PDFBOX-3788 > Project: PDFBox > Issue Type: Bug > Components: Parsing >Affects Versions: 2.0.6 >Reporter: Andreas Lehmkühler >Assignee: Andreas Lehmkühler > Labels: regression > Attachments: YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf > > > This file was parsed in 2.0.5 but no longer now: > {code} > Caused by: java.io.IOException: Catalog cannot be found > org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) > org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) > org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) > org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) > > org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) > org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Created] (PDFBOX-3788) java.lang.RuntimeException: java.io.IOException: Catalog cannot be found
Andreas Lehmkühler created PDFBOX-3788: -- Summary: java.lang.RuntimeException: java.io.IOException: Catalog cannot be found Key: PDFBOX-3788 URL: https://issues.apache.org/jira/browse/PDFBOX-3788 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.6 Reporter: Andreas Lehmkühler Assignee: Andreas Lehmkühler Attachments: YVFDWHF767TEYTT7IVFSLUIJTDF3YP57.pdf This file was parsed in 2.0.5 but no longer now: {code} Caused by: java.io.IOException: Catalog cannot be found org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:373) org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:238) org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:310) org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1000) org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:938) org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1288) org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1209) org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1194) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3782) Text extraction loses whitespace
[ https://issues.apache.org/jira/browse/PDFBOX-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004913#comment-16004913 ] Tilman Hausherr commented on PDFBOX-3782: - That problem is also with Adobe Reader: {code} such as“BC/AD”,“a.m./p.m.”,“FBI”, and“CD” {code} The spaces are missing because some of the glyphs have larger width than what is black. You can see this by marking the quote before FBI. In theory, we could calculate our own widths from the font paths instead of trusting the fonts. But this might bring some new surprises. (And it would be slower) > Text extraction loses whitespace > > > Key: PDFBOX-3782 > URL: https://issues.apache.org/jira/browse/PDFBOX-3782 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 2.0.4, 2.0.5, 2.0.6 > Environment: Java/Tika >Reporter: Tony Bray >Priority: Minor > Attachments: PDFBOX-3782-reduced.pdf, Test doc - Japanese writing > system - Kanji Hiragana Katakana.pdf, Test doc - Japanese writing system - > Kanji Hiragana Katakana.txt > > > I have a PDF document that I am using Tika/PDFBox to extract the content. In > several areas, the content extracted loses the whitespace, causing a > tokenization problem for indexing/searching. > I have attached the original document and the text output. If you search > (Ctrl+f) the text document for "Another example". Here you will see no space > after "is" and the Japanese text. The same issue shows for > "whichmeans"eraser"" at the end of the sentence. > Another example is消しゴム (Rō- maji: keshigomu) whichmeans“eraser” > I get the warning "WARNING: No Unicode mapping for CID+0 (0) in font > RGOFPX+IPAexMincho" during extraction but have been unable to find any > information on it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3782) Text extraction loses whitespace
[ https://issues.apache.org/jira/browse/PDFBOX-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004877#comment-16004877 ] Tony Bray commented on PDFBOX-3782: --- It seems to be around the quotes and other punctuation. Here's an example sentence with spacing not honored: To a lesser extent, modern written Japanese also uses acronyms from the Latin alphabet, for example in terms such as“BC/AD”,“a.m./p.m.”,“FBI”, and“CD” . > Text extraction loses whitespace > > > Key: PDFBOX-3782 > URL: https://issues.apache.org/jira/browse/PDFBOX-3782 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 2.0.4, 2.0.5, 2.0.6 > Environment: Java/Tika >Reporter: Tony Bray >Priority: Minor > Attachments: PDFBOX-3782-reduced.pdf, Test doc - Japanese writing > system - Kanji Hiragana Katakana.pdf, Test doc - Japanese writing system - > Kanji Hiragana Katakana.txt > > > I have a PDF document that I am using Tika/PDFBox to extract the content. In > several areas, the content extracted loses the whitespace, causing a > tokenization problem for indexing/searching. > I have attached the original document and the text output. If you search > (Ctrl+f) the text document for "Another example". Here you will see no space > after "is" and the Japanese text. The same issue shows for > "whichmeans"eraser"" at the end of the sentence. > Another example is消しゴム (Rō- maji: keshigomu) whichmeans“eraser” > I get the warning "WARNING: No Unicode mapping for CID+0 (0) in font > RGOFPX+IPAexMincho" during extraction but have been unable to find any > information on it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3782) Text extraction loses whitespace
[ https://issues.apache.org/jira/browse/PDFBOX-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004860#comment-16004860 ] Tilman Hausherr commented on PDFBOX-3782: - Can you tell what part "resisted" the extraction? > Text extraction loses whitespace > > > Key: PDFBOX-3782 > URL: https://issues.apache.org/jira/browse/PDFBOX-3782 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 2.0.4, 2.0.5, 2.0.6 > Environment: Java/Tika >Reporter: Tony Bray >Priority: Minor > Attachments: PDFBOX-3782-reduced.pdf, Test doc - Japanese writing > system - Kanji Hiragana Katakana.pdf, Test doc - Japanese writing system - > Kanji Hiragana Katakana.txt > > > I have a PDF document that I am using Tika/PDFBox to extract the content. In > several areas, the content extracted loses the whitespace, causing a > tokenization problem for indexing/searching. > I have attached the original document and the text output. If you search > (Ctrl+f) the text document for "Another example". Here you will see no space > after "is" and the Japanese text. The same issue shows for > "whichmeans"eraser"" at the end of the sentence. > Another example is消しゴム (Rō- maji: keshigomu) whichmeans“eraser” > I get the warning "WARNING: No Unicode mapping for CID+0 (0) in font > RGOFPX+IPAexMincho" during extraction but have been unable to find any > information on it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-3782) Text extraction loses whitespace
[ https://issues.apache.org/jira/browse/PDFBOX-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004860#comment-16004860 ] Tilman Hausherr edited comment on PDFBOX-3782 at 5/10/17 3:31 PM: -- Can you tell what part "resisted" the extraction with the modified parameter? was (Author: tilman): Can you tell what part "resisted" the extraction? > Text extraction loses whitespace > > > Key: PDFBOX-3782 > URL: https://issues.apache.org/jira/browse/PDFBOX-3782 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 2.0.4, 2.0.5, 2.0.6 > Environment: Java/Tika >Reporter: Tony Bray >Priority: Minor > Attachments: PDFBOX-3782-reduced.pdf, Test doc - Japanese writing > system - Kanji Hiragana Katakana.pdf, Test doc - Japanese writing system - > Kanji Hiragana Katakana.txt > > > I have a PDF document that I am using Tika/PDFBox to extract the content. In > several areas, the content extracted loses the whitespace, causing a > tokenization problem for indexing/searching. > I have attached the original document and the text output. If you search > (Ctrl+f) the text document for "Another example". Here you will see no space > after "is" and the Japanese text. The same issue shows for > "whichmeans"eraser"" at the end of the sentence. > Another example is消しゴム (Rō- maji: keshigomu) whichmeans“eraser” > I get the warning "WARNING: No Unicode mapping for CID+0 (0) in font > RGOFPX+IPAexMincho" during extraction but have been unable to find any > information on it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3782) Text extraction loses whitespace
[ https://issues.apache.org/jira/browse/PDFBOX-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004859#comment-16004859 ] Tilman Hausherr commented on PDFBOX-3782: - Some background: in this font, the space width is not defined. So a default width is taken, which is 600 in this font. That is unusually large. The "0" has 500, and in fonts the space has usually a width around 250. Because of that large space width, PDFBox assumes that the area between glyphs isn't a space. Now you may argue: "I don't care, Adobe does it correctly so I want it here too". Our algorithm does a lot of "magic" and changing it is risky, because it may degrade files that were good... I don't have any good idea right now, but will keep this open and add your file to my test set. > Text extraction loses whitespace > > > Key: PDFBOX-3782 > URL: https://issues.apache.org/jira/browse/PDFBOX-3782 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 2.0.4, 2.0.5, 2.0.6 > Environment: Java/Tika >Reporter: Tony Bray >Priority: Minor > Attachments: PDFBOX-3782-reduced.pdf, Test doc - Japanese writing > system - Kanji Hiragana Katakana.pdf, Test doc - Japanese writing system - > Kanji Hiragana Katakana.txt > > > I have a PDF document that I am using Tika/PDFBox to extract the content. In > several areas, the content extracted loses the whitespace, causing a > tokenization problem for indexing/searching. > I have attached the original document and the text output. If you search > (Ctrl+f) the text document for "Another example". Here you will see no space > after "is" and the Japanese text. The same issue shows for > "whichmeans"eraser"" at the end of the sentence. > Another example is消しゴム (Rō- maji: keshigomu) whichmeans“eraser” > I get the warning "WARNING: No Unicode mapping for CID+0 (0) in font > RGOFPX+IPAexMincho" during extraction but have been unable to find any > information on it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3782) Text extraction loses whitespace
[ https://issues.apache.org/jira/browse/PDFBOX-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004855#comment-16004855 ] Tony Bray commented on PDFBOX-3782: --- Hi and thank you. I tried extraction with the "setSpacingTolerance" and it helped but was not 100%. > Text extraction loses whitespace > > > Key: PDFBOX-3782 > URL: https://issues.apache.org/jira/browse/PDFBOX-3782 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 2.0.4, 2.0.5, 2.0.6 > Environment: Java/Tika >Reporter: Tony Bray >Priority: Minor > Attachments: PDFBOX-3782-reduced.pdf, Test doc - Japanese writing > system - Kanji Hiragana Katakana.pdf, Test doc - Japanese writing system - > Kanji Hiragana Katakana.txt > > > I have a PDF document that I am using Tika/PDFBox to extract the content. In > several areas, the content extracted loses the whitespace, causing a > tokenization problem for indexing/searching. > I have attached the original document and the text output. If you search > (Ctrl+f) the text document for "Another example". Here you will see no space > after "is" and the Japanese text. The same issue shows for > "whichmeans"eraser"" at the end of the sentence. > Another example is消しゴム (Rō- maji: keshigomu) whichmeans“eraser” > I get the warning "WARNING: No Unicode mapping for CID+0 (0) in font > RGOFPX+IPAexMincho" during extraction but have been unable to find any > information on it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3782) Text extraction loses whitespace
[ https://issues.apache.org/jira/browse/PDFBOX-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3782: Affects Version/s: 2.0.6 2.0.5 > Text extraction loses whitespace > > > Key: PDFBOX-3782 > URL: https://issues.apache.org/jira/browse/PDFBOX-3782 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 2.0.4, 2.0.5, 2.0.6 > Environment: Java/Tika >Reporter: Tony Bray >Priority: Minor > Attachments: PDFBOX-3782-reduced.pdf, Test doc - Japanese writing > system - Kanji Hiragana Katakana.pdf, Test doc - Japanese writing system - > Kanji Hiragana Katakana.txt > > > I have a PDF document that I am using Tika/PDFBox to extract the content. In > several areas, the content extracted loses the whitespace, causing a > tokenization problem for indexing/searching. > I have attached the original document and the text output. If you search > (Ctrl+f) the text document for "Another example". Here you will see no space > after "is" and the Japanese text. The same issue shows for > "whichmeans"eraser"" at the end of the sentence. > Another example is消しゴム (Rō- maji: keshigomu) whichmeans“eraser” > I get the warning "WARNING: No Unicode mapping for CID+0 (0) in font > RGOFPX+IPAexMincho" during extraction but have been unable to find any > information on it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3782) Text extraction loses whitespace
[ https://issues.apache.org/jira/browse/PDFBOX-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3782: Summary: Text extraction loses whitespace (was: WARNING: No Unicode mapping for CID+0 (0) in font RGOFPX+IPAexMincho) > Text extraction loses whitespace > > > Key: PDFBOX-3782 > URL: https://issues.apache.org/jira/browse/PDFBOX-3782 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 2.0.4, 2.0.5, 2.0.6 > Environment: Java/Tika >Reporter: Tony Bray >Priority: Minor > Attachments: PDFBOX-3782-reduced.pdf, Test doc - Japanese writing > system - Kanji Hiragana Katakana.pdf, Test doc - Japanese writing system - > Kanji Hiragana Katakana.txt > > > I have a PDF document that I am using Tika/PDFBox to extract the content. In > several areas, the content extracted loses the whitespace, causing a > tokenization problem for indexing/searching. > I have attached the original document and the text output. If you search > (Ctrl+f) the text document for "Another example". Here you will see no space > after "is" and the Japanese text. The same issue shows for > "whichmeans"eraser"" at the end of the sentence. > Another example is消しゴム (Rō- maji: keshigomu) whichmeans“eraser” > I get the warning "WARNING: No Unicode mapping for CID+0 (0) in font > RGOFPX+IPAexMincho" during extraction but have been unable to find any > information on it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3782) WARNING: No Unicode mapping for CID+0 (0) in font RGOFPX+IPAexMincho
[ https://issues.apache.org/jira/browse/PDFBOX-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3782: Component/s: (was: Parsing) Text extraction > WARNING: No Unicode mapping for CID+0 (0) in font RGOFPX+IPAexMincho > > > Key: PDFBOX-3782 > URL: https://issues.apache.org/jira/browse/PDFBOX-3782 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 2.0.4 > Environment: Java/Tika >Reporter: Tony Bray >Priority: Minor > Attachments: PDFBOX-3782-reduced.pdf, Test doc - Japanese writing > system - Kanji Hiragana Katakana.pdf, Test doc - Japanese writing system - > Kanji Hiragana Katakana.txt > > > I have a PDF document that I am using Tika/PDFBox to extract the content. In > several areas, the content extracted loses the whitespace, causing a > tokenization problem for indexing/searching. > I have attached the original document and the text output. If you search > (Ctrl+f) the text document for "Another example". Here you will see no space > after "is" and the Japanese text. The same issue shows for > "whichmeans"eraser"" at the end of the sentence. > Another example is消しゴム (Rō- maji: keshigomu) whichmeans“eraser” > I get the warning "WARNING: No Unicode mapping for CID+0 (0) in font > RGOFPX+IPAexMincho" during extraction but have been unable to find any > information on it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
Re: 2.0.6 release ?
Thanks for the test... the sum is still negative, but if we'd ignore the truncated files I bet we'd be positive. I have downloaded a few of the regressions but won't create issues this time as yesterday's turned out to be duplicates, I'll wait for Andreas next commit and will create issues only if these aren't solved. @Andreas - ping me if you didn't keep the "secret" URL. Some misc thoughts... 039800.pdf: "refinery's" is a different token than refinery. Shouldn't "refinery's" be three tokens? I mention this because refinery is probably in a dictionary. Some differences are because of a different treatment of the space in bad fonts. Some were improved, and some now look like this "C I T I E S W I T H O U T D R U G S". There is an open issue about these. It is tricky because if we treat these like 1 word, we'd also lose spaces where we don't want. commoncrawl2/5N/5NSKV4CTVY4KT7R2FGY4XJDIK4PRLA4Z I can't find. I used http://XXX.XXX.XXX.XXX/docs/commoncrawl2/5N/5NSKV4CTVY4KT7R2FGY4XJDIK4PRLA4Z Tilman Am 10.05.2017 um 11:42 schrieb Allison, Timothy B.: Haven't had a chance to look. Reports are here: http://162.242.228.174/reports/reports_pdfbox_2_0_6_20170510.tar.gz - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3316) Add comment to PDF
[ https://issues.apache.org/jira/browse/PDFBOX-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004650#comment-16004650 ] Tilman Hausherr commented on PDFBOX-3316: - You can't use the parser, you would have to read the content stream yourself with PDPage.getContents(). In the future please use the user mailing list, as your question is only loosely related to the (closed) issue. > Add comment to PDF > -- > > Key: PDFBOX-3316 > URL: https://issues.apache.org/jira/browse/PDFBOX-3316 > Project: PDFBox > Issue Type: Improvement > Components: Rendering >Affects Versions: 2.0.0, 2.0.1, 2.0.2, 3.0.0 >Reporter: Jerrol Etheredge >Assignee: Tilman Hausherr >Priority: Minor > Fix For: 2.0.2, 3.0.0 > > > For our application we use some comment texts (prepended by a %) to mark > content and perform text replacement. > We currently use the appendRawCommands() method to add these, but since this > method has been marked as deprecated since version 2.0. > Would it be possible to add some like a addComment() method to > PDPageContentStream? > The code would probably be something trivial like: > public void addComment(String comment) { > output.write("%" + comment + "\n"); > } -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3316) Add comment to PDF
[ https://issues.apache.org/jira/browse/PDFBOX-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004644#comment-16004644 ] Peter Pinnau commented on PDFBOX-3316: -- Is there a way to read such comments with PDFBox? I tried the PDFStreamParser but it seems to ignore % comments since they are not tokens. I am searching for the possibility to identify content content streams which contain a certain comment and remove that streams from the document. > Add comment to PDF > -- > > Key: PDFBOX-3316 > URL: https://issues.apache.org/jira/browse/PDFBOX-3316 > Project: PDFBox > Issue Type: Improvement > Components: Rendering >Affects Versions: 2.0.0, 2.0.1, 2.0.2, 3.0.0 >Reporter: Jerrol Etheredge >Assignee: Tilman Hausherr >Priority: Minor > Fix For: 2.0.2, 3.0.0 > > > For our application we use some comment texts (prepended by a %) to mark > content and perform text replacement. > We currently use the appendRawCommands() method to add these, but since this > method has been marked as deprecated since version 2.0. > Would it be possible to add some like a addComment() method to > PDPageContentStream? > The code would probably be something trivial like: > public void addComment(String comment) { > output.write("%" + comment + "\n"); > } -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
RE: 2.0.6 release ?
> "Allison, Timothy B." hat am 10. Mai 2017 um 11:42 > geschrieben: > > > Haven't had a chance to look. Reports are here: > http://162.242.228.174/reports/reports_pdfbox_2_0_6_20170510.tar.gz Thanks again for running the report again I had a quick look and there are 2 new exceptions. It seems to be a regression. I'm going to dig deeper later when I'm back home Here a 2 sample pfs, one for each exception commoncrawl2/YV/YVFDWHF767TEYTT7IVFSLUIJTDF3YP57 commoncrawl2/5W/5WULWDW54DAQ4ORVJSACEE2KCXQ7PQLL Andreas > > - > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: dev-h...@pdfbox.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3512) PDFDebugger Mac App
[ https://issues.apache.org/jira/browse/PDFBOX-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-3512: --- Fix Version/s: (was: 2.0.6) 2.0.7 > PDFDebugger Mac App > --- > > Key: PDFBOX-3512 > URL: https://issues.apache.org/jira/browse/PDFBOX-3512 > Project: PDFBox > Issue Type: New Feature > Components: Utilities > Environment: Mac OS X >Reporter: John Hewson >Assignee: John Hewson >Priority: Minor > Fix For: 2.0.7, 3.0.0 > > > Using the PDFDebugger on the Mac isn't a great experience (see PDFBOX-3507). > We should package the jar into a native Mac .app bundle. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
RE: 2.0.6 release ?
Haven't had a chance to look. Reports are here: http://162.242.228.174/reports/reports_pdfbox_2_0_6_20170510.tar.gz