[ https://issues.apache.org/jira/browse/PDFBOX-5838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854934#comment-17854934 ]
Andreas Lehmkühler commented on PDFBOX-5838: -------------------------------------------- Unfortunately again one of those either/or cases :-( I tend to keep the current implementation, as not only the golden master Adobe follows the same rules than us, but other tools are doing the same. I've checked pdftotext and evince. Both are using poppler and are producing the same output than Adobe and PDFBox > Text extraction garbled in this file, was OK in 3.0.2 / 2.0.31 > -------------------------------------------------------------- > > Key: PDFBOX-5838 > URL: https://issues.apache.org/jira/browse/PDFBOX-5838 > Project: PDFBox > Issue Type: Bug > Components: Text extraction > Affects Versions: 2.0.32, 3.0.3 PDFBox > Reporter: Tilman Hausherr > Priority: Major > Labels: regression > Attachments: OFLSV3YFD3TDOU4YZTL2QY745W53W3DW.pdf, > PDFBOX-5838-0024320-reduced.pdf > > > discovered in 2.0.32 regression tests -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org