[ https://issues.apache.org/jira/browse/PDFBOX-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849518#comment-17849518 ]
Michael Klink commented on PDFBOX-5829: --------------------------------------- Are you sure you really want to force-interpret every bit of junk that is in a PDF instead of a number? Chances always are you interpret it differently than originally intended. Already the PDFBOX-3500 interpretations are questionable, what makes you sure {{0.-262}} was meant to mean {{-0.262}} and not e.g. two numbers {{0. -262}}? Similarly here, is {{-12.-1}} actually {{12.1}} (minus times minus), {{-12.1}} (overeager minus addition), {{-12. -1}} (two numbers), or something else entirely? Yes, you can of course look what Acrobat appears to interpret that and copy that behavior, but that Acrobat is allowed to be a moving target concerning its interpretation of invalid data. As an alternative, what about an option to register some listener that allows customizing the handling of invalid numbers (or other data structures with invalid format, e.g. invalid dates)? PDFBox could already come with two implementations, a strict one that rejects all invalid stuff, and a more relaxed one that tries to fix in parallel to Acrobat. > IOException: Error expected floating point numberactual='-12.-1' > ---------------------------------------------------------------- > > Key: PDFBOX-5829 > URL: https://issues.apache.org/jira/browse/PDFBOX-5829 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 2.0.31, 3.0.2 PDFBox, 4.0.0 > Reporter: Tilman Hausherr > Assignee: Tilman Hausherr > Priority: Major > Fix For: 2.0.32, 3.0.3 PDFBox, 4.0.0 > > Attachments: PDFBOX-5829.pdf > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org