[ 
https://issues.apache.org/jira/browse/PDFBOX-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849518#comment-17849518
 ] 

Michael Klink commented on PDFBOX-5829:
---------------------------------------

Are you sure you really want to force-interpret every bit of junk that is in a 
PDF instead of a number?

Chances always are you interpret it differently than originally intended. 
Already the PDFBOX-3500 interpretations are questionable, what makes you sure 
{{0.-262}} was meant to mean {{-0.262}} and not e.g. two numbers {{0. -262}}? 
Similarly here, is {{-12.-1}} actually {{12.1}} (minus times minus), {{-12.1}} 
(overeager minus addition), {{-12. -1}} (two numbers), or something else 
entirely?

Yes, you can of course look what Acrobat appears to interpret that and copy 
that behavior, but that Acrobat is allowed to be a moving target concerning its 
interpretation of invalid data.

As an alternative, what about an option to register some listener that allows 
customizing the handling of invalid numbers (or other data structures with 
invalid format, e.g. invalid dates)? PDFBox could already come with two 
implementations, a strict one that rejects all invalid stuff, and a more 
relaxed one that tries to fix in parallel to Acrobat. 

> IOException: Error expected floating point numberactual='-12.-1'
> ----------------------------------------------------------------
>
>                 Key: PDFBOX-5829
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5829
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.31, 3.0.2 PDFBox, 4.0.0
>            Reporter: Tilman Hausherr
>            Assignee: Tilman Hausherr
>            Priority: Major
>             Fix For: 2.0.32, 3.0.3 PDFBox, 4.0.0
>
>         Attachments: PDFBOX-5829.pdf
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to