[ 
https://issues.apache.org/jira/browse/PDFBOX-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044986#comment-15044986
 ] 

Uwe Schindler edited comment on PDFBOX-3155 at 12/7/15 2:03 PM:
----------------------------------------------------------------

The stack trace we have seen:
{noformat}
Caused by: java.lang.NumberFormatException: For input string: "9-ea"
        at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Integer.parseInt(Integer.java:675)
        at java.lang.Integer.parseInt(Integer.java:793)
        at 
org.apache.pdfbox.util.PDFTextStripper.<clinit>(PDFTextStripper.java:122)
        ... 57 more
{noformat}

If the "-ea" would not be part of the string (e.g. final release of Java 9), it 
would fail with ArrayIndexOutOfBoundsException in the following line.

See http://openjdk.java.net/jeps/223 for new version number formats, so be 
prepared and don't fail code with exceptions caused by missing consistency 
checks.

This "note" from the above JEP applies to your code:
bq. Note that all code which has historically detected . in any of these system 
properties as part of version identification will need to be examined and 
potentially modified. For example, 
System.getProperty("java.version").indexof('.') will return -1 for major 
releases.


was (Author: thetaphi):
The stack trace we have seen:
{noformat}
Caused by: java.lang.NumberFormatException: For input string: "9-ea"
        at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Integer.parseInt(Integer.java:675)
        at java.lang.Integer.parseInt(Integer.java:793)
        at 
org.apache.pdfbox.util.PDFTextStripper.<clinit>(PDFTextStripper.java:122)
        ... 57 more
{noformat}

If the "-ea" would not be part of the string (e.g. final release of Java 9), it 
would fail with ArrayIndexOutOfBoundsException in the following line.

See http://openjdk.java.net/jeps/223 for new version number formats, so be 
prepared and don't fail code with exceptions caused by missing consistency 
checks.

> org.apache.pdfbox.util.PDFTextStripper class initialization throws 
> NumberFormatException with recent Verona-enabled Java 9 JVMs
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-3155
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3155
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 1.8.8, 1.8.10
>            Reporter: Uwe Schindler
>            Priority: Critical
>
> Lucene/Solr runs its whole testsuite also with Java 9 EA releases to trigger 
> bugs early. In our tests (Solr + TIKA) we found out that 
> org.apache.pdfbox.util.PDFTextStripper throws a NumberFormatException in its 
> static initializer when parsing the "java.version" system property. The 
> reason for failure is a change in Java 9, where version numbers got a new 
> format.
> There are 2 problems:
> - It should not assume that all components are really a number. So it should 
> try/catch NumberFormatException and assign some "unknown" version
> - The code should really use "java.specification.version". This is 
> standardized and only contains digits.
> - The code should also be prepared to handle version numbers without minor 
> version! E.g. Java 9 only has "9" instead of "1.9" as its main version number.
> For the use case I would nuke this check and find a better workaround.
> Relying on String parsing for non-standardized system properties in a static 
> class initializer is the reason why this bug is raised to level "Critical".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to