The PDF document attached to bug report
https://issues.apache.org/jira/browse/PDFBOX-816 (TaxReturn-1.pdf)
throws a NumberFormatException.
#getEncodingFromFont uses a StringTokenizer to split a line into
separate tokens:
StringTokenizer st = new StringTokenizer(line);
The following line however results in a NumberFormatException because
0/NUL is read as one token.
dup 0/NUL put
The StringTokenizer only accepts the following chars as line delimiters:
" \t\n\r\f".
I think this is not correct because it seems that some delimiter chars
are missing like (, ), <, >, [, ], {, }, /, and %
Is this a bug?
Kind regards,
Martijn Brinkers