Hi there, I am attempting text extraction with PDFBox 1.8.2.
For reasons I cannot explain, I am sometimes sent PDFs with no version number in the header, e.g. %PDF-\r\n instead of, say %PDF-1.7\r\n (I have checked, the version number does not appear in the next couple of lines, either.) This causes PDFParser.parseHeader() to die as it attempts to perform a negative substring offset calculation. My question is: if I could detect this situation and default it to a really low version (%PDF-1.0 ?), would it be safe - or would other things break later on? Thanks for any help. - Chris

