Hi,

Thanks for finding this. Sadly there were no tests. I'll investigate.


Tilman

On 17.05.2024 10:25, pascal.schumac...@t-systems.com wrote:
Hi,

concerning commit: "PDFBOX-5660: add warning / exception, as suggested by mkl in SO 
78307200" 
(https://github.com/apache/pdfbox/commit/5c0abf94367c12c9ac0b464046784d456ce4caf5)

After this commit this code:

for (int pageNumber = 0; pageNumber < pdDocument.getNumberOfPages(); 
pageNumber++) {
     PDPage pdPage = pdDocument.getPage(pageNumber);
     ...
     String textForRegion = extractText(pdPage, rect);}

private static String extractText(PDPage pdPage, Rectangle2D rect) throws 
IOException {
     String regionName = "rectangle";
     PDFTextStripperByArea textStripper = new PDFTextStripperByArea();
     textStripper.setSortByPosition(true);
     textStripper.addRegion(regionName, rect);
     textStripper.extractRegions(pdPage);
     return textStripper.getTextForRegion(regionName);
}

Which worked with PDF Box 3 and trunk before this change now fails with:

java.lang.IllegalArgumentException: Parameter must be 1-based, but is 0
        at 
org.apache.pdfbox.text.PDFTextStripper.setStartPage(PDFTextStripper.java:956)
        at 
org.apache.pdfbox.text.PDFTextStripperByArea.extractRegions(PDFTextStripperByArea.java:117)
...

I believe 0 should still be allowed, or am I missing something?

Thanks and kind regards,
Pascal

By the way: Thank you very much for providing PDFBox.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to