Hi all, I'm new to the list... I beg your indulgence if I'm out of line here, but here goes...
I'm working on a PDF table extractor. This is my second attempt at it, and this one is based on extending PageDrawer. In particular, I'm looking for table cells delineated by vertical & horizontal lines, and then grabbing whatever text is inside the rectangle. This works well for most PDFs I've tried (admittedly all from the same source), but there's a large subset that it doesn't work on. I've debugged my way through one, and it appears that when processStream(page, page.findResources(), page.getContents().getStream()); calls fillPath() or strokepath() to draw the lines, they aren't drawn in the correct place. They seem to be offset some distance down the page. I've looked at a couple of my troublesome PDFs, and one thing they have in common is that they are v1.4, whereas the ones that work are v1.7. Sooo... Has anyone encountered this before? Is there a known bug with PageDrawer.processStream() or perhaps with the PdfStreamEngine and drawing of v1.4 PDFs? I'm happy to share my source code and example PDFs with anyone if it would help. Thanks Frank

