Am 02.08.2014 22:36, schrieb -A:
Tilman,
Ok thank you! Glad to hear it isn't anything on my end.
I will take note of the preferred loading mechanism and swap out my old
calls.
What is the reasoning behind that, or is load just deprecated?
It isn't deprecated, but it should be :-) loadNonSeq() is the more
correct PDF parser, but load() will sometimes get better results with
malformed PDFs.
Tilman
Thanks!
-Aaron
On Thu, Jul 31, 2014 at 3:49 PM, Tilman Hausherr <thaush...@t-online.de>
wrote:
Hi,
this is a malformed PDF. If you get a correct text extraction, then don't
bother.
Btw it is better to use loadNonSeq(file, null) instead of load(file). An
even better strategy is to use loadNonSeq() and then load() in the
exception catch.
Tilman
Am 31.07.2014 22:46, schrieb -A:
Hello;
I am just going to jump in and ask about the following warning when used
with the default PDFTextStripper class:
WARNING: Count in xref table is 0 at offset 96825
Attached is the causing document. I thought it may have to do with the
Properties file that Tillman Hausherr pointed out to me, but didn't.
This isn't a big issue as the program still functions, but if I could get
rid of the warning so I don't have to look at it - more the merrier!
Also getting to the PDF spec. If there is anything I could assist with if
the properties file becomes an active issue (even just testing), let me
know.