Am 02.08.2014 22:36, schrieb -A:
Tilman,

Ok thank you! Glad to hear it isn't anything on my end.

I will take note of the preferred loading mechanism and swap out my old
calls.

What is the reasoning behind that, or is load just deprecated?

It isn't deprecated, but it should be :-) loadNonSeq() is the more correct PDF parser, but load() will sometimes get better results with malformed PDFs.

Tilman



Thanks!

-Aaron


On Thu, Jul 31, 2014 at 3:49 PM, Tilman Hausherr <thaush...@t-online.de>
wrote:

Hi,

this is a malformed PDF. If you get a correct text extraction, then don't
bother.

Btw it is better to use loadNonSeq(file, null) instead of load(file). An
even better strategy is to use loadNonSeq() and then load() in the
exception catch.

Tilman

Am 31.07.2014 22:46, schrieb -A:

Hello;

I am just going to jump in and ask about the following warning when used
with the default PDFTextStripper class:

WARNING: Count in xref table is 0 at offset 96825

Attached is the causing document. I thought it may have to do with the
Properties file that Tillman Hausherr pointed out to me, but didn't.

This isn't a big issue as the program still functions, but if I could get
rid of the warning so I don't have to look at it - more the merrier!

Also getting to the PDF spec. If there is anything I could assist with if
the properties file  becomes an active issue (even just testing), let me
know.




Reply via email to