Re: org.apache.pdfbox.io.PushBackInputStream on some PDFs

2010-12-13 Thread Adam
From: Alex Rodriguez Lopez To: users@pdfbox.apache.org Date: 12/10/2010 02:23 Subject: Re: org.apache.pdfbox.io.PushBackInputStream on some PDFs Thanks Adam, I was kind of expecting this to be a problem with the PDF file. Is there a quick and easy way to tell if PDF is valid before going ahead with th

Re: org.apache.pdfbox.io.PushBackInputStream on some PDFs

2010-12-10 Thread Alex Rodriguez Lopez
ould serve as a good example on some things we need to watch out for. [1] https://issues.apache.org/jira/browse/PDFBOX Thanks, Adam From: Alex Rodriguez Lopez To: users@pdfbox.apache.org Date: 12/07/2010 08:28 Subject: org.apache.pdfbox.io.PushBackInputStream on some PDFs Hi list! I

Re: org.apache.pdfbox.io.PushBackInputStream on some PDFs

2010-12-09 Thread Adam
uez Lopez To: users@pdfbox.apache.org Date: 12/07/2010 08:28 Subject: org.apache.pdfbox.io.PushBackInputStream on some PDFs Hi list! I'm using PdfBox through Apache TIKA 0.8 and it gives me an error on some files when parsing, the resulting file (after the exception is raised) is

org.apache.pdfbox.io.PushBackInputStream on some PDFs

2010-12-07 Thread Alex Rodriguez Lopez
Hi list! I'm using PdfBox through Apache TIKA 0.8 and it gives me an error on some files when parsing, the resulting file (after the exception is raised) is a partial text extraction, like text from some pages at the beginning followed by text from the end of the PDF, missing pages at the mid