Am 01.08.2016 um 21:28 schrieb Dirk Groeneveld:
The file is public. GDrive is just being difficult. Here is a Dropbox link instead: https://www.dropbox.com/s/8thckx5crdc15ml/bb3ddd9a7de5aa494cd5611128e433ea8791c569.pdf?dl=0



No problem, I did get the file. I've opened issue
https://issues.apache.org/jira/browse/PDFBOX-3446

Tilman


I had a feeling the file might be corrupt. We’re processing over 6M PDFs with this, so we’re bound to find some edge cases.

Dirk

On August 1, 2016 at 11:48:51, Tilman Hausherr ([email protected] <mailto:[email protected]>) wrote:

Am 01.08.2016 um 20:20 schrieb Dirk Groeneveld:
> https://drive.google.com/a/allenai.org/file/d/0BxI7RAiTuio0a1k2amhoa1kxS1U/view?usp=sharing
>
> I hope that works?

Yes, although it requires authorization. Is the file public or not?

>
> There are actually two concerns. Clearly it should not go into an infinite loop, so that’s concern one. But even if it does, it would be good if the thread was interruptible. It might already be. I have not tried that yet.

It isn't interruptible... Your file is corrupt, it has this:

0000497410 00000 n
0000497457 00000 n
0000497532 00000 n
0000497579 00000 n
0000497654 00000 n
0000497701 00000 ¶ñw%–CÞ—ò.þ=^VPƒ»y2+‰6Aºo;-Ó›^€úrhf-d„lÍ£YYD
lƒ}j¶xïÊÞúÊÿ\ü¡ËnP^P–ÜÓ(W=ÊÚò¶enIxGúiº9pÉN‘Á¿¶èˆ> ×À+sJ´ç7Ã
<æ£Ùm/

of course it shouldn't loop forever.

Tilman

>
> Cheers!
>
> On August 1, 2016 at 11:07:09, Tilman Hausherr ([email protected]) wrote:
>
> Am 01.08.2016 um 19:59 schrieb Dirk Groeneveld:
>> I found a PDF that causes PDFBox to go into an infinite loop. I
>> attached it to this email. The problem is easy to reproduce.
> PDF Attachments are not allowed, please upload your file somewhere.
>
> Tilman
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


Reply via email to