Re: FYI: Workaround for incorrect XRef/XRefStm input

Brzrk One Mon, 21 Nov 2016 14:45:11 -0800

ewps... left out that it was pdfbox 1.8.9...

On Mon, Nov 21, 2016 at 5:12 PM, Brzrk One <[email protected]> wrote:


> I have a PDF file (which I cannot share) with the trailer:
>
> trailer
> <<
> /Size 16922
> /Root 1 0 R
> /Info 9 0 R
> /ID [<495BB8DD62106B9AB4E6E1C8B591C982> <91EB7F87537B4838AF45C0D28A9882
> 80>]
> /XRefStm 5347791
> >>
> startxref
> 5135270
>
> But there is only a single xref table in this pdf file: there is no object
> with /Type /XRef.
> In this situation, NonSequentialPDFParser.parseXref() will enter the
> XREF_STM paragraph, but, since there is no object with /Type /XRef at
> offset 5347791 (a position that lands smack dab in the middle of the xref
> table) it does a brute force search for some XRef entry, and returns offset
> 5135270, which is the location of the one and only xref table in the file.
>
> I added this check to the XREF_STM paragraph, which seems to get around
> the problem:
>
>
> *if* ( streamOffset != prev ) {
>
> // if the positions are the same, this a hybrid *xref* table / *xrefstm*
> but no /XRef stream...
> parseXrefObjStream(prev, *false*);
>
> }
>
>
>  I see similar code in 2.0.3 COSParser.parseXref().
>  HtH, Pat
>
>

Re: FYI: Workaround for incorrect XRef/XRefStm input

Reply via email to