[
https://issues.apache.org/jira/browse/PDFBOX-3338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278128#comment-15278128
]
Tilman Hausherr edited comment on PDFBOX-3338 at 5/10/16 2:34 PM:
------------------------------------------------------------------
I tried to run the RLE branch too...
- PDFBOX-1708.pdf was unchanged (EncodedByteAlign=true, EndOfBlock=false)
- PDFBOX-2585-390618.pdf error "Unknown code in Huffman RLE stream"
- PDFBOX-2653.pdf error (default options => EncodedByteAlign=false,
EndOfLine=false)
- PDFBOX-2778.pdf error (default options => EncodedByteAlign=false,
EndOfLine=false)
After doing a change in decodeRowType2() to call resetBuffer() only if
optionByteAligned is set, it works except for PDFBOX-2778. That file has the
problem that there is trash at the end, so one needs to count rows.
was (Author: tilman):
I tried to run the RLE branch too...
- PDFBOX-1708.pdf was unchanged (EncodedByteAlign=true, EndOfBlock=false)
- PDFBOX-2585-390618.pdf error "Unknown code in Huffman RLE stream"
- PDFBOX-2653.pdf error (default options => EncodedByteAlign=false,
EndOfLine=false)
- PDFBOX-2778.pdf error (default options => EncodedByteAlign=false,
EndOfLine=false)
... more to come...
> CCITT Fax decoder fails
> -----------------------
>
> Key: PDFBOX-3338
> URL: https://issues.apache.org/jira/browse/PDFBOX-3338
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 1.8.12, 2.0.1
> Reporter: Petr Slaby
> Labels: CCITTFaxDecode, ccitt
> Attachments: 1.tiff, CCITTFaxFilter.patch, TestCCITTFaxDecoder.java
>
>
> I have a PDF which does not render in PDFBox. It contains pages from a
> scanner, encoded as CCITT Fax Tiffs. On each page, the decoder always runs
> into IOException("TIFFFaxDecoder: EOL encountered in black run.") (or the
> same message just with "white" instead of "black"). Unfortunately, the PDF
> contains sensitive data and I cannot share it.
> As a test, I have replaced the TIFFFaxDecoder by the class
> CCITTFaxDecoderStream from the Twelve Monkeys ImageIO library. All worked
> fine after that and PDFToImage produced the expected result.
> I have extracted the first few bytes of the TIFF to show the problem without
> sharing the confidential content. See the attached test program and test file.
> I have tested this against latest trunk version of PDFBox, but I think the
> decoder implementation is basically the same in all versions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]