[ 
https://issues.apache.org/jira/browse/PDFBOX-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082391#comment-13082391
 ] 

Jeremias Maerki commented on PDFBOX-1086:
-----------------------------------------

As can be read in the associated SVN commit, I've replaced the Group3 1D 
decoder. The same problem can also occur with Group3 2D and Group4. So, 
eventually, PDFBox also needs implementations for these two. Especially Group4 
is widely used which is why I'm keeping this issue open for the moment.

> Error when decoding CCITT compressed data that contains EOLs, fill bits etc.
> ----------------------------------------------------------------------------
>
>                 Key: PDFBOX-1086
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1086
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>            Reporter: Jeremias Maerki
>            Assignee: Jeremias Maerki
>
> The TIFFFaxDecoder class (originally coming from JAI via XML Graphics 
> Commons) does not handle cases like EOLs between lines and in front. But the 
> PDF CCITTFaxDecode filter needs to allow many different variants of the 
> encoding. Apparently, TIFF has a relatively restricted way of encoding CCITT 
> data, so TIFFFaxDecoder was not written to be as flexible as we need it. 
> Ideally, PDFBox should handle anything that gets thrown at it.
> It apprears that it would be rather difficult to retrofit TIFFFaxDecoder with 
> the necessary flexibility. So, new decoders for T.4 and T.6 should probably 
> be written.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to