[ https://issues.apache.org/jira/browse/PDFBOX-3338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271045#comment-15271045 ]
Petr Slaby commented on PDFBOX-3338: ------------------------------------ {quote} > It has an Apache license, so this isn't a problem. {quote} Cool, that saves me some sorrows. {quote} I suspect that the encodedByteAlign option isn't supported one would have to implement it. See in rev 1581603 and 1581602 / PDFBOX-1074. {quote} I can try, seems to be quite straightforward at a first glance. {quote} Another problem in that code is "continue" with label. I've never seen that one before, ever. When was this added to java? {quote} It is there since ever. See e.g. some examples at https://docs.oracle.com/javase/tutorial/java/nutsandbolts/branch.html. I hope you are just exaggerating with the word "problem"? I find the code much better and more readable than the current decoder class in PDFBox. To the least, it does not need to jump hence and forth in the input and reads it byte by byte instead. Not that I would really understand what is going on in detail in either of the implementations. For that, one would have to study the standard first. > CCITT Fax decoder fails > ----------------------- > > Key: PDFBOX-3338 > URL: https://issues.apache.org/jira/browse/PDFBOX-3338 > Project: PDFBox > Issue Type: Bug > Affects Versions: 1.8.12, 2.0.1 > Reporter: Petr Slaby > Attachments: 1.tiff, TestCCITTFaxDecoder.java > > > I have a PDF which does not render in PDFBox. It contains pages from a > scanner, encoded as CCITT Fax Tiffs. On each page, the decoder always runs > into IOException("TIFFFaxDecoder: EOL encountered in black run.") (or the > same message just with "white" instead of "black"). Unfortunately, the PDF > contains sensitive data and I cannot share it. > As a test, I have replaced the TIFFFaxDecoder by the class > CCITTFaxDecoderStream from the Twelve Monkeys ImageIO library. All worked > fine after that and PDFToImage produced the expected result. > I have extracted the first few bytes of the TIFF to show the problem without > sharing the confidential content. See the attached test program and test file. > I have tested this against latest trunk version of PDFBox, but I think the > decoder implementation is basically the same in all versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org