[ https://issues.apache.org/jira/browse/PDFBOX-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tilman Hausherr closed PDFBOX-944. ---------------------------------- Resolution: Cannot Reproduce Closing because of missing example. And yes, loadNonSeq() usually solves such problems, always use it :-) > number of pages returns the incorrect number for some PDFs > ---------------------------------------------------------- > > Key: PDFBOX-944 > URL: https://issues.apache.org/jira/browse/PDFBOX-944 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 1.4.0 > Reporter: Adam Nichols > > This is a regression bug which appeared between 1.3.1 and 1.4.0, as the > former returns the correct page count while the latter does not. > Unfortunately, the PDF which demonstrates this problem is confidential, so I > can not attach it here, however I will describe the things which may be > causing this problem as best I can. > The problem does not occur after using the "uncompress" feature of pdftk. > The problem does not occur after using PdfDecompressor from PDFBox. The > original file which was given to me is Linearized. In Adobe Acrobat Standard > -> File -> Properties, it says the Application was "Adobe Photoshop CS4 > Windows", the PDF Producer was "Adobe Photoshop for Windows -- Image > Conversion Plug-in" and the PDF Version is 1.7 (Acrobat 8.x). Fast Web View > is set to "No". I suspect that the problem has to do with the fact that it's > Lineraized or the fact that it uses ObjStm. I don't have enough time to > trace through this, so I'm either going to revert back to PDFBox 1.3.1 or > pre-process all the ObjStm objects, save the uncompressed file, and then > process that. The latter is less efficient, but I think it'll handle more > cases. I just wanted to make sure to open an issue here on JIRA so we can > eventually get a proper solution to this problem. -- This message was sent by Atlassian JIRA (v6.2#6252)