[ https://issues.apache.org/jira/browse/PDFBOX-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeremias Maerki reopened PDFBOX-1437: ------------------------------------- Assignee: Jeremias Maerki (was: Guillaume Bailleul) This seems to have introduced a regression that causes characters like \n to be swallowed when getString() is called. PDFDocEncoding doesn't handle all valid characters. {code} testStr = "Line1\nLine2\nLine3\n"; COSString lineFeedString = new COSString(testStr); assertEquals(testStr, lineFeedString.getString()); //Same as previous but this time as a dictionary value lineFeedString = new COSString(true); for (int i = 0; i < testStr.length(); i++) { lineFeedString.append(testStr.charAt(i)); } assertEquals(testStr, lineFeedString.getString()); //currently fails {code} Direct link to the change causing the regression: http://svn.apache.org/viewvc?view=revision&revision=1406628 I'm currently investigating what the best way is to fix that. > Title invalidly read in DocumentInformation > ------------------------------------------- > > Key: PDFBOX-1437 > URL: https://issues.apache.org/jira/browse/PDFBOX-1437 > Project: PDFBox > Issue Type: Bug > Components: PDModel > Affects Versions: 1.7.1 > Reporter: Guillaume Bailleul > Assignee: Jeremias Maerki > Fix For: 1.8.0 > > Attachments: AA.pdf, Loader.java > > > The value returned by document.getDocumentInformation().getTitle() is invalid > with the attached document. > The last character is badly deserialized. > The method returns > Microsoft Word - LA_LAN01-#230492-v1-j2-Zilker_-_Motion_for_Letters_Rogator > Adobe reader proposes : > Microsoft Word - LA_LAN01-#230492-v1-j2-Zilker_-_Motion_for_Letters_Rogator… > with a HORIZONTAL ELLIPSIS -- This message was sent by Atlassian JIRA (v6.2#6252)