[ 
https://issues.apache.org/jira/browse/PDFBOX-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremias Maerki reopened PDFBOX-1437:
-------------------------------------

      Assignee: Jeremias Maerki  (was: Guillaume Bailleul)

This seems to have introduced a regression that causes characters like \n to be 
swallowed when getString() is called. PDFDocEncoding doesn't handle all valid 
characters.

{code}
testStr = "Line1\nLine2\nLine3\n";
COSString lineFeedString = new COSString(testStr);
assertEquals(testStr, lineFeedString.getString());

//Same as previous but this time as a dictionary value
lineFeedString = new COSString(true);
for (int i = 0; i < testStr.length(); i++) {
    lineFeedString.append(testStr.charAt(i));
}
assertEquals(testStr, lineFeedString.getString()); //currently fails
{code}

Direct link to the change causing the regression:
http://svn.apache.org/viewvc?view=revision&revision=1406628

I'm currently investigating what the best way is to fix that.

> Title invalidly read in DocumentInformation
> -------------------------------------------
>
>                 Key: PDFBOX-1437
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1437
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.7.1
>            Reporter: Guillaume Bailleul
>            Assignee: Jeremias Maerki
>             Fix For: 1.8.0
>
>         Attachments: AA.pdf, Loader.java
>
>
> The value returned by document.getDocumentInformation().getTitle() is invalid 
> with the attached document.
> The last character is badly deserialized.
> The method returns 
> Microsoft Word - LA_LAN01-#230492-v1-j2-Zilker_-_Motion_for_Letters_RogatorÂ
> Adobe reader proposes :
> Microsoft Word - LA_LAN01-#230492-v1-j2-Zilker_-_Motion_for_Letters_Rogator…
> with a HORIZONTAL ELLIPSISƒ



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to