[jira] [Commented] (PDFBOX-3347) COSName parsing/writing interprets byte sequences as UTF-8 when parsing

John Hewson (JIRA) Tue, 10 May 2016 11:59:06 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278679#comment-15278679
 ]


John Hewson commented on PDFBOX-3347:
-------------------------------------

{{E5}} on its own is not a valid UTF-8 sequence, which must be how iText is 
detecting that the encoding of this name is wrong, and falling back to 
ISO-8859-1. We can do the same.

> COSName parsing/writing interprets byte sequences as UTF-8 when parsing
> -----------------------------------------------------------------------
>
>                 Key: PDFBOX-3347
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3347
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing, Writing
>    Affects Versions: 1.8.12, 2.0.1, 2.0.2
>            Reporter: Maruan Sahyoun
>            Assignee: John Hewson
>            Priority: Minor
>
> As discussed here 
> http://stackoverflow.com/questions/36964496/pdfbox-2-0-overcoming-dictionary-key-encoding/
>  a byte sequence making up a COSName is interpreted during parsing and 
> writing where it shouldn't. Details are given my mkl's excellent analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (PDFBOX-3347) COSName parsing/writing interprets byte sequences as UTF-8 when parsing

Reply via email to