[ https://issues.apache.org/jira/browse/PDFBOX-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278679#comment-15278679 ]
John Hewson commented on PDFBOX-3347: ------------------------------------- {{E5}} on its own is not a valid UTF-8 sequence, which must be how iText is detecting that the encoding of this name is wrong, and falling back to ISO-8859-1. We can do the same. > COSName parsing/writing interprets byte sequences as UTF-8 when parsing > ----------------------------------------------------------------------- > > Key: PDFBOX-3347 > URL: https://issues.apache.org/jira/browse/PDFBOX-3347 > Project: PDFBox > Issue Type: Bug > Components: Parsing, Writing > Affects Versions: 1.8.12, 2.0.1, 2.0.2 > Reporter: Maruan Sahyoun > Assignee: John Hewson > Priority: Minor > > As discussed here > http://stackoverflow.com/questions/36964496/pdfbox-2-0-overcoming-dictionary-key-encoding/ > a byte sequence making up a COSName is interpreted during parsing and > writing where it shouldn't. Details are given my mkl's excellent analysis. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org