[ https://issues.apache.org/jira/browse/PDFBOX-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gang Luo updated PDFBOX-612: ---------------------------- Attachment: 1DE9A100d01.pdf Fail for 'GBK-EUC-H' encoding > Unknown encoding for 'GBK-EUC-H' > -------------------------------- > > Key: PDFBOX-612 > URL: https://issues.apache.org/jira/browse/PDFBOX-612 > Project: PDFBox > Issue Type: Bug > Components: PDModel > Affects Versions: 0.8.0-incubator > Environment: Windows > Reporter: Gang Luo > Labels: encoding > Attachments: 1DE9A100d01.pdf > > > Unknown encoding for 'GBK-EUC-H' for chinese pdf document. To fix it. > 1.add method to org.apache.pdfbox.pdmodel.font.PDFont.java > public String getEncodingName() { > COSBase encoding = font.getDictionaryObject(COSName.ENCODING); > if (encoding != null) { > if (encoding instanceof COSName) { > return ((COSName) encoding).getName(); > } > } > return null; > } > 2.modify encode method. > from > if( retval == null && cmap != null ) > { > retval = cmap.lookup( c, offset, length ); > } > //if we havn't found a value yet and > //we are still on the first byte and > //there is no cmap or the cmap does not have 2 byte mappings then try > to encode > //using fallback methods. > to > if( retval == null && cmap != null ) > { > String encodingStr = getEncodingName(); > if (encodingStr != null) { > EncodingConverter converter = > EncodingConversionManager.getConverter(encodingStr); > if (converter != null) { > if (length == 1) return null; > retval = converter.convertBytes(c, offset, length, cmap); > } else { > retval = cmap.lookup( c, offset, length ); > } > } else { > retval = cmap.lookup( c, offset, length ); > } > } > //if we havn't found a value yet and > //we are still on the first byte and > //there is no cmap or the cmap does not have 2 byte mappings then try > to encode > //using fallback methods. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira