[ https://issues.apache.org/jira/browse/PDFBOX-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tilman Hausherr updated PDFBOX-4318: ------------------------------------ Component/s: PDModel > PDFont.encode results change on identical input > ----------------------------------------------- > > Key: PDFBOX-4318 > URL: https://issues.apache.org/jira/browse/PDFBOX-4318 > Project: PDFBox > Issue Type: Bug > Components: PDModel > Affects Versions: 2.0.11, 3.0.0 PDFBox > Reporter: Tilman Hausherr > Assignee: Tilman Hausherr > Priority: Major > > As reported Daniel Wildschut in the user mailing list: > Hello, we use PDFBox to fill in PDF Forms and stumbled on a potential bug > while sanitizing the input. > {quote}We call PDFont.encode to check beforehand if a given character can be > inserted using the given font. > However we noticed that the results of the method call can change depending > on what other strings have been checked before. > Apparently PDType1Font stores previous results in a codeToBytesMap, which > then causes the unexpected behavior. > I'd say that the key used in "codeToBytesMap.put(code, bytes);" is wrong; you > probably want to use the method parameter "unicode" instead. > I tested 2.0.11, the current 2.0.x branch and the 3.0.x branch and was able > to reproduce the problem with all of them. > Code to reproduce: {quote} > {code:java} > public class PDFBoxEncodeTest > { > public static void main( final String[] args ) > { > final PDType1Font font = PDType1Font.HELVETICA_BOLD; > tryEncode(font, "\u0080"); > tryEncode(font, "€"); > tryEncode(font, "\u0080"); > } > private static void tryEncode(final PDFont font, final String str) { > try { > font.encode(str); > System.out.println("Character " + str.codePointAt(0) + " can be > encoded in Font " + font); > } catch (final IOException | IllegalArgumentException e) { > System.out.println("Character " + str.codePointAt(0) + " cannot > be encoded in Font " + font + ": " + e.getMessage()); > } > } > } > {code} > {quote} > Expected output: > Character 128 cannot be encoded in Font PDType1Font Helvetica-Bold: U+0080 > ('.notdef') is not available in this font Helvetica-Bold encoding: > WinAnsiEncoding > Character 8364 can be encoded in Font PDType1Font Helvetica-Bold > Character 128 cannot be encoded in Font PDType1Font Helvetica-Bold: U+0080 > ('.notdef') is not available in this font Helvetica-Bold encoding: > WinAnsiEncoding > Actual output: > Character 128 cannot be encoded in Font PDType1Font Helvetica-Bold: U+0080 > ('.notdef') is not available in this font Helvetica-Bold encoding: > WinAnsiEncoding > Character 8364 can be encoded in Font PDType1Font Helvetica-Bold > Character 128 can be encoded in Font PDType1Font Helvetica-Bold > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org