[ https://issues.apache.org/jira/browse/PDFBOX-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185117#comment-14185117 ]
Laurent Richard commented on PDFBOX-2419: ----------------------------------------- The problem is specific to XML format XFDF where special characters should be escaped. There's no issue with FDF. I join a sample PDF file with simple AcroForm containing such characters (in the field named "Nom"). With code like {code} PDDocument pdf = PDDocument.load("SampleForm.pdf"); PDAcroForm form = pdf.getDocumentCatalog().getAcroForm(); FDFDocument fdf = form.exportFDF(); List<FDFField> fields = fdf.getCatalog().getFDF().getFields(); StringWriter writer = new StringWriter(); fdf.saveXFDF(writer); return writer.toString(); {code} We get the following content {code} <?xml version="1.0" encoding="UTF-8"?> <xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve"> <ids original="40DE256FBEC20B428C72BCF68015AB9E" modified="3E3C2606FFB360C3FA74D3921A630318" /> <fields> <field name="choix_1_J-64ncBSov7NcPeTY8oJ3A"> <value>Oui</value> </field> <field name="prenom_By11Gk3puTlnwwnv4WA0-g"> <value>special XML characters < > &</value> </field> <field name="nom_yQacEuz649N*BJguviO5Ow"> <value>special XML characters < > &</value> </field> </fields> </xfdf> {code} which is not valid XML since '<', '>' and '&' should be escaped. The right result would be : {code} <?xml version="1.0" encoding="UTF-8"?> <xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve"> <ids original="40DE256FBEC20B428C72BCF68015AB9E" modified="3E3C2606FFB360C3FA74D3921A630318" /> <fields> <field name="choix_1_J-64ncBSov7NcPeTY8oJ3A"> <value>Oui</value> </field> <field name="prenom_By11Gk3puTlnwwnv4WA0-g"> <value>special XML characters &lt; &gt; &amp;</value> </field> <field name="nom_yQacEuz649N*BJguviO5Ow"> <value>special XML characters < > &</value> </field> </fields> </xfdf> {code} Ideally, relying on JAXP (Java API for XML Processing) instead of manipulating directly a String content would handle such things. > XFDF export is not XML compliant > -------------------------------- > > Key: PDFBOX-2419 > URL: https://issues.apache.org/jira/browse/PDFBOX-2419 > Project: PDFBox > Issue Type: Bug > Components: AcroForm > Affects Versions: 1.8.7 > Reporter: Laurent Richard > Labels: FDF > Fix For: 1.8.8 > > Attachments: SampleForm.pdf > > > The XFDF content is written as a simple string instead of XML nodes. > As a result, field values containing special characters (&, <, >, ...) are > not escaped and the resulting XML is invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)