[ 
https://issues.apache.org/jira/browse/PDFBOX-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13851294#comment-13851294
 ] 

Guillaume Bailleul commented on PDFBOX-1812:
--------------------------------------------

I guess there is something to do on xml generation to escape specific 
characters.

However, something has been changed in the parser that makes the message 
different, and should not. Is it possible for you to determine which was the 
last version it was working correctly ?

> Illegal characters in XML output
> --------------------------------
>
>                 Key: PDFBOX-1812
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1812
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Preflight
>    Affects Versions: 2.0.0
>         Environment: Bug reproduced under Win 7, Ubuntu
>            Reporter: Johan van der Knijff
>              Labels: characters, utf-8, xml
>             Fix For: 2.0.0
>
>         Attachments: 013814.pdf, 013814.xml, 013814_old.xml, 598659.pdf, 
> 598659.xml, 598659_old.xml, 600111.pdf, 600111.xml, 600111_old.xml
>
>
> When running Preflight in XML mode, the latest Preflight version (I used the 
> JAR from build #747) sometimes produces output that contains characters that 
> are illegal in XML. This can cause unexpected behavior if such files are 
> further processed with tools that expect well-formed XML.  See attached PDFs, 
> which all result in illegal characters in the description of a 1.0 Syntax 
> error, Error: Expected a long type. Output of older versions of Preflight 
> didn't contain these illegal characters; instead they would give something 
> like *actual='/O'*, *actual='Pages'*. etc. So I suppose this must have been 
> caused by a fairly recent change.
> See attachments below.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to