ben.guillon <ben.guil...@gmail.com> wrote:

Hi Benoît,

> Hi,
>
> This bug was not so easy to fix, since it is related to conversion from UTF8
> chars to valid latex strings. Please find attach a patch. BTW, the behaviour 
> of
> xmllint with xinclude, postvalid option, and so on, is quite weird as shown by
> Andreas: on y machine it does not give the same results, but they are still
> wrong...
>
> Andreas, tell me if the patch is enough for you to proceed,

Thanks for the patch, it is definitely an improvement, as it succeeds
for the example document bad-title.xml.  However success or failure
seems to depend on the non-ascii latin-1 characters: "À" is okay,
however other latin-1 characters like "æ" are not, compare this example
document [1].

I don't know why pdflatex is able to handle some non-ascii characters,
but not others.  Perhaps a more robust approach would be to transform
every label/hyperlabel to pure ascii, e.g. by replacing non-ascii
characters with their Unicode code point: æ → U+00E6.

[1] 
<?xml version="1.0"?>
<!DOCTYPE article
          PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
                 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd";>
<article lang="en">
  <title>title</title>
  <para id="id-with-æ">
    body
  </para>
</article>
Regards, Andreas
-- 
Andreas Hoenen <andr...@hoenen-terstappen.de>
GPG: 1024D/B888D2CE
     A4A6 E8B5 593A E89B 496B
     82F0 728D 8B7E B888 D2CE

Attachment: signature.asc
Description: PGP signature

Reply via email to