Hello,

I am trying to get an epub out of a html documentation, after some XPath 
queries. But XPath requires perfectly valid xml, so I tried to format it with 
xmllint. Then if I'm not mistaken, the --html option breaks the encoding.

Without --html :

$ echo '<title>Introduction — Vue.js</title>' |  xmllint --encode UTF-8 
--format  -

[...]
<title>Introduction — Vue.js</title>



With --html (seems to be required for entire documents). The "—" is transformed 
to "&acirc;&#128;&#148;" :

$ echo '<title>Introduction — Vue.js</title>' |  xmllint --html --htmlout 
--encode UTF-8 --format -

[...]
<title>Introduction &acirc;&#128;&#148; Vue.js</title>



$ xmllint --version 
xmllint: using libxml version 20904
   compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP 
HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ICU 
ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib 
Lzma


Thanks in advance.
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to