I am using Maven 1.1-beta-2 and maven-xdoc-plugin-1.9.2 on a Windows XP
workstation.  When I attempt to build UTF-8 encoded HTML from UTF-8 XML
source files, every special character is scrambled.  I haven't done the
analysis, but I would guess that every multi-byte character is being treated
like a group of single byte characters.

We are using the Maven xDoc plug-in to generate our on-line user guide.  We
sent the English XML source and I18N properties file to be translated into 9
languages.  The returned files are UTF-8 encoded.

Each source file begins.

<?xml version="1.0" encoding="UTF-8" ?>

I build each language tree seperately and then combine the output trees into
a single web site.  In my project properties, I specify

maven.xdoc.includeProjectDocumentation=no
maven.xdoc.date=navigation-bottom
maven.xdoc.jsl=file:${basedir}/src/site.jsl
maven.docs.outputencoding=UTF-8

maven.docs.src=${basedir}/src/xdoc/en
maven.faq.src=${basedir}/src/xdoc/en/Faq

maven.xdoc.bundle.src=${basedir}/src/i18nBundles
maven.xdoc.bundle=wasce
maven.xdoc.locale.default=en
maven.docs.dest=${maven.build.dir}/docs/en

When I want to generate a site in a different language, I override the
properties on the maven command line like this:

-Dmaven.docs.src=${basedir}/src/xdoc/xx
-Dmaven.faq.src=${basedir}/src/xdoc/xx/Faq
-Dmaven.xdoc.locale.default=xx
-Dmaven.docs.dest=${maven.build.dir}/docs/xx

where xx is replaced with the language to be generated (de es fr it ko pt_BR
ru zh_CN zh_TW)

In my UTF-8 enabled editor, the source files appear to be properly encoded.
Firefox and the Internet Explorer both agree that the HTML is UTF-8
encoded.  The characters are scrambled.

What am I doing wrong?

Lance

Reply via email to